Java mmap MappedByteBuffer

Java mmap MappedByteBuffer - java

Let’s say I’ve mapped a memory region [0, 1000] and now I have MappedByteBuffer.
Can I read and write to this buffer from multiple threads at the same time without locking, assuming that each thread accesses different part of the buffer for exp. T1 [0, 500), T2 [500, 1000]?
If the above is true, is it possible to determine whether it’s better to create one big buffer for multiple threads, or smaller buffer for each thread?

Detailed Intro:
If you wanna learn how to answer those questions yourself, check their implementation source codes:
MappedByteBuffer: https://github.com/himnay/java7-sourcecode/blob/master/java/nio/MappedByteBuffer.java (notice it's still abstract, so you cannot instantiate it directly)
extends ByteBuffer: https://github.com/himnay/java7-sourcecode/blob/master/java/nio/ByteBuffer.java
extends Buffer: https://github.com/himnay/java7-sourcecode/blob/329bbb33cbe8620aee3cee533eec346b4b56facd/java/nio/Buffer.java (which only does index checks, and does not grant an actual access to any buffer memory)
Now it gets a bit more complicated:
When you wanna allocate a MappedByteBuffer, you will get either a
HeapByteBuffer: https://github.com/himnay/java7-sourcecode/blob/329bbb33cbe8620aee3cee533eec346b4b56facd/java/nio/HeapByteBuffer.java
or a DirectByteBuffer: https://github.com/himnay/java7-sourcecode/blob/329bbb33cbe8620aee3cee533eec346b4b56facd/java/nio/DirectByteBuffer.java
Instead of having to browse internet pages, you could also simply download the source code packages for your Java version and attach them in your IDE so you can see the code in development AND debug modes. A lot easier.
Short (incomplete) answer:
Neither of them does secure against multithreading.
So if you ever needed to resize the MappedByteBuffer, you might get stale or even bad (ArrayIndexOutOfBoundsException) access
If the size is constant, you can rely on either Implementation to be "thread safe", as far as your requirements are concerned
On a side note, here also lies an implementation failure creep in the Java implementation:
MappedByteBuffer extends ByteBuffer
ByteBuffer has the heap byte[] called "hb"
DirectByteBuffer extends MappedByteBuffer extends ByteBuffer
So DirectByteBuffer still has ByteBuffer's byte[] hb buffer,
but does not use it
and instead creates and manages its own Buffer
This design flaw comes from the step-by-step development of those classes (they were no all planned and implemented at the same time), AND the topic of package visibility, resulting in inversion of dependency/hierarchy of the implementation.
Now to the true answer:
If you wanna do proper object-oriented programming, you should NOT share resource unless utterly needed.
This ESPECIALLY means that each Thread should have its very own Buffer.
Advantage of having one global buffer: the only "advantage" is to reduce the additional memory consumption for additional object references. But this impact is SO MINIMAL (not even 1:10000 change in your app RAM consumption) that you will NEVER notice it. There's so many other objects allocated for any number of weird (Java) reasons everywhere that this is the least of your concerns. Plus you would have to introduce additional data (index boundaries) which lessens the 'advantage' even more.
The big Advantages of having separate buffers:
You will never have to take care of the pointer/index arithmetics
especially when it comes to you needing more threads at any given time
You can freely allocate new threads at any time without having to rearrange any data or do more pointer arithmetics
you can freely reallocate/resize each individual buffer when needed (without worrying about all the other threads' indexing requirement)
Debugging: You can locate problems so much easier that result from "writing out of boundaries", because if they tried, the bad thread would crash, and not other threads that would have to deal with corrupted data
Java ALWAYS checks each array access (on normal heap arrays like byte[]) before it accesses it, exactly to prevent side effects
think back: once upon a time there was the big step in operating systems to introduce linear address space so programs would NOT have to care about where in the hardware RAM they're loaded.
Your one-buffer-design would be the exact step backwards.
Conclusion:
If you wanna have a really bad design choice - which WILL make life a lot harder later on - you go with one global Buffer.
If you wanna do it the proper OO way, separate those buffers. No convoluted dependencies and side effect problems.

Related

Simulation thread and data writer thread parallelism

This a general programming question. Let's say I have a thread doing a specific simulation, where speed is quite important. At every iteration I want to extract data from it and write it to a file.
Is it a better practice to hand over the data to a different thread and let the simulation thread focus on his job, or since speed is very important, make the simulation thread do the data recording too without any copying of data. (in my case it is 3-5 deques of integers with a size of 1000-10000)
Firstly it surely depends on how much data we are copying, but what else can it depend on? Can the cost of synchronization and copying be worth? Is it a good practice to create small runnables at each iteration to handle the recording task in case of 50 or more iterations per second?

If you truly want low latency on this stat capturing, and you want it during the simulation itself then two techniques come to mind. They can be used together very effectively. Please note that these two approaches are fairly far from the standard Java trodden path, so measure first and confirm that you need these techniques before abusing them; they can be difficult to implement correctly.
The fastest way to write the data to file during a simulation, without slowing down the simulation is to hand the work off to another thread. However care has to be taken on how the hand off occurs, as a memory barrier in the simulation thread will slow the simulation. Given the writer only cares that the values will come eventually I would consider using the memory barrier that sits behind AtomicLong.lazySet, it requests a thread safe write out to a memory address without blocking for the write to actually become visible to the other thread. Unfortunately direct access to this memory barrier is currently only availble via lazySet or via class sun.misc.Unsafe, which obviously is not part of the public Java API. However that should not be too large of a hurdle as it is on all current JVM implementations and Doug Lea is talking about moving parts of it into the mainstream.
To avoid the slow, blocking file IO that Java uses; make use of a memory mapped file. This lets the OS perform async IO for you on your behalf, and is very efficient. It also supports use of the same memory barrier mentioned above.
For examples of both techniques, I strongly recommend reading the source code to HFT Chronicle by Peter Lawrey. In fact, HFT Chronicle may be just the library for you to use here. It offers a highly efficient and simple to use disk backed queue that can sustain a million or so messages per second.

In my work on a stress-testing HTTP client I stored the stats into an array and, when the array was ready to send to the GUI, I would create a new array for the tester client and hand off the full array to the network layer. This means that you don't need to pay for any copying, just for the allocation of a fresh array (an ultra-fast operation on the JVM, involving hand-coded assembler macros to utilize the best SIMD instructions available for the task).
I would also suggest not throwing yourself head-on into the realms of optimal memory barrier usage; the difference between a plain volatile write and an AtomicReference.lazySet() can only be measurable if your thread does almost nothing else but excercise the memory barrier (at least millions of writes per second). Depending on your target I/O throughput, you may not even need NIO to meet the goal. Better try first with simple, easily maintainable code than dig elbows-deep into highly specialized APIs without a confirmed need for that.

"Give a rough estimate of the overhead incurred by each system call." - what? [duplicate]

I am a student in Computer Science and I am hearing the word "overhead" a lot when it comes to programs and sorts. What does this mean exactly?

It's the resources required to set up an operation. It might seem unrelated, but necessary.
It's like when you need to go somewhere, you might need a car. But, it would be a lot of overhead to get a car to drive down the street, so you might want to walk. However, the overhead would be worth it if you were going across the country.
In computer science, sometimes we use cars to go down the street because we don't have a better way, or it's not worth our time to "learn how to walk".

The meaning of the word can differ a lot with context. In general, it's resources (most often memory and CPU time) that are used, which do not contribute directly to the intended result, but are required by the technology or method that is being used. Examples:
Protocol overhead: Ethernet frames, IP packets and TCP segments all have headers, TCP connections require handshake packets. Thus, you cannot use the entire bandwidth the hardware is capable of for your actual data. You can reduce the overhead by using larger packet sizes and UDP has a smaller header and no handshake.
Data structure memory overhead: A linked list requires at least one pointer for each element it contains. If the elements are the same size as a pointer, this means a 50% memory overhead, whereas an array can potentially have 0% overhead.
Method call overhead: A well-designed program is broken down into lots of short methods. But each method call requires setting up a stack frame, copying parameters and a return address. This represents CPU overhead compared to a program that does everything in a single monolithic function. Of course, the added maintainability makes it very much worth it, but in some cases, excessive method calls can have a significant performance impact.

You're tired and cant do any more work. You eat food. The energy spent looking for food, getting it and actually eating it consumes energy and is overhead!
Overhead is something wasted in order to accomplish a task. The goal is to make overhead very very small.
In computer science lets say you want to print a number, thats your task. But storing the number, the setting up the display to print it and calling routines to print it, then accessing the number from variable are all overhead.

Wikipedia has us covered:
In computer science, overhead is
generally considered any combination
of excess or indirect computation
time, memory, bandwidth, or other
resources that are required to attain
a particular goal. It is a special
case of engineering overhead.

Overhead typically reffers to the amount of extra resources (memory, processor, time, etc.) that different programming algorithms take.
For example, the overhead of inserting into a balanced Binary Tree could be much larger than the same insert into a simple Linked List (the insert will take longer, use more processing power to balance the Tree, which results in a longer percieved operation time by the user).

For a programmer overhead refers to those system resources which are consumed by your code when it's running on a giving platform on a given set of input data. Usually the term is used in the context of comparing different implementations or possible implementations.
For example we might say that a particular approach might incur considerable CPU overhead while another might incur more memory overhead and yet another might weighted to network overhead (and entail an external dependency, for example).
Let's give a specific example: Compute the average (arithmetic mean) of a set of numbers.
The obvious approach is to loop over the inputs, keeping a running total and a count. When the last number is encountered (signaled by "end of file" EOF, or some sentinel value, or some GUI buttom, whatever) then we simply divide the total by the number of inputs and we're done.
This approach incurs almost no overhead in terms of CPU, memory or other resources. (It's a trivial task).
Another possible approach is to "slurp" the input into a list. iterate over the list to calculate the sum, then divide that by the number of valid items from the list.
By comparison this approach might incur arbitrary amounts of memory overhead.
In a particular bad implementation we might perform the sum operation using recursion but without tail-elimination. Now, in addition to the memory overhead for our list we're also introducing stack overhead (which is a different sort of memory and is often a more limited resource than other forms of memory).
Yet another (arguably more absurd) approach would be to post all of the inputs to some SQL table in an RDBMS. Then simply calling the SQL SUM function on that column of that table. This shifts our local memory overhead to some other server, and incurs network overhead and external dependencies on our execution. (Note that the remote server may or may not have any particular memory overhead associated with this task --- it might shove all the values immediately out to storage, for example).
Hypothetically we might consider an implementation over some sort of cluster (possibly to make the averaging of trillions of values feasible). In this case any necessary encoding and distribution of the values (mapping them out to the nodes) and the collection/collation of the results (reduction) would count as overhead.
We can also talk about the overhead incurred by factors beyond the programmer's own code. For example compilation of some code for 32 or 64 bit processors might entail greater overhead than one would see for an old 8-bit or 16-bit architecture. This might involve larger memory overhead (alignment issues) or CPU overhead (where the CPU is forced to adjust bit ordering or used non-aligned instructions, etc) or both.
Note that the disk space taken up by your code and it's libraries, etc. is not usually referred to as "overhead" but rather is called "footprint." Also the base memory your program consumes (without regard to any data set that it's processing) is called its "footprint" as well.

Overhead is simply the more time consumption in program execution. Example ; when we call a function and its control is passed where it is defined and then its body is executed, this means that we make our CPU to run through a long process( first passing the control to other place in memory and then executing there and then passing the control back to the former position) , consequently it takes alot performance time, hence Overhead. Our goals are to reduce this overhead by using the inline during function definition and calling time, which copies the content of the function at the function call hence we dont pass the control to some other location, but continue our program in a line, hence inline.

You could use a dictionary. The definition is the same. But to save you time, Overhead is work required to do the productive work. For instance, an algorithm runs and does useful work, but requires memory to do its work. This memory allocation takes time, and is not directly related to the work being done, therefore is overhead.

You can check Wikipedia. But mainly when more actions or resources are used. Like if you are familiar with .NET there you can have value types and reference types. Reference types have memory overhead as they require more memory than value types.

A concrete example of overhead is the difference between a "local" procedure call and a "remote" procedure call.
For example, with classic RPC (and many other remote frameworks, like EJB), a function or method call looks the same to a coder whether its a local, in memory call, or a distributed, network call.
For example:
service.function(param1, param2);
Is that a normal method, or a remote method? From what you see here you can't tell.
But you can imagine that the difference in execution times between the two calls are dramatic.
So, while the core implementation will "cost the same", the "overhead" involved is quite different.

Think about the overhead as the time required to manage the threads and coordinate among them. It is a burden if the thread does not have enough task to do. In such a case the overhead cost over come the saved time through using threading and the code takes more time than the sequential one.

To answer you, I would give you an analogy of cooking Rice, for example.
Ideally when we want to cook, we want everything to be available, we want pots to be already clean, rice available in enough quantities. If this is true, then we take less time to cook our rice( less overheads).
On the other hand, let's say you don't have clean water available immediately, you don't have rice, therefore you need to go buy it from the shops first and you need to also get clean water from the tap outside your house. These extra tasks are not standard or let me say to cook rice you don't necessarily have to spend so much time gathering your ingredients. Ideally, your ingredients must be present at the time of wanting to cook your rice.
So the cost of time spent in going to buy your rice from the shops and water from the tap are overheads to cooking rice. They are costs that we can avoid or minimize, as compared to the standard way of cooking rice( everything is around you, you don't have to waste time gathering your ingredients).
The time wasted in collecting ingredients is what we call the Overheads.
In Computer Science, for example in multithreading, communication overheads amongst threads happens when threads have to take turns giving each other access to a certain resource or they are passing information or data to each other. Overheads happen due to context switching.Even though this is crucial to them but it's the wastage of time (CPU cycles) as compared to the traditional way of single threaded programming where there is never a time wastage in communication. A single threaded program does the work straight away.

its anything other than the data itself, ie tcp flags, headers, crc, fcs etc..

What is mean by creating some object will be expensive in java/oop

I heard this statement many times when reading some java books/articles.My question is very straightforward that when we say that creating some object will be very expensive?
Here expensive is for what and at what scenario we should use this term.It would be very easy for me to understand if some one illustrate with small example and how to avoid this?

Expensive usually means it'll take a while, but it can also mean it'll take a lot of some other resource, such as memory, bandwidth, hosting budget, disk space, or anything else you'd like to use less of. For example,
new int[1000000000]
will be expensive, because it allocates and zeros an incredible amount of memory.

Expensive means it requires quite a good amount of system resources like memory, disk I/O. A good example would be creation of a databse connection object which requires quite a number of steps before you get an actual connection object. Each step in itself may perform operations like reading configuration from a file which requires I/O, loading the database driver, registering the driver etc.
Creating big arrays are also expensive because it takes a big chunk of memory.

Expensive word in used in sense of Activity performed (like io operations), actions, or creation of object.
In simplest words, anything that can make a slight performance hit is called expensive.

In case of object oriented programming Expensive is related to memory, resources etc your object is using.If unnecessarily your object is uses lots of memory or resources then somewhere you are not doing good programming.

What is the best resizable byte buffer available in Java?

I need a byte buffer class in Java for single-threaded use. The buffer should resize when it's full, rather than throw an exception or something. Very important issue for me is performance.
What would you recommend?
ADDED:
At the momement I use ByteBuffer but it cannot resize. I need one that can resize.

Any reason not to use the boring normal ByteArrayOutputStream?
As mentioned by miku above, Evan Jones gives a review of different types and shows that it is very application dependent. So without knowing further details it is hard to speculate.
I would start with ByteArrayOutputStream, and only if profiling shows it is your performance bottleneck move to something else. Often when you believe the buffer code is the bottleneck, it will actually be network or other IO - wait until profiling shows you need an optimisation before wasting time finding a replacement.
If you are moving to something else, then other factors you will need to think about:
You have said you are using single threaded use, so BAOS's synchronization is not needed
what is the buffer being filled by and fed into? If either end is already wired to use Java NIO, then using a direct ByteBuffer is very efficient.
Are you using a circular buffer or a plain linear buffer? If you are then the Ostermiller Utils are pretty efficient, and GPL'd

You can use a direct ByteBuffer. Direct memory uses virtual memory to start with is only allocated to the application when it is used. i.e. the amount of main memory it uses re-sizes automagically.
Create a direct ByteBuffer larger than you need and it will only consume what you use.

you can also write manual code for checking the buffer content continously and if its full then make a new buffer of greater size and shift all the data in that new buffer.

ByteBuffer recycling class

I'm wondering how I'd code up a ByteBuffer recycling class that can get me a ByteBuffer which is at least as big as the specified length, and which can lock up ByteBuffer objects in use to prevent their use while they are being used by my code. This would prevent re-construction of DirectByteBuffers and such over and over, instead using existing ones. Is there an existing Java library which can do this very effectively? I know Javolution can work with object recycling, but does that extend to the ByteBuffer class in this context with the requirements set out?

It would be more to the point to be more conservative in your usage patterns in the first place. For example there is lots of code out there that shows allocation of a new ByteBuffer on every OP_READ. This is insane. You only need two ByteBuffers at most per connection, one for input and one for output, and depending on what you're doing you can get away with exactly one. In extremely simple cases like an echo server you can get away with one BB for the entire application.
I would look into that rather than paper over the cracks with yet another layer of software.

This is just advice, not an answer. If you do implement some caching for DirectByteBuffer, then be sure to read about the GC implications, because the memory consumed by DirectByteBuffer is not tracked by the garbage collector.
Some references:
A thread - featuring Stack Overflow's tackline
A blog post on the same subject
And the followup

Typically, you would use combination of ThreadLocal and SoftReference wrapper. Former to simplify synchronization (eliminate need for it, essentially); and latter to make buffer recycleable if there's not enough memory (keeping in mind other comments wrt. GC issues with direct buffers). It's actually quite simple: check if SoftReference has buffer with big enough size; if not, allocate; if yes, clear reference. Once you are done with it, re-set reference to point to buffer.
Another question is whether ByteBuffer is needed, compared to regular byte[]. Many developers assume ByteBuffers are better performance-wise, but that assumption is not usually backed by actual data (i.e. testing to see if there is performance difference, and to what direction). Reason why byte[] may often be faster is that code accessing it can be simpler, easier for HotSpot to efficiently JIT.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.