I don't understand, what that Buffer classes are for. Aren't they for buffering? I think this should mean that one buffer object should allow both read and write it simultaneously and independently. Nevertheless it is not so: buffer allows only one position, single one for reading and writing. This means that if I wrote something into the buffer with relative put() then I can't read anything sensitive with relative get(). Also if I will call put() and get() interchangeably I will get a delirium.
So are there any usage patterns (samples) for buffers? So that it would be evident that those buffers are somehow better than conventional arrays?
ByteBuffer are used for read and writing data, you can get/put many primitive type and control the endianess. They can be a wrapper for direct memory (off heap) and memory mapped files (also off heap)
They can be used for performance (as they can access a long or double natively without assembling bytes together), direct byte buffers can read/write data without an additional copy into "Java" memory. memory mapped files can be extended to the size of your disk space, allowing you to use lots of memory without impacting your GC times.
Related
Let’s say I’ve mapped a memory region [0, 1000] and now I have MappedByteBuffer.
Can I read and write to this buffer from multiple threads at the same time without locking, assuming that each thread accesses different part of the buffer for exp. T1 [0, 500), T2 [500, 1000]?
If the above is true, is it possible to determine whether it’s better to create one big buffer for multiple threads, or smaller buffer for each thread?
Detailed Intro:
If you wanna learn how to answer those questions yourself, check their implementation source codes:
MappedByteBuffer: https://github.com/himnay/java7-sourcecode/blob/master/java/nio/MappedByteBuffer.java (notice it's still abstract, so you cannot instantiate it directly)
extends ByteBuffer: https://github.com/himnay/java7-sourcecode/blob/master/java/nio/ByteBuffer.java
extends Buffer: https://github.com/himnay/java7-sourcecode/blob/329bbb33cbe8620aee3cee533eec346b4b56facd/java/nio/Buffer.java (which only does index checks, and does not grant an actual access to any buffer memory)
Now it gets a bit more complicated:
When you wanna allocate a MappedByteBuffer, you will get either a
HeapByteBuffer: https://github.com/himnay/java7-sourcecode/blob/329bbb33cbe8620aee3cee533eec346b4b56facd/java/nio/HeapByteBuffer.java
or a DirectByteBuffer: https://github.com/himnay/java7-sourcecode/blob/329bbb33cbe8620aee3cee533eec346b4b56facd/java/nio/DirectByteBuffer.java
Instead of having to browse internet pages, you could also simply download the source code packages for your Java version and attach them in your IDE so you can see the code in development AND debug modes. A lot easier.
Short (incomplete) answer:
Neither of them does secure against multithreading.
So if you ever needed to resize the MappedByteBuffer, you might get stale or even bad (ArrayIndexOutOfBoundsException) access
If the size is constant, you can rely on either Implementation to be "thread safe", as far as your requirements are concerned
On a side note, here also lies an implementation failure creep in the Java implementation:
MappedByteBuffer extends ByteBuffer
ByteBuffer has the heap byte[] called "hb"
DirectByteBuffer extends MappedByteBuffer extends ByteBuffer
So DirectByteBuffer still has ByteBuffer's byte[] hb buffer,
but does not use it
and instead creates and manages its own Buffer
This design flaw comes from the step-by-step development of those classes (they were no all planned and implemented at the same time), AND the topic of package visibility, resulting in inversion of dependency/hierarchy of the implementation.
Now to the true answer:
If you wanna do proper object-oriented programming, you should NOT share resource unless utterly needed.
This ESPECIALLY means that each Thread should have its very own Buffer.
Advantage of having one global buffer: the only "advantage" is to reduce the additional memory consumption for additional object references. But this impact is SO MINIMAL (not even 1:10000 change in your app RAM consumption) that you will NEVER notice it. There's so many other objects allocated for any number of weird (Java) reasons everywhere that this is the least of your concerns. Plus you would have to introduce additional data (index boundaries) which lessens the 'advantage' even more.
The big Advantages of having separate buffers:
You will never have to take care of the pointer/index arithmetics
especially when it comes to you needing more threads at any given time
You can freely allocate new threads at any time without having to rearrange any data or do more pointer arithmetics
you can freely reallocate/resize each individual buffer when needed (without worrying about all the other threads' indexing requirement)
Debugging: You can locate problems so much easier that result from "writing out of boundaries", because if they tried, the bad thread would crash, and not other threads that would have to deal with corrupted data
Java ALWAYS checks each array access (on normal heap arrays like byte[]) before it accesses it, exactly to prevent side effects
think back: once upon a time there was the big step in operating systems to introduce linear address space so programs would NOT have to care about where in the hardware RAM they're loaded.
Your one-buffer-design would be the exact step backwards.
Conclusion:
If you wanna have a really bad design choice - which WILL make life a lot harder later on - you go with one global Buffer.
If you wanna do it the proper OO way, separate those buffers. No convoluted dependencies and side effect problems.
How actually a buffer optimize the process of reading/writing?
Every time when we read a byte we access the file. I read that a buffer reduces the number of accesses the file. The question is how?. In the Buffered section of picture, when we load bytes from the file to the buffer we access the file just like in Unbuffered section of picture so where is the optimization?
I mean ... the buffer must access the file every time when reads a byte so
even if the data in the buffer is read faster this will not improve performance in the process of reading. What am I missing?
The fundamental misconception is to assume that a file is read byte by byte. Most storage devices, including hard drives and solid-state discs, organize the data in blocks. Likewise, network protocols transfer data in packets rather than single bytes.
This affects how the controller hardware and low-level software (drivers and operating system) work. Often, it is not even possible to transfer a single byte on this level. So, requesting the read of a single byte ends up reading one block and ignoring everything but one byte. Even worse, writing a single byte may imply reading an entire block, changing one bye of it, and writing the block back to the device. For network transfers, sending a packet with a payload of only one byte implies using 99% of the bandwidth for metadata rather than actual payload.
Note that sometimes, an immediate response is needed or a write is required to be definitely completed at some point, e.g. for safety. That’s why unbuffered I/O exists at all. But for most ordinary use cases, you want to transfer a sequence of bytes anyway and it should be transferred in chunks of a size suitable to the underlying hardware.
Note that even if the underlying system injects a buffering on its own or when the hardware truly transfers single bytes, performing 100 operating system calls to transfer a single byte on each still is significantly slower than performing a single operating system call telling it to transfer 100 bytes at once.
But you should not consider the buffer to be something between the file and your program, as suggested in your picture. You should consider the buffer to be part of your program. Just like you would not consider a String object to be something between your program and a source of characters, but rather a natural way to process such items. E.g. when you use the bulk read method of InputStream (e.g. of a FileInputStream) with a sufficiently large target array, there is no need to wrap the input stream in a BufferedInputStream; it would not improve the performance. You should just stay away from the single byte read method as much as possible.
As another practical example, when you use an InputStreamReader, it will already read the bytes into a buffer (so no additional BufferedInputStream is needed) and the internally used CharsetDecoder will operate on that buffer, writing the resulting characters into a target char buffer. When you use, e.g. Scanner, the pattern matching operations will work on that target char buffer of a charset decoding operation (when the source is an InputStream or ByteChannel). Then, when delivering match results as strings, they will be created by another bulk copy operation from the char buffer. So processing data in chunks is already the norm, not the exception.
This has been incorporated into the NIO design. So, instead of supporting a single byte read method and fixing it by providing a buffering decorator, as the InputStream API does, NIO’s ByteChannel subtypes only offer methods using application managed buffers.
So we could say, buffering is not improving the performance, it is the natural way of transferring and processing data. Rather, not buffering is degrading the performance by requiring a translation from the natural bulk data operations to single item operations.
As stated in your picture, buffered file contents are saved in memory and unbuffered file is not read directly unless it is streamed to program.
File is only representation on path only. Here is from File Javadoc:
An abstract representation of file and directory pathnames.
Meanwhile, buffered stream like ByteBuffer takes content (depends on buffer type, direct or indirect) from file and allocate it into memory as heap.
The buffers returned by this method typically have somewhat higher allocation and deallocation costs than non-direct buffers. The contents of direct buffers may reside outside of the normal garbage-collected heap, and so their impact upon the memory footprint of an application might not be obvious. It is therefore recommended that direct buffers be allocated primarily for large, long-lived buffers that are subject to the underlying system's native I/O operations. In general it is best to allocate direct buffers only when they yield a measureable gain in program performance.
Actually depends on the condition, if the file is accessed repeatedly, then buffered is a faster solution rather than unbuffered. But if the file is larger than main memory and it is accessed once, unbuffered seems to be better solution.
Basically for reading if you request 1 byte the buffer will read 1000 bytes and return you the first byte, for next 999 reads for 1 byte it will not read anything from the file but use its internal buffer in RAM. Only after you read all the 1000 bytes it will actually read another 1000 bytes from the actual file.
Same thing for writing but in reverse. If you write 1 byte it will be buffered and only if you have written 1000 bytes they may be written to the file.
Note that choosing the buffer size changes the performance quite a bit, see e.g. https://stackoverflow.com/a/237495/2442804 for further details, respecting file system block size, available RAM, etc.
When I use glMapBuffer I get a float (byte) buffer as return type, which I can use to modify the data on the server side.
But is there any performance advantages of doing so?
I have an example
Approach 1:
I create a float buffer with vertex data and pass it to glBufferData directly.
Approach 2:
I allocate space using glBufferData and I pass no data...
I get the reference to a float buffer...
I write the float values to it to... and I unmap the buffer.
What are the pros and cons of two approaches
Am I doing the same on both?
I think that second approach avoids duplicates of buffers.
There are two very related aspects to this:
Reducing memory usage.
Avoiding unnecessary copying of data, which can hurt performance.
Calling glBufferData() with data being passed in includes the following:
You allocate buffer memory to store your data.
You store your data in this buffer you allocated.
When you call glBufferData(), the OpenGL implementation allocates memory for the data.
The OpenGL implementation copies the data from your buffer into its own allocation.
Compare this with what happens when you do the same thing with buffer mapping:
When you call glBufferData(), the OpenGL implementation allocates memory for the data.
When you call glMapBuffer(), the OpenGL implementation returns a pointer to its memory.
You store your data in this memory.
You unmap the buffer.
If you compare the two sequences, you have an extra memory allocation in the first one, which means that it requires about twice the memory in total. And the OpenGL implementation has to copy the buffer data in the first one, which is not the case in the second.
In reality, things can get a bit more complicated. Particularly on systems that have dedicated graphics memory (VRAM), there might be more copies of the data. But the principle remains, you reduce extra memory allocations and copying.
Another aspect to keep in mind is what happens beyond the initial use of the buffer, if you want to modify the content of the buffer after it was already used. Again, glMapBuffer() will generally reduce the amount of extra data copying, but it might come at the price of undesired synchronization. So it could be more efficient to pay the price for an extra copy needed for glBufferData() or glBufferSubData() to avoid synchronization points.
If you have these more complex cases where you frequently modify buffer data, you really need to start benchmarking, and you have to expect differences between vendors. You can also look into schemes where you use buffer mapping, but use a pool of buffers you cycle through instead of a single buffer, to reduce/avoid the performance penalty from synchronization.
On top of this, if you work on devices where power/thermal considerations come into play, you may want to measure power usage in addition to just execution speed. Because the fastest solution might not necessarily be the most power efficient.
What differences are there (if any) between the regular java memory model for memory in the heap vs an mmap'd file accessed through a direct byte buffer?
Eg if I have multiple threads writing to the byte buffer, is there any special synchronization necessary to ensure that a reader thread will see all the changes?
No difference. Yes, you have to have store synchronized-with edges between writers and the reader to ensure data is written to the buffer.
Can someone with the natural gift to explain complex things in an easy and straightforward way address this question? To acquire the best performance when should I use direct ByteBuffers versus regular ByteBuffers when doing network I/O with Java NIO?
For example: Should I read into a heap buffer and parse it from there, doing many get() (byte by byte) OR should I read it into a direct buffer and parse from the direct buffer?
To acquire the best performance when should I use direct ByteBuffers versus regular ByteBuffers when doing network I/O with Java NIO?
Direct buffers have a number of advantages
The avoid an extra copy of data passed between Java and native memory.
If they are re-used, only the page used are turning into real memory. This means you can make them much larger than they need to me and they only waste virtual memory.
You can access multi-byte primitives in native byte order efficiently. (Basically one machine code instruction)
Should I read into a heap buffer and parse it from there, doing many get() (byte by byte) OR should I read it into a direct buffer and parse from the direct buffer?
If you are reading a byte at a time, you may not get much advantage. However, with a direct byte buffer you can read 2 or 4 bytes at a time and effectively parse multiple bytes at once.
[real time] [selectors]
If you are parsing real time data, I would avoid using selectors. I have found using blocking NIO or busy waiting NIO can give you the lowest latency performance (assuming you have a relatively small number of connections e.g. up to 20)
A direct buffer is best when you are just copying the data, say from a socket to a file or vice versa, as the data doesn't have to traverse the JNI/Java boundary, it just stays in JNI land. If you are planning to look at the data yourself there's no point in a direct buffer.