I have an Android application that reads a lot of chunks of bytes one by one via network, then combine them into a large buffer. For example,
ByteArrayOutputStream outputStream = new ByteArrayOutputStream( );
while (i < 10) {
// read is an API in a lib that returns byte[].
byte[] bytes = API.read();
outputStream.write(bytes);
i++;
}
...
The question is about the memory for bytes. Is there a way to force Java to use the same chunk of byte for all reads? So it does not have to free and allocate memory too much? Will JAVA runtime optimize the case? Thanks.
The byte[] will be garbage collected. It is not appropriate to use an NIO ByteBuffer in this case as you are getting byte[] anyway, though it could come in handy later.
With each loop iteration, a byte[] is being created and filled with data, read from into the stream, and then is no longer used. Once memory runs low (or earlier depending on how your JVM operates) the array will be deleted and the memory made available.
You need not worry about such things most of the time (unless you are concatenating tons of strings, which is extremely inefficient for this reason).
Related
What are intended use cases for the BitmapFactory.Options.inTempStorage option?
Documentation is pretty terse on this:
Temp storage to use for decoding. Suggest 16K or so.
If I'm not mistaken it means that if you don't provide the buffer explicitly, it would create and use one by itself.
So the only benefit I see is reusing the same 16K buffer for multiple decodings which seems to have quite questionable impact on performance/memory usage optimization.
So why SDK authors give us control over the temp storage for decoding? Should providing much greater buffer improve decoding performance?
Can someone expand on this?
It seems that your assumption is the correct one - this option is mainly for recycling the buffer itself.
From the Android Source Code:
// pass some temp storage down to the native code. 1024 is made up,
// but should be large enough to avoid too many small calls back
// into is.read(...) This number is not related to the value passed
// to mark(...) above.
byte [] tempStorage = null;
if (opts != null) tempStorage = opts.inTempStorage;
if (tempStorage == null) tempStorage = new byte[16 * 1024];
This means that if you do not send this buffer, it will be allocated. Though does not look like an optimization for most cases, if you load many small images - the allocation of a 16K buffer per image might be pricy.
Regarding the buffer size, as you can see from the comments in the code - there is no magic number. What happens is that the Native code that decodes the image, uses the InputStream managed code to fetch the actual raw bytes (from disk/network etc). It uses the allocated buffer to communicate the bytes for each READ call. So, it is really depends on the InputStream. For example, disk IS might read from the disk in a bulk of 4k and then 16k is more than enough - passing in a buffer bigger than that will not improve the performance since the buffer will not fill up more than 4k at each READ call.
In any case, considering this kind of optimization should be for a really specific cases - if you have such a case, you can provide a bigger buffer and see if it has any affect on the performance.
I'm trying parse a ByteBuf that is not in the JVM heap, to Google Protocol Buffer object. In fact it's the direct memory byte buffer which Netty passed to me.
This is what I am currently doing:
ByteBuf buf = ...;
ByteBufInputStream stream = new ByteBufInputStream(buf);
Message msg = MyPbMessage.getDefaultInstance().getParserForType().parseFrom(stream);
This can work. However, I found this type of parsing introduce new byte array per message, and cause a lot GC.
So is there a way to avoid these in heap byte arrays creating? i.e, parse Google Protocol Buffer bytes directly from native memory.
You can do it the way the Guava guys do store a small buffer (1024 bytes) in a ThreadLocal, use it if it suffices and never put a bigger buffer in the TL.
This'll work fine as long as most requests can be served by it. If the average/median size is too big, you could go for soft/weak reference, however, without some real testing it's hard to tell if it helps.
You could combine the approaches, i.e., use a strongly referenced small buffer and a weakly referenced big buffer in the TL. You could pool your buffers, you could...
... but note that it all has its dark side. Wasted memory, prolonged buffer lifetime leading to promoting them to the old generations, where garbage collecting is much more expensive.
Few days ago I was given a solution of checking collision between two bitmaps that has config_alpha_8. But upon using it I noticed my app started lagging oddly, and when I checked the logs I noticed the garbage collector was spamming every millisecond
I tried removing few lines, and found out what causing the garbage collector going hype sh*t were these lines:
byte[] pixelData = getPixels(bitmap1);
byte[] pixelData2 = getPixels(bitmap2);
which called this function:
public byte[] getPixels(Bitmap bmp) {
int bytes = bmp.getRowBytes() * bmp.getHeight();
ByteBuffer buffer = ByteBuffer.allocate(bytes);
bmp.copyPixelsToBuffer(buffer);
return buffer.array();
}
Why? What can I do to make it stop?
You are allocating large contiguous blocks of memory (i.e. a byte[]). Depending on how large your images are, this could be accounting for a significant amount of your available heap.
If you are going to be doing a lot of these type of operations, it may be worth considering pooling byte[] instances of fixed sizes to be reused.
I am writing bytes of image to ByteArrayOutputStream then sending it over socket.
The problem is, when I do
ImageIO.write(image, "gif", byteArray);
Memory goes up VERY much, kinda memory leak.
I send using this
ImageIO.write(image, "gif", byteArrayO);
byte [] byteArray = byteArrayO.toByteArray();
byteArrayO.flush();
byteArrayO.reset();
Connection.pw.println("" + byteArray.length);
int old = Connection.client.getSendBufferSize();
Connection.client.setSendBufferSize(byteArray.length);
Connection.client.getOutputStream().write(byteArray, 0, byteArray.length);
Connection.client.getOutputStream().flush();
image.flush();
image = null;
byteArrayO = null;
byteArray = null;
System.gc();
Connection.client.setSendBufferSize(old);
As you can see I have tried all ways, the error comes when I write to the ByteArrayOutputStream, not when I transfer it. The receiver does not get any errors.
Any way I can clear the byteArray, and remove everything it has in it from memory? I know reset() does, but it dont in here. I want to dispose of the ByteArrayOutputStream directly when this is done.
#Christoffer Hammarström probably has the best solution, but I'll add this to try to explain the memory usage.
These 2 lines are creating 3 copies of your image data:
ImageIO.write(image, "gif", byteArrayO);
byte [] byteArray = byteArrayO.toByteArray();
After executing this you have one copy of the data stored in image, one copy in the ByteArrayOutputStream and another copy in the byte array (toByteArray() does not return the internal buffer it creates a copy).
Calling reset() does not release the memory inside the ByteArrayOutputStream, it just resets the position counter back to 0. The data is still there.
To allow the memory to be released earlier you could assign each item to null as soon as you have finished with it. This will allow the memory to be collected by the garbage collector if it decides to run earlier. EG:
ImageIO.write(image, "gif", byteArrayO);
image = null;
byte [] byteArray = byteArrayO.toByteArray();
byteArrayO = null;
...
Why do you have to fiddle with the send buffer size? What kind of protocol are you using on top of this socket? It should be just as simple as:
ImageIO.write(image, "gif", Connection.client.getOutputStream());
If you have to use a ByteArrayOutputStream, at least use
byteArrayO.writeTo(Connection.client.getOutputStream())
so you don't make an extra redundant byte[].
This is not quite the answer you want, but something you might wish to consider.
Why not create a pool of byte arrays and resuse them everytime you need to. This will be a little more efficient as you wont be creating new arrays and throwing them away all the time. Using less gc is always a good thing. You will also be able to guarantee that the application has enough memory to operate in all the time.
You can request that the VM to run garbage collection through System.gc() but this is NOT guaranteed to actually happen. The virtual machine performs garbage collection when it decides it is necessary or is an appropriate time.
What you are describing is pretty normal. It has to put the bytes of the image you are creating somewhere.
Instead of memory you can use a FileOutputStream to write the bytes to. You then have to make a FileInputStream to read from the file you wrote to and a loop which reads bytes into a byte array buffer of say 64k size and then writes those bytes to the connection's output stream.
You mention error. If you are getting an error what is the error?
If you use the client JVM (-client argument to java) then the memory might be given back to the OS and the Java process will shrink again. I'm not sure about this.
If you don't like how much memory JAI is using you can try using Sanselan: http://commons.apache.org/imaging/
In Java, I have a method
public int getNextFrame( byte[] buff )
that reads from a file into the buffer and returns the number of bytes read. I am reading from .MJPEG that has a 5byte value, say "07939", followed by that many bytes for the jpeg.
The problem is that the JPEG byte size could overflow the buffer. I cannot seem to find a neat solution for the allocation. My goal is to not create a new buffer for every image. I tried a direct ByteBuffer so I could use its array() method to get direct access to the underlying buffer. The ByteBuffer does not expand dynamically.
Should I be returning a reference to the parameter? Like:
public ByteBuffer getNextFrame( ByteBuffer ref )
How do I find the bytes read? Thanks.
java.io.ByteArrayOutputStream is a wrapper around a byte-array and enlarges it as needed. Perhaps this is something you could use.
Edit:
To reuse just call reset() and start over...
Just read the required number of bytes. Do not use read(buffer), but use read(buffer,0,size). If there are more bytes, just discard them, the JPG is broken anyway.
EDIT:
Allocating a byte[] is so much faster than reading from a file or a
socket, I would be surprised it will make much difference, unless you
have a system where micro-seconds cost money.
The time it takes to read a file of 64 KB is about 10 ms (unless the
file is in memory)
The time it takes to allocate a 64 KB byte[] is about 0.001 ms,
possibly faster.
You can use apache IO's IOBuffer, however this expands very expensively.
You can also use ByteBuffer, the position() will tell you how much data was read.
If you don't know how big the buffer will be and you have a 64-bit JVM you can create a large direct buffer. This will only allocate memory (by page) when used. The upshot is that you can allocate a 1 GB but might only ever use 4 KB if that is all you need. Direct buffer doesn't support array() however, you would have to read from the ByteBuffer using its other methods.
Another solution is to use an AtomicReference<byte[]> the called method can increase the size as required, but if its large enough it would reuse the previous buffer.
The usual way of accomplishing this in a high-level API is either let the user provide an OutputStream and fill it with your data (which can be a ByteArrayOutputStream or something completely different), or have an InputStream as return value, that the user can read to get the data (which will dynamically load the correct parts from the file and stop when finished).