ByteBuffer=Garbage collector spam - java

Few days ago I was given a solution of checking collision between two bitmaps that has config_alpha_8. But upon using it I noticed my app started lagging oddly, and when I checked the logs I noticed the garbage collector was spamming every millisecond
I tried removing few lines, and found out what causing the garbage collector going hype sh*t were these lines:
byte[] pixelData = getPixels(bitmap1);
byte[] pixelData2 = getPixels(bitmap2);
which called this function:
public byte[] getPixels(Bitmap bmp) {
int bytes = bmp.getRowBytes() * bmp.getHeight();
ByteBuffer buffer = ByteBuffer.allocate(bytes);
bmp.copyPixelsToBuffer(buffer);
return buffer.array();
}
Why? What can I do to make it stop?

You are allocating large contiguous blocks of memory (i.e. a byte[]). Depending on how large your images are, this could be accounting for a significant amount of your available heap.
If you are going to be doing a lot of these type of operations, it may be worth considering pooling byte[] instances of fixed sizes to be reused.

Related

Why to use BitmapFactory.Options.inTempStorage?

What are intended use cases for the BitmapFactory.Options.inTempStorage option?
Documentation is pretty terse on this:
Temp storage to use for decoding. Suggest 16K or so.
If I'm not mistaken it means that if you don't provide the buffer explicitly, it would create and use one by itself.
So the only benefit I see is reusing the same 16K buffer for multiple decodings which seems to have quite questionable impact on performance/memory usage optimization.
So why SDK authors give us control over the temp storage for decoding? Should providing much greater buffer improve decoding performance?
Can someone expand on this?
It seems that your assumption is the correct one - this option is mainly for recycling the buffer itself.
From the Android Source Code:
// pass some temp storage down to the native code. 1024 is made up,
// but should be large enough to avoid too many small calls back
// into is.read(...) This number is not related to the value passed
// to mark(...) above.
byte [] tempStorage = null;
if (opts != null) tempStorage = opts.inTempStorage;
if (tempStorage == null) tempStorage = new byte[16 * 1024];
This means that if you do not send this buffer, it will be allocated. Though does not look like an optimization for most cases, if you load many small images - the allocation of a 16K buffer per image might be pricy.
Regarding the buffer size, as you can see from the comments in the code - there is no magic number. What happens is that the Native code that decodes the image, uses the InputStream managed code to fetch the actual raw bytes (from disk/network etc). It uses the allocated buffer to communicate the bytes for each READ call. So, it is really depends on the InputStream. For example, disk IS might read from the disk in a bulk of 4k and then 16k is more than enough - passing in a buffer bigger than that will not improve the performance since the buffer will not fill up more than 4k at each READ call.
In any case, considering this kind of optimization should be for a really specific cases - if you have such a case, you can provide a bigger buffer and see if it has any affect on the performance.

RAM rapidily increases while running this Java program

The program is receiving the image data in bytes from IP camera and then process the image. The first time when the program starts uses 470Mb of RAM, and in every 1 second it increases up to 15Mb, it will continue till there is no enough space and the computer hanged up.
The method getImage() is called every 100ms
I have done some experiment going to share here. The original code is like this: (in which the buffer is created only once and after that, it can be reused)
private static final int WIDTH = 640;
private static final int HEIGHT = 480;
private byte[] sJpegPicBuffer = new byte[WIDTH * HEIGHT];
private Mat readImage() throws Exception {
boolean isGetSuccess = camera.getImage(lUserID, sJpegPicBuffer, WIDTH * HEIGHT);
if (isGetSuccess) {
return Imgcodecs.imdecode(new MatOfByte(sJpegPicBuffer), Imgcodecs.CV_LOAD_IMAGE_UNCHANGED);
}
return null;
}
In the above code, RAM goes up to Computer hang up (99% 10Gb). Then I changed the code like this: (in every loop it will create a new buffer)
private static final int WIDTH = 640;
private static final int HEIGHT = 480;
private Mat readImage() throws Exception {
byte[] sJpegPicBuffer = new byte[WIDTH * HEIGHT];
boolean isGetSuccess = camera.getImage(lUserID, sJpegPicBuffer, WIDTH * HEIGHT);
if (isGetSuccess) {
return Imgcodecs.imdecode(new MatOfByte(sJpegPicBuffer), Imgcodecs.CV_LOAD_IMAGE_UNCHANGED);
}
return null;
}
In this above code the RAM goes up to about 43% (5Gb)and then freed up.
Now the question is in the first block of code seems to be optimized, the buffer can be reused avoiding to create new space of memory in every call, but the result is not something we want. Why?
In the second block of code, it seems that the code is not as optimized as the first one, but works well than the first one.
But in general why the RAM increasing up to 10Gb in the first case and 5Gb in the second case. How can we control this situation?
This is a speculation, though I've seen similar scenario in real live few times.
Your Java code is interacting with native camera SDK (dll). Native code is like to allocating buffers in non JVM memory and use some internal Java objects to access that buffers. Common (a very poor) practice is to relay on Java object finalizer deallocate native buffer if it is not used any more.
Finalizers rely on garbage collector to trigger them, and this is a reason that pattern often fails. Although, finalizer is guarantied to run eventually, in practice it would not happen as long as there are enough space in Java heap and native memory would not be deallocated in timely fashion.
Java heap size have hard limit, but native memory pool used by C/C++ can grow as long as OS allow it to grow.
Concerning your problem
I assume in your first snippet, Java heap traffic is low. GC is idle and no finalizers are executes, thus memory allocated outside of Java heap keeps growing.
In second snippet, your are creating pressure on Java heap forcing GC to run frequently. As a side effect of GC finializer are executed and native memory released.
Instead of finalizers and buffer allocated in native code, your camera SDK may relay on Java direct memory buffers (these memory is direct accessing for C code so it is convenient to pass data over JVM boundary). Though effect would be mostly the same, because Java direct buffers implementation is using same pattern (with phantom references in stead of finalizers).
Suggestions
-XX:+PrintGCDetails and -XX:+PrintReferenceGC options would print information about reference processing so you can verify if finalizer/phantom references are indeed being used.
look at you camera's SDK docs to see is it is possible to release native resources early via API
option -XX:MaxDirectMemorySize=X can be used to cap direct buffer usage, if your camara's SDK relays on them. Though it is not a solution, but a safety net to let your application OOM before OS memory is exhausted
force every few frames (e.g. System.gc()). This is another poor option as behavior of System.gc() is JVM dependent.
PS
This is my post about resource management with finalizers and phantom references.

How to parse Google Protocol Buffers that in the direct memory without allocating heap byte array in Java?

I'm trying parse a ByteBuf that is not in the JVM heap, to Google Protocol Buffer object. In fact it's the direct memory byte buffer which Netty passed to me.
This is what I am currently doing:
ByteBuf buf = ...;
ByteBufInputStream stream = new ByteBufInputStream(buf);
Message msg = MyPbMessage.getDefaultInstance().getParserForType().parseFrom(stream);
This can work. However, I found this type of parsing introduce new byte array per message, and cause a lot GC.
So is there a way to avoid these in heap byte arrays creating? i.e, parse Google Protocol Buffer bytes directly from native memory.
You can do it the way the Guava guys do store a small buffer (1024 bytes) in a ThreadLocal, use it if it suffices and never put a bigger buffer in the TL.
This'll work fine as long as most requests can be served by it. If the average/median size is too big, you could go for soft/weak reference, however, without some real testing it's hard to tell if it helps.
You could combine the approaches, i.e., use a strongly referenced small buffer and a weakly referenced big buffer in the TL. You could pool your buffers, you could...
... but note that it all has its dark side. Wasted memory, prolonged buffer lifetime leading to promoting them to the old generations, where garbage collecting is much more expensive.

How does Java (Android) reuse free-ed memory?

I have an Android application that reads a lot of chunks of bytes one by one via network, then combine them into a large buffer. For example,
ByteArrayOutputStream outputStream = new ByteArrayOutputStream( );
while (i < 10) {
// read is an API in a lib that returns byte[].
byte[] bytes = API.read();
outputStream.write(bytes);
i++;
}
...
The question is about the memory for bytes. Is there a way to force Java to use the same chunk of byte for all reads? So it does not have to free and allocate memory too much? Will JAVA runtime optimize the case? Thanks.
The byte[] will be garbage collected. It is not appropriate to use an NIO ByteBuffer in this case as you are getting byte[] anyway, though it could come in handy later.
With each loop iteration, a byte[] is being created and filled with data, read from into the stream, and then is no longer used. Once memory runs low (or earlier depending on how your JVM operates) the array will be deleted and the memory made available.
You need not worry about such things most of the time (unless you are concatenating tons of strings, which is extremely inefficient for this reason).

ByteBuffer allocateDirect taking a really long time

I'm using float buffers as direct byte buffers as required for opengl drawing in Android. The problem that is that upon creating the byte buffer, the GC goes crazy---as in 30s+ crazy. I'm creating a mesh of 40x40 vertices, or 1600 vertices, or 4800 floats. As per the profiler, the culprit that calls the GC is ByteBuffer.allocateDirect.
Is this normal or expected for creating a mesh this size? It seems pretty tame.
The buffer init() code is below:
public static FloatBuffer createFloatBuffer(int capacity) {
ByteBuffer vbb = ByteBuffer.allocateDirect(capacity * 4);
vbb.order(ByteOrder.nativeOrder());
return vbb.asFloatBuffer();
}
Your question says allocateDirect, but your code says allocate. Which are you using?
allocateDirect is known to call System.gc in an attempt to force DirectByteBuffer to be reclaimed before trying (and failing) to allocate a new direct byte buffer.
See this answer for one suggestion on avoiding the GC. Alternatively, you could try creating a pool of appropriately-sized DirectByteBuffer rather than continuously creating new ones.

Categories

Resources