Java heap size growing too big with Infinispan cache

Java heap size growing too big with Infinispan cache - java

I am using an Infinispan cache to store values. The code writes to the cache every 10 minutes and the cache reaches a size of about 400mb.
It has a time to live of about 2 hours, and the maximum entries is 16 million although currently in my tests the number of entries doesn't go above 2 million or so (I can see this by checking the mbeans/metrics in jconsole).
When I start jboss the java heap size is 1.5Gb to 2Gb. The -Xmx setting for the maximum allocated memory to jboss is 4Gb.
When I disable the Infinispan cache the heap memory usage stays flat at around 1.5Gb to 2Gb. It is very constant and stays at that level.
=> The problem is: when I have the Infinispan cache enabled the java heap size grows to about 3.5Gb/4Gb which is way more than expected.
I have done a heap dump to check the size of the cache in Eclipse MAT and it is only 300 or 400mb (which is ok).
So I would expect the memory usage to go to 2.5Gb and stay steady at that level, since the initial heap size is 2Gb and the maximum cache size should only be around 500mb.
However it continues to grow and grow over time. Every 2 or 3 hours a garbage collection is done and that brings the usage down to about 1 or 1.5Gb but it then increases again within 30 minutes up to 3.5Gb.
The number of entries stays steady at about 2 million so it is not due to just more entries going in to the cache. (Also the number of evictions stays at 0).
What could be holding on to this amount of memory if the cache is only 400-500mb?
Is it a problem with my garbage collection settings? Or should I look at Infinispan settings?
Thanks!
Edit: you can see the heap size over time here.
What is strange is that even after what looks like a full GC, the memory shoots back up again to 3Gb. This corresponds to more entries going into the cache.
Edit: It turns out this has nothing to do with Infinispan. I narrowed down the problem to a single line of code that is using a lot of memory (about 1Gb more than without the call).
But I do think more and more memory is being taken by the Infinispan cache, naturally because more entries are being added over the 2 hour time to live.
I also need to have upwards of 50 users query on Infinispan. When the heap reaches a high value like this (even without the memory leak mentioned above), I know it's not an error scenario in java however I need as much memory available as possible.
Is there any way to "encourage" a heap dump past a certain point? I have tried using GC options to collect at a given proportion of heap for the old gen but in general the heap usage tends to creep up.

Probably what you're seeing is the JVM not collecting objects which have been evicted from the cache. Cache's in general have a curious relationship with the prevailing idea of generational GC.
The generational GC idea is that, broadly speaking, there are two types of objects in the JVM - short lived ones, which are used and thrown away quickly, and longer lived ones, which are usually used throughout the lifetime of the application. In this model you want to tune your GC so that you put most of your effort attempting to identify the short lived objects. This means that you avoid looking at the long-lived objects as much as possible.
Cache's disrupt this pattern by having some intermediate-length object lifespans (i.e. a few seconds / minutes / hours, depending on your cache). These objects often get promoted to the tenured generation, where they're not usually looked at until a full GC becomes necessary, even after they've been evicted from the cache.
If this is what's happening then you've a couple of choices:
ignore it, let the full GC semantics do its thing and just be aware that this is what's happening.
try to tune the GC so that it takes longer for objects to get promoted to the tenured generation. There are some GC flags which can help with that.

Related

How to deal with long Full Garbage Collection cycle in Java

We inherited a system which runs in production and started to fail every 10 hours recently. Basically, our internal software marks the system that is has failed if it is unresponsive for a minute. We found that our problem that our Full GC cycles last for 1.5 minutes, we use 30 GB heap. Now the problem is that we cannot optimize a lot in a short period of time and we cannot partition of our service quickly but we need to get rid of 1.5 minutes pauses as soon as possible as our system fails because of these pauses in production. For us, an acceptable delay is 20 milliseconds but not more. What will be the quickest way to tweak the system? Reduce the heap to trigger GCs frequently? Use System.gc() hints? Any other solutions? We use Java 8 default settings and we have more and more users - i.e. more and more objects created.
Some GC stat

You have a lot of retained data. There is a few options which are worth considering.
increase the heap to 32 GB, this has little impact if you have free memory. Looking again at your totals it appears you are using 32 GB rather than 30 GB, so this might not help.
if you don't have plenty of free memory, it is possible a small portion of your heap is being swapped as this can increase full GC times dramatically.
there might be some simple ways to make the data structures more compact. e.g. use compact strings, use primitives instead of wrappers e.g. long for a timestamp instead of Date or LocalDateTime. (long is about 1/8th the size)
if neither of these help, try moving some of the data off heap. e.g. Chronicle Map is a ConcurrentMap which uses off heap memory can can reduce you GC times dramatically. i.e. there is no GC overhead for data stored off heap. How easy this is to add highly depends on how your data is structured.
I suggest analysing how your data is structured to see if there is any easy ways to make it more efficient.

There is no one-size-fits-all magic bullet solution to your problem: you'll need to have a good handle on your application's allocation and liveness patterns, and you'll need to know how that interacts with the specific garbage collection algorithm you are running (function of version of Java and command line flags passed to java).
Broadly speaking, a Full GC (that succeeds in reclaiming lots of space) means that lots of objects are surviving the minor collections (but aren't being leaked). Start by looking at the size of your Eden and Survivor spaces: if the Eden is too small, minor collections will run very frequently, and perhaps you aren't giving an object a chance to die before its tenuring threshold is reached. If the Survivors are too small, objects are going to be promoted into the Old gen prematurely.
GC tuning is a bit of an art: you run your app, study the results, tweak some parameters, and run it again. As such, you will need a benchmark version of your application, one which behaves as close as possible to the production one but which hopefully doesn't need 10 hours to cause a full GC.
As you stated that you are running Java 8 with the default settings, I believe that means that your Old collections are running with a Serial collector. You might see some very quick improvements by switching to a Parallel collector for the Old generation (-XX:+UseParallelOldGC). While this might reduce the 1.5 minute pause to some number of seconds (depending on the number of cores on your box, and the number of threads you specify for GC), this will not reduce your max pause to to 20ms.

When this happened to me, it was due to a memory leak caused by a static variable eating up memory. I would go through all recent code changes and look for any possible memory leaks.

Java UI application: slow CPU growth

I want to understand: is it a normal situation that CPU usage of the working java UI application is growing slowly (started from <= 1.5%, after 48 hours: <= 10%). I don't see memory leaks during the heapdump investigations.
Although, if I perform gc (using jvisualvm) and look at deltas (sample memory part), such classes as WeakReference, WeakListenerImpl are still growing (slowly).
Also, the problem is that major garbage collection occurs too often (practically every second), however at first few hours the situation was normal.
What could be the reason of a such application behavior?
JVM:
-Xms128m
-Xmx256m
GC:
default for jdk 1.8
Thank you in advance!

Also, the problem is that major garbage collection occurs too often (practically every second), however at first few hours the situation was normal.
Consider increasing the max heap size (Xmx) to give the GCs more breathing room.
Although, if I perform gc (using jvisualvm) and look at deltas (sample memory part), such classes as WeakReference, WeakListenerImpl are still growing (slowly).
There are two possibilities, either weak references themselves get cleared but the reference objects are not dequeued from a referencequeue (this would usually result in a very slow leak over time) or something is holding a strong reference to the objects.
You should take a heap dump and inspect what keeps the accumulating objects reachable from GC roots.

Does this memory usage pattern indicate that my Java application leaks memory?

I have a Java application that waits for the user to hit a key and then runs a task. Once done, it goes back and waits again. I was looking at memory profile for this application with jvisualvm, and it showed an increasing pattern.
Committed memory size is 16MB.
Used memory, on application startup, was 2.7 MB, and then it climbed with intermediate drops (garbage collection). Once this sawtooth pattern approached close to 16MB, a major drop occurred and the memory usage fell close to 4 MB. This major drop point has been increasing though. 4MB, 6MB, 8MB. The usage never goes beyond 16 MB but the whole sawtooth pattern is on a climb towards 16 MB.
Do I have a memory leak?
Since this is my first time posting to StackOverflow, do not have enough reputation to post an image.

Modern SunOracle JVMs use what is called a generational garbage collector:
When the collector runs it first tries a partial collection only releases memory that was allocated recently
recently created objects that are still active get 'promoted'
Once an object has been promoted a few times, it will no longer get cleaned up by partial collections even after it is ready for collection
These objects, called tenured, are only cleaned up when a full collection becomes necessary in order to make enough room for the program to continue running
So basically, bits of your program that stick around long enough to get missed by the fast 'partial' collections will hang around until JVM decides it has to do a full collection. If you let it go long enough you should eventually see the full collection happen and usage drop back down to your original starting point.
If that never happens and you eventually get an Out Of Memory exception, then you probably have a memory leak :)

That kind of sawtooth pattern is commonly observed and is not an indication of memory leak.
Because garbage collecting in big chunks is more efficient than constantly collecting small amounts, the JVM does the collecting in batches. That's why you see this pattern.

As stated by others, this behavior is normal. This is a good description of the garbage collection process. To summarize, the JVM usese a generational garbage collector. The vast majority of objects are very short-lived, and those that survive longer tend to last much longer. Knowing this, the GC will check the newer generation first to avoid having to repeatedly check the older objects which are less likely to be inaccessible. After a period of time, the survivors move to the older generation. This increasing saw-tooth is exactly what you're seeing- the rising troughs are due to the older generation growing larger as the survivors are being moved to it. If your program ran long enough eventually checking the newer generation wouldn't free up enough memory and it would have to GC the old generation as well.
Hope that helps.

What the frequency of the Garbage Collection in Java?

Page 6 of the the document Memory Management in the Java
HotSpot™ Virtual Machine contains the following paragraphs:
Young generation collections occur relatively frequently and are
efficient and fast because the young generation space is usually small
and likely to contain a lot of objects that are no longer referenced.
Objects that survive some number of young generation collections are
eventually promoted, or tenured, to the
old generation. See Figure 1. This generation is typically larger than the young generation and its occupancy
grows more slowly. As a result, old generation collections are infrequent, but take significantly longer to
complete
Could someone please define what "frequent" and "infrequent" mean in the statements above? Are we talking microseconds, milliseconds, minutes, days?

It is not possible to give a definite answer to this. It really depends on a lot of factors, including the platform (JVM version, settings, etc), the application, and the workload.
At one extreme, it is possible for an application to never trigger a garbage collector. It might simply sit there doing nothing, or it might perform an extremely long computation in which no objects are created after the JVM initialization and application startup.
At the other extreme it is theoretically possible for one garbage collection end and another one to start within few nanoseconds. For example, this could happen if your application is in the last stages of dying from a full heap, or if it is allocating pathologically large arrays.
So:
Are we talking microseconds, milliseconds, minutes, days?
Possibly all of the above, though the first two would definitely be troubling if you observed them in practice.
A well behaved application should not run the GC too often. If your application is triggering a young space collection more than once or twice a second, then this could lead to performance problems. And too frequent "full" collections is worse because their impact is greater. However, it is certainly plausible for a poorly designed / implemented application to behave like this.
There is also the issue that the interval between GC runs is not always meaningful. For instance some of the HotSpot GCs actually have GC threads running concurrently with normal application threads. If you have enough cores, enough RAM and enough memory bus bandwidth, then a constantly running concurrent GC may not appreciably affect application performance.
Terminology note:
Strictly speaking a concurrent GC is one where the GC can run at the same time as the application threads.
Strictly speaking a parallel GC is one where the GC itself uses multiple threads.
A GC can be concurrent without being parallel, and vice versa.

Its a relative term. Young collections could be many times a seconds up to a few hours. Old generations collections can be every few seconds, up to daily. You should expect to have many more young collections than old collections in a most systems.
Its highly unlikely to be many days. If the GC occurs too often e.g. << 100 ms apart you get get a OutOfMemoryError: GC Overhead Exceeded as the JVM prevenets that from happening.

As it is, the terms "frequent" , "infrequent" are relative. And the timings are, in fact, not fixed. It depends on the system in question. It depends on lots of things like:
Your heap size and settings for different parts of the heap (young, old gen, perm gen)
Your application's memory behaviour. How many objects does it create and how fast? how long those objects are referenced etc?
If your application is monster memory eater, gc would run as if its running for its life. If your application does not demand too much of memory, then gc would run at intervals decided by how full the memory is.

TL DL: "Frequent" and "infrequent" are relative terms that depends on the memory allocation rate and the heap size. If you want a precise answer, you need to measure it yourself for your particular application.
Let's say your app has two modes, mode-1 allocates memory and does computation and mode-2 sits idle.
If mode-1 allocation is smaller than the heap available, no gc need to occur until it finishes. Maybe it used so little RAM that it could do a second round of mode-1 without collection. However, eventually you'll run out of free heap, and jvm will perform an "infrequent" collection.
However, if mode-1 allocation is a significant fraction of, or larger, than the young-generation heap, collection would happen more "frequently". During the young gen collection, allocations that survive (imagine data is needed through the entire mode-1 operation), will be promoted to old-gen, giving the young-gen more room. Young-gen allocation and collection can now continue. Eventually old-gen heap would run out, and must be collected, thus "infrequently".
So then, how frequent is frequent? It depends on the allocation rate and the heap size. If jvm is bumping into the heap limit often, it'll collect often. If there is plenty of heap (let's say 100GB), then jvm doesn't need to collect for a long long time. The down side is that when it finally does a collection, it might take a long time to free 100GB, stopping the jvm for many seconds (or minutes!). The current JVMs are smarter than that and would occasionanlly force a collection (preferably in mode-2). And with parallel collectors, it could happen all the time if necessary.
Ultimately, the frequency is task and heap dependent, as well as how various vm parameters are set. If you want a precise answer, you must measure them yourself for your particular application.

Because spec says "relatively frequently" and infrequent (regarding Young generation), we can't estimate the frequency in absolute units like microseconds, milliseconds, minutes or days

Java heap size not entirely used

I'm currently monitoring my running java application with Visual VM: http://visualvm.java.net/
I'm stressing the memory usage by with -Xmx128m.
When running I see the heap size increasing to 128m (as expected) however the used heap converges to approximately 105m before I run into a java heap space error.
Why are these remaining 20m, not used?

You need to understand a central fact about garbage collector ergonomics:
The costly part of garbage collection is finding and dealing with the objects that are NOT garbage.
This means: as the heap gets close to its maximum capacity, the GC will spend more and more time for less and less return in reclaimed space. If the GC was to try and use every last byte of memory, the net result would be that your JVM would spend more and more time garbage collecting, until ... eventually ... almost no useful work was being done.
To avoid this pathological situation, the JVM monitors the ratio of time is spent GC'ing and doing useful work. When the ratio exceeds a configurable threshold value, the GC raises an OutOfMemoryError ... even though (technically) there is free memory available. This is probably what you are seeing, though the other explanations are equally plausible.
You can change the GC thresholds, generation sizes, etc via JVM options, but it is probably better not to. A better idea is to figure out why your application's memory usage is continually creeping upwards. There are most likely memory leaks ... i.e. a bugs ... in your code that are causing this. Spend your effort finding and fixing those bugs, rather than worrying about why you are not using all of the memory.
(In fact, you are using it ... but not all of the time.)

The heap is split up in Young-Generation (Eden-Space, and two Survivor-Spaces of identical size usually called From and To), Old Generation (Tenured) and Permanent Space.
The Xmx/Xms option sets the overall heap size. So a region (with a default size) is actually the Permanent Space - and maybe, we don't know details about your stress test, no objects are actually moved from eden to tenured or permanent, so those regions remain empty while Eden runs out of space.

Java splits its memory into generations. You can get a heap space error if the tenured generation fills. Normally, they resize dynamically but if you have set a fixed size it won't.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.