Java UI application: slow CPU growth - java

I want to understand: is it a normal situation that CPU usage of the working java UI application is growing slowly (started from <= 1.5%, after 48 hours: <= 10%). I don't see memory leaks during the heapdump investigations.
Although, if I perform gc (using jvisualvm) and look at deltas (sample memory part), such classes as WeakReference, WeakListenerImpl are still growing (slowly).
Also, the problem is that major garbage collection occurs too often (practically every second), however at first few hours the situation was normal.
What could be the reason of a such application behavior?
JVM:
-Xms128m
-Xmx256m
GC:
default for jdk 1.8
Thank you in advance!

Also, the problem is that major garbage collection occurs too often (practically every second), however at first few hours the situation was normal.
Consider increasing the max heap size (Xmx) to give the GCs more breathing room.
Although, if I perform gc (using jvisualvm) and look at deltas (sample memory part), such classes as WeakReference, WeakListenerImpl are still growing (slowly).
There are two possibilities, either weak references themselves get cleared but the reference objects are not dequeued from a referencequeue (this would usually result in a very slow leak over time) or something is holding a strong reference to the objects.
You should take a heap dump and inspect what keeps the accumulating objects reachable from GC roots.

Related

How to deal with long Full Garbage Collection cycle in Java

We inherited a system which runs in production and started to fail every 10 hours recently. Basically, our internal software marks the system that is has failed if it is unresponsive for a minute. We found that our problem that our Full GC cycles last for 1.5 minutes, we use 30 GB heap. Now the problem is that we cannot optimize a lot in a short period of time and we cannot partition of our service quickly but we need to get rid of 1.5 minutes pauses as soon as possible as our system fails because of these pauses in production. For us, an acceptable delay is 20 milliseconds but not more. What will be the quickest way to tweak the system? Reduce the heap to trigger GCs frequently? Use System.gc() hints? Any other solutions? We use Java 8 default settings and we have more and more users - i.e. more and more objects created.
Some GC stat
You have a lot of retained data. There is a few options which are worth considering.
increase the heap to 32 GB, this has little impact if you have free memory. Looking again at your totals it appears you are using 32 GB rather than 30 GB, so this might not help.
if you don't have plenty of free memory, it is possible a small portion of your heap is being swapped as this can increase full GC times dramatically.
there might be some simple ways to make the data structures more compact. e.g. use compact strings, use primitives instead of wrappers e.g. long for a timestamp instead of Date or LocalDateTime. (long is about 1/8th the size)
if neither of these help, try moving some of the data off heap. e.g. Chronicle Map is a ConcurrentMap which uses off heap memory can can reduce you GC times dramatically. i.e. there is no GC overhead for data stored off heap. How easy this is to add highly depends on how your data is structured.
I suggest analysing how your data is structured to see if there is any easy ways to make it more efficient.
There is no one-size-fits-all magic bullet solution to your problem: you'll need to have a good handle on your application's allocation and liveness patterns, and you'll need to know how that interacts with the specific garbage collection algorithm you are running (function of version of Java and command line flags passed to java).
Broadly speaking, a Full GC (that succeeds in reclaiming lots of space) means that lots of objects are surviving the minor collections (but aren't being leaked). Start by looking at the size of your Eden and Survivor spaces: if the Eden is too small, minor collections will run very frequently, and perhaps you aren't giving an object a chance to die before its tenuring threshold is reached. If the Survivors are too small, objects are going to be promoted into the Old gen prematurely.
GC tuning is a bit of an art: you run your app, study the results, tweak some parameters, and run it again. As such, you will need a benchmark version of your application, one which behaves as close as possible to the production one but which hopefully doesn't need 10 hours to cause a full GC.
As you stated that you are running Java 8 with the default settings, I believe that means that your Old collections are running with a Serial collector. You might see some very quick improvements by switching to a Parallel collector for the Old generation (-XX:+UseParallelOldGC). While this might reduce the 1.5 minute pause to some number of seconds (depending on the number of cores on your box, and the number of threads you specify for GC), this will not reduce your max pause to to 20ms.
When this happened to me, it was due to a memory leak caused by a static variable eating up memory. I would go through all recent code changes and look for any possible memory leaks.

Java heap size growing too big with Infinispan cache

I am using an Infinispan cache to store values. The code writes to the cache every 10 minutes and the cache reaches a size of about 400mb.
It has a time to live of about 2 hours, and the maximum entries is 16 million although currently in my tests the number of entries doesn't go above 2 million or so (I can see this by checking the mbeans/metrics in jconsole).
When I start jboss the java heap size is 1.5Gb to 2Gb. The -Xmx setting for the maximum allocated memory to jboss is 4Gb.
When I disable the Infinispan cache the heap memory usage stays flat at around 1.5Gb to 2Gb. It is very constant and stays at that level.
=> The problem is: when I have the Infinispan cache enabled the java heap size grows to about 3.5Gb/4Gb which is way more than expected.
I have done a heap dump to check the size of the cache in Eclipse MAT and it is only 300 or 400mb (which is ok).
So I would expect the memory usage to go to 2.5Gb and stay steady at that level, since the initial heap size is 2Gb and the maximum cache size should only be around 500mb.
However it continues to grow and grow over time. Every 2 or 3 hours a garbage collection is done and that brings the usage down to about 1 or 1.5Gb but it then increases again within 30 minutes up to 3.5Gb.
The number of entries stays steady at about 2 million so it is not due to just more entries going in to the cache. (Also the number of evictions stays at 0).
What could be holding on to this amount of memory if the cache is only 400-500mb?
Is it a problem with my garbage collection settings? Or should I look at Infinispan settings?
Thanks!
Edit: you can see the heap size over time here.
What is strange is that even after what looks like a full GC, the memory shoots back up again to 3Gb. This corresponds to more entries going into the cache.
Edit: It turns out this has nothing to do with Infinispan. I narrowed down the problem to a single line of code that is using a lot of memory (about 1Gb more than without the call).
But I do think more and more memory is being taken by the Infinispan cache, naturally because more entries are being added over the 2 hour time to live.
I also need to have upwards of 50 users query on Infinispan. When the heap reaches a high value like this (even without the memory leak mentioned above), I know it's not an error scenario in java however I need as much memory available as possible.
Is there any way to "encourage" a heap dump past a certain point? I have tried using GC options to collect at a given proportion of heap for the old gen but in general the heap usage tends to creep up.
Probably what you're seeing is the JVM not collecting objects which have been evicted from the cache. Cache's in general have a curious relationship with the prevailing idea of generational GC.
The generational GC idea is that, broadly speaking, there are two types of objects in the JVM - short lived ones, which are used and thrown away quickly, and longer lived ones, which are usually used throughout the lifetime of the application. In this model you want to tune your GC so that you put most of your effort attempting to identify the short lived objects. This means that you avoid looking at the long-lived objects as much as possible.
Cache's disrupt this pattern by having some intermediate-length object lifespans (i.e. a few seconds / minutes / hours, depending on your cache). These objects often get promoted to the tenured generation, where they're not usually looked at until a full GC becomes necessary, even after they've been evicted from the cache.
If this is what's happening then you've a couple of choices:
ignore it, let the full GC semantics do its thing and just be aware that this is what's happening.
try to tune the GC so that it takes longer for objects to get promoted to the tenured generation. There are some GC flags which can help with that.

Does this memory usage pattern indicate that my Java application leaks memory?

I have a Java application that waits for the user to hit a key and then runs a task. Once done, it goes back and waits again. I was looking at memory profile for this application with jvisualvm, and it showed an increasing pattern.
Committed memory size is 16MB.
Used memory, on application startup, was 2.7 MB, and then it climbed with intermediate drops (garbage collection). Once this sawtooth pattern approached close to 16MB, a major drop occurred and the memory usage fell close to 4 MB. This major drop point has been increasing though. 4MB, 6MB, 8MB. The usage never goes beyond 16 MB but the whole sawtooth pattern is on a climb towards 16 MB.
Do I have a memory leak?
Since this is my first time posting to StackOverflow, do not have enough reputation to post an image.
Modern SunOracle JVMs use what is called a generational garbage collector:
When the collector runs it first tries a partial collection only releases memory that was allocated recently
recently created objects that are still active get 'promoted'
Once an object has been promoted a few times, it will no longer get cleaned up by partial collections even after it is ready for collection
These objects, called tenured, are only cleaned up when a full collection becomes necessary in order to make enough room for the program to continue running
So basically, bits of your program that stick around long enough to get missed by the fast 'partial' collections will hang around until JVM decides it has to do a full collection. If you let it go long enough you should eventually see the full collection happen and usage drop back down to your original starting point.
If that never happens and you eventually get an Out Of Memory exception, then you probably have a memory leak :)
That kind of sawtooth pattern is commonly observed and is not an indication of memory leak.
Because garbage collecting in big chunks is more efficient than constantly collecting small amounts, the JVM does the collecting in batches. That's why you see this pattern.
As stated by others, this behavior is normal. This is a good description of the garbage collection process. To summarize, the JVM usese a generational garbage collector. The vast majority of objects are very short-lived, and those that survive longer tend to last much longer. Knowing this, the GC will check the newer generation first to avoid having to repeatedly check the older objects which are less likely to be inaccessible. After a period of time, the survivors move to the older generation. This increasing saw-tooth is exactly what you're seeing- the rising troughs are due to the older generation growing larger as the survivors are being moved to it. If your program ran long enough eventually checking the newer generation wouldn't free up enough memory and it would have to GC the old generation as well.
Hope that helps.

Garbage Collector taking too much CPU Time

I've developed a Web Application which process a huge amount of data and takes a lot of time to complete?
So now I am doing profiling of my application and I noticed one very bad thing about GC.
When a Full GC occurred it stops all process for 30 - 40 secs.
I wonder if there is any way to improve this. I don't want to waist my CPU's that much time only in GC. Below are some details that can be useful:
I am using Java 1.6.0.23
My Application takes 20 GB max memory.
A full GC occur after every 14 minutes.
Memory Before GC is 20 GB and after GC is 7.8 GB
Memory used in CPU (i.e. shown in task manager) is 41 GB.
After process completed(JVM is still running) Used memory 5 GB and free memory 15 GB.
There are many algorithms that modern JVM's use for garbage collection. Some algorithms such as reference counting are so fast, and some such as memory copying are so slow. You could change your code so that help the JVM to use the faster algorithms most of the time.
One of the fastest algorithms is reference counting, and as the name describes, it counts references to an object, and when it reaches zero, it is ready for garbage collection, and after that it decreases reference count to objects referenced by the current GCed object.
To help JVM to use this algorithm, avoid having circular references (object A references B, then B references C, C references D ...., and Z references A again). Because even when the whole object graph is not reachable, none of the object's reference counters reaches zero.
You could only just break the circle when you don't need the objects in the circle any more (by assigning null to one of references)....
If you use 64 bit architecture add:
-XX:+UseCompressedOops 64bit addresses are converted to 32bit
Use G1GC instead of CMS:
-XX:+UseG1GC - it use incremental steps
Set the same initial and max size: -Xms5g -Xmx5g
Tune parameters (just example):
-XX:MaxGCPauseMillis=100 -XX:GCPauseIntervalMillis=1000
See Java HotSpot VM Options Performance Options
Either improve app by reusing resources or kick-in System.gc() yourself in some critical regions of the app (which is not guaranteed to help you). Most likely you have a memory leak somewhere that you have to investigate and consequently restructure the code.
The fewer things you new, the fewer things need to be collected.
Suppose you have class A.
You can include in it a reference to another instance of class A.
That way you can make a "free list" of instances of A.
Whenever you need an A, just pop one off the free list.
If the free list is empty, then new one.
When you no longer need it, push it on the free list.
This can save a lot of time.
The amount of time spent in GC depends on two factors:
How many objects are live (= can be reached from anyone)
How many dead objects implement finalize()
Objects which can't be reached and which don't use finalize() cost nothing to clean up in Java which is why Java is usually on par with other languages like C++ (and often much better because C++ spends a lot of time to delete objects).
So what you need to do in your app is cut down on the number of objects that survive and/or cut references to objects (that you no longer need) earlier in the code. Example:
When you have a very long method, you will keep all the objects alive that you reference from local variables. If you split that method in many smaller methods, the references will be lost faster and the GC won't have to deal with those objects.
If you put everything that you might need in huge hash maps, the maps will keep all those instances alive until your code completes. So even when you don't need those anymore, the GC will still have to spend time on them.

Why does java wait so long to run the garbage collector?

I am building a Java web app, using the Play! Framework. I'm hosting it on playapps.net. I have been puzzling for a while over the provided graphs of memory consumption. Here is a sample:
The graph comes from a period of consistent but nominal activity. I did nothing to trigger the falloff in memory, so I presume this occurred because the garbage collector ran as it has almost reached its allowable memory consumption.
My questions:
Is it fair for me to assume that my application does not have a memory leak, as it appears that all the memory is correctly reclaimed by the garbage collector when it does run?
(from the title) Why is java waiting until the last possible second to run the garbage collector? I am seeing significant performance degradation as the memory consumption grows to the top fourth of the graph.
If my assertions above are correct, then how can I go about fixing this issue? The other posts I have read on SO seem opposed to calls to System.gc(), ranging from neutral ("it's only a request to run GC, so the JVM may just ignore you") to outright opposed ("code that relies on System.gc() is fundamentally broken"). Or am I off base here, and I should be looking for defects in my own code that is causing this behavior and intermittent performance loss?
UPDATE
I have opened a discussion on PlayApps.net pointing to this question and mentioning some of the points here; specifically #Affe's comment regarding the settings for a full GC being set very conservatively, and #G_H's comment about settings for the initial and max heap size.
Here's a link to the discussion, though you unfortunately need a playapps account to view it.
I will report the feedback here when I get it; thanks so much everyone for your answers, I've already learned a great deal from them!
Resolution
Playapps support, which is still great, didn't have many suggestions for me, their only thought being that if I was using the cache extensively this may be keeping objects alive longer than need be, but that isn't the case. I still learned a ton (woo hoo!), and I gave #Ryan Amos the green check as I took his suggestion of calling System.gc() every half day, which for now is working fine.
Any detailed answer is going to depend on which garbage collector you're using, but there are some things that are basically the same across all (modern, sun/oracle) GCs.
Every time you see the usage in the graph go down, that is a garbage collection. The only way heap gets freed is through garbage collection. The thing is there are two types of garbage collections, minor and full. The heap gets divided into two basic "areas." Young and tenured. (There are lots more subgroups in reality.) Anything that is taking up space in Young and is still in use when the minor GC comes along to free up some memory, is going to get 'promoted' into tenured. Once something makes the leap into tenured, it sits around indefinitely until the heap has no free space and a full garbage collection is necessary.
So one interpretation of that graph is that your young generation is fairly small (by default it can be a fairly small % of total heap on some JVMs) and you're keeping objects "alive" for comparatively very long times. (perhaps you're holding references to them in the web session?) So your objects are 'surviving' garbage collections until they get promoted into tenured space, where they stick around indefinitely until the JVM is well and good truly out of memory.
Again, that's just one common situation that fits with the data you have. Would need full details about the JVM configuration and the GC logs to really tell for sure what's going on.
Java won't run the garbage cleaner until it has to, because the garbage cleaner slows things down quite a bit and shouldn't be run that frequently. I think you would be OK to schedule a cleaning more frequently, such as every 3 hours. If an application never consumes full memory, there should be no reason to ever run the garbage cleaner, which is why Java only runs it when the memory is very high.
So basically, don't worry about what others say: do what works best. If you find performance improvements from running the garbage cleaner at 66% memory, do it.
I am noticing that the graph isn't sloping strictly upward until the drop, but has smaller local variations. Although I'm not certain, I don't think memory use would show these small drops if there was no garbage collection going on.
There are minor and major collections in Java. Minor collections occur frequently, whereas major collections are rarer and diminish performance more. Minor collections probably tend to sweep up stuff like short-lived object instances created within methods. A major collection will remove a lot more, which is what probably happened at the end of your graph.
Now, some answers that were posted while I'm typing this give good explanations regarding the differences in garbage collectors, object generations and more. But that still doesn't explain why it would take so absurdly long (nearly 24 hours) before a serious cleaning is done.
Two things of interest that can be set for a JVM at startup are the maximum allowed heap size, and the initial heap size. The maximum is a hard limit, once you reach that, further garbage collection doesn't reduce memory usage and if you need to allocate new space for objects or other data, you'll get an OutOfMemoryError. However, internally there's a soft limit as well: the current heap size. A JVM doesn't immediately gobble up the maximum amount of memory. Instead, it starts at your initial heap size and then increases the heap when it's needed. Think of it a bit as the RAM of your JVM, that can increase dynamically.
If the actual memory use of your application starts to reach the current heap size, a garbage collection will typically be instigated. This might reduce the memory use, so an increase in heap size isn't needed. But it's also possible that the application currently does need all that memory and would exceed the heap size. In that case, it is increased provided that it hasn't already reached the maximum set limit.
Now, what might be your case is that the initial heap size is set to the same value as the maximum. Suppose that would be so, then the JVM will immediately seize all that memory. It will take a very long time before the application has accumulated enough garbage to reach the heap size in memory usage. But at that moment you'll see a large collection. Starting with a small enough heap and allowing it to grow keeps the memory use limited to what's needed.
This is assuming that your graph shows heap use and not allocated heap size. If that's not the case and you are actually seeing the heap itself grow like this, something else is going on. I'll admit I'm not savvy enough regarding the internals of garbage collection and its scheduling to be absolutely certain of what's happening here, most of this is from observation of leaking applications in profilers. So if I've provided faulty info, I'll take this answer down.
As you might have noticed, this does not affect you. The garbage collection only kicks in if the JVM feels there is a need for it to run and this happens for the sake of optimization, there's no use of doing many small collections if you can make a single full collection and do a full cleanup.
The current JVM contains some really interesting algorithms and the garbage collection itself id divided into 3 different regions, you can find a lot more about this here, here's a sample:
Three types of collection algorithms
The HotSpot JVM provides three GC algorithms, each tuned for a specific type of collection within a specific generation. The copy (also known as scavenge) collection quickly cleans up short-lived objects in the new generation heap. The mark-compact algorithm employs a slower, more robust technique to collect longer-lived objects in the old generation heap. The incremental algorithm attempts to improve old generation collection by performing robust GC while minimizing pauses.
Copy/scavenge collection
Using the copy algorithm, the JVM reclaims most objects in the new generation object space (also known as eden) simply by making small scavenges -- a Java term for collecting and removing refuse. Longer-lived objects are ultimately copied, or tenured, into the old object space.
Mark-compact collection
As more objects become tenured, the old object space begins to reach maximum occupancy. The mark-compact algorithm, used to collect objects in the old object space, has different requirements than the copy collection algorithm used in the new object space.
The mark-compact algorithm first scans all objects, marking all reachable objects. It then compacts all remaining gaps of dead objects. The mark-compact algorithm occupies more time than the copy collection algorithm; however, it requires less memory and eliminates memory fragmentation.
Incremental (train) collection
The new generation copy/scavenge and the old generation mark-compact algorithms can't eliminate all JVM pauses. Such pauses are proportional to the number of live objects. To address the need for pauseless GC, the HotSpot JVM also offers incremental, or train, collection.
Incremental collection breaks up old object collection pauses into many tiny pauses even with large object areas. Instead of just a new and an old generation, this algorithm has a middle generation comprising many small spaces. There is some overhead associated with incremental collection; you might see as much as a 10-percent speed degradation.
The -Xincgc and -Xnoincgc parameters control how you use incremental collection. The next release of HotSpot JVM, version 1.4, will attempt continuous, pauseless GC that will probably be a variation of the incremental algorithm. I won't discuss incremental collection since it will soon change.
This generational garbage collector is one of the most efficient solutions we have for the problem nowadays.
I had an app that produced a graph like that and acted as you describe. I was using the CMS collector (-XX:+UseConcMarkSweepGC). Here is what was going on in my case.
I did not have enough memory configured for the application, so over time I was running into fragmentation problems in the heap. This caused GCs with greater and greater frequency, but it did not actually throw an OOME or fail out of CMS to the serial collector (which it is supposed to do in that case) because the stats it keeps only count application paused time (GC blocks the world), application concurrent time (GC runs with application threads) is ignored for those calculations. I tuned some parameters, mainly gave it a whole crap load more heap (with a very large new space), set -XX:CMSFullGCsBeforeCompaction=1, and the problem stopped occurring.
Probably you do have memory leaks that's cleared every 24 hours.

Categories

Resources