I would like to understand why the GC gets triggered even though I have plenty of heap left unused.. I have allocated 1.7 GB of RAM. I still see 10% of GC CPU usage often.
I use this - -XX:+UseG1GC with Java 17
JVMs will always have some gc threads running (unless you use Epsilon GC which perform no gc, I do not recommend using this unless you know why you need it), because the JVM manages memory for you.
Heap in G1 is divided two spaces: young and old. All objects are created in young space. When the young space fills (it always do eventually, unless you are developing zero garbage), it will trigger some gc cleaning unreferenced objects from the young and promoting some objects which are still referenced to old.
Those spikes in the right screenshot will correspond to young collection events (where unreferenced objects get cleaned). Young space is always much more small than the old space. So it fills frequently. That is why you see those spikes regarding there is much more memory free.
DISCLAIMER This is a really very high level explanation of memory management in the JVM. Some important concepts have been not mentioned.
You can read more about g1 gc collector here
Also take a look at jstat tool which will help you understand what is happening in your heap.
I'm able to use jmap command to dump JVM memory heap.
The issue is, I have a program with heavier young gen GC activities compared to the previous version when checking from the GC logs, and when I ran memory profiler, the biggest objects are always the ones in the old gen. So it makes troubleshooting more difficult when you need to find the leaks that exist in young gen GC.
Is there any way to profile the memory in the young gen heap?
Updated:
I used open JDK 1.8.
The profiling interface of the JVM does not provide GC-specific information, so profilers cannot tell you what GC generation an object is in.
However, some profilers have the ability to record how old an object is (effectively making their own "generations") so you can restrict your analysis to younger objects only. JProfiler, for example has a "Set mark" action that marks all objects in the heap.
After you create a heap dump, you can select objects that have been created after the mark action was invoked.
If you choose to record allocation times, you can select objects created in arbitrary time intervals in the "Time" view of the heap walker.
Disclaimer: My company develops JProfiler
we have a situation wherein the metaspace of a springboot microservice keeps growing but the heap is behaving well.
jmap -clstats shows there are thowsands of dead classloaders of the following kind.
0x00000000e3c8e3c8 1 4087 0x00000000e0004d18 dead com/sun/org/apache/xalan/internal/xsltc/trax/TemplatesImpl$TransletClassLoader#0x000000010087c7e8
At the initial high watermark GC is being triggered and I see a drop in metaspace. After this forced GC that is triggered due to the defined metaspacesize, I see metaspace is continuously growing and I see more dead classloaders of the same kind are being kept in metaspace. I do see a bit of GC activity but no drop in Metaspace consumption. However, if I force GC collection through visualvm, ton of classes are being unloaded and metaspace consumption is going back to the state when the service was started.
Why would JVM managed GC not unload these dead classloaders but a forced GC does? If weak/soft/phantom reference is the reason then shouldn't that apply to forced GC as well?
This is on Java8. Can anyone give some pointers as to where I should look next? Obviously there is a leak so is there a way to know the parent classloader of the TemplatesImpl$TransletClassLoader?
Appreciate any help.
First thing is that, JVM will clear Metaspace ONLY during Full GC and NOT during Young GC. So the behavior you are seeing is expected.
At the initial high watermark GC is being triggered and I see a drop in metaspace.
If you check GC trace, you will see a System.GC() call. That's Full GC.
However, if I force GC collection through visualvm, ton of classes are being unloaded and metaspace consumption is going back to the state when the service was started.
Again, that will be a Full GC triggered by visualvm and can be seen in GC trace. That is why you are seeing the utilization drop.
From the description you provided, I don't think you have a classloader leak because everything is getting cleaned up during Full GC. From my experience, I would only consider something as a "memory leak" if app is unnecessarily keeping some objects alive which therefore is not getting garbage collected - even during Full GC. In your case I will suggest to limit Metaspace size by using -XX:MaxMetaspaceSize flag. When occupancy will reach that threshold, JVM will automatically trigger a Full GC and as you noticed, Metaspace usage will drop. Set that limit judiciously because too low value will cause java.lang.OutOfMemoryError: Metaspace issue.
Some more details on Metaspace can be found here.
Two questions about the CMS collector:
Will ParNew run concurrently with a CMS old gen collection.
In the GC log, I do not see the old gen usage after a CMS collection. How can I check how much space is collected in the old gen and how much still remains.
Thanks,
Yes - ParNew will run whilst CMS is performing one of its concurrent phases. This can lead to GC log corruption as the JVM's logging is not thread-safe with respect to the GC threads.
CMS performs a parallel sweep. While this is running, a ParNew can cause objects to be promoted to old gen. The question "how much memory was collected by CMS?" is therefore neither very useful or entirely meaningful.
We have a fairly big application running on a JBoss 7 application server. In the past, we were using ParallelGC but it was giving us trouble in some servers where the heap was large (5 GB or more) and usually nearly filled up, we would get very long GC pauses frequently.
Recently, we made improvements to our application's memory usage and in a few cases added more RAM to some of the servers where the application runs, but we also started switching to G1 in the hopes of making these pauses less frequent and/or shorter. Things seem to have improved but we are seeing a strange behaviour which did not happen before (with ParallelGC): the Perm Gen seems to fill up pretty quickly and once it reaches the max value a Full GC is triggered, which usually causes a long pause in the application threads (in some cases, over 1 minute).
We have been using 512 MB of max perm size for a few months and during our analysis the perm size would usually stop growing at around 390 MB with ParallelGC. After we switched to G1, however, the behaviour above started happening. I tried increasing the max perm size to 1 GB and even 1,5 GB, but still the Full GCs are happening (they are just less frequent).
In this link you can see some screenshots of the profiling tool we are using (YourKit Java Profiler). Notice how when the Full GC is triggered the Eden and the Old Gen have a lot of free space, but the Perm size is at the maximum. The Perm size and the number of loaded classes decrease drastically after the Full GC, but they start rising again and the cycle is repeated. The code cache is fine, never rises above 38 MB (it's 35 MB in this case).
Here is a segment of the GC log:
2013-11-28T11:15:57.774-0300: 64445.415: [Full GC 2126M->670M(5120M), 23.6325510 secs]
[Eden: 4096.0K(234.0M)->0.0B(256.0M) Survivors: 22.0M->0.0B Heap: 2126.1M(5120.0M)->670.6M(5120.0M)]
[Times: user=10.16 sys=0.59, real=23.64 secs]
You can see the full log here (from the moment we started up the server, up to a few minutes after the full GC).
Here's some environment info:
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
Startup options: -Xms5g -Xmx5g -Xss256k -XX:PermSize=1500M -XX:MaxPermSize=1500M -XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy -Xloggc:gc.log
So here are my questions:
Is this the expected behaviour with G1? I found another post on the web of someone questioning something very similar and saying that G1 should perform incremental collections on the Perm Gen, but there was no answer...
Is there something I can improve/corrrect in our startup parameters? The server has 8 GB of RAM, but it doesn't seem we are lacking hardware, performance of the application is fine until a full GC is triggered, that's when users experience big lags and start complaining.
Causes of growing Perm Gen
Lots of classes, especially JSPs.
Lots of static variables.
There is a classloader leak.
For those that don't know, here is a simple way to think about how the PremGen fills up. The Young Gen doesn't get enough time to let things expire and so they get moved up to Old Gen space. The Perm Gen holds the classes for the objects in the Young and Old Gen. When the objects in the Young or Old Gen get collected and the class is no longer being referenced then it gets 'unloaded' from the Perm Gen. If the Young and Old Gen don't get GC'd then neither does the Perm Gen and once it fills up it needs a Full stop-the-world GC. For more info see Presenting the Permanent Generation.
Switching to CMS
I know you are using G1 but if you do switch to the Concurrent Mark Sweep (CMS) low pause collector -XX:+UseConcMarkSweepGC, try enabling class unloading and permanent generation collections by adding -XX:+CMSClassUnloadingEnabled.
The Hidden Gotcha'
If you are using JBoss, RMI/DGC has the gcInterval set to 1 min. The RMI subsystem forces a full garbage collection once per minute. This in turn forces promotion instead of letting it get collected in the Young Generation.
You should change this to at least 1 hr if not 24 hrs, in order for the the GC to do proper collections.
-Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000
List of every JVM option
To see all the options, run this from the cmd line.
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version
If you want to see what JBoss is using then you need to add the following to your standalone.xml. You will get a list of every JVM option and what it is set to. NOTE: it must be in the JVM that you want to look at to use it. If you run it external you won't see what is happening in the JVM that JBoss is running on.
set "JAVA_OPTS= -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal %JAVA_OPTS%"
There is a shortcut to use when we are only interested in the modified flags.
-XX:+PrintcommandLineFlags
Diagnostics
Use jmap to determine what classes are consuming permanent generation space. Output will show
class loader
# of classes
bytes
parent loader
alive/dead
type
totals
jmap -permstat JBOSS_PID >& permstat.out
JVM Options
These settings worked for me but depending how your system is set up and what your application is doing will determine if they are right for you.
-XX:SurvivorRatio=8 – Sets survivor space ratio to 1:8, resulting in larger survivor spaces (the smaller the ratio, the larger the space). The SurvivorRatio is the size of the Eden space compared to one survivor space. Larger survivor spaces allow short lived objects a longer time period to die in the young generation.
-XX:TargetSurvivorRatio=90 – Allows 90% of the survivor spaces to be occupied instead of the default 50%, allowing better utilization of the survivor space memory.
-XX:MaxTenuringThreshold=31 – To prevent premature promotion from the young to the old generation . Allows short lived objects a longer time period to die in the young generation (and hence, avoid promotion). A consequence of this setting is that minor GC times can increase due to additional objects to copy. This value and survivor space sizes may need to be adjusted so as to balance overheads of copying between survivor spaces versus tenuring objects that are going to live for a long time. The default settings for CMS are SurvivorRatio=1024 and MaxTenuringThreshold=0 which cause all survivors of a scavenge to be promoted. This can place a lot of pressure on the single concurrent thread collecting the tenured generation. Note: when used with -XX:+UseBiasedLocking, this setting should be 15.
-XX:NewSize=768m – allow specification of the initial young generation sizes
-XX:MaxNewSize=768m – allow specification of the maximum young generation sizes
Here is a more extensive JVM options list.
Is this the expected behaviour with G1?
I don't find it surprising. The base assumption is that stuff put into permgen almost never becomes garbage. So you'd expect that permgen GC would be a "last resort"; i.e. something the JVM would only do if its was forced into a full GC. (OK, this argument is nowhere near a proof ... but its consistent with the following.)
I've seen lots of evidence that other collectors have the same behaviour; e.g.
permgen garbage collection takes multiple Full GC
What is going on with java GC? PermGen space is filling up?
I found another post on the web of someone questioning something very similar and saying that G1 should perform incremental collections on the Perm Gen, but there was no answer...
I think I found the same post. But someone's opinion that it ought to be possible is not really instructive.
Is there something I can improve/corrrect in our startup parameters?
I doubt it. My understanding is that this is inherent in the permgen GC strategy.
I suggest that you either track down and fix what is using so much permgen in the first place ... or switch to Java 8 in which there isn't a permgen heap anymore: see PermGen elimination in JDK 8
While a permgen leak is one possible explanation, there are others; e.g.
overuse of String.intern(),
application code that is doing a lot of dynamic class generation; e.g. using DynamicProxy,
a huge codebase ... though that wouldn't cause permgen churn as you seem to be observing.
I would first try to find the root cause for the PermGen getting larger before randomly trying JVM options.
You could enable classloading logging (-verbose:class, -XX:+TraceClassLoading -XX:+TraceClassUnloading, ...) and chek out the output
In your test environment, you could try monitoring (over JMX) when classes get loaded (java.lang:type=ClassLoading LoadedClassCount). This might help you find out which part of your application is responsible.
You could also try listing all the classes using the JVM tools (sorry but I still mostly use jrockit and there you would do it with jrcmd. Hope Oracle have migrated those helpful features to Hotspot...)
In summary, find out what generates so many classes and then think how to reduce that / tune the gc.
Cheers,
Dimo
I agree with the answer above in that you should really try to find what is actually filling your permgen, and I'd heavily suspect it's about some classloader leak that you want to find a root cause for.
There's this thread in the JBoss forums that goes through couple of such diagnozed cases and how they were fixed. this answer and this article discusses the issue in general as well. In that article there's a mention of possibly the easiest test you can do:
Symptom
This will happen only if you redeploy your application without
restarting the application server. The JBoss 4.0.x series suffered
from just such a classloader leak. As a result I could not redeploy
our application more than twice before the JVM would run out of
PermGen memory and crash.
Solution
To identify such a leak, un-deploy your application and then trigger a
full heap dump (make sure to trigger a GC before that). Then check if
you can find any of your application objects in the dump. If so,
follow their references to their root, and you will find the cause of
your classloader leak. In the case of JBoss 4.0 the only solution was
to restart for every redeploy.
This is what I'd try first, IF you think that redeployment might be related. This blog post is an earlier one, doing the same thing but discussing the details as well. Based on the posting it might be though that you're not actually redeploying anything, but permgen is just filling up by itself. In that case, examination of classes + anything else added to permgen might be the way (as has been already mentioned in previous answer).
If that doesn't give more insight, my next step would be trying out plumbr tool. They have a sort of guarantee on finding the leak for you, as well.
You should be starting your server.bat with java command with -verbose:gc