Metaspace growth, dead classloader and GC

Metaspace growth, dead classloader and GC - java

we have a situation wherein the metaspace of a springboot microservice keeps growing but the heap is behaving well.
jmap -clstats shows there are thowsands of dead classloaders of the following kind.
0x00000000e3c8e3c8 1 4087 0x00000000e0004d18 dead com/sun/org/apache/xalan/internal/xsltc/trax/TemplatesImpl$TransletClassLoader#0x000000010087c7e8
At the initial high watermark GC is being triggered and I see a drop in metaspace. After this forced GC that is triggered due to the defined metaspacesize, I see metaspace is continuously growing and I see more dead classloaders of the same kind are being kept in metaspace. I do see a bit of GC activity but no drop in Metaspace consumption. However, if I force GC collection through visualvm, ton of classes are being unloaded and metaspace consumption is going back to the state when the service was started.
Why would JVM managed GC not unload these dead classloaders but a forced GC does? If weak/soft/phantom reference is the reason then shouldn't that apply to forced GC as well?
This is on Java8. Can anyone give some pointers as to where I should look next? Obviously there is a leak so is there a way to know the parent classloader of the TemplatesImpl$TransletClassLoader?
Appreciate any help.

First thing is that, JVM will clear Metaspace ONLY during Full GC and NOT during Young GC. So the behavior you are seeing is expected.
At the initial high watermark GC is being triggered and I see a drop in metaspace.
If you check GC trace, you will see a System.GC() call. That's Full GC.
However, if I force GC collection through visualvm, ton of classes are being unloaded and metaspace consumption is going back to the state when the service was started.
Again, that will be a Full GC triggered by visualvm and can be seen in GC trace. That is why you are seeing the utilization drop.
From the description you provided, I don't think you have a classloader leak because everything is getting cleaned up during Full GC. From my experience, I would only consider something as a "memory leak" if app is unnecessarily keeping some objects alive which therefore is not getting garbage collected - even during Full GC. In your case I will suggest to limit Metaspace size by using -XX:MaxMetaspaceSize flag. When occupancy will reach that threshold, JVM will automatically trigger a Full GC and as you noticed, Metaspace usage will drop. Set that limit judiciously because too low value will cause java.lang.OutOfMemoryError: Metaspace issue.
Some more details on Metaspace can be found here.

Related

GC gets triggered often

I would like to understand why the GC gets triggered even though I have plenty of heap left unused.. I have allocated 1.7 GB of RAM. I still see 10% of GC CPU usage often.
I use this - -XX:+UseG1GC with Java 17

JVMs will always have some gc threads running (unless you use Epsilon GC which perform no gc, I do not recommend using this unless you know why you need it), because the JVM manages memory for you.
Heap in G1 is divided two spaces: young and old. All objects are created in young space. When the young space fills (it always do eventually, unless you are developing zero garbage), it will trigger some gc cleaning unreferenced objects from the young and promoting some objects which are still referenced to old.
Those spikes in the right screenshot will correspond to young collection events (where unreferenced objects get cleaned). Young space is always much more small than the old space. So it fills frequently. That is why you see those spikes regarding there is much more memory free.
DISCLAIMER This is a really very high level explanation of memory management in the JVM. Some important concepts have been not mentioned.
You can read more about g1 gc collector here
Also take a look at jstat tool which will help you understand what is happening in your heap.

Tenured Generation Garbage Collection is Not clearing

I have a java spring boot project. When the code related to multi-threading (Executor service)is executed memory
is getting filled. GC is not clearing this memory. After reading GC docs, came to know that the tenured memory is not getting cleared.
By monitoring the JVM by Java Profiler, I notice that this Tenured Generation never get cleared(until full, for my case).
how can I make gc clear the tenured space?
we running app using docker image

There are two potential issues here.
The garbage collector, can only release objects that are unreachable. So if the tenured objects are still reachable, they won't ever be released. This is a memory leak scenario.
The JVM could be not running the old / tenured space collection because it doesn't need to. The JVM will typically only run the old space collector when it thinks it is necessary / economical to do. This is actually a good thing, because most of the easily (cheaply) collectable garbage is normally in the new space.
After reading gc docs, came to know that the tenured memory is not getting cleared.
That's probably a misreading of the documentation. The GC will collect objects in tenured space. It should happen before you start getting OOMEs.
How can I make the GC clear the tenured space?
You can call System.gc() ... which gives the JVM a hint that it would be a good thing to run the GC now. However:
It is only a hint. The JVM may ignore it.
It may or may not run an old space collection.
It won't "clear" the old generation. It won't necessarily even release all of the unreachable objects.
It is generally inefficient to call System.gc(). The JVM typically has a better idea of the most efficient time to run the GC, and which kind of GC to run.
Running the GC is probably not actually going to release memory for the rest of the system to use.
In short, it is probably a BAD IDEA to call System.gc(). In most cases, it doesn't achieve anything.
It certainly help if your real problem is a memory leak. If you are getting OOME's, it won't help.

Metaspace out of memory even 20-30% or more space free

A Keycloak/JBoss server running using Java 8 was switched to G1GC and a -XX:MaxMetaspaceSize was set to 256MB. It soon stopped responding with logs filling up with OutOfMemory: Metaspace errors. GC logs were not enabled. Server was hooked with Dynatrace monitoring service which showed plenty of metapsace available.
191MB used out of 256MB allocated on one occasion.
165MB used out of 256MB allocated on a second occurrence.
I understand that to free metaspace, GC was invoked and being unable to free enough space it was running GC after GC so it makes sense why the process was stuck. However, what I am unable to understand is why process was running out of metaspace even when plenty was free. No other JVM parameter was provided (except XMS/XMX). All heap memory sections had plenty of free space. Was something trying to allocate 60MB+ space in metaspace? Is this the only possible reason?

Why GC happens even there is lots of unused memory left

(Committed and Max lines are the same)
I am looking at the memory usage for a Java application in newrelic. Here are several questions:
# 1
The committed PS Survivor Space Heap varied in past few days. But should it be a constant since it is configured by JVM?
# 2
From what I am understanding, the heap memory should decrease when there is a garbage collection. The memory of Eden could decrease when a major gc or a minor gc happens, while the memory of Old could decrease when a major gc happens.
But if you look at Old memory usage, some time between June 6th and 7th, the memory went up and then later it went down. This should represent that a major gc happend, right? However, there was still lots of unused memory left. It didn't seem it almost reach the limit. Then how did the major gc be triggered? Same for Eden memory usage, it never reached the limit but it still decreased.
The application fetches a file from other places. This file could be large and be processed in memory. Could this explain the issue above?

You need to provide more information about your configuration to answer this definitively, I will assume you are using the Hotspot JVM from Oracle and that you are using the G1 collector. Posting the flags you start the JVM with would also be useful.
The key term here is 'committed'. This is memory reserved by the JVM, but not necessarily in use (or even mapped to physical pages, it's just a range of virtual memory that can be used by the JVM). There's a good description of this in the MemoryUsage class of the java.lang.management package (check the API docs). It says, "committed represents the amount of memory (in bytes) that is guaranteed to be available for use by the Java virtual machine. The amount of committed memory may change over time (increase or decrease). The Java virtual machine may release memory to the system..." This is why you see it change.
Assuming you are using G1 then the collector performs incremental compaction. You are correct that if the collector could not keep up with allocation in the old gen and it was getting low on space it would perform a full compacting collection. This is not happening here as the last graph shows you are using nowhere near the allocated heap space. However, to avoid this G1 will collect and compact concurrently with your application. This is why you see usage go up (as you application instantiates more objects) and then go down (as the G1 collector reclaims space from no longer required objects). For a more detailed explanation of how G1 works there is a good read in the documentation, https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/g1_gc.html.

Identify old gen in heap dump (or take heap dump of old gen only)

I think I have a memory leak.
(they say the first step is admitting the problem, right?)
Anyway, I think I do - see attached image for heap by regions: .
Green is Eden, blue/red is S0/S1, purple is old. I have unlimited tenuring (>15), lots of time passed between memory being allocated and it spilling to old gen. Hence - a memory leak. I think.
So - the question - how can I analyze what is leaking? As you can see, my Eden is very active. Lot's of objects being created and destroyed all the time.
Is there a way of taking a heap dump of the old gen only? Or somehow identify the old gen in a full heap dump (if so, with what tool)?
Edit 1:
Clarification: I'm not doing anything that should retain objects in memory. Everything I allocate after the initial startup should die young.
Edit2:
New findings: I took a heap dump, GCed like crazy and took another. The second one shows a significantly reduced level of old gen usage. The main difference between the two were objects held by finalizers.
Don't finalizers run in young GC cycles? Do they always wait for a full GC to be cleaned?

seeing some things propagate to old gen isn't a huge concern. After your old gen reaches a certain threshold a full GC will kick off. If that isn't able to reclaim the memory then you have an issue. The fact that you are seeing some memory allocated during a young collection shouldn't be an alarming concern.
lots of time passed between memory being allocated and it spilling to
old gen. Hence - a memory leak. I think
Not really.. just because memory is being added to old gen doesn't mean it is a memory leak. It is normal practice during a young collection that older objects get promoted to old gen. It is during those young collections when older objects get added to the old gen. This may just be your application still ramping up. In large scale applications there may be features not used every day, which may be getting into memory later then you expected.
That being said, if you really are concerned with any memory being added to the old gen and want to investigate further, I would recommend running this application on a demo environment. Attach a profiler (VisualVM will work) and load test (JMeter is good and free) your application. If you look at the objects you can get an idea of what generation an object is. You also want to see what happens when your old gen reaches a threshold where a full GC will kick off (normally in the 70%-90% range). If your old gen recovers back to the 20% threshold, then there is no leak. In some cases the old gen may never reach the point where a full GC gets kicked off, but instead level off as you expected. The load test will help identify that.
If it doesn't recover and you confirm you have a memory leak then you will want to capture a heap dump (hprof) and use a tool like MAT (Memory Analyzer Tool) to analyze the dump to find the culprit.

Using JVisualVM (part of the JDK since Java 6 Build 10 or something like that), you can look at the TYPE of objects that are in memory. That will help you track down where the leak is. Of course, it takes a lot of digging into the code, but that's the best tool I've used that always available and reliable.
Watch out for objects being passed around, it could be that you have a handle that's being kept in a list or array that's not being cleared out. I find that if I watch the number of objects being created, and kept, in JVisualVM over a period of a few minutes, I usually get an idea of where in the code to go dig for the offending objects not being released.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.