GC gets triggered often - java

I would like to understand why the GC gets triggered even though I have plenty of heap left unused.. I have allocated 1.7 GB of RAM. I still see 10% of GC CPU usage often.
I use this - -XX:+UseG1GC with Java 17

JVMs will always have some gc threads running (unless you use Epsilon GC which perform no gc, I do not recommend using this unless you know why you need it), because the JVM manages memory for you.
Heap in G1 is divided two spaces: young and old. All objects are created in young space. When the young space fills (it always do eventually, unless you are developing zero garbage), it will trigger some gc cleaning unreferenced objects from the young and promoting some objects which are still referenced to old.
Those spikes in the right screenshot will correspond to young collection events (where unreferenced objects get cleaned). Young space is always much more small than the old space. So it fills frequently. That is why you see those spikes regarding there is much more memory free.
DISCLAIMER This is a really very high level explanation of memory management in the JVM. Some important concepts have been not mentioned.
You can read more about g1 gc collector here
Also take a look at jstat tool which will help you understand what is happening in your heap.

Related

Minor GC pause times are too higfh. possible reasons?

I Am experiencing a regular high minor GC pause times(~ 9seconds).
The application is a server written in Java, executing 3 transactions/seconds.
Eventhough there's no I/O excessive activity
Heap parameters are:
-Xms1G
-Xmx14G
-XX:+UseConcMarkSweepGC
-XX:+DisableExplicitGC
-XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
What are the possible reasons for such minor gc pause times values?
As the other answer says, without GC log snippets it's not possible to answer this definitively. Some things to think about:
Assuming no impact from the underlying OS (scheduling, CPU thrashing), the time taken for a minor collection will be proportional to the amount of live data in the young gen. when the collector runs (every live object in the young gen. gets copied during a minor GC). Looking at the graph of your old gen. you are seeing consistent growth which would indicate you are promoting significant amounts of data during minor GC. You're either creating a lot of long-lived objects or you're maintaining references unnecessarily.
To reduce the pauses you could try reducing the size of the Eden space (so there is less potential data to copy on each minor GC) and also reduce the tenuring threshold so that objects get moved out of the survivor spaces more quickly. The downside of this will be your minor GCs will happen more frequently so you will probably see a degradation in throughput.
I would also change the -Xms value. You clearly need more than 1Gb in your heap so it would be best to set it to 14Gb to avoid the heap having to be resized by the JVM as the amount of data increases.
For questions in category "Why my GC pause that long?" you should always provide some snippets for GC logs.
As a pure speculation, here are few reasons why minor GC may be abnormally slow:
JVM were suspended from execution by OS (e.i. CPU starvation, swapping, virtual server freeze)
There are some issue with putty JVM at safepoint (unlikely looking at your pause pattern though)
Object survival spikes
Reference object processing overhead (you need add -XX:+PrintReferebceGC to get reference processing info into GC log)
Try using
-XX:+UseG1GC -XX:MaxGCPauseMillis=1000
This will try to keep max GC pause below 1s.
You need to assign enough memory using -Xmx and set the MaxGCPauseMillis as per need.
IMHO:
There is no sufficient evidence that there is indeed minor GC "pause time". What is shown is Garbage Collection Time, and GC time != GC Pause time. Garbage collection activity time and garbage collection pause time are two different beasts - or in technical words, I should say that these are JVM ergonomics/performance goals.
Minor GC pause time is commonly negilible and it is the major GC pause time that affects the application responsiveness.
There is something called as JVM/Ergonomics goals, and "Throughput Goal" is one of the three goals, so I think in this case at max what can be said is that throughput goal is not going very good for young generation as lot of time is spent on GC.

Why GC happens even there is lots of unused memory left

(Committed and Max lines are the same)
I am looking at the memory usage for a Java application in newrelic. Here are several questions:
# 1
The committed PS Survivor Space Heap varied in past few days. But should it be a constant since it is configured by JVM?
# 2
From what I am understanding, the heap memory should decrease when there is a garbage collection. The memory of Eden could decrease when a major gc or a minor gc happens, while the memory of Old could decrease when a major gc happens.
But if you look at Old memory usage, some time between June 6th and 7th, the memory went up and then later it went down. This should represent that a major gc happend, right? However, there was still lots of unused memory left. It didn't seem it almost reach the limit. Then how did the major gc be triggered? Same for Eden memory usage, it never reached the limit but it still decreased.
The application fetches a file from other places. This file could be large and be processed in memory. Could this explain the issue above?
You need to provide more information about your configuration to answer this definitively, I will assume you are using the Hotspot JVM from Oracle and that you are using the G1 collector. Posting the flags you start the JVM with would also be useful.
The key term here is 'committed'. This is memory reserved by the JVM, but not necessarily in use (or even mapped to physical pages, it's just a range of virtual memory that can be used by the JVM). There's a good description of this in the MemoryUsage class of the java.lang.management package (check the API docs). It says, "committed represents the amount of memory (in bytes) that is guaranteed to be available for use by the Java virtual machine. The amount of committed memory may change over time (increase or decrease). The Java virtual machine may release memory to the system..." This is why you see it change.
Assuming you are using G1 then the collector performs incremental compaction. You are correct that if the collector could not keep up with allocation in the old gen and it was getting low on space it would perform a full compacting collection. This is not happening here as the last graph shows you are using nowhere near the allocated heap space. However, to avoid this G1 will collect and compact concurrently with your application. This is why you see usage go up (as you application instantiates more objects) and then go down (as the G1 collector reclaims space from no longer required objects). For a more detailed explanation of how G1 works there is a good read in the documentation, https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/g1_gc.html.

What causes the JVM to do a major garbage collection?

I have a Java app which shows different GC behaviors in different environments. In one environment, the heap usage graph is a slow sawtooth with major GCs every 10 hours or so, only when the heap is >90% full. In another environment, the JVM does major GCs every hour on the dot (the heap is normally between 10% and 30% at these times).
My question is, what are the factors which cause the JVM to decide to do a major GC?
Obviously it collects when the heap is nearly full, but there is some other cause at play which I am guessing is related to an hourly scheduled task within my app (although there is no spike in memory usage at this time).
I assume GC behaviour depends heavily on the JVM; I am using:
Java HotSpot(TM) 64-Bit Server VM 1.7.0_21 Oracle Corporation
No specific GC options, so using the default settings for 64-bit server (PS MarkSweep and PS Scavenge)
Other info:
This is a web app running in Tomcat 6.
Perm gen hovers around 10% in both environments.
The environment with the sawtooth behaviour has 7Gb max heap, the other has 14Gb.
Please, no guesswork. The JVM must have rules for deciding when to perform a major GC, and these rules must be encoded deep in the source somewhere. If anyone knows what they are, or where they are documented, please share!
I have found four conditions that can cause a major GC (given my JVM config):
The old gen area is full (even if it can be grown, a major GC will still be run first)
The perm gen area is full (even if it can be grown, a major GC will still be run first)
Someone is manually calling System.gc(): a bad library or something related to RMI (see links 1, 2 and 3)
The young gen areas are all full and nothing is ready to be moved into old gen (see 1)
As others have commented, cases 1 and 2 can be improved by allocating plenty of heap and permgen, and setting -Xms and -Xmx to the same value (along with the perm equivalents) to avoid dynamic heap resizing.
Case 3 can be avoided using the -XX:+DisableExplicitGC flag.
Case 4 requires more involved tuning, e.g., -XX:NewRatio=N (see Oracle's tuning guide).
Garbage collection is a pretty complicated topic, and while you could learn all the details about this, I think what’s happening in your case is pretty simple.
Sun’s Garbage Collection Tuning guide, under the “Explicit Garbage Collection” heading, warns:
applications can interact with garbage collection … by invoking full garbage collections explicitly … This can force a major collection to be done when it may not be necessary … One of the most commonly encountered uses of explicit garbage collection occurs with RMI … RMI forces full collections periodically
That guide says that the default time between garbage collections is one minute, but the sun.rmi Properties reference, under sun.rmi.dgc.server.gcInterval says:
The default value is 3600000 milliseconds (one hour).
If you’re seeing major collections every hour in one application but not another, it’s probably because the application is using RMI, possibly only internally, and you haven’t added -XX:+DisableExplicitGC to the startup flags.
Disable explicit GC, or test this hypothesis by setting -Dsun.rmi.dgc.server.gcInterval=7200000 and observing if GCs happen every two hours instead.
It depends on your configurations, since HotSpot configures itself differently in different Java environments. For example, in a server with more than 2GB and two processors some JVMs will be configured in '-server' mode instead of the default '-client' mode, which configure the sizes of the memory spaces (generations) differently, and that has an impact as to when garbage collection will occur.
A full GC can occur automatically, but also if you call the garbage collector in your code (ex: using System.gc()). Automatically, it depends on how the minor collections are behaving.
There are at least two algorithms being used. If you are using defaults, a copying algorithm is used for minor collections, and a mark-sweep algorithm for major collections.
A copying algorithm consists of copying used memory from one block to another, and then clearing the space containing the blocks with no references to them. The copying algorithm in the JVM uses uses a large area for objects that are created for the first time (called Eden), and two smaller ones (called survivors). Surviving objects are copied once from Eden and several times from the survivor spaces during each minor collection until they become tenured and are copied to another space (called tenured space) where they can only be removed in a major collection.
Most of the objects in Eden die quickly, so the first collection copies the surviving objects to the survivor spaces (which are by default much smaller). There are two survivors s1 and s2. Every time the Eden fills, the surviving objects from Eden and s1 are copied to s2, Eden and s1 are cleared. Next time, survivors from Eden and s2 are copied back to s1. They keep on being copied from s1 to s2 to s1 until a certain number of copies is reached, or because a block is too big and doesn't fit, or some other criteria. Then the surviving memory block is copied to the tenured generation.
The tenured objects are not affected by the minor collections. They accumulate until the area gets full (or the garbage collector is called). Then the JVM will run a mark-sweep algorithm in a major collection which will preserve only the surviving objects that still have references.
If you have larger objects that don't fit into the survivors, they might be copied directly to the tenured space, which will fill more quickly and you will get major collections more frequently.
Also, the sizes of the survivor spaces, amount of copies between s1 and s2, Eden size related to the size of s1 and s2, size of the tenured generation, all these may be automatically configured differently in different environments with JVM ergonomics, which may automatically select a -server or -client behavior. You might try to run both JVMs as -server or -client and check if they still behave differently.
Even if this will get down votes... My best guess (you will have to test this) would be that the heap needs to expand and when this happens a full gc will be triggered. Not all memory is allocated at once to JVM.
You can test this by setting -Xms and -Xmx to the same value, for example 7GB each

Identify old gen in heap dump (or take heap dump of old gen only)

I think I have a memory leak.
(they say the first step is admitting the problem, right?)
Anyway, I think I do - see attached image for heap by regions: .
Green is Eden, blue/red is S0/S1, purple is old. I have unlimited tenuring (>15), lots of time passed between memory being allocated and it spilling to old gen. Hence - a memory leak. I think.
So - the question - how can I analyze what is leaking? As you can see, my Eden is very active. Lot's of objects being created and destroyed all the time.
Is there a way of taking a heap dump of the old gen only? Or somehow identify the old gen in a full heap dump (if so, with what tool)?
Edit 1:
Clarification: I'm not doing anything that should retain objects in memory. Everything I allocate after the initial startup should die young.
Edit2:
New findings: I took a heap dump, GCed like crazy and took another. The second one shows a significantly reduced level of old gen usage. The main difference between the two were objects held by finalizers.
Don't finalizers run in young GC cycles? Do they always wait for a full GC to be cleaned?
seeing some things propagate to old gen isn't a huge concern. After your old gen reaches a certain threshold a full GC will kick off. If that isn't able to reclaim the memory then you have an issue. The fact that you are seeing some memory allocated during a young collection shouldn't be an alarming concern.
lots of time passed between memory being allocated and it spilling to
old gen. Hence - a memory leak. I think
Not really.. just because memory is being added to old gen doesn't mean it is a memory leak. It is normal practice during a young collection that older objects get promoted to old gen. It is during those young collections when older objects get added to the old gen. This may just be your application still ramping up. In large scale applications there may be features not used every day, which may be getting into memory later then you expected.
That being said, if you really are concerned with any memory being added to the old gen and want to investigate further, I would recommend running this application on a demo environment. Attach a profiler (VisualVM will work) and load test (JMeter is good and free) your application. If you look at the objects you can get an idea of what generation an object is. You also want to see what happens when your old gen reaches a threshold where a full GC will kick off (normally in the 70%-90% range). If your old gen recovers back to the 20% threshold, then there is no leak. In some cases the old gen may never reach the point where a full GC gets kicked off, but instead level off as you expected. The load test will help identify that.
If it doesn't recover and you confirm you have a memory leak then you will want to capture a heap dump (hprof) and use a tool like MAT (Memory Analyzer Tool) to analyze the dump to find the culprit.
Using JVisualVM (part of the JDK since Java 6 Build 10 or something like that), you can look at the TYPE of objects that are in memory. That will help you track down where the leak is. Of course, it takes a lot of digging into the code, but that's the best tool I've used that always available and reliable.
Watch out for objects being passed around, it could be that you have a handle that's being kept in a list or array that's not being cleared out. I find that if I watch the number of objects being created, and kept, in JVisualVM over a period of a few minutes, I usually get an idea of where in the code to go dig for the offending objects not being released.

jvm conf for normal gc at high load

I have server application based on Netty. It decode message (from json) and send it back to the client (simple echo). When i have a lot of messages send from one client (more than 15k/second) garbage collector don't start and memory usage grown up.
How can i configure jvm to decrease gc pauses and decrease memory usage?
Your description sounds like a memory leak. Does the garbage collector eventually start, or do you end up with an OutOfMemoryError?
If you don't, then it sounds like you're running into a situation where objects are living long enough to get into the tenured generation (I'm assuming Sun JVM here). And the solution to that is to increase the size of the young generation relative to the tenured generation.
Here's a link that explains the Sun JVM generational collector (it's for the 1.5 JVM, but I believe that the options haven't changed for 1.6): http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html
The options that you would want to experiment with are NewRatio, which is the ratio between the young and tenured generations, and SurvivorRatio, which is the ratio between Eden and the two survivor spaces. I might try the following:
-XX:NewRatio=1 gives the young generation half of the object heap
-XX:SurvivorRatio=2 makes each survivor space be half that of Eden
These two settings will make the "Eden" space for new objects take 1/4 of the heap. This is pretty big, so hopefully most objects will spend their entire lives in Eden. The survivor ration gives another 1/4 of the heap to the survivor spaces (1/8 to each), to hold objects with a medium life.
Of course, don't blindly set options. Instead, use jconsole (part of the JDK distribution) to see what's really happening with your heap. You might find that the default survivor ratio of (1:6) is better than what I've suggested.
To configure jvm to decrease gc pauses and decrease memory usage, you need to choose an appropriate GC collector. CMS is a low pause collector. You can set -XX:+UseConcMarkSweepGC to enable it. And, you can fine-tune other parameters such as
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=nn
to control GC pause.

Categories

Resources