Shrinking survivor spaces lead to continuous full GC - java

I've had this troubling experience with a Tomcat server, which runs:
our Hudson server;
a staging version of our web application, redeployed 5-8 times per day.
The problem is that we end up with continuous garbage collection, but the old generation is nowhere near to being filled. I've noticed that the survivor spaces are next to inexisting, and the garbage collector output is similar to:
[GC 103688K->103688K(3140544K), 0.0226020 secs]
[Full GC 103688K->103677K(3140544K), 1.7742510 secs]
[GC 103677K->103677K(3140544K), 0.0228900 secs]
[Full GC 103677K->103677K(3140544K), 1.7771920 secs]
[GC 103677K->103677K(3143040K), 0.0216210 secs]
[Full GC 103677K->103677K(3143040K), 1.7717220 secs]
[GC 103679K->103677K(3143040K), 0.0219180 secs]
[Full GC 103677K->103677K(3143040K), 1.7685010 secs]
[GC 103677K->103677K(3145408K), 0.0189870 secs]
[Full GC 103677K->103676K(3145408K), 1.7735280 secs]
The heap information before restarting Tomcat is:
Attaching to process ID 10171, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 14.1-b02
using thread-local object allocation.
Parallel GC with 8 thread(s)
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 3221225472 (3072.0MB)
NewSize = 2686976 (2.5625MB)
MaxNewSize = 17592186044415 MB
OldSize = 5439488 (5.1875MB)
NewRatio = 2
SurvivorRatio = 8
PermSize = 21757952 (20.75MB)
MaxPermSize = 268435456 (256.0MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 1073479680 (1023.75MB)
used = 0 (0.0MB)
free = 1073479680 (1023.75MB)
0.0% used
From Space:
capacity = 131072 (0.125MB)
used = 0 (0.0MB)
free = 131072 (0.125MB)
0.0% used
To Space:
capacity = 131072 (0.125MB)
used = 0 (0.0MB)
free = 131072 (0.125MB)
0.0% used
PS Old Generation
capacity = 2147483648 (2048.0MB)
used = 106164824 (101.24666595458984MB)
free = 2041318824 (1946.7533340454102MB)
4.943684861063957% used
PS Perm Generation
capacity = 268435456 (256.0MB)
used = 268435272 (255.99982452392578MB)
free = 184 (1.7547607421875E-4MB)
99.99993145465851% used
The relevant JVM flags passed to Tomcat are:
-verbose:gc -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFFE -Xmx3g -XX:MaxPermSize=256m
Please note that the survivor spaces are sized at about 40 MB at startup.
How can I avoid this problem?
Updates:
The JVM version is
$ java -version
java version "1.6.0_15"
Java(TM) SE Runtime Environment (build 1.6.0_15-b03)
Java HotSpot(TM) 64-Bit Server VM (build 14.1-b02, mixed mode)
I'm going to look into bumping up the PermGen size and seeing if that helps - probably the sizing of the survivor spaces was unrelated.

The key is probably PS Perm Generation which is at 99.999% (only 184 bytes out of 256***MB*** free).
Usually, I'd suggest that you give it more perm gen but you already gave it 256MB which should be plenty. My guess is that you have a memory leak in some code generation library. Perm Gen is mostly used for bytecode for classes.

It's very easy to have ClassLoader leaks - all it takes is a single object loaded through the ClassLoader being referred by an object not loaded by it. A constantly redeployed app will then quickly fill PermGenSpace.
This article explains what to look out for, and a followup describes how to diagnose and fix the problem.

I think this is not that uncommon for an application server that gets continuously deployed to. The perm gen space, which is full for you, is where classes go. Keep in mind that JSPs are compiled as Java classes, and when you change a JSP, a new class gets generated and loaded.
We have had this problem, and our solution is to have the app server restart occasionally.
This is what I'd do:
Deploy Hudson to a separate server from your staging server
Configure Hudson to restart your staging server from time to time. You can either do this one of two ways:
Restart periodically (e.g., every night at midnight, regardless of if there's build activity); or
Have the web app deployment job trigger the server restart job. If you do this make sure there's a really long quiet period for the restart job (we set ours to 2 hours), so that you don't get a server restart for every build (i.e., if two web app deployments happen within 2 hours, they'll only trigger one server restart).

The flag -XX:SurvivorRatio sets the ratio between Eden and the survivor spaces. According to the JDK 1.5 tuning doc, the default value is 32, which gives a 1:32 ratio. This is in accordance with what you're seeing. It seems incredibly small to me, although I understand that only a very small number of objects are expected to make their way from Eden to the survivor space.
So, assuming that you have a lot of long-lived objects, you should decrease the survivor ratio. The risk is that you only have those long-lived objects during a startup phase, and so are limiting the Eden size. For a testing server, I doubt this is going to be an issue.
I'd probably also reduce the size of the Eden space, by increasing -XX:NewRatio (the default is 3). My gut says that a hundred MB or so is sufficient for the young generation, and you'll just be increasing the cost of garbage collection to have such a large amount of space allocated (ie, object will live in Eden far too long). But that's just instinct, and should definitely be validated for your environment.
And a semi-related comment, after reading other replies: if you're not seeing errors for running out of permgen space, don't spend your time fiddling with it. The permgen is managed separately from the rest of the heap.

Related

High GC pasue time in GC logs [duplicate]

We have a Web Java based application running on JBoss with allowed maximum heap size of about 1.2 GB (total machine physical memory is 2 GB). At some point the application stops responding (to clients) for several minutes. After some analysis we found out that the culprit is the Full GC. Here's an excerpt from the verbose GC log:
74477.402: [Full GC [PSYoungGen: 3648K->0K(332160K)] [PSOldGen: 778476K->589497K(819200K)] 782124K->589497K(1151360K) [PSPermGen: 102671K->102671K(171328K)], 646.1546860 secs] [Times: user=3.84 sys=3.72, real=646.17 secs]
What I don't understand is how is it possible that the real time spent on Full GC is about 11 minutes (646 seconds), while user+sys times are just 7.5 seconds. 7.5 seconds sound to me much more logical time to spend for cleaning <200 MB from the old generation. Where does all the other time go?
Thanks a lot.
Where does all the other time go?
It is most likely that your application is causing virtual memory thrashing. Basically, your application needs significantly more pages of virtual memory than there are physical pages available to hold them. As a result, it is spending most of the time waiting for vm pages to be read from and written to disc.
For more information, read this wikipedia page.
The cure is to either reduce virtual memory usage or increase the amount of physical memory on the system. For example, you could:
run fewer applications on the machine,
reduce Java application heap sizes, or
if you are running in a virtual, increase the virtual's allocation of physical memory.
(Note however that reducing the JVM heap size can be a two-edged sword. If you reduce the heap size too much the application will either die from OutOfMemoryErrors, spend too much time garbage collecting, or suffer from not being able to cache things effectively.)

glassfish full gc once an hour

I'm seeing a full GC about once an hour in our Glassfish application. Extract from the GC log:
9.210: [Full GC 28311K->27979K(6422528K), 0.3770238 secs]
...
3609.647: [Full GC 1186957K->597880K(6478208K), 4.5102977 secs]
...
7214.192: [Full GC 742184K->595596K(6469504K), 4.3726625 secs]
...
10818.805: [Full GC 756228K->570803K(6455936K), 4.8630472 secs]
And this pattern roughly repeats as long as Glassfish is up. The "..." in between are incremental GCs. The timing seems awfully suspicious- why would we be seeing full GC's about once an hour?
JVM startup parameters:
-Xms6400m
-Xmx6400m
-XX:NewSize=1024m
-XX:MaxNewSize=1024m
-XX:PermSize=256m
-XX:MaxPermSize=1024m
-XX:+UseParallelGC
-XX:+UseParallelOldGC
-Xloggc:C:\glassfish3\glassfish\domains\domain1\logs\gc\gc.log
-XX:+AggressiveOpts
-Xss1024k
-XX:+CMSClassUnloadingEnabled
According to JVisualVM, we're no where close to running out of heap space.
Glassfish 3.1.2.2, Oracle JDK 1.6.0_45, Windows Server 2008
I suspect your RMI is triggering a Full clean up.
http://docs.oracle.com/javase/6/docs/technotes/guides/rmi/sunrmiproperties.html
both
sun.rmi.dgc.server.gcInterval
When it is necessary to ensure that unreachable remote objects are unexported and garbage collected in a timely fashion, the value of this property represents the maximum interval (in milliseconds) that the Java RMI runtime will allow between garbage collections of the local heap. The default value is 3600000 milliseconds (one hour).
and
sun.rmi.dgc.client.gcInterval
When it is necessary to ensure that DGC clean calls for unreachable remote references are delivered in a timely fashion, the value of this property represents the maximum interval (in milliseconds) that the Java RMI runtime will allow between garbage collections of the local heap. The default value is 3600000 milliseconds (one hour).
default to hourly checks.
I would set these to a day or a week for you believe you don't need these.
You could also try to disable explicit GC (-XX:+DisableExplicitGC) and see if the FullGCs go away.

Serial Mark-Sweep-Compact (PSOldGen) PS stands for?

When I searched for PSOldGen garbage collector which I saw in the gc log, I found out that it is Serial Mark-Sweep-Compact. If this gc is serial, what does PS in PSOldGen stand for? AFAIK it is Parallel scavenge. But this confuses me.
[Full GC [PSYoungGen: 647K->0K(60352K)] [PSOldGen: 45361K->45875K(54528K)] 46008K->45875K(114880K) [PSPermGen: 10201K->10201K(21248K)], 0.0359430 secs]
There are 2 collectors in JVM: young space collector and old space collector. HotSpot JVM are implementing bunch of algorithms, but only certain combination of collectors are workable.
PSYoungGen is a "parallel scavenge" young space GC algorithm, but its not compatible with default serial algorithm for old space (Tenured). PSOldGen is a serial old space algorithm which was added specifically to work with parallel scavenge young space algorithm - PSYoungGen.
You can enable parallel algorithm for old space too (-XX:+UseParallelOldGC), in this case you will see PSYoungGen, ParOldGen pair of algorithms at work.
You can also enable another parallel young space algorithm -XX:+UseParNewGC, which will tandem with default serial old space algorithm Tenured.
Have I lost you already? :)
You can read more about algorithms implemented in HotSpot JVM in my blog.
You are correct, in a way, except it really depends on how you configured your JVM command line options. The young gen GC is Parallel Scavenge and multithreaded.
Interestingly, if you start it using -XX:+UseParallelGC then, you'll get a serial (single-threaded) Old Gen GC. If you use -XX:+UseParallelOldGC then you get both a multi-threaded, parallel young gen GC and a multi-threaded, parallel old gen GC.
Source: Java Performance, chapter 7, Garbage Collectors section.
Surprising, isn't it. there's a lot of scope for tinkering here too! The Java Performance book is well worth a read!

Optimizing Tomcat / Garbage Collection

Our server has 128GB of RAM and 64 cores, running Tomcat 7.0.30 and Oracle jdk1.6.0_38, on CentOS 6.3.
Every 60 minutes we're seeing garbage collection that was taking 45 - 60 seconds. Adding -XX:-UseConcMarkSweepGC increased page load times by about 10% but got that down to about 3 seconds, which is an acceptable trade-off.
Our config:
-Xms30g
-Xmx30g
-XX:PermSize=8g
-XX:MaxPermSize=8g
-Xss256k
-XX:-UseConcMarkSweepGC
We set the heap at 30 GB to keep 32 bit addressing (I read that above 32 GB the 64 bit addressing takes up more memory, so you have to go to about 48 GB to see improvements).
Using VisualGC I can see that the Eden space is cycling through every 30 - 60 minutes, but not much happens with the Survivor 0, Survivor 1, Old Gen, and Perm Gen.
We have a powerful server. What other optimizations can we make to further decrease the 3 second GC time?
Any recommendations to improve performance or scaling?
Any other output or config info that would help?
It might sound counter-intuitive, but have you tried allocating a lot less memory? E.g. do you really need a 30G heap? In case you can get along with 4G or even less: Garbage collection might be more frequent, but when it happens it will be a lot faster. Typically I find this more desirable than allocating a lot of memory, suffering from the time it takes to clean it up.
Even if this will not help you because you really need 30G of memory, others might come along with a similar problem and they might benefit from allocating less.
Seems that you need Incremental GC to reduce pauses:
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
and for tracing without visualgc this always went well for me (output in catalina.out):
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
2013-01-05T22:52:13.954+0100: 15918369.557: [GC 15918369.557: [DefNew:
65793K->227K(98304K), 0.0031220 secs] 235615K->170050K(491520K),
0.0033220 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
After you can play with this:
-XX:NewSize=ABC -XX:MaxNewSize=ABC
-XX:SurvivorRatio=ABC
-XX:NewRatio=ABC
Reference: Virtual Machine Garbage Collection Tuning

Cannot understand Intellij IDEA's memory usage and management

Since a few years i am developing with IDEA again, and i am happy so far.
The problem is just weird memory usage behaviour and GC action while i am working on projects which causes my IDE freeze for a few seconds while GC is doing its job.
Regardless of how big the project is, i am working on, after a few days the memory usage increases upto 500 MBs (my heap space max 512 MB and actually, i assume, it had to be sufficient for web projects which has ca 100 java files). After GC did its job, i get 400 MB used - not collected - and just ca 100 MB free on heap and in a few mins the memory usage increases the heap is full again.
JVM version is 19.0-b09
using thread-local object allocation.
Parallel GC with 2 thread(s)
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 536870912 (512.0MB)
NewSize = 178257920 (170.0MB)
MaxNewSize = 178257920 (170.0MB)
OldSize = 4194304 (4.0MB)
NewRatio = 2
SurvivorRatio = 8
PermSize = 16777216 (16.0MB)
MaxPermSize = 314572800 (300.0MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 145489920 (138.75MB)
used = 81242600 (77.4789810180664MB)
free = 64247320 (61.271018981933594MB)
55.84070704004786% used
From Space:
capacity = 16384000 (15.625MB)
used = 0 (0.0MB)
free = 16384000 (15.625MB)
0.0% used
To Space:
capacity = 16384000 (15.625MB)
used = 0 (0.0MB)
free = 16384000 (15.625MB)
0.0% used
PS Old Generation
capacity = 358612992 (342.0MB)
used = 358612992 (342.0MB)
free = 0 (0.0MB)
100.0% used
PS Perm Generation
capacity = 172621824 (164.625MB)
used = 172385280 (164.3994140625MB)
free = 236544 (0.2255859375MB)
99.86296981776765% used
it's how my heap space seems. It is remarkable that Old Generation and Perm Generation use ca 100% their spaces. But i had triggered GC manually several times. The question is how can i get the IDE to sweep these objects in old generation without starting the IDE? (After start up the memory usage is about 60MB -90 MB) how can i find out what these are? There are some threads running which can be watched in VisualVM like RMI TCP Connection, RMI TCP Accept , XML RPC Weblistener and so on, although i do nothing on IDE and they're still consuming memory even 5-10 MBs per second.
$ uname -a
Linux bagdemir 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10 21:21:01 UTC 2011 i686 GNU/Linux
$ java --version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) Server VM (build 19.1-b02, mixed mode)
UPDATE:
memory configuration:
-Xms256m -Xmx512m -Xmn170m -XX:MaxPermSize=300m
You may find this useful: Intellij Idea JVM options benchmark: default settings are worst
Right way to go is to get a memory snapshot and submit corresponding ticket to the JetBrains tracker with the snapshot attached.
Excess memory usage persists to this day, April 2 2019. By default the IntelliJ IDEA Ultimate Edition has 131 plugins enabled by default.
I turned off about 50 of those plugins.
Go to File >> Settings >> Plugins to manage plugins. Then click Installed to view the plugins already active.

Categories

Resources