Allocation failure in Java - java

I have a method which hits DB and fetches lot of records into memory for processing. After I fetch the records and before I start processing, I get the following log message. What does it mean ?
164575.034: [GC (Allocation Failure) 4937664K->3619624K(5602816K), 0.0338580 secs]
Options:
java.opts=-d64 -Xmx8g -XX:+PrintGCTimeStamps -verbose:gc -XX:MaxPermSize=512m -XX:+UseParallelGC -XX:+UseParallelOldGC

It just basically tells you that it had to run GC to allocate additional memory, it does not fit in memory otherwise. So it is just a reason for GC.

Related

Xmx and gc log data not matching

I ran a test program (adds list with string in a infinite loop) using below jvm arguments
java -XX:+PrintGCDetails -Xloggc:gc.log Test -Xmx=1k -Xms=1k
and got the below exception
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
In the gc log i see the below collection as the last entry
11.242: [Full GC (Allocation Failure) [PSYoungGen: 0K->0K(92160K)] [ParOldGen: 274084K->274072K(371712K)] 274084K->274072K(**463872K**), [Metaspace: 2522K->2522K(1056768K)], 3.0296130 secs] [Times: user=3.28 sys=0.00, real=3.02 secs]
If min and max is 1k how come the heap memory available is showing as 463872K?
Plus in the oracle site
https://docs.oracle.com/cd/E13150_01/jrockit_jvm/jrockit/jrdocs/refman/optionX.html
I see a note ,
Note: -Xmx does not limit the total amount of memory that the JVM can use.
What does this means?
You have specified Xmx and Xms parameters after the name of your class. Thus java executable command interprets them as parameters to your class, not as options to configure JVM. The correct way would be:
java -Xmx1k -Xms1k -XX:+PrintGCDetails -Xloggc:gc.log Test
Also note, that the correct format is -Xmx1k, not -Xmx=1k.
But please not, that JVM will not start with such low value of Xms.
#Nikem responded correctly and I would not like to respond to that part of the question. However, I would like to add something to your second question -
Note: -Xmx does not limit the total amount of memory that the JVM can use. What does this imply?
-Xmx only limits the memory available for consumption by your application heap.
But JVM will need memory space for PermGen and stack sizes which are not accounted for with the -Xmx parameter.

GC real time significant longer

The application is using -XX:+UseParallelGC.
There was a GC and real time took significant longer time than the user + sys (~1.05 secs):
[Times: user=0.04 sys=0.01, real=1.05 secs]
The PSYoungGen for the previous GC were cleaning same amount of memory and much less time. I don't have any flag enable to check if jit caused this. There was load on the machine this was running but was caused from the same jvm process that did the GC.
What reasons could cause something like this and what can I do to in the future to figure it out what's happening (if it happens again)?
[GC
Desired survivor size 720864 bytes, new threshold 1 (max 15)
[PSYoungGen: 18864K->454K(21121K)] 38016K->17722K(64833K), 0.0224350 secs] [Times: user=0.04 sys=0.01, real=1.05 secs]
To be able to diagnose this better in the future run with -XX:+PrintSafepointStatistics –XX:PrintSafepointStatisticsCount=1 -XX:+PrintGCDetails -XX:+PrintGCCause for more useful logs. Which JVM version you're using may also be relevant.
Beyond that you'll have to monitor system stats, mostly paging and CPU utilization to see whether other processes or disk IO starve the JVM.

glassfish full gc once an hour

I'm seeing a full GC about once an hour in our Glassfish application. Extract from the GC log:
9.210: [Full GC 28311K->27979K(6422528K), 0.3770238 secs]
...
3609.647: [Full GC 1186957K->597880K(6478208K), 4.5102977 secs]
...
7214.192: [Full GC 742184K->595596K(6469504K), 4.3726625 secs]
...
10818.805: [Full GC 756228K->570803K(6455936K), 4.8630472 secs]
And this pattern roughly repeats as long as Glassfish is up. The "..." in between are incremental GCs. The timing seems awfully suspicious- why would we be seeing full GC's about once an hour?
JVM startup parameters:
-Xms6400m
-Xmx6400m
-XX:NewSize=1024m
-XX:MaxNewSize=1024m
-XX:PermSize=256m
-XX:MaxPermSize=1024m
-XX:+UseParallelGC
-XX:+UseParallelOldGC
-Xloggc:C:\glassfish3\glassfish\domains\domain1\logs\gc\gc.log
-XX:+AggressiveOpts
-Xss1024k
-XX:+CMSClassUnloadingEnabled
According to JVisualVM, we're no where close to running out of heap space.
Glassfish 3.1.2.2, Oracle JDK 1.6.0_45, Windows Server 2008
I suspect your RMI is triggering a Full clean up.
http://docs.oracle.com/javase/6/docs/technotes/guides/rmi/sunrmiproperties.html
both
sun.rmi.dgc.server.gcInterval
When it is necessary to ensure that unreachable remote objects are unexported and garbage collected in a timely fashion, the value of this property represents the maximum interval (in milliseconds) that the Java RMI runtime will allow between garbage collections of the local heap. The default value is 3600000 milliseconds (one hour).
and
sun.rmi.dgc.client.gcInterval
When it is necessary to ensure that DGC clean calls for unreachable remote references are delivered in a timely fashion, the value of this property represents the maximum interval (in milliseconds) that the Java RMI runtime will allow between garbage collections of the local heap. The default value is 3600000 milliseconds (one hour).
default to hourly checks.
I would set these to a day or a week for you believe you don't need these.
You could also try to disable explicit GC (-XX:+DisableExplicitGC) and see if the FullGCs go away.

Java ConcurrentMarkSweep garbage collector not happening

Environment Details:
OS: Linux RedHat
Java: JRE 6 Update 21
I am using following GC setting for my app.
-server -d64 -Xms8192m -Xmx8192m -javaagent:lib/instrum.jar -XX\:MaxPermSize=256m -XX\:+UseParNewGC -X\:+ParallelRefProcEnabled -XX\:+UseConcMarkSweepGC -XX\:MaxGCPauseMillis=250 -XX\:+CMSIncrementalMode -XX\:+CMSIncrementalPacing -XX\:+CMSParallelRemarkEnabled -verbose\:gc -Xloggc\:/tmp/my-gc.log -XX\:DisableExplicitGC -XX\:+PrintGCTimeStamps -XX\:+PrintGCDetails -XX\:+UseCompressedOops
With there setting, there is single Full GC at the begining of application
2.946: [Full GC 2.946: [CMS: 0K->7394K(8111744K), 0.1364080 secs] 38550K->7394K(8360960K), [CMS Perm : 21247K->21216K(21248K)], 0.1365530 secs] [Times: user=0.10 sys=0.04, real=0.14 secs]
Which is followed by a 4-5 successful of CMS collections, But after this there is no trace of CMS in logs, there are entries on only minor collections.
379022.293: [GC 379022.293: [ParNew: 228000K->4959K(249216K), 0.0152000 secs] 7067945K->6845720K(8360960K) icms_dc=0 , 0.0153940 secs]
The heap is growing continuously and it has reached 7GB. We have to restart the application as we can not afford OOM or any kind of breakdown in production system.
I am not able to understand as to why CMS collector has stopped cleaning. Any clues/suggestions are welcome. Thanks in Advance.
======================================================================================
Updated 23rd Jan.
Thanks everyone for the responses till now. I have setup the application in test environment and tested the app with following set of JVM options:
Option #1
-server -d64 -Xms8192m -Xmx8192m -javaagent\:instrum.jar -XX\:MaxPermSize\=256m -XX\:+UseParNewGC -XX\:+UseConcMarkSweepGC -verbose\:gc -Xloggc\:my-gc.log -XX\:+PrintGCTimeStamps -XX\:+PrintGCDetails
Option #2
-server -d64 -Xms8192m -Xmx8192m -javaagent\:instrum.jar -XX\:MaxPermSize\=256m -XX\:+UseParNewGC -XX\:+UseConcMarkSweepGC -verbose\:gc -Xloggc\:my-gc.log -XX\:+DisableExplicitGC -XX\:+PrintGCTimeStamps -XX\:+PrintGCDetails
I ran the test with both settings for 2 days in parallel. These are my observations:
Option #1
The heap memory is stable but there are 90 ConcurrentMarkSweep collections and JVM spent 24 minutes. That’s too high. And I see following lines in GC logs and the pattern continues every one hour...
318995.941: [GC 318995.941: [ParNew: 230230K->8627K(249216K), 0.0107540 secs] 5687617K->5466913K(8360960K), 0.0109030 secs] [Times: user=0.11 sys=0.00, real=0.01 secs]
319050.363: [GC 319050.363: [ParNew: 230195K->9076K(249216K), 0.0118420 secs] 5688481K->5468316K(8360960K), 0.0120470 secs] [Times: user=0.12 sys=0.01, real=0.01 secs]
319134.118: [GC 319134.118: [ParNew: 230644K->8503K(249216K), 0.0105910 secs] 5689884K->5468704K(8360960K), 0.0107430 secs] [Times: user=0.11 sys=0.00, real=0.01 secs]
319159.250: [Full GC (System) 319159.250: [CMS: 5460200K->5412132K(8111744K), 19.1981050 secs] 5497326K->5412132K(8360960K), [CMS Perm : 72243K->72239K(120136K)], 19.1983210 secs] [Times: user=19.14 sys=0.06, real=19.19 secs]
I don’t see the concurrent mark and sweep logs. Does this mean CMS switched to throughput collector? If so why?
Option #2:
Since I see the Full GC (System) logs, I thought of adding -XX\:+DisableExplicitGC. But with that option the collection is not happening and the current heap size is 7.5G. What I am wondering is why CMS is doing the Full GC instead of concurrent collection.
This is a theory ...
I suspect that those CMS collections were not entirely successful. The event at 12477.056 looks like the CMS might have decided that it is not going to be able to work properly due to the 'pre-clean' step taking too long.
If that caused the CMS to decide to switch off, then I expect it will revert to using the classic "throughput" GC algorithm. And there's a good chance it would wait until the heap is full and then it would run a full GC. In short, if you'd just let it continue it would have been OK (modulo that you'd get big GC pauses every now and then.)
I suggest you run your application on a test server with the same heap size and other GC parameters, and see what happens when the server hits the limit. Does it actually throw an OOME?
CMS is running for you :P
You are using incremental mode on CMS (although really you should not bother as its likely punishing your throughput)
The icms_dc in your posted log line is a give away, the only thing that logs this in the JVM is ... the CMS collector, its saying for that GC run you did a small amount of tenure cleanup interwoven with the application.
This part of your log relates to parallel new (the give away there is the heap size)
379022.293: [GC 379022.293: [ParNew: 228000K->4959K(249216K), 0.0152000 secs]
this part is incremenatal CMS (iCMS)
7067945K->6845720K(8360960K) icms_dc=0 , 0.0153940 secs]
I would ask, why are you using iCMS, do you have a lot of Soft/Weak/Phantom references (or why are you using the ParallelRefProcEnabled flag) and have you actually seen an Out of memory, or insufferable pause.
Try backing down to CompressedOops, ParNewGC and CMS without anything else fancy and see if that works out for you.
I can see that the initial heap size -Xms is :8192m and max heap size is -Xmx8192m, which might be one of the reasons why GC is still waiting to start sweeping.
I would suggest to decrease the heap size and then check if the GC kicks in.
When you set the maximum size, it allocates that amount of virtual memory immediately.
When you set the minimum size, it has already allocated the maximum size. All the minimum size does is to take minimal steps to free up memory until this maximum is reached. This could be reducing the number of full GCs because you told it to use up to 8 GB freely.
You have a lot of options turned on (some of them the default) I suggest you strip back to a minimum set as they can have odd interactions when you turn lots of.
I would start with (assuming you have Solaris)
-mx8g -javaagent:lib/instrum.jar -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -verbose\:gc -Xloggc\:/tmp/my-gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
The options -server is the default on server class machines, -XX:+UseCompressedOops is the default on recent versions of Java and -XX:MaxGCPauseMillis=250 is just a hint.
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

JVM CMS Garbage Collecting Issues

I'm seeing the following symptoms on an application's GC log file with the Concurrent Mark-Sweep collector:
4031.248: [CMS-concurrent-preclean-start]
4031.250: [CMS-concurrent-preclean: 0.002/0.002 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
4031.250: [CMS-concurrent-abortable-preclean-start]
CMS: abort preclean due to time 4036.346: [CMS-concurrent-abortable-preclean: 0.159/5.096 secs] [Times: user=0.00 sys=0.01, real=5.09 secs]
4036.346: [GC[YG occupancy: 55964 K (118016 K)]4036.347: [Rescan (parallel) , 0.0641200 secs]4036.411: [weak refs processing, 0.0001300 secs]4036.411: [class unloading, 0.0041590 secs]4036.415: [scrub symbol & string tables, 0.0053220 secs] [1 CMS-remark: 16015K(393216K)] 71979K(511232K), 0.0746640 secs] [Times: user=0.08 sys=0.00, real=0.08 secs]
The preclean process keeps aborting continously. I've tried adjusting CMSMaxAbortablePrecleanTime to 15 seconds, from the default of 5, but that has not helped. The current JVM options are as follows...
Djava.awt.headless=true
-Xms512m
-Xmx512m
-Xmn128m
-XX:MaxPermSize=128m
-XX:+HeapDumpOnOutOfMemoryError
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:BiasedLockingStartupDelay=0
-XX:+DoEscapeAnalysis
-XX:+UseBiasedLocking
-XX:+EliminateLocks
-XX:+CMSParallelRemarkEnabled
-verbose:gc
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintHeapAtGC
-Xloggc:gc.log
-XX:+CMSClassUnloadingEnabled
-XX:+CMSPermGenPrecleaningEnabled
-XX:CMSInitiatingOccupancyFraction=50
-XX:ReservedCodeCacheSize=64m
-Dnetworkaddress.cache.ttl=30
-Xss128k
It appears the concurrent-abortable-preclean never gets a chance to run. I read through https://blogs.oracle.com/jonthecollector/entry/did_you_know which had a suggestion of enabling CMSScavengeBeforeRemark, but the side effects of pausing did not seem ideal. Could anyone offer up any suggestions?
Also I was wondering if anyone had a good reference for grokking the CMS GC logs, in particular this line:
[1 CMS-remark: 16015K(393216K)] 71979K(511232K), 0.0746640 secs]
Not clear on what memory regions those numbers are referring to.
Edit Found a link to this http://www.sun.com/bigadmin/content/submitted/cms_gc_logs.jsp
[Times: user=0.00 sys=0.01, real=5.09 secs]
I would try investigate why CMS-concurrent-abortable-preclean-start doesn't get neither user nor sys CPU time in 5 seconds.
My suggestion is starting from a 'clean' JVM CMS startup flags like
-Djava.awt.headless=true
-Xms512m
-Xmx512m
-Xmn128m
-Xss128k
-XX:MaxPermSize=128m
-XX:+UseConcMarkSweepGC
-XX:+HeapDumpOnOutOfMemoryError
-Xloggc:gc.log
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintHeapAtGC
then check if the problem reproduces and keep tweaking one parameter at a time.
As someone has already mentioned, the first step would be to increase the CMSInitiatingOccupancyFraction.
As a second step, I would use the flag -XX:-PrintTenuringDistribution and make sure that there is no premature promotion from the young generation to the old one. This would lead to old-to-young references which might lead to a longer abortable preclean phase.
If there is such a premature promotion, try to adjust the ratio between the eden and the survior spaces.
There is a good explanation here about this phenomenon:
Quote:
So when the system load is light(which means there will be no
minor gc), precleaning will always time out and full gc will always
fail. cpu is waste.
It won't fail. It'll be less parallel (i.e. less efficient, and would
have a longer pause time, for lesser work).
So all in all: this seems to be normal operation - the thread just waits for a minor GC to happen for 5 seconds, but there is no big issue when this does not happen: the JVM chooses a different (less efficient) strategy to continue with the GC.
For the service I'm using I added:
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=80
This configures the JVM to start the marking only after 80% is full and it's worth giving it a try.

Categories

Resources