The Java Virtual Machine supports several garbage collection strategies.
This article explains them.
Now I am wondering which (automatically selected) strategy my application is using, is there any way to let the JVM(version 1.6) print this information?
Edit: The JVM detects if it is in client or server mode. So the question really is how can I see which has been detected?
jmap -heap
Prints a heap summary. GC algorithm used, heap configuration and generation wise heap usage are printed.
http://java.sun.com/javase/6/docs/technotes/tools/share/jmap.html
http://java.sun.com/j2se/1.5.0/docs/guide/vm/gc-ergonomics.html which is applicable for J2SE 6 as well states that the default is the Parallel Collector.
We tested this once on a JVM 1.5 by setting only
-server -Xms3g -Xmx3g -XX:PermSize=128m -XX:LargePageSizeInBytes=4m -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
and the output showed
41359.597: [GC [PSYoungGen: 90499K->32K(377344K)] 268466K->181862K(2474496K), 0.0183138 secs]
41359.615: [Full GC [PSYoungGen: 32K->0K(377344K)] [PSOldGen: 181830K->129760K(2097152K)] 181862K->129760K(2474496K) [PSPermGen: 115335K->115335K(131072K)], 4.4590942 secs]
where PS stands for Parallel Scavenging
Put this in the JAVA_OPTS:
-XX:+UseSerialGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
For the UseSerialGC we will see in the log:
7.732: [GC 7.732: [DefNew: 419456K->47174K(471872K), 0.1321800 secs] 419456K->47174K(1520448K), 0.1322500 secs] [Times: user=0.10 sys=0.03, real=0.14 secs]
For the UseConcMarkSweepGC we will see in the log:
5.630: [GC 5.630: ['ParNew: 37915K->3941K(38336K), 0.0123210 secs] 78169K->45163K(1568640K), 0.0124030 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
For the UseParallelGC we will see in the log:
30.250: [GC [PSYoungGen: 441062K->65524K(458752K)] 441062K->76129K(1507328K), 0.1870880 secs] [Times: user=0.33 sys=0.03, real=0.19 secs]
Looks like, we have more convenient way to define the version of GC at runtime. Always use tools, my suggestion.
To define GC version we need two tools that come with JVM (placed in your jdk/bin directory):
VisualVM - start it and try to profile some process (for example you can profile VisualVM itself). Your profile will show you a PID of process (see the green rectangles at a screenshot).
jMap - start this tool with -heap <PID> options and find a string dedicated to a Garbage Collector type (see a pink line at a screenshot)
As Joachim already pointed out, the article you refer to describes the VM strategies offered by Sun's VM. The VM specification itself does not mandate specific GC algorithms and hence it won't make sense to have e.g. enumerated values for these in the API.
You can however get some infos from the Management API:
List<GarbageCollectorMXBean> beans =
ManagementFactory.getGarbageCollectorMXBeans();
Iterating through these beans, you can get the name of the GC (although only as a string) and the names of the memory pools, which are managed by the different GCs.
Best way to get this is :
Go to command Line and enter the following command.
java -XX:+PrintCommandLineFlags -version
It will show you result like :
C:\windows\system32>java -XX:+PrintCommandLineFlags -version
-XX:InitialHeapSize=132968640 -XX:MaxHeapSize=2127498240 -XX:+PrintCommandLineFl
ags -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:-UseLargePagesInd
**ividualAllocation -XX:+UseParallelGC**
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)`enter code here`
You can write simple progam which connects via jmx to your java process:
public class PrintJMX {
public static void main(String[] args) throws Exception {
String rmiHostname = "localhost";
String defaultUrl = "service:jmx:rmi:///jndi/rmi://" + rmiHostname + ":1099/jmxrmi";
JMXServiceURL jmxServiceURL = new JMXServiceURL(defaultUrl);
JMXConnector jmxConnector = JMXConnectorFactory.connect(jmxServiceURL);
MBeanServerConnection mbsc = jmxConnector.getMBeanServerConnection();
ObjectName gcName = new ObjectName(ManagementFactory.GARBAGE_COLLECTOR_MXBEAN_DOMAIN_TYPE + ",*");
for (ObjectName name : mbsc.queryNames(gcName, null)) {
GarbageCollectorMXBean gc = ManagementFactory.newPlatformMXBeanProxy(mbsc,
name.getCanonicalName(),
GarbageCollectorMXBean.class);
System.out.println(gc.getName());
}
}
}
Related
I have a Java web app running on heroku which keeps generating "Memory quota exceeded" messages. The app itself is quite big and has a lot of libraries, but it is getting only very few requests (it is only used by a handful of users so if none of the users are online the system may not get a single request for hours) and thus performance is not a primary problem.
Even though the is very little happening in my app the memory consumption is consistently high:
Before deploying the app on heroku I deployed the app using docker containers and never worried much about memory setting leaving everything at the defaults. The whole container usually consumed about 300 MB.
The first thing I tried was to reduce memory consumtion by using -Xmx256m -Xss512k however this did not seem to have any effect.
The heroku manual suggests to log some data about garbage collection, so used the following flags to run my application: -Xmx256m -Xss512k -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC. This gives me e.g. the following output:
2017-01-11T22:43:39.605180+00:00 heroku[web.1]: Process running mem=588M(106.7%)
2017-01-11T22:43:39.605545+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-11T22:43:40.431536+00:00 app[web.1]: 2017-01-11T22:43:40.348+0000: [GC (Allocation Failure) 2017-01-11T22:43:40.348+0000: [ParNew
2017-01-11T22:43:40.431566+00:00 app[web.1]: Desired survivor size 4456448 bytes, new threshold 1 (max 6)
2017-01-11T22:43:40.431579+00:00 app[web.1]: - age 1: 7676592 bytes, 7676592 total
2017-01-11T22:43:40.431593+00:00 app[web.1]: - age 2: 844048 bytes, 8520640 total
2017-01-11T22:43:40.431605+00:00 app[web.1]: - age 3: 153408 bytes, 8674048 total
2017-01-11T22:43:40.431772+00:00 app[web.1]: : 72382K->8704K(78656K), 0.0829189 secs] 139087K->78368K(253440K), 0.0830615 secs] [Times: user=0.06 sys=0.00, real=0.08 secs]
2017-01-11T22:43:41.298146+00:00 app[web.1]: 2017-01-11T22:43:41.195+0000: [GC (Allocation Failure) 2017-01-11T22:43:41.195+0000: [ParNew
2017-01-11T22:43:41.304519+00:00 app[web.1]: Desired survivor size 4456448 bytes, new threshold 1 (max 6)
2017-01-11T22:43:41.304537+00:00 app[web.1]: - age 1: 7271480 bytes, 7271480 total
2017-01-11T22:43:41.304705+00:00 app[web.1]: : 78656K->8704K(78656K), 0.1091697 secs] 148320K->81445K(253440K), 0.1092897 secs] [Times: user=0.10 sys=0.00, real=0.11 secs]
2017-01-11T22:43:42.589543+00:00 app[web.1]: 2017-01-11T22:43:42.526+0000: [GC (Allocation Failure) 2017-01-11T22:43:42.526+0000: [ParNew
2017-01-11T22:43:42.589562+00:00 app[web.1]: Desired survivor size 4456448 bytes, new threshold 1 (max 6)
2017-01-11T22:43:42.589564+00:00 app[web.1]: - age 1: 6901112 bytes, 6901112 total
2017-01-11T22:43:42.589695+00:00 app[web.1]: : 78656K->8704K(78656K), 0.0632178 secs] 151397K->83784K(253440K), 0.0633208 secs] [Times: user=0.06 sys=0.00, real=0.06 secs]
2017-01-11T22:43:57.653300+00:00 heroku[web.1]: Process running mem=587M(106.6%)
2017-01-11T22:43:57.653498+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
Unfortunately I am no expert on reading those logs, but on a first naive look it looks like the app is actually not consuming an amount of memory that would be a problem (or am I horribly misreading the logs?).
My Procfile reads:
web: java $JAVA_OPTS -jar target/dependency/webapp-runner.jar --port $PORT --context-xml context.xml app.war
Update
As codefinger suggested I added the Heroku Java agent to my app. For some reason after adding the java-agent the problem did not occur anymore. But now I have bean able to capture the problem. In the following except the memory limit was exceeded only for a short moment:
2017-01-24T10:30:00.143342+00:00 app[web.1]: measure.mem.jvm.heap.used=92M measure.mem.jvm.heap.committed=221M measure.mem.jvm.heap.max=233M
2017-01-24T10:30:00.143399+00:00 app[web.1]: measure.mem.jvm.nonheap.used=77M measure.mem.jvm.nonheap.committed=78M measure.mem.jvm.nonheap.max=0M
2017-01-24T10:30:00.143474+00:00 app[web.1]: measure.threads.jvm.total=41 measure.threads.jvm.daemon=24 measure.threads.jvm.nondaemon=2 measure.threads.jvm.internal=15
2017-01-24T10:30:00.147542+00:00 app[web.1]: measure.mem.linux.vsz=4449M measure.mem.linux.rss=446M
2017-01-24T10:31:00.143196+00:00 app[web.1]: measure.mem.jvm.heap.used=103M measure.mem.jvm.heap.committed=251M measure.mem.jvm.heap.max=251M
2017-01-24T10:31:00.143346+00:00 app[web.1]: measure.mem.jvm.nonheap.used=101M measure.mem.jvm.nonheap.committed=103M measure.mem.jvm.nonheap.max=0M
2017-01-24T10:31:00.143468+00:00 app[web.1]: measure.threads.jvm.total=42 measure.threads.jvm.daemon=25 measure.threads.jvm.nondaemon=2 measure.threads.jvm.internal=15
2017-01-24T10:31:00.153106+00:00 app[web.1]: measure.mem.linux.vsz=4739M measure.mem.linux.rss=503M
2017-01-24T10:31:24.163943+00:00 heroku[web.1]: Process running mem=517M(101.2%)
2017-01-24T10:31:24.164150+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-24T10:32:00.143066+00:00 app[web.1]: measure.mem.jvm.heap.used=108M measure.mem.jvm.heap.committed=248M measure.mem.jvm.heap.max=248M
2017-01-24T10:32:00.143103+00:00 app[web.1]: measure.mem.jvm.nonheap.used=108M measure.mem.jvm.nonheap.committed=110M measure.mem.jvm.nonheap.max=0M
2017-01-24T10:32:00.143173+00:00 app[web.1]: measure.threads.jvm.total=40 measure.threads.jvm.daemon=23 measure.threads.jvm.nondaemon=2 measure.threads.jvm.internal=15
2017-01-24T10:32:00.150558+00:00 app[web.1]: measure.mem.linux.vsz=4738M measure.mem.linux.rss=314M
2017-01-24T10:33:00.142989+00:00 app[web.1]: measure.mem.jvm.heap.used=108M measure.mem.jvm.heap.committed=248M measure.mem.jvm.heap.max=248M
2017-01-24T10:33:00.143056+00:00 app[web.1]: measure.mem.jvm.nonheap.used=108M measure.mem.jvm.nonheap.committed=110M measure.mem.jvm.nonheap.max=0M
2017-01-24T10:33:00.143150+00:00 app[web.1]: measure.threads.jvm.total=40 measure.threads.jvm.daemon=23 measure.threads.jvm.nondaemon=2 measure.threads.jvm.internal=15
2017-01-24T10:33:00.146642+00:00 app[web.1]: measure.mem.linux.vsz=4738M measure.mem.linux.rss=313M
In the following case the limit was exceeded for a much longer time:
2017-01-25T08:14:06.202429+00:00 heroku[web.1]: Process running mem=574M(111.5%)
2017-01-25T08:14:06.202429+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-25T08:14:26.924265+00:00 heroku[web.1]: Process running mem=574M(111.5%)
2017-01-25T08:14:26.924265+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-25T08:14:48.082543+00:00 heroku[web.1]: Process running mem=574M(111.5%)
2017-01-25T08:14:48.082615+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-25T08:15:00.142901+00:00 app[web.1]: measure.mem.jvm.heap.used=164M measure.mem.jvm.heap.committed=229M measure.mem.jvm.heap.max=233M
2017-01-25T08:15:00.142972+00:00 app[web.1]: measure.mem.jvm.nonheap.used=121M measure.mem.jvm.nonheap.committed=124M measure.mem.jvm.nonheap.max=0M
2017-01-25T08:15:00.143019+00:00 app[web.1]: measure.threads.jvm.total=40 measure.threads.jvm.daemon=23 measure.threads.jvm.nondaemon=2 measure.threads.jvm.internal=15
2017-01-25T08:15:00.149631+00:00 app[web.1]: measure.mem.linux.vsz=4740M measure.mem.linux.rss=410M
2017-01-25T08:15:09.339319+00:00 heroku[web.1]: Process running mem=574M(111.5%)
2017-01-25T08:15:09.339319+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-25T08:15:30.398980+00:00 heroku[web.1]: Process running mem=574M(111.5%)
2017-01-25T08:15:30.399066+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-25T08:15:51.140193+00:00 heroku[web.1]: Process running mem=574M(111.5%)
2017-01-25T08:15:51.140280+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2017-01-25T08:16:00.143016+00:00 app[web.1]: measure.mem.jvm.heap.used=165M measure.mem.jvm.heap.committed=229M measure.mem.jvm.heap.max=233M
2017-01-25T08:16:00.143084+00:00 app[web.1]: measure.mem.jvm.nonheap.used=121M measure.mem.jvm.nonheap.committed=124M measure.mem.jvm.nonheap.max=0M
2017-01-25T08:16:00.143135+00:00 app[web.1]: measure.threads.jvm.total=40 measure.threads.jvm.daemon=23 measure.threads.jvm.nondaemon=2 measure.threads.jvm.internal=15
2017-01-25T08:16:00.148157+00:00 app[web.1]: measure.mem.linux.vsz=4740M measure.mem.linux.rss=410M
For the later log here is the bigger picture:
(memory consumption dropped because I restarted the server)
At the time the memory limit is first exceeded a cron job (spring scheduled) imports CSV files. The CSV files are processed in batches of 10,000 lines, so there are never more than 10K rows referenced in memory. Nevertheless a lot of memory is of course consumed overall, as many batches are processed. Also I tried to trigger the imports manually to check whether I can reproduce the memory consumption peak, but I can't: This does not always happen.
This isn't really an answer, but may help:
I looks like you have a surge (or possibly a leak) in off-heap memory consumption. The source is almost certainly the CSV processing. Here's a good article that describes a similar problem.
http://www.evanjones.ca/java-native-leak-bug.html
If you have exceeded about:
Process running mem=574M(111.5%)
maybe it is impossible do decrease under 500 in your app, i had similar problems with app and now my app works correct under 512MB.
you can try some of these options (my docker example):
ENTRYPOINT ["java","-Dserver.port=$PORT","-Xmx268M","-Xss512K","-
XX:CICompilerCount=2","-Dfile.encoding=UTF-8","-
XX:+UseContainerSupport","-Djava.security.egd=file:/dev/./urandom","-
Xlog:gc","-jar","/app.jar"]
of course Xmx value you have to match to your case (maybe more maybe less)
My application's gc.log reveals the following information:
2015-05-23T03:51:10.086+0800: 648560.384: [GC 648560.384: [ParNew: 311342K->3965K(409600K), 0.0025980 secs] 390090K->82715K(1433600K), 0.0028290 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
2015-05-23T03:51:10.506+0800: 648560.804: [GC 648560.804: [ParNew: 311165K->3784K(409600K), 0.0030820 secs] 389915K->82536K(1433600K), 0.0032760 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
2015-05-25T15:20:54.421+0800: 862744.719: [GC 862744.719: [ParNew: 310984K->3625K(409600K), 0.0032810 secs] 389736K->82379K(1433600K), 0.0036910 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
2015-05-25T15:20:54.549+0800: 862744.846: [GC 862744.846: [ParNew: 310825K->11547K(409600K), 0.0037930 secs] 389579K->90305K(1433600K), 0.0040420 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
As you can see, the gc.log stopped printing from 2015-05-23:03:51:10 to 2015-05-25:15:20:54.
My application is a long running server side application, and it is based on Netty. So basically it is not possible that there is no garbages to collect in these hours.
And it is not because the disk is full. There is plenty of space left.
This is my JVM arguments:
-Xmx2048m -Xms1024m -verbose:gc -Xloggc:./gc.log -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSCompactAtFullCollection -XX:MaxTenuringThreshold=10 -XX:-UseAdaptiveSizePolicy -XX:PermSize=256M -XX:MaxPermSize=512M -XX:SurvivorRatio=3 -XX:NewRatio=2 -XX:+PrintGCDateStamps -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails
After a long long google, I found the problem.
It's actually a bug in linux. Please refer to this google group post
This is really a big "ouch".
I run the java application with following configuration
-Xmx512M
-Xms32M
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-Xloggc:Desktop/Gen/Gen/gc.log
-XX:+PrintGCTimeStamps
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintGC
But it is not printing GCTimeStamps in gc.log file..
The gc.log file contents are like below..
Application time: 0.0272860 seconds
2015-01-23T17:18:14.054+0100: 0.731: [GC [PSYoungGen: 94627K->58213K(108928K)] 226219K->217525K(287744K), 0.0607860 secs] [Times: user=0.35 sys=0.12, real=0.06 secs]
2015-01-23T17:18:14.115+0100: 0.792: [Full GC [PSYoungGen: 58213K->38649K(108928K)] [PSOldGen: 159312K->178815K(251904K)] 217525K->217465K(360832K) [PSPermGen: 4237K->4237K(21248K)], 0.1840190 secs] [Times: user=0.18 sys=0.01, real=0.19 secs]
Total time for which application threads were stopped: 0.2449170 seconds
Application time: 0.0107920 seconds
From your log
Application time: 0.0272860 seconds
Line above is produced by -XX:+PrintGCApplicationConcurrentTime, no date stamp
2015-01-23T17:18:14.054+0100: 0.731: [GC [PSYoungGen: 94627K->58213K(108928K)] 226219K->217525K(287744K), 0.0607860 secs] [Times: user=0.35 sys=0.12, real=0.06 secs]
2015-01-23T17:18:14.115+0100: 0.792: [Full GC [PSYoungGen: 58213K->38649K(108928K)] [PSOldGen: 159312K->178815K(251904K)] 217525K->217465K(360832K) [PSPermGen: 4237K->4237K(21248K)], 0.1840190 secs] [Times: user=0.18 sys=0.01, real=0.19 secs]
Lines above are produced by -XX:+PrintGCDetails, date stamps are they
Total time for which application threads were stopped: 0.2449170 seconds
Line above is produced by -XX:+PrintGCApplicationStopedTime, no date stamp
Application time: 0.0107920 seconds
Line above is produced by -XX:+PrintGCApplicationConcurrentTime, no date stamp
In summary
-XX:+PrintGCDateStamps works only with output produced by -XX:+PrintGCDetails
Output -XX:+PrintGCApplicationConcurrentTime and -XX:+PrintGCApplicationStoppedTime cannot be prefixed with date stamps unfortunately
You can find more details about GC diagnostic options here.
Recently we upgraded the Java (1.6.0_18->1.6.0_38) and Tomcat (6.0.32->7.0.34) versions of a third party J2EE web application that runs in our production environment. We soon received alerts that the CPU on the server was spiking over 50% a couple of times a day. Upon further analysis, I observed that the spikes were taking place at the same time as the Concurrent Mark Sweep major gc’s, and that the total CPU time required to complete them had greatly increased, particularly in the CMS-concurrent-mark and CMS-concurrent-sweep phases:
Before:
2013-03-08T14:36:49.861-0500: 553875.681: [GC [1 CMS-initial-mark: 4152134K(8303424K)] 4156673K(8380096K), 0.0067893 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2013-03-08T14:36:49.868-0500: 553875.688: [CMS-concurrent-mark-start]
2013-03-08T14:36:55.682-0500: 553881.503: [GC 553881.503: [ParNew: 72675K->4635K(76672K), 0.0322031 secs] 4224809K->4157567K(8380096K), 0.0327540 secs] [Times: user=0.12 sys=0.01, real=0.03 secs]
2013-03-08T14:36:58.224-0500: 553884.045: [CMS-concurrent-mark: 8.320/8.356 secs] [Times: user=9.18 sys=0.02, real=8.36 secs]
2013-03-08T14:36:58.224-0500: 553884.045: [CMS-concurrent-preclean-start]
2013-03-08T14:36:58.276-0500: 553884.097: [CMS-concurrent-preclean: 0.051/0.052 secs] [Times: user=0.06 sys=0.00, real=0.05 secs]
2013-03-08T14:36:58.277-0500: 553884.097: [CMS-concurrent-abortable-preclean-start]
2013-03-08T14:37:01.458-0500: 553887.279: [GC 553887.279: [ParNew: 72795K->4887K(76672K), 0.0332472 secs] 4225727K->4158532K(8380096K), 0.0337703 secs] [Times: user=0.13 sys=0.00, real=0.03 secs]
CMS: abort preclean due to time 2013-03-08T14:37:03.296-0500: 553889.117: [CMS-concurrent-abortable-preclean: 1.462/5.020 secs] [Times: user=2.04 sys=0.02, real=5.02 secs]
2013-03-08T14:37:03.299-0500: 553889.119: [GC[YG occupancy: 22614 K (76672 K)]553889.120: [Rescan (parallel) , 0.0151518 secs]553889.135: [weak refs processing, 0.0356825 secs] [1 CMS-remark: 4153644K(8303424K)] 4176259K(8380096K), 0.0620445 secs] [Times: user=0.11 sys=0.00, real=0.06 secs]
2013-03-08T14:37:03.363-0500: 553889.183: [CMS-concurrent-sweep-start]
2013-03-08T14:37:07.248-0500: 553893.069: [GC 553893.069: [ParNew: 73047K->5136K(76672K), 0.0510894 secs] 3182253K->3115235K(8380096K), 0.0516111 secs] [Times: user=0.19 sys=0.00, real=0.05 secs]
2013-03-08T14:37:08.277-0500: 553894.097: [CMS-concurrent-sweep: 4.856/4.914 secs] [Times: user=5.67 sys=0.02, real=4.91 secs]
2013-03-08T14:37:08.277-0500: 553894.097: [CMS-concurrent-reset-start]
2013-03-08T14:37:08.325-0500: 553894.145: [CMS-concurrent-reset: 0.048/0.048 secs] [Times: user=0.07 sys=0.00, real=0.05 secs]
After:
2013-03-07T17:18:01.323-0500: 180055.128: [CMS-concurrent-mark: 10.765/20.646 secs] [Times: user=50.25 sys=3.32, real=20.65 secs]
2013-03-07T17:18:01.323-0500: 180055.128: [CMS-concurrent-preclean-start]
2013-03-07T17:18:01.401-0500: 180055.206: [CMS-concurrent-preclean: 0.076/0.078 secs] [Times: user=0.08 sys=0.00, real=0.08 secs]
2013-03-07T17:18:01.401-0500: 180055.206: [CMS-concurrent-abortable-preclean-start]
2013-03-07T17:18:03.074-0500: 180056.879: [GC 180056.880: [ParNew: 76670K->8512K(76672K), 0.1024039 secs] 5980843K->5922977K(8380096K), 0.1028797 secs] [Times: user=0.28 sys=0.04, real=0.10 secs]
2013-03-07T17:18:05.447-0500: 180059.253: [CMS-concurrent-abortable-preclean: 3.132/4.046 secs] [Times: user=3.94 sys=0.07, real=4.05 secs]
2013-03-07T17:18:05.448-0500: 180059.254: [GC[YG occupancy: 51161 K (76672 K)]180059.254: [Rescan (parallel) , 0.0243232 secs]180059.279: [weak refs processing, 0.2053571 secs] [1 CMS-remark: 5914465K(8303424K)] 5965627K(8380096K), 0.2569077 secs] [Times: user=0.33 sys=0.01, real=0.26 secs]
2013-03-07T17:18:05.706-0500: 180059.512: [CMS-concurrent-sweep-start]
2013-03-07T17:18:12.511-0500: 180066.316: [CMS-concurrent-sweep: 6.804/6.804 secs] [Times: user=13.98 sys=0.80, real=6.80 secs]
2013-03-07T17:18:12.511-0500: 180066.316: [CMS-concurrent-reset-start]
2013-03-07T17:18:12.558-0500: 180066.363: [CMS-concurrent-reset: 0.047/0.047 secs] [Times: user=0.11 sys=0.02, real=0.05 secs]
During these spikes, which lasted about a minute, the Tomcat server response time went from an average of 2ms to approximately 90 seconds. After 3 days in production, we rolled back the changes and have not seen a CPU spike since. Do you know of any changes in the JDK or Tomcat that may have caused this behavior? One note: this web application caches a very large amount of data in the heap (up to 3GB at startup).
Here are the JVM settings:
(Before) Tomcat 6 / JDK 1.6.0_18:
JAVA_HOME="/usr/local/java/jdk1.6.0_18"
JAVA_OPTS="$JAVA_OPTS -server -d64 -XX:PermSize=128m -XX:MaxPermSize=128m"
CATALINA_OPTS="$CATALINA_OPTS -Xms8192m -Xmx8192m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50 -XX:+UseCMSInitiatingOccupancyOnly -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:/env/tomcat-instance/logs/gc.log -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=(omitted) -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
(After) Tomcat 7 / JDK 1.6.0_38:
JAVA_HOME="/usr/local/java/jdk1.6.0_38"
JAVA_OPTS="$JAVA_OPTS -server -d64 -XX:PermSize=128m -XX:MaxPermSize=128m"
CATALINA_OPTS="$CATALINA_OPTS -Xms8192m -Xmx8192m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50 -XX:+UseCMSInitiatingOccupancyOnly -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:/env/tomcat-instance/logs/gc.log -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=(omitted) -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
Any help is very much appreciated.
Anecdotal feedback - we hit a serious memory leak bug in 6u2n that has not been fixed until 7:
http://bugs.sun.com/view_bug.do?bug_id=7013538
http://bugs.sun.com/view_bug.do?bug_id=7042582
6u21 is the safest Java 6 JRE that I have experience with.
You upgraded both Tomcat and the JVM, so the spikes could be caused by either one of them. You can limit the number of threads for the GC.
-XX:ParallelGCThreads=12
If you run more than one JVM, make sure there are not more GC threads than cores. Look into JVM 1.7, too.
Use these to see the effective jvm parameters and look for changes:
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintFlagsFinal
-XX:+LogVMOutput
-XX:LogFile=logs/jvm.log
I'm currently having problems with very long garbage collection times. please see the followig. My current setup is that I'm using a -Xms1g and -Xmx3g. my application is using java 1.4.2. I don't have any garbage collection flags set. by the looks of it, 3gb is not enough and I really have a lot of objects to garbage collect.
question:
should I change my garbage collection algorithm?
what should i use? is it better to use -XX:+UseParallelGC or -XX:+UseConcMarkSweepGC
or should i use this combination
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
the ones occupying the memory are largely reports data and not cache data. also, the machine has 16gb memory and I plan to increase the heap to 8gb.
What are the difference between the two options as I still find it hard to understand.
the machine has multiple processors. I can take hits of up to 5 seconds but 30 to 70 seconds is really hard.
Thanks for the help.
Line 151493: [14/Jan/2012:11:47:48] WARNING ( 8710): CORE3283: stderr: [GC 1632936K->1020739K(2050552K), 1.2462436 secs]
Line 157710: [14/Jan/2012:11:53:38] WARNING ( 8710): CORE3283: stderr: [GC 1670531K->1058755K(2050552K), 1.1555375 secs]
Line 163840: [14/Jan/2012:12:00:42] WARNING ( 8710): CORE3283: stderr: [GC 1708547K->1097282K(2050552K), 1.1503118 secs]
Line 169811: [14/Jan/2012:12:08:02] WARNING ( 8710): CORE3283: stderr: [GC 1747074K->1133764K(2050552K), 1.1017273 secs]
Line 175879: [14/Jan/2012:12:14:18] WARNING ( 8710): CORE3283: stderr: [GC 1783556K->1173103K(2050552K), 1.2060946 secs]
Line 176606: [14/Jan/2012:12:15:42] WARNING ( 8710): CORE3283: stderr: [Full GC 1265571K->1124875K(2050552K), 25.0670316 secs]
Line 184755: [14/Jan/2012:12:25:53] WARNING ( 8710): CORE3283: stderr: [GC 2007435K->1176457K(2784880K), 1.2483770 secs]
Line 193087: [14/Jan/2012:12:37:09] WARNING ( 8710): CORE3283: stderr: [GC 2059017K->1224285K(2784880K), 1.4739291 secs]
Line 201377: [14/Jan/2012:12:51:08] WARNING ( 8710): CORE3283: stderr: [Full GC 2106845K->1215242K(2784880K), 30.4016208 secs]
xaa:1: [11/Oct/2011:16:00:28] WARNING (17125): CORE3283: stderr: [Full GC 3114936K->2985477K(3114944K), 53.0468651 secs] --> garbage collection occurring too often as noticed in the time. garbage being collected is quite low and if you would notice is quite close the the heap size. during the 53 seconds, this is equivalent to a pause.
xaa:2087: [11/Oct/2011:16:01:35] WARNING (17125): CORE3283: stderr: [Full GC 3114943K->2991338K(3114944K), 58.3776291 secs]
xaa:3897: [11/Oct/2011:16:02:33] WARNING (17125): CORE3283: stderr: [Full GC 3114940K->2997077K(3114944K), 55.3197974 secs]
xaa:5597: [11/Oct/2011:16:03:00] WARNING (17125): CORE3283: stderr: [Full GC[Unloading class sun.reflect.GeneratedConstructorAccessor119]
xaa:7936: [11/Oct/2011:16:04:36] WARNING (17125): CORE3283: stderr: [Full GC 3114938K->3004947K(3114944K), 55.5269911 secs]
xaa:9070: [11/Oct/2011:16:05:53] WARNING (17125): CORE3283: stderr: [Full GC 3114937K->3012793K(3114944K), 70.6993328 secs]
Since you have extremenly long GC pauses, it's don't think that changing GC algorithm would help.
Note that it's highly suspicious that you have only full collections. Perhaps you need to increase the size of young generation and/or survivor space.
See also:
Tuning Garbage Collection with the 1.4.2 Java[tm] Virtual Machine
Your heap is too small. The pause is so large because it's busy repeatedly scanning the entire heap desperately looking for anything to collect.
You need to do 1 or possibly more of the following;
find and fix a memory leak
tune the application to use less memory
configure the JVM is use a bigger heap
Are you tied to 1.4.2 for some reason? GC implementations really have moved on since then so you should consider upgrading if possible. I realise this may be a non trivial undertaking but it's worth considering anyway.
If you have high survival rate, your heap may be too large. The larger the heap, the longer the JVM can go without GC'ing, so once it hits, it has so much more to move around.
Step 1:
Make sure that you have set enough memory for your application.
Make sure that you don't have memory leaks in your application. Eclipse Memory Analyzer Tool or visualvm will help you to identify leaks in your application.
Step 2:
If you don't have any issues with Step 1 with respect to memory leaks, refer to oracle documentation page on use cases for specific garbage collection algorithm in "Java Garbage Collectors" section and gctuning article.
Since you have decided to configure larger heaps (>= 8 GB), G1GC should work fine for you. Refer to this related SE question on fine tuning key parameters:
Java 7 (JDK 7) garbage collection and documentation on G1