How to measure Java GC Stop The World time?

How to measure Java GC Stop The World time? - java

I know we can get the GC duration from the GarbageCollectionNotificationInfo object, but the duration there seems to be the entire duration (e.g., I found 5+ seconds once) which could be much larger than the actual stop the world pause (typically less than 1 seconds from my experience), is there anyway we can get the actual stop the world pause duration? Either somehow calculated from the available sources (I do not think GarbageCollectionNotificationInfo provide us with those details? but I could be wrong) or any other ways? I know jstat tool prints the FGCT column which seems to be reflecting exactly the stop the world pause time, how do they do that then? Thanks in advance!

To get all STW pauses in the VM log output you need to pass the following two options. This includes non-GC safepoints.
-XX:+PrintSafepointStatistics –XX:PrintSafepointStatisticsCount=1
Alternatively there's -XX:+PrintGCApplicationStoppedTime
Keep in mind that non-safepoint things can induce pauses too (e.g. the kernel's thread scheduler). There's jHiccup to measure those.

Related

What is the best way to measure GC pause times?

What is the best way to track the GC pause/stall times in a Java instance?
can it be retrieved from the Garbage Collector GarbageCollectorMXBean?
can it be read from gc.log?
Does gc.log have any additional information that the MXBean doesn't have? I would prefer option 1 because I want the instance to emit that metric to our monitoring system.
I have read through a few posts like this on SO, but I don't seem to be getting the right answer. I am specifically looking for the GC stall times and not the total time spent on GC.

Garbage Collection is not the only reason of JVM stop-the-world pauses.
You may want to count other reasons, too.
The first way to monitor safepoint pauses is to parse VM logs:
for JDK 8 and earlier add -XX:+PrintGCApplicationStoppedTime JVM option;
starting from JDK 9 add -Xlog:safepoint.
Then look for Total time for which application threads were stopped messages in the log file.
The second way is to use undocumented Hotspot internal MXBean:
sun.management.HotspotRuntimeMBean runtime =
sun.management.ManagementFactoryHelper.getHotspotRuntimeMBean();
System.out.println("Safepoint time: " + runtime.getTotalSafepointTime() + " ms");
System.out.println("Safepoint count: " + runtime.getSafepointCount());
It gives you the cumulative time of all JVM pauses. See the discussion in this answer.

I am specifically looking for the GC stall times
There are more to stall times than the GC itself. Time to acquire the safepoint is also an application stall and is only available as part logging, not through MXBeans
But really, if you're concerned about application stalls then neither GC pauses nor over safepoint time is what you should actually measure. You should measure the stalls themselves, e.g. via jhiccup

Java Execution Time Peaks within a Loop

I have a loop with 701 iterations of similar complex calculations. I measured the execution time of each iteration for three runs. As you can see in the chart I'm getting strange peaks. Is there any common aproach which is able to explain these peaks without analyzing the code inside the loop.
Execution Time
Is it possible that the gc is starting at these points and slow down the other parts?

It depends.
If you don't want to analyze the code inside the loop you have to analyze at least the usage of memory during excution. For example if algorithm does not create many garbage and you have configured enough heap and you have chosen the right gc algorithm then the gc does not require any "stop the world".
As first thing you have to activate gc logging (see here: http://www.oracle.com/technetwork/articles/javase/gcportal-136937.html) and check for gc peaks in correspondance to your time peaks.
If you want to analyze the gc log you can use this tool http://www.tagtraum.com/gcviewer.html.
Then follow the link posted from Turing85 about micro-benchmarks. It is more complete.

jHiccup analysis doesn't add up

I have the following jHiccup result.
Obviously there are huge peaks of few secs in the graph. My app outputs logs every 100 ms or so. When I read my logs I never see such huge pauses. Also I can check the total time spent in GC from the JVM diagnostics and it says the following:
Time: 
2013-03-12 01:09:04
Used: 
 1,465,483 kbytes
Committed: 
 2,080,128 kbytes
Max: 
 2,080,128 kbytes
GC time: 
     2 minutes on ParNew (4,329 collections)
8.212 seconds on ConcurrentMarkSweep (72 collections)
The total big-GC time is around 8 seconds spread over 72 separate collections. All of them are below 200ms per my JVM hint to limit the pauses.
On the other hand I observed exactly one instance of network response time of 5 seconds in my independent network logs (wireshark). That implies the pauses exist, but they are not GC and they are not blocked threads or something that can be observed in profiler or thread dumps.
My question is what would be the best way to debug or tune this behavior?
Additionally, I'd like to understand how jHiccup does the measurement. Obviously it is not GC pause time.

Glad to see you are using jHiccup, and that it seems to show reality-based hiccups.
jHiccup observes "hiccups" that would also be seen by application threads running on the JVM. It does not glean the reason - just reports the fact. Reasons can be anything that would cause a process to not run perfectly ready-to-run code: GC pauses are a common cause, but a temporary ^Z at the keyboard, or one of those "live migration" things across virtualized hosts would be observed just as well.. There are a multitude of possible reasons, including scheduling pressure at the OS or hypervisor level (if one exists), power management craziness, swapping, and many others. I've seen Linux file system pressure and Transparent Huge Page "background" defragmentation cause multi-second hiccups as well...
A good first step at isolating the cause of the pause is to use the "-c" option in jHiccup: It launches a separate control process (with an otherwise idle workload). If both your application and the control process show hiccups that are roughly correlated in size and time, you'll know you are looking for a system-level (as opposed to process-local) reason. If they do not correlate, you'll know to suspect the insides of your JVM - which most likely indicates your JVM paused for something big; either GC or something else, like a lock debiasing or a class-loading-deriven-deoptimization which can take a really long [and often unreported in logs] time on some JVMs if time-to-safepoint is long for some reason (and on most JVMs, there are many possible causes for a long time-to-safepoint).
jHiccup's measurement is so dirt-simple that it's hard to get wrong. The entire thing is less than 650 lines of java code, so you can look at the logic for yourself. jHiccup's HiccupRecorder thread repeatedly goes to sleep for 1msec, and when it wakes up it records any difference in time (from before the sleep) that is greater that 1msec as a hiccup. The simple assumption is that if one ready-to-run thread (the HiccupRecorder) did not get to run for 5 seconds, other threads in the same process also saw a similar sized hiccup.
As you note above, jHiccups observations seem to be corroborated in your independent network logs, where you saw a 5 seconds response time, Note that not all hiccups would have been observed by the network logs, as only requests actually made during the hiccups would have been observed by a network logger. In contrast, no hiccup larger than ~1msec can hide from jHiccup, since it will attempt a wakeup 1,000 times per second even with no other activity.
This may not be GC, but before you rule out GC, I'd suggest you look into the GC logging a bit more. To start with, a JVM hint to limit pauses to 200msec is useless on all known JVMs. A pause hint is the equivalent of saying "please". In addition, don't believe your GC logs unless you include -XX:+PrintGCApplicationStoppedTime in options (and Suspect them even then). There are pauses and parts of pauses that can be very long and go unreported unless you include this flag. E.g. I've seen pauses caused by the occasional long running counted loop taking 15 seconds to reach a safe point, where GC only reported only the .08 seconds part of the pause where it actually did some work. There are also plenty of pauses whose causes that are not considered part of "GC" and can thereby go unreported by GC logging flags.
-- Gil. [jHiccup's author]

How to tune a jvm to crash instead of heroically GC till 100% CPU utilization?

We have a JVM process that infrequently pegs the CPU at 100%, with what appears to be (according to visualgc) a very nearly exhausted heap. Our supposition is that the process is heroically GC'ing causing a CPU spike, which is affecting the overall health of the entire system (consisting of other JVMs doing different things).
This process is not critical and can be restarted. Is there a way to tune the JVM via the command line which starts it to make it fall on its own sword rather than it keep GC'ing and causing the entire box to suffer?
Of note is that we are not getting OOMExceptions, so the heap isn't TOTALLY exhausted, but just barely not, we think.
Alternatively, something to give us some insight as to what in the JVM is actually using the CPU in the way that it is to confirm/deny our GC supposition?

We can get the statistics from
1):The option -XX:+PrintGCTimeStamps will add a time stamp at the start of each collection. This is useful to see how frequently garbage collections occur.
With the above option we can get rough estimation whether you supposition that the process is heroically GC'ing causing a CPU spike or not .
If your suppossition is right then start tunig your GC .
Both parallel collector and Concurrent Collector will throw an OutOfMemoryError if too much time is being
spent in garbage collection: if more than 98% of the total time is spent in garbage collection and
less than 2% of the heap is recovered, an OutOfMemoryError will be thrown. the option X:-UseGCOverheadLimit
is enabled by default for both Parallel and concurrent collector . Check whether this option is disabled in
your system .
For more information about Gc tuning in JVM refer this and for vm debugging options check this

The parallel and concurrent collectors have an "overhead limit" that might do what you want:
if more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, an OutOfMemoryError will be thrown
See http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html for more information.

The best thing to do is to find out the memory leak and fix it.
A simple way to exit on high memory usage:
if(Runtime.getRuntime().totalMemory()>100*1024*1024)
System.exit(0);

Try to look what processes are currently running in the JVM.
with jstack you can make a thread dump (there are other ways to do that as well)
with jvisualvm you could peek into the current state of the JVM (takes some resources)
also turn on verbosegc (to prove your assumption that GC is frequent)

You need to find a way how to gather some statistic about GC work. Actually there are some methods to do this. I will not do copy-paste, just give you the link to similar question:
Can you get basic GC stats in Java?
I believe, you will think of how to analyze this statistic and decide when GC is constantly active.
Because this question contains some new idea of applying GC statistic, I don't think, that it is duplicate.

Methods of limiting emulated cpu speed

I'm writing a MOS 6502 processor emulator as part of a larger project I've undertaken in my spare time. The emulator is written in Java, and before you say it, I know its not going to be as efficient and optimized as if it was written in c or assembly, but the goal is to make it run on various platforms and its pulling 2.5MHZ on a 1GHZ processor which is pretty good for an interpreted emulator. My problem is quite to the contrary, I need to limit the number of cycles to 1MHZ. Ive looked around but not seen many strategies for doing this. Ive tried a few things including checking the time after a number of cycles and sleeping for the difference between the expected time and the actual time elapsed, but checking the time slows down the emulation by a factor of 8 so does anyone have any better suggestions or perhaps ways to optimize time polling in java to reduce the slowdown?

The problem with using sleep() is that you generally only get a granularity of 1ms, and the actual sleep that you will get isn't necessarily even accurate to the nearest 1ms as it depends on what the rest of the system is doing. A couple of suggestions to try (off the top of my head-- I've not actually written a CPU emulator in Java):
stick to your idea, but check the time between a large-ish number of emulated instructions (execution is going to be a bit "lumpy" anyway especially on a uniprocessor machine, because the OS can potentially take away the CPU from your thread for several milliseconds at a time);
as you want to execute in the order of 1000 emulated instructions per millisecond, you could also try just hanging on to the CPU between "instructions": have your program periodically work out by trial and error how many runs through a loop it needs to go between instructions to "waste" enough CPU to make the timing work out at 1 million emulated instructions / sec on average (you may want to see if setting your thread to low priority helps system performance in this case).

I would use System.nanoTime() in a busy wait as #pst suggested earlier.
You can speed up the emulation by generating byte code. Most instructions should translate quite well and you can add a busy wait call so each instruction takes the amount of time the original instruction would have done. You have an option to increase the delay so you can watch each instruction being executed.
To make it really cool you could generate 6502 assembly code as text with matching line numbers in the byte code. This would allow you to use the debugger to step through the code, breakpoint it and see what the application is doing. ;)
A simple way to emulate the memory is to use direct ByteBuffer or native memory with the Unsafe class to access it. This will give you a block of memory you can access as any data type in any order.

You might be interested in examining the Java Apple Computer Emulator (JACE), which incorporates 6502 emulation. It uses Thread.sleep() in its TimedDevice class.

Have you looked into creating a Timer object that goes off at the cycle length you need it? You could have the timer itself initiate the next loop.
Here is the documentation for the Java 6 version:
http://download.oracle.com/javase/6/docs/api/java/util/Timer.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.