Extremely long garbage collection times

Extremely long garbage collection times - java

We have a web application running Java 6, Tomcat 6, Spring Framework 3, Hibernate 4, EhCache.
We have a problem with extremely long garbage collection times which can take 30 seconds or longer, leaving the application unresponsive.
We're currently in testing but apart from the obvious: add more memory, I was wondering if there are aspects we could tune to reduce garbage collection time.
The major contributor to memory use is EHCache as we are aggressively caching. But I always find it hard to size the EHCache stores (the new EhCache byte size stores, lead to all sorts of problems with us because the cached object graphs can be quite large).
These are my settings for the JVM
JAVA_OPTS="$JAVA_OPTS -server -Xms256m -Xmx704m XX:OnOutOfMemoryError=/usr/share/scripts/on_server_crash.sh -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/tomcat6 -XX:MaxPermSize=192m -XX:+UseConcMarkSweepGC

To reduce GC times, the best thing you can do is use off heap memory. If you can move as much of your large data as possible you can reduce your full GC time to as low as 10 milli-second even with 100s of MB of off heap memory. I believe Ehcache support off heap data stores, but if it doesn't or you can't use it I suggest you look at alternatives which do.
Given you only have a 700 MB maximum memory size it appears you are running on a server with very limited memory. Otherwise I would suggest you start with a maximum of 8 or 16 GB and reduce the memory size if you believe you don't really need it.

There is an excellent tool from FourSquare folks. Check this link and quick example they have. Foursquare Heap tool. . Based on diagnostics that you find in any of the above mentioned tools, most sorted solution to resolve the issue will be to either to add more RAM or add power to your CPU processor. If you are open to some infrastructure changes check Zing from Azul Systems. But I think the second option might be a stretch.

Related

Java Heap Dump : How to find the objects/class that is taking memory by 1. io.netty.buffer.ByteBufUtil 2. byte[] array

I found that one of my spring boot project's memory (RAM consumption) is increasing day by day. When I uploaded the jar file to the AWS server, it was taking 582 MB of RAM (Max Allocated RAM is 1500 MB), but each day, the RAM is increasing by 50MB to 100 MB and today after 5 days, it's taking 835 MB. Right now the project is having 100-150 users and with normal usage of Rest APIs.
Because of this increase in the RAM, couple of times the application went down with the following error (error found from the logs):
Exception in thread "http-nio-3384-ClientPoller" java.lang.OutOfMemoryError: Java heap space
So to resolve this, I found that by using JAVA Heap Dump, I can find the objects/classes that are taking the memory. So by using Jmap in the command line, I've created a heap dump and uploaded it to Heap Hero and Eclipse Memory Analyzer Tool. In both of them I found the following:
1. Total Waste memory is: 64.69MB (73%) (check below screenshot)
2. Out of these, 34.06MB is taken by Byte [] array and LinkedHashmap[] (check below screenshot), which I have never used in my whole project. I searched for it in my project but didn't found.
3. Following 2 large objects taking 32 MB and 20 MB respectively.
1. Java Static io.netty.buffer.ByteBufUtil.DEFAULT_ALLOCATOR
2. Java Static com.mysql.cj.jdbc.AbandonedConnectionCleanupThread.connectionFinalizerPhantomRefs`
So I tried to find this netty.buffer. in my project, but I don't find anything which matched with netty or buffer.
Now my question is how can I reduce this memory leak or how can I find the exact memory consumption objects/class/variable so that I can reduce the heap size.
I know few of the experts will ask for the source code or anything similar to that but I believe that from the heap dump we can find the memory leak or live objects that are available in the memory. I am looking for that option or anything that reduces this heap dump!
I am working on this issue for the past 3 weeks. Any help would be appreciated.
Thank you!

Start with enabling the JVM native memory tracker to get an idea which part of the memory is increasing by adding the flag -XX:NativeMemoryTracking=summary. There is some performance overhead according to the documentation (5-10%), but if this isn't a issue I would recommend running the JVM with this flag enabled even in production.
Then you can check the values using jcmd <PID> VM.native_memory (there's a good writeup in this answer: Java native memory usage)
If there is indeed a big chunk of native memory allocated, it's likely this is allocated by Netty.
How do you run your application in AWS? If it's running in a Docker image, you might have stumbled upon this issue: What would cause a java process to greatly exceed the Xmx or Xss limit?
If this is the case, you may need to set the environment variable MALLOC_ARENA_MAX if your application is using native memory (which Netty does) and running on a server with a large number of cores. It's perfectly possible that the JVM allocates this memory for Netty but doesn't see any reason to release it, so it will appear to only continue to grow.
If you want to control how much native memory can be allocated by Netty, you can use the JVM flag -XX:MaxDirectMemorySize for this (I believe the default is the same value as Xmx) and lower it in case you application doesn't require that much memory.
JVM memory tuning is a complex process and it becomes even more complex when native memory is involved - as the linked answer shows it's not as easy as simply setting the Xms and Xmx flag and expecting that no more memory will be used.

Heap dump is not enough to detect memory leaks.
You need to look at the difference of two consecutive heaps snapshots both taken after calling the GC.
Or you need a profiling tool that can give the generations count for each class.
Then you should only look at your domain objects (not generic objects like bytes or strings ...etc) that survived the GC and passed from the old snapshot to the new one.
Or, if using the profiling tool, look for old domain objects that still alive and growing for many generations.
Having objects lived for many generations and keeps growing means those objects are still refernced and the GC is not able to reclaim them. However, living for many generations alone is not enough to cause a leak because cached or static Objects may stay for many generations. The other important factor is that they keep growing.
After you detected what object is being leaked, you may use heap dumb to analyse those objects and get the references.

JVM performance with these garbage collection settings

I have an enterprise level Java application that serves a few thousand users per day. This is a JAXB web service on weblogic 10.3.6 (Java 1.6 JVM), using Hibernate to hit an Oracle database. It also calls other web services.
We have it tuned the following GC settings on our production system:
-server -Xms2048m -Xmx2048m -XX:PermSize=512m -XX:MaxPermSize=512m
What is the effect of this GC sizing? The hardware has more than enough capacity to handle it.
I know that this sets the heap size and perm gen at a stable level. But what's the impact of that when you eventually have to do garbage collection?
To me it seems that it would make GC happen less frequently, but take longer when it does happen. Does that sound correct?

I would say please monitor the GC before deciding on the sizing as you never know how the application will behave under load. Have a look at this link and this it has some good references about GC and tools to calculate the same.

it would make GC happen less frequently, but take longer when it does happen
It might, it depends on your use case. You might even find that the GC is shorter in rare case.
A 2 GB heap isn't that much and I would use up to 26 GB without worrying about heap size. Above this size memory accesses are a little slower or use more memory.

Setting -Xmx & -Xms and PermSize & MaxPermSize to equal sizes will stop the JVM from resizing the heaps based on your requirement. These resizes are expensive as they trigger a Full GC.
-server will allow the JVM to make use of Server Compiler which will do more aggressive optimizations before compiling your code to native assembly instructions. Although now-a-days any machine with 2 or more cores and 2GB+ of memory will have server compiler on by default.
Increasing the memory doesn't always fix a problem. Sometimes adding more memory will be an overhead.
If you need details regarding GC, you can try this link
The very reason to tune something is to improve your application's performance and there by achieve your throughput and latency goals.

Appropriate JVM/GC tuning for 4GB JVM with 3GB cache

I am looking for the appropriate settings to configure the JVM for a web application. I have read about old/young/perm generation, but I have trouble using those parameters at best for this configuration.
Out of the 4 GB, around 3 GB are used for a cache (applicative cache using EhCache), so I'm looking for the best set up considering that. FYI, the cache is static during the lifetime of the application (loaded from disk, never expires), but heavily used.
I have profiled my application already, and I have performed optimization regarding the DB queries, the application's architecture, the cache size, etc... I am just looking for JVM configuration advices here. I have measured 99% throughput for the Garbage Collector, and 6-8s pauses when the Full GC runs (approximately once every 1/2h).
Here are the current JVM parameters:
-XX:+UseParallelGC -XX:+AggressiveHeap -Xms2048m -Xmx4096m
-XX:NewSize=64m -XX:PermSize=64m -XX:MaxPermSize=512m
-verbose:gc -XX:+PrintGCDetails -Xloggc:gc.log
Those parameters may be completely off because they have been written a long time ago... Before the application became that big.
I am using Java 1.5 64 bits.
Do you see any possible improvements?
Edit: the machine has 4 cores.

-XX:+UseParallel*Old*GC should speed up the Full GCs on a multi core machine.
You could also profile with different NewRatio values. Your cached objects will live in the tenured generation so profile it with -XX:NewRatio=7 and then again with some higher and lower values.
You may not be able to accurately replicate realistic use during profiling, so make sure you monitor GC when it is in real life use and then you can make minor changes (e.g. to survivor space etc) and see what effect they have.
Old advice was not to use AggressiveHeap with Xms and Xmx, I am not sure if that is still true.
Edit: Please let us know which OS/hardware platform you are deployed on.
Full collections every 30 mins indicates the old generation is quite full. A high value for newRatio will give it more space at the expense of the young gen. Can you give the JVM more than 4g or are you limited to that?
It would also be useful to know what your goals / non functional requirements are. Do you want to avoid these 6 / 7 second pauses at the risk of lower throughput or are those pauses an acceptable compromise for highest possible throughput?
If you want to minimise the pauses, try the CMS collector by removing both
-XX:+UseParallelGC -XX:+UseParallelOldGC
and adding
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
Profile with that with various NewRatio values and see how you get on.
One downside of the CMS collector is that unlike the parallel old and serial collectors, it doesn't compact the old generation. If the old generation gets too fragmented and a minor collection needs to promote a lot of objects to the old gen at once, a full serial collection may be invoked which could mean a long pause. (I've seen this once in prod but with the IBM JVM which went out of memory instead of invoking a compacting collection!)
This might not be a problem for you - it depends on the nature of the application - but you can insure against it by restarting nightly or weekly.

I would use Java 6 update 30 or 7 update 2, 64-bit as they are much more efficient. e.g. they use 32-bit references by default.
I would also configure Ehcache to use direct memory or a memory mapped file if possible. This should minimise the impact on GC.
Using these options its possible to almost eliminate your heap foot print. e.g. I have an app which uses up to 180 GB of memory mapped files on a machine with 16 GB of memory and the heap size is 6 MB. A full GC takes up to 11 ms when trigger manually, not that it ever GCs. ;)
If you want a simple example where I map in an 8 TB file into memory and update it. http://vanillajava.blogspot.com/2011/12/using-memory-mapped-file-for-huge.html

I hope you just removed -server to not inflate the post, otherwise you should instantly enable it. Apart from the bit longer startup time (which really isn't an issue for a web application that should run days) I don't see any reason to use anything but c2. That could give some nice performance improvements in general. Umn back to topic:
Sadly the best thing I can think of won't work with your ancient JVM. The G1 garbage collector was basically designed to reduce latency. Not only does it try to reduce pauses in general, it also offers some tuning parameters to set pause goals and intervals. See this page.
There is an experimental backport to java6 though I doubt it's kept up to date. And nobody is wasting any time on optimizing GC or anything else for Java 1.5 anymore I fear.
PS: There would also be IBM's JVM and obviously azul systems (ok that wasn't a serious proposition ;) ), but those are obviously out of the question.. just wanted to mention them.

Can Sun JVM handle gigantic heap sizes without problems, and how?

I have heard several people claiming that you can not scale the JVM heap size up. I've heard claims of the practical limit being 4 gigabytes (I heard an IBM consultant say that), 10 gigabytes, 32 gigabytes, and so on... I simply can not believe any of those numbers and have been wondering about the issue now for a while.
So, I have three part question I would hope someone with experience could answer:
Given the following case how would you tune the heap and GC settings?
Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
Should this really still work? I think it should.
The case:
64 bit platform
64 cores
64 gigabytes of memory
The application server is client facing (ie. Jboss/tomcat web application server) - complete pauses of JVM would probably be noticed by end users
Sun JVM, probably 1.5
To prove I am not asking you guys to do my homework this is what I came up with:
-XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:+UnlockDiagnosticVMOptions -XX:-EliminateZeroing -Xmn768m -Xmx55000m
CMS should reduce the amount of pauses, although it comes with overhead. The other settings for CMS seem to default automatically to the number of CPUs so they seem sane to me. The rest that I added are extras that might do good or bad generally for performance, and they should probably be tested.
Definitely.

I think it's going to be difficult for anybody to give you anything more than general advice, without having further knowledge of your application.
What I would suggest is that you use VisualGC (or the VisualGC plugin for VisualVM) to actually look at what the garbage collection is doing when your app is running. Once you have a greater understanding of how the GC is working alongside your application, it'll be far easier to tune it.

#1. Given the following case how would you tune the heap and GC settings?
First, having 64 gigabytes of memory doesn't imply that you have to use them all for one JVM. Actually, it rather means you can run many of them. Then, it is impossible to answer your question without any access to your machine and application to measure and analyse things (knowing what your application is doing isn't enough). And no, I'm not asking to get access to your environment :)
#2. Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
The goal of tuning is to find a good compromise between frequency and duration of (major) GCs. With a ~55g heap, GC won't be frequent but will take noticeable time, for sure (the bigger the heap, the longer the major GC). Using a Parallel or Concurrent garbage collector will help on multiprocessor systems but won't entirely solve this issue. Why do you need ~55g (this is mega ultra huge for a webapp IMO), that's my question. I'd rather run many clustered JVMs to handle load if required (at some point, the database will become the bottleneck anyway with a data oriented application).
#3. Should this really still work? I think it should.
Hmm... not sure I get the question. What is "this"? Instantiating a JVM with a big heap? Yes, it should. Is it equivalent to running several JVMs? No, certainly not.
PS: 4G is the maximum theoretical heap limit for the 32-bit JVM running on a 64-bit operating system (see Why can't I get a larger heap with the 32-bit JVM?)
PPS: On 64-bit VMs, you have 64 bits of addressability to work with resulting in a maximum Java heap size limited only by the amount of physical memory and swap space your system provides. (see How large a heap can I create using a 64-bit VM?)

Obviously heap size is not unlimited and the larger is the heap size, the more your JVM will eventually spend on GC. Though I think it is possible to set heap size quite high on 64-bit JVM, I still think it's not really practical. The advice here is better to have several JVMs running with the same parameters i.e. cluster of JBoss/Tomcat nodes running on the same physical machine and you will get better throughput.
EDIT: Also your GC behavior depends on the taxonomy of your heap. If you have a lot of short-living objects and each request to the server creates a lot of those, then your GC will collect a lot of garbage very often and thus on large heap size this will result in longer pauses. If you have very many long-living objects (e.g. caching most of your data in memory) and the amount of short-living objects is not that big, then having bigger heap size is OK.

As Chris Rice already wrote, I wouldn't expect any obvious problems with the GC for heap sizes up to 32-64GB, although there may of course be some point of your application logic, which can cause problems.
Not directly related to GC, but I would still recommend you to perform a realistic load test on your production system. I used to work on a project, where we had a similar setup (relatively large, clustered JBoss/Tomcat setup to serve a public web application) and without exaggeration, JBoss is not behaving very well under high load or with a high number of concurrent calls if you are using EJBs. JBoss is spending a lot of time in synchronized blocks when accessing and managing the EJB instance pools and if you opt for a cluster, it will even wait for intra-cluster network communication within these synchronized blocks. Be especially aware of poorly performing state replication, if you are using SFSBs.

Only to add some more switches I would use by default: -Xms55g can help to reduce the rampup time because it frees Java from the need to check if it can fall back to the initial size and allows also better internal initial sizing of memory areas.
Additionally we made good experiences with NewSize to give you a large young size to get rid of short term garbage: -XX:NewSize=1g Additionally most webapps create a lot of short time garbage that will never survive the request processing. You can even make that bigger. With Xms55g, the VM reserves a large chunk already. Maybe downsizing can help.
-Xincgc helps to clean the young generation incrementally and return the cpu often to the user threads.
-XX:CMSInitiatingOccupancyFraction=70 If you really fill all that memory, try to start CMS garbage collection earlier.
-XX:+CMSIncrementalMode puts the CMS into incremental mode to return the cpu to the user threads more often.
Attach to the process with jstat -gc -h 10 <pid> 1s and watch the GC working.
Will you really fill up the memory? I assume that 64cpus for request processing might even be able to work with less memory. What do you store in there?

Depending on your GC pause analysis, you may wish to implement Incremental mode whereby the long pause may be broken out over a period of time.

I have found memory architecture plays a part in large memory sizes. Applications in general don't perform as well if they use more than one memory bank. The JVM appears to suffer as well, esp the GC which has to sweep the whole memory.
If you have an application which doesn't fit into one memory bank, your application has to pull in memory which is not local to a processor and use memory local to another processor.
On linux you can run numactl --hardware to see the layout of processors and memory banks.

Which heap size do you prefer?

I know there is no "right" heap size, but which heap size do you use in your applications (application type, jdk, os)?
The JVM Options -Xms (initial/minimum) and -Xmx (maximum) allow for controlling the heap size. What settings make sense under which circumstances? When are the defaults appropriate?

You have to try your application and see how it performs. for example, I used to always run IDEA out of the box until I've got this new job where I work on this huge monolithic project. IDEA was running very slow and regularly throwing out of memory errors when compiling the full project.
first thing I did is to ramp up the heap to 1 gig. this got rid of the out of memory issues but it was still slow. I also noticed IDEA was regularly freezing for 10 seconds or so after which the used memory was cut in half only to ramp up again and , and that triggered the garbage collection idea. I now use it with -Xms512m, -Xmx768m but, I also added -Xincgc, to activate incremental garbage collection
As a result, I've got my old IDEA back: it runs smooth, doesn't freeze anymore and never uses more than 600m of heap.
For your application you have to use a similar approach. try to determine the typical memory usage and tune your heap for the application to run well in those conditions. But also let advanced users tune the setting, to address out of the ordinary data loads.

It depends on the application type. A desktop application is much different than a web application. An application server is much different than a standalone application.
It also depends on the JVM that you are using. JDK5 and later 6 include enhancements that help understand how to tune your application.
Heap size is important, but its also important to know how it plays with the garbage collector.
JDK1.4 Garbage Collector Tuning
JDK5 Garbage Collector Tuning
JDK6 Garbage Collector Tuning

Actually I always considered it very strange that Java limits the heap size. A native application can usually use as much heap as it wants, until it runs out of virtual address space. The only reason to limit the heap in Java seems the garbage collector, which has a certain kind of "laziness" and may not garbage collect objects, unless there is a necessity to do so. That means if you choose the heap too big, your app constantly uses more memory than is really necessary.
However, Sun has improved the GC a lot over the years and to emulate the behavior of a native C app, I would set the initial heap size to 32 MB (for small programs) or 64 MB (for bigger ones) and the maximum to something between 1-2 GB. If your app really needs over a 1 GB of memory, it is most likely broken (unless you deal with data objects that large), but I see no reason why your app should be killed, just because it goes over a certain heap size.
Of course, this is referring to normal PCs. If you create Java code for mobile phones or other limited devices, you should probably adopt the initial and maximum heap size to the limitations of that device.

Typically i try not to use heaps which are larger than 1GB.
It will cost you on major garbage collections.
Sometime it is better to split your application to a few JVM on the same machine and not you large heap sizes.
Major collection with a large heap size can take >10 mintues (on unoptimized GC applications).

This is entirely dependent on your application and any hardware limitations you may have. There is no one size fits all.
jmap can be used to have a look at what heap you are actually using and is a good starting point for right-sizing the heap.

You need to spend quite some time in JConsole or visualvm to get a clear picture on what the plateau memory usage is. Wait until everything is stable and you see the characteristic sawtooth curve of heap memory usage. The peaks should be your 70-80% heap, depending on what garbage collector you use.
Most garbage collectors trigger full GCs when heap usage reaches a certain percentage. This percentage is from 60% to 80% of max heap, depending on what strategy is involved.

1.3Gb for a heavy GUI application.
Unfortunately on Linux the JVM seems to pre-request 1.3G of virtual memory in that situation, which looks bad even if it's not needed (and causes a lot of confused grumbling from users)

On my most memory intensive app:
-Xms250M -Xmx1500M -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.