Memory leak in a Java web application

Memory leak in a Java web application - java

I have a Java web application running on Tomcat 7 that appears to have a memory leak. The average memory usage of the application increases linearly over time when under load (determined using JConsole). After the memory usage reaches the plateau, performance degrades significantly. Response times go from ~100ms to [300ms, 2500ms], so this is actually causing real problems.
JConsole memory profile of my application:
Using VisualVM, I see that at least half the memory is being used by character arrays (i.e. char[]) and that most (roughly the same number of each, 300,000 instances) of the strings are one of the following: "Allocation Failure", "Copy", "end of minor GC", all of which seem to be related to garbage collection notification. As far as I know, the application doesn't monitor the garbage collector at all. VisualVM can't find a GC root for any of these strings, so I'm having a hard time tracking this down.
Memory Analyzer heap dump:
I can't explain why the memory usage plateaus like that, but I have a theory as to why performance degrades once it does. If memory is fragmented, the application could take a long time to allocate a contiguous block of memory to handle new requests.
Comparing this to the built-in Tomcat server status application, the memory increases and levels off at, but doesn't hit a high "floor" like my application. It also doesn't have the high number of unreachable char[].
JConsole memory profile of Tomcat server status application:
Memory Analyzer heap dump of Tomcat server status applicationp:
Where could these strings be allocated and why are they not being garbage collected? Are there Tomcat or Java settings that could affect this? Are there specific packages that could be affect this?

I removed the following JMX configuration from tomcat\bin\setenv.bat:
set "JAVA_OPTS=%JAVA_OPTS%
-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.port=9090
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false"
I can't get detailed memory heap dumps anymore, but the memory profile looks much better:
24 hours later, the memory profile looks the same:

I would suggest to use memoryAnalyzer for analyzing your heap, it gives far more information.
http://www.eclipse.org/mat/
there is a standalone application and eclipse embedded one.
you just need to run jmap on your application and analyze the result with this.

The plateau is caused by the available memory dropping below the default percentage threshold which causes a Full GC. This explains why the performance drops as the JVM is constantly pausing while it tries to find and free memory.
I would usually advise to look at object caches but in your case I think your Heap size is simply too low for a Tomcat instance + webapp. I would recommend increasing your heap to 1G (-Xms1024m -Xmx1024m) and then review your memory usage again.
If you still see the same kind of behaviour then you should take another Heap dump and look at the largest consumers after String and Char. It my experience this is usually caching mechanisms. Either increase your memory further or reduce the caching stores if possible. Some caches only define number of objects so you need to understand how big each cached object is.
Once you understand your memory usage, you may be able to lower it again but IMHO 512MB would be a minimum.
Update:
You need not worry about unreachable objects as they should be cleaned up by the GC. Also, it's normal that the largest consumers by type are String and Char - most objects will contain some kind of String so it makes sense that Strings and Chars are the most common by frequency. Understanding what holds the objects that contains the Strings is the key to finding memory consumers.

I can recommend jvisualvm which comes along with every Java installation. Start the programm, connect to your Webapplication. Go to Monitor -> Heap Dump. It now may take some time (depending on the size).
The navigation through the Heap Dump is quite easy, but the meaning you have to figure out yourself (not too complicated though), e.g.
Go to Classes (within the heapdump), select java.lang.String, right click Show in Instances View. After that you'll see on the left side table String instances currently active in your system.
Klick on one String instance and you'll see some String preferenes on the right-upper part of the right table, like the value of the String.
On the bottom-right part of the right table you'll see where this String instance is referenced from. Here you have to check where the most of your *String*s are being referenced from. But with your case (176/210, good propability to find some String examples which causes your problems soon) it should be clear after some inspection where the problem lies.

I just encountered the same problem in a totally different application, so tomcat7 is probably not to blame. Memory Analyzer shows 10M unreachable String instances in the process (which has been running for about 2 months), and most/all of them have values that relate to Garbage Collection (e.g., "Allocation Failure", "end of minor GC")
Memory Analyzer
Full GC is now running every 2s but those Strings don't get collected. My guess is that we've hit a bug in the GC code. We use the following java version:
$ java -version
java version "1.7.0_06"
Java(TM) SE Runtime Environment (build 1.7.0_06-b24)
Java HotSpot(TM) 64-Bit Server VM (build 23.2-b09, mixed mode)
and the following VM parameters:
-Xms256m -Xmx768m -server -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC
-XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:NewSize=32m -XX:MaxNewSize=64m
-XX:SurvivorRatio=8 -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
-Xloggc:/path/to/file

By accident, I stumbled across the following lines in our Tomcat's conf/catalina.properties file that activate String caching. This might be related to your case if you have any of them turned on. It seems others are warning to use the feature.
tomcat.util.buf.StringCache.byte.enabled=true
#tomcat.util.buf.StringCache.char.enabled=true
#tomcat.util.buf.StringCache.trainThreshold=500000
#tomcat.util.buf.StringCache.cacheSize=5000

Try to use MAT and make sure that when you parse the heapdump, do it not dropping out the unreachable objects.
To do so, follow the tutorial here.
Then you can run a simple Mem Leak Analysis (This is a good tutorial)
That should quickly lead you to the root cause.

As this sounds unspecific, one candidate would have been JSF. But then I would have expected hash maps leaking too.
Should you use JSF:
In web.xml you could try:
javax.faces.STATE_SAVING_METHOD client
com.sun.faces.numberOfViewsInSession 0
com.sun.faces.numberOfLogicalViews 1
As for tools: JavaMelody might be interesting for continual statistics, but needs effort.

Related

java heap and thread analysis for memory leak

My WebLogic server was configured with 16gb of heap space, but it was 90% used within 1 hour of production usage when most of the users started work. I observed there were several stuck threads whenever this happens.
I have captured the heap dump when the heap was approx 10% free. How do I inspect the heap dump to find out the memory leak, or process, codes which is causing this issue.
I have tried to understand the memory leak, running tools like JMap and Eclipse MAT, but it maybe due to lack of experience, I couldn't understand what these tools are trying to show. Or how/what should I look out for?
I have both the before/after GC heap dump to analyze.
I have reviewed the thread dumps, there were no "waiting to lock" objects threads, the threads were similar as shown below, with threads stuck with no obvious reasons.

According to your heap dump, your biggest memory issue is the int arrays, indeed it takes nearly 70 % of your heap (Yes sort the Size Column instead).
Select it in your heap dump, right click and select on Show in Instances View
Then browse the biggest objects and for each of them right click and select Show Nearest GC Root to see which Object has still an hard reference to the int array which prevents to be eligible for the GC.
It could help you to find your memory leak assuming that it is a memory leak.
See below an example of Nearest GC Root allowing to identify a leak that I added intentionally to my program just to show the idea. As you can see in the screenshot, I have an array of int which cannot be eligible for the GC because it is stored in an HashMap called leak in my class Application, so I know that my memory issue could be due to this particular HashMap especially if I have many other objects which lead to this HashMap.
NB: Be patient when you try to identify a leak as it is not always obvious, the ideal situation is where you have a huge object that takes the whole heap but obviously it is not your case there is nothing really obvious that is the reason why I propose to investigate the int arrays first. Don't forget that it could also be little int arrays but thousands of them with the same Nearest GC Root.
Another trick, If you have JProfiler you can simply follow this wonderful tutorial to find your leak.
Response Update:
One simple way to better identify the root cause of the memory leak is to take at least 2 heap dumps then compare them using a tool like jhat with the syntax
jhat -J-Xmx2G -baseline ${path-to-the-first-heap-dump} ${path-to-the-second-heap-dump}
It will launch a small HTTP sever on port 7000 so:
Launch http://localhost:7000/
Then click on Show instance counts for all classes (including platform)
You will then see the list of Classes ordered by total amount of new instances created. You can then use VisualVM to do what I described in the first part of my answer to find the root cause of your memory leak.
You can also use jhat
By selecting of the Top Classes then for each of them
click on one "Reference to this Object"
then click on Exclude weak refs
You will then see the GC root of each instances like the next screenshot:
Another way is to use Eclipse Memory Analyzer also called MAT.
Open the second snapshot with it
Select the view histogram
Then for each of the Top Classes right click
Choose Merge Shortest Paths To GC Roots/ Exclude All references
you will then see something like the next screenshot:

The JDK's "jmap -histo" command will dump object counts/bytes for all classes to a text file. If you capture/compare a few of these dumps over time, you will see which ones grow continually -- your memory leak. The overhead of -histo is much lower than that of capturing a full heap dump.
Comparing just a few dumps (like the python script detailed here) seems like too small of a sample, so I wrote an open-source tool (here) that runs this jmap -histo command in the background (at an interval). It has a live display and tracks the % of time that the byte count for each class is on the rise.

It seems you, probably, have a memory leak situation. Your best approach is to use Java Mission Control with Flight Recorder to get the class and method leaking.
You should setup your weblogic managed server with the following parameters:
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=8999
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-XX:+UnlockCommercialFeatures
-XX:+FlightRecorder
When you set this up, follow the instructions here to detect the leak.
Hope it helps !!

I am one of the developers of the tool called Plumbr. Among other things we make an automatic analysis of heap contents in case of excessive memory usage. You may find it useful.

Per your comments: you have Java 7 with 16GB heap, no GC algorithm explicitly specified, so default for Java 7 is Throughput GC, which is not suitable for most web apps, for it leads to long GC pauses for big heaps.
Switch to ConcurrentMarkSweep GC, this way GC will not wait till your memory fills up and will try its best to collect garbage incrementally, so that you will have fewer Stop The World pauses.

Did you try yourkit profiler? It's not free, but you can evaluate it for 30 days. In this case if you dump contains all object (not only live), you will be able to check roots for them as well. Because it could be that you don't have memory leak, but too big memory footprint. Also it would be great to enable GC logs and parse how much FullGC pauses do you have:
grep "Full GC" jvm_gc.log | wc -l
In ideal world it should be 0 :)
Btw, whole this article could be helpful for you.

Out of memory error: Java heap size when memory is available

I'm running java with java -Xmx240g mypackage.myClass
OS is Ubuntu 12.10.
top says MiB Mem 245743 total, and shows that java process has virt 254g since the very beginning, and res is steadily increasing up to 169g. At that point it looks like it starts garbage collect a lot, I think so because the program is single-threaded at that point, and CPU% is mostly 100% up to this point, and it jumps around 1300-2000 at this point (I conclude it is multithreaded garbage collector), and then res slowly moves to 172g. At that point java crashes with
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at the line with new double[2000][5]
java -version says
java version "1.7.0_15"
OpenJDK Runtime Environment (IcedTea7 2.3.7) (7u15-2.3.7-0ubuntu1~12.10)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
Hardware is Amazon cr1.8xlarge instance
It seems to me that java crashes even when there's a lot of memory available. It is clearly not possible, I have to interpret some numbers wrong. Where should I look to understand what's going on?
Edit:
I don't specify any GC options. The only command-line option is -Xmx240g
My program is successfully working on many inputs, and top said sometimes that it uses up to 98.3% of memory. However I reproduced the situation described above with certain program input.
Edit2:
This is scientific application. It has gigantic tree (1-10 millions of nodes), in each node there are couple double arrays with size approx. 300x3 - 900x5. After initial tree creation program does not allocate much memory. Most of the time there are some arithmetic operations going on with these arrays.
Edit3:
HotSpot JVM died the same way, used CPU a lot at 170-172g mark and crashed with the same error. Looks like 70-75% of memory is some magical line that JVM does not want to cross.
Final solution:
With -XX:+UseConcMarkSweepGC -XX:NewRatio=12 program made it through 170g mark and is happily working further.

Analysis
The first thing you need to do is get a heap dump so you can figure out exactly what the heap looks like when the JVM crashes. Add this set of flags to the command line:
-XX:+HeapDumpOnOutOfMemoryError -verbose:gc -XX:+PrintGCDetails
When a crash happens, the JVM is going to write out the heap to disk. And frankly, its going to take a long time on a heap that size. Download Eclipse MAT or install the plugin if you're already running Eclipse. From there, you can load up the heap dump and run a couple of canned reports. You'll want to check the Leak Suspects and Dominator Tree to see where your memory is going and determine that you don't have an actual leak.
After that, I would recommend you read this document by Oracle about Garbage Collection, however here are some things you can consider:
Concurrent GC
-XX:+UseConcMarkSweepGC
I've never heard of anyone getting away with using the parallel only collector on a heap that size. You can activate the concurrent collector, and you'll want to read up on incremental mode and determine if its right for your workload / hardware combo.
Heap Free Ratio
-XX:MinHeapFreeRatio=25
Dial this down to lower the bar for the garbage collector when you do a full collection. This may prevent you from running out of memory doing a full collection. 40% is the default, experiment with smaller values.
New Ratio
-XX:NewRatio
We'll need to hear more about your actual workload: is this a webapp? A swing app? Depending on how long objects are expected to remain alive on the heap will have an impact on the new ratio value. Server-mode VMs like the one you're running have a fairly high new ratio by default (8:1), this may not be ideal for you if you have a lot of long-lived objects.

If I understood your question correcly, it looks like memory leak actually happening before the program hits the line new double[2000][5]. It seems the memory is already low whe nthe line is hit, thus it throws up when this line asks for more memory.
I would use jvisualvm or similar tools to find out where the memory leak is. Memory leak I've encountered mostly to do with Strings being created in a loop, Cache not being cleared etc.

As a general advice, NEVER use OpenJDK, even less for production environments, it is much slower than the one from Sun/Oracle.
Apart from that I have never seen VM using sooo much memory, but I guess that is what you need (or maybe you have a code using more memory than needed?)
EDIT : OpenJDK for server is fine, only differences with Sun/Oracle JDK is regarding desktop stuff (sound, gui...) so ignore that part.

Java using up far more memory than allocated with -Xmx

I have a project I'm writing (in Java) for a class where the prof says we're not allowed to use more than 200m
I limit the stack memory to 50m (just to be absolutely sure) with -Xmx50m but according to top, it's still using 300m
I tried running Eclipse Memory Analyzer and it reports only 26m
Could this all be memory on the stack?, I'm pretty sure I never go further than about 300 method calls deep (yes, it is a recursive DFS search), so that would have to mean every stack frame is using up almost a megabyte which seems hard to believe.
The program is single-threaded. Does anyone know any other places in which I might reduce memory usage? Also, how can I check/limit how much memory the stack is using?
UPDATE: I'm using the following JVM options now with no effect (still about 300m according to top): -Xss104k -Xms40m -Xmx40m -XX:MaxPermSize=1k
Another UPDATE: Actually, if I let it run a little bit longer (with all these options) about half the time it suddenly drops to 150m after 4 or 5 seconds (the other half it doesn't drop). What makes this really strange is that my program has no stochastic (and as I said it's single-threaded) so there's no reason it should behave differently on different runs
Could it have something to do with the JVM I'm using?
java version "1.6.0_27"
OpenJDK Runtime Environment (IcedTea6 1.12.3) (6b27-1.12.3-0ubuntu1~10.04)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
According to java -h, the default JVM is -server. I tried adding -cacao and now (with all the other options) it's only 59m. So I suppose this solves my problem. Can anyone explain why this was necessary? Also, are there any drawbacks I should know about?
One more update: cacao is really really slow compared to server. This is an awful option

Top command reflects the total amount of memory used by the Java application. This includes among other things:
A basic memory overhead of the JVM itself
the heap space (bounded with -Xmx)
The permanent generation space (-XX:MaxPermSize - not standard in all JVMs)
threads stack space (-Xss per stack) which may grow significantly depending on the number of threads
Space used by native allocations (using ByteBufer class, or JNI)

Max memory = [-Xmx] + [-XX:MaxPermSize] + number_of_threads * [-Xss]
here max heap memory as -Xmx ,min heap memory as -Xms,stack memory as -Xss
and -XX maxPermSize
The following example illustrates this situation. I have launched my tomcat with the following startup parameters:
-Xmx168m -Xms168m -XX:PermSize=32m -XX:MaxPermSize=32m -Xss1m

With -Xmx you are configuring heap size. To configure stack size use -Xss parameter. Sum of those two parameters should be approximately what you want:
-Xmx150m -Xss50m
for example.
Additionally there is also -XX:MaxPermSize parameter which controls. This parameter for -client has default value of 32mb and for -server 64mb. According to your configuration calculate it as well. PermGen space is:
The permanent generation is used to hold reflective of the VM itself such as class objects and method objects.
So basically it stores internal data of the JVM, like classes definitions and intern-ed strings.
At the end I must say that there is one part which you can't control, that is memory used by native java process. Java is program, just like any other, so it uses memory also. If you are watching memory usage in Task Manager you will see this memory as well together with your program memory consumption.

It's important to note that "total memory used" (RSS in Linux land) includes JDK heap (+ other JDK areas) as well as any "native memory" allocated.
For instance, these people found that allocating too many jaxbcontexts (which have associated native memory) between GC's could cause it to use a lot of extra RAM. Another common one is apparently ZipInflater if you don't call close on it (or GZipStream, etc.)
http://sleeplessinslc.blogspot.com/2014/08/jvm-native-memory-leak.html
His final workaround/fix was to either GC "more often" (by using GC1 garbage collector, or specifying a smaller [ironically] -Xmx setting) or by cacheing the JaxBContext objects (since they have no close method so you can't control the leak).
Also note that sometimes you can find memory culprits by just examing jstack: http://javaeesupportpatterns.blogspot.com/2011/09/jaxbcontext-performance-problem-case.html
It's also sometimes possible to "miss" closing for instance GZipStreams accidentally http://kohsuke.org/2011/11/03/quiz-time-memory-leak-in-java

Have you tried using JVisualVM?
http://docs.oracle.com/javase/6/docs/technotes/tools/share/jvisualvm.html
I've often found it helps me track this stuff down. It will show you how much of each kind of memory is being used in even let you drill in and find out what.

Eclipse not releasing memory in Java process on Linux

My Linux server need to be able to handle 30+ eclipse instances for developers. I did a quick test of running 10 eclipse instances. The Java process associated with each eclipse initially around 200MB RSS memory, increased up to around 550MB, when more projects are loaded.
But Java process doesn't seem to release memory, after closing/deleting all projects within eclipse instances. I still see it uses over 550MB RSS.
How can I change Eclipse or Java settings so that memory foot print got reduced when developers closed down projects or being idle for a while?
Thanks

You may want to experiment with these (and other) JVM tuning options to make the JVM less reluctant to return memory to the OS:
-XX:MaxHeapFreeRatio Maximum percentage of heap free after GC to avoid shrinking. Default is 70.
-XX:MinHeapFreeRatio Minimum percentage of heap free after GC to avoid expansion. Default is 40.
However, I suspect that you won't see the eclipse process shrink to anywhere near its initial size, since eclipse is a huge, complex application that probably lazy-loads (but does not unload, once used) a lot of classes and associated data structures.

I've never seen Java release memory.
I don't think you will get any value out of trying to get it to release memory with Eclipse, I've watched that little memory counter for YEARS and never once see the allocated memory drop.
You might try one of these.
After each session, exit the JVM and restart.
Set your -Xmx lower.
Separate your instances into categories with high -Xmx and low -Xmx and let the user determine which one he wants.
As a side-thought, if it really mattered to you, you MIGHT be able to run multiple eclipse instances under one VM. It would probably be WAY too much work (man-weeks to man-years), but if you could get it right you could reduce overhead by like 150-200mb/instance. The disadvantage would be that a VM crash (Pretty rare these days) would kill everyone.
Testing this theory would be a matter of calling eclipse's main from within an existing JVM and trying to get it to display somewhere useful. The rest of the man-year is spent trying to figure out where they used evil static variables or singletons and changing them to something else.

Switch the Java to use the G1 garbage collector with the HeapFreeRatio parameters. Use these options in eclipse.ini:
-XX:+UnlockExperimentalVMOptions
-XX:+UseG1GC
-XX:MinHeapFreeRatio=5
-XX:MaxHeapFreeRatio=25
Now when Eclipse eats up more than 1 GB of RAM for a complicated operation and switched back to 300 MB after Garbage Collection the memory will be released back to the operating system.

I would suggest checking on garbage collection, setting right options or even forcing GC periodically might increase time till eclipse memory usage grows high.
Following link might be useful http://www.eclipsezone.com/eclipse/forums/t93757.html

How to help java tomcat processes to regain used memory?

We are running an web application that is using Java 64bit 5 gigs of -Xmx of maximum heap size. We have no control over the java code. We can only tweak configuration parameters. The situation that we are facing is that the java processes after it takes the full heap allocated at start up, it starts acting very responding very slow to web site requests. My guess is that is waiting for the GC to collect unused memory objects.
The image below will show you a image of top in linux that shows the critical situation of the processes.
top image of java process http://cp.images.s3.amazonaws.com/ForumImages/java-gc-issue.jpg
Is there any way, we can help java regain the used memory inside the allocated space.
EDIT 1:
I used some of the answers below to be able to get to the answer of my question. Since my question was too difficult to answer, and it turned out to be a discussion. I will post how I was able to monitor the GC cycles and I will pick the answer with more votes. I used jconsole through real vnc viewer to be able to hook from my windows machine to my linux machine running tomcat.
I used this parameters to start the java processes:
-Djava.awt.headless=true -server -Xms512m -Xmx5120m -Dbuild.compiler.emacs=true -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=4999 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
This is the type of output I got, from jconsole through vnc viewer.
GC Sample Image http://cp.images.s3.amazonaws.com/ForumImages/Sample_GC_Image.jpg

I'd recommend that you not guess. Get some data to see exactly what's going on. You can use visual gc to see what's happening.
If it's the perm space that's being filled up, there won't be much you can do.
Which JVM? If it's 5 or higher there are additional parameters besides just max heap size you can adjust. Check out http://blog.springsource.com/2008/10/14/optimising-and-tuning-apache-tomcat-part-2/

It sounds like you have a memory leak if your application is getting progressively slower. The GC will always start to clean up unused objects as soon as it needs to. If you add -verbosegc you will be able to see how often a GC is performed and much memory is free after a GC. If the heap is more than 80% used you either have to increase the max memory or fix the program so it doesn't use so much.
Can you do a numactl --hardware ? I suggest you not use more than 80% of one memory bank or your GC times will increases dramatically.

Sounds like you need to get details of what's using up the stack. For that I recommend JMAP (http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jmap.html) which you can run on the process ID (PID) to see what's using memory. Take JMAP snapshots several times when the application is running and see what classes are not freeing up the stack.
Cheers,
-Richard

Try running the app with the -verbose:gc -Xloggc:/path/to/where/you/want/gc.log parameters, and study the resulting gc.log; it should tell you how much time is being spent in garbage collection. Or, as Duffymo suggests above, use visualGC to give you the same data.
Make sure you're using an appropriate colllector - you probably want either the parallel or low-pause (CMS) collectors, assuming you're on java 5.
Have a read of Sun's GC tuning document to see what else you can tweak. On occasions I have found very large heaps to be counter-productive (assuming the application doesn't actually need all that space); more frequent, smaller collections can sometimes end up less disruptive than occasional massive ones.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.