debugging tomcat crash

debugging tomcat crash - java

I have an instance of Tomcat which periodically crashes for unknown reasons.
There are no errors left in the logs, only a line in Event Viewer saying "Tomcat terminated unexpectedly".
In a test environment I have been unable to replicate the issue. I am therefore mostly restricted to passive monitoring of the production environment.
The problem does not seem to be related to memory as the unexpected terminations show no obvious correlation to the process' memory usage.
What steps could I take to further diagnose this problem?
EDIT:
Some corrections/clarifications:
It is actually not a single "instance" of Tomcat, rather several instances with similar configurations.
OS is Windows 2003.
Java version is Java 6.
UPDATE:
Looks like the issue might be related to memory after all. Discovered some crash dumps which were created in the Tomcat directory (not .../Tomcat/logs).
The dumps mostly contained errors such as:
java.lang.OutOfMemoryError: requested 32756 bytes for ChunkPool::allocate. Out of swap space?
This is unexpected as the process sometimes crashed when it's memory usage was a relatively low point (compared to historical usage).
In all dumps, perm gen space is at 99% usage, but in absolute terms this usage is not consistent, and is nowhere near the limit specified in -XX:MaxPermSize.

This indicates to me that the whole JVM crashed, which is a rather unusual thing. I would consider the following steps:
First check the hardware is ok. Run memtest86+ - http://www.memtest86.com/ or on a Ubuntu cd - to test the memory. Let it run a while to be absolutely certain.
Then see if the version of Java you use, is ok. Some versions of Java 6 broke some subtle functionality. The latest Java 5 might be a good solution at this point.
Disable the Tomcat native code to improve performance. There is a native library which Tomcat uses for something. Since you have a crashing JVM, getting rid of native code is a very good start.
See if there is some restrictions in the version of Windows you use. A cpu usage limit before termination, or any other quota.

Generally if a process crashes in windows, a dump file is created. Load the dump file in windbg (windows debugger) and get a stack trace of the thread that caused the exception. This should give you a better idea what the problem is.

Related

Tomcat commited virtual memory = more than x2 XmX Setting

I'm having difficulties understanding / ignorning current memory commitments (virtual memory) by my jvms.
For better context:
We're running an closed source java application. We just host / run the application by deploying the delivered .war files. But overall we're are responsible for running the application (this construct, as obscure as it is, is non negotiable ;) )
OS is RHEL 7, Tomcat v8. Overall we're running several Tomcats with an apache 2.2 in front of them as a loadbalancer
We experience massive memory swapping. It seems like every time we give the server VM more RAM it immediately gets consumed by the tomcats. Currently we run 7 tomcats with a total 48GB of RAM. The plan is to upgrade to 64GB or even higher but I fear this won't change a thing with tomcats hunger for memory...
For security reasons I've blacked out some internal paths/IPs/ports etc.
I do know that the memory consumption of a jvm consits of more than just assigned Heap Space, but in my case Xmx is set to 4096m and commited memory is about 10GB - as seen in the jconsole screenshot but can also be confirmed by a top on the server.
And this just doesn't seem right. I mean - over 2x more than heap space?
I've also read some memory related java stuff, but as stated above - the application is closed source which gives me not that much leeway in debugging / trying stuff out,
And since I am relatively new to tomcat I figured I might as well turn to a more experienced community :)
So here are the main questions, which came up:
Is there a certain setting I am missing which definatley caps the max
commited memory size for a tomcat?
Is the behaviour we're experiencing normal?
How should I talk with the devs about this problem? We've already done Heap Dumps but they weren't as telling as we hoped. Heap is correctly set and is almost always at around 50% load.
Am I missing something here?
I know it's hard to tell from a distance, especially if no source code can be provided.
But any help is appreciated :)
Thanks!
Alex

How can I find out why Tomcat is using so much memory and stop it?

I'm profiling my webapp using YourKit Java Profiler. The webapp is running on tomcat 7 v30, and I can see that the heap of the JVM is ~30 megabytes, but Tomcat.exe is using 200 megabytes and keeps rising keeps rising.
Screenshot: http://i.imgur.com/Zh9NGJ1.png
(On left is how much memory profiler says Java is using, on right is Windows usage of tomcat.exe)
I've tried adding different flags to tomcat, but still the memory usage keeps rising and rising. I've tried precompiling my .jsp files as well in case that helps, but it hasn't.
The flags I've added to tomcat's java flags:
-XX:+UseG1GC
-XX:MinHeapFreeRatio=10
-XX:MaxHeapFreeRatio=10
-XX:GCTimeRatio=1
Tomcat is also running as a windows service if that matters at all.
I need assistance figuring out how to get tomcat to use less memory/know why it's using so much memory. As is is now, it keeps going until it uses the whole system's memory.

So the solution that I found was to add some flags to the tomcat run.
Not sure which flag it was. I think it might've been the jacob library we were using, or some combo of these flags with that. Hopefully this can help people in the future.
-XX:+UseG1GC
-XX:MinHeapFreeRatio=10
-XX:MaxHeapFreeRatio=10
-XX:GCTimeRatio=1
-Dcom.jacob.autogc=true
-Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true

You should look for memory leaks in your application, or large sessions that live too long and not invalidated. Try to think which functionality holds too many objects for long periods.

You could dump Yor memory and see what is using it. Propably it will be a long list of Your application objects, or strings You unknowingly internalize.
You might use a tool like jvisualvm, or a cool eclipse tool: http://www.eclipse.org/mat/ to do that.
If You do that and still dont know why, then post us what objects are in Your memory....

Java root cause java.lang.OutOfMemoryError error [duplicate]

This question already has answers here:
Closed 12 years ago.
I am new to Java and given the task to fix a bug and the issue is as follows. It would be really great if you give suggestions/ideas what is this issue and how can I fix this.:
HTTP Status 500 -
--------------------------------------------------------------------------------
type Exception report
message
description The server encountered an internal error () that prevented it from fulfilling this request.
exception
org.apache.jasper.JasperException
org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:453)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:375)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:368)
root cause
javax.servlet.ServletException
org.apache.jasper.runtime.PageContextImpl.doHandlePageException(PageContextImpl.java:858)
org.apache.jasper.runtime.PageContextImpl.handlePageException(PageContextImpl.java:791)
org.apache.jsp.CustMaint.Jsp.ProfProfileDetails_jsp._jspService(ProfProfileDetails_jsp.java:4016)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:368)
root cause
java.lang.OutOfMemoryError
note The full stack trace of the root cause is available in the Apache Tomcat/5.5.17 logs.
--------------------------------------------------------------------------------
Apache Tomcat/5.5.17

Here's what the Tomcat guys have to say:
An Out Of Memory can be thrown by
several causes:
A servlet trying to load a several
GBytes file into memory will surely
kill the server. These kind of errors
must be considered a simple bug in our
program.
To compensate for the data
your servlet tries to load, you
increase the heap size so that there
is no room to create the stack size
for the threads that need to be
created. The memory required by each
thread will vary by OS but can be as
high as 2M by default and in some OS's
(like Debian Sarge) is not reducible
with the -Xss parameter. 1 Rule of
Thumb, use no more than 1G for heap
space in a 32-bit web application.
Deep recursive algorithms can also
lead to Out Of Memory problems. In
this case, the only fixes are
increasing the thread stack size
(-Xss), or refactoring the algorithms
to reduce the depth, or the local data
size per call.
A webapp that uses lots
of libraries with many dependencies,
or a server maintaining lots of
webapps could exhauste the JVM PermGen
space. This space is where the VM
stores the classes and methods data.
In those cases, the fix is to increase
this size. The Sun VM has the flag
-XX:MaxPermSize that allows to set its size (the default value is 64M)
Hard references to classes can prevent the
garbage collector from reclaiming the
memory allocated for them when a
ClassLoader is discarded. This will
occur on JSP recompilations, and
webapps reloads. If these operations
are common in a webapp having these
kinds of problems, it will be a matter
of time, until the PermGen space gets
full and an Out Of Memory is thrown.
Source: Tomcat Wiki: OutOfMemory

Well... who really caused the out of memory error?
If you ate 8 slices of pizza and you are full, is it the last slice that caused the out of stomach error?

Use Java Heap Analysis Tool (JHAT) with Eclipse MAT http://www.eclipse.org/mat/ to analyse what's going on inside JVM. What is eating how much memory. See the profile and then see the code causing that.
You can also use JConsole, it's dead easy to set it up. And you can see stuffs 'live'. TPTP is also a good option, unfortunately, I find it hard to configure.

This kind of problem is not easy to nail down based on just the stacktrace. It at least boils down to that you've either a memory leak in your application (the code is keeping (unnecessarily) too much objects for an (unnecessarily) long time in memory), or the server simply doesn't have enough memory in order to be able to run your webapp (simply because it is designed that way to require many memory).
To detect and fix memory leaks, use a Java profiler. If you don't have any memory leaks, i.e. the memory usage is stable but the code just really need that much memory, then simply give the server more memory to work with. The profiler is however still useful to spot memory holes in your webapp and optimize code accordingly.
If you're using Eclipse, use TPTP profiler or if you're using Netbeans, use builtin VisualVM profiler. Or when you're using standalone VisualVM, check this blog how to monitor Tomcat with it.

JVM crashes under stress on RHEL 5.2

I've got (the currently latest) jdk 1.6.0.18 crashing while running a web application on (the currently latest) tomcat 6.0.24 unexpectedly after 4 to 24 hours 4 hours to 8 days of stress testing (30 threads hitting the app at 6 mil. pageviews/day). This is on RHEL 5.2 (Tikanga).
The crash report is at http://pastebin.com/f639a6cf1 and the consistent parts of the crash are:
a SIGSEGV is being thrown
on libjvm.so
eden space is always full (100%)
JVM runs with the following options:
CATALINA_OPTS="-server -Xms512m -Xmx1024m -Djava.awt.headless=true"
I've also tested the memory for hardware problems using http://memtest.org/ for 48 hours (14 passes of the whole memory) without any error.
I've enabled -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps to inspect for any GC trends or space exhaustion but there is nothing suspicious there. GC and full GC happens at predicable intervals, almost always freeing the same amount of memory capacities.
My application does not, directly, use any native code.
Any ideas of where I should look next?
Edit - more info:
1) There is no client vm in this JDK:
[foo#localhost ~]$ java -version -server
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
[foo#localhost ~]$ java -version -client
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
2) Changing the O/S is not possible.
3) I don't want to change the JMeter stress test variables since this could hide the problem. Since I've got a use case (the current stress test scenario) which crashes the JVM I'd like to fix the crash and not change the test.
4) I've done static analysis on my application but nothing serious came up.
5) The memory does not grow over time. The memory usage equilibrates very quickly (after startup) at a very steady trend which does not seem suspicious.
6) /var/log/messages does not contain any useful information before or during the time of the crash
More info: Forgot to mention that there was an apache (2.2.14) fronting tomcat using mod_jk 1.2.28. Right now I'm running the test without apache just in case the JVM crash relates to the mod_jk native code which connects to JVM (tomcat connector).
After that (if JVM crashes again) I'll try removing some components from my application (caching, lucene, quartz) and later on will try using jetty. Since the crash is currently happening anytime between 4 hours to 8 days, it may take a lot of time to find out what's going on.

Do you have compiler output? i.e. PrintCompilation (and if you're feeling particularly brave, LogCompilation).
I have debugged a case like this in the part by watching what the compiler is doing and, eventually (this took a long time until the light bulb moment), realising that my crash was caused by compilation of a particular method in the oracle jdbc driver.
Basically what I'd do is;
switch on PrintCompilation
since that doesn't give timestamps, write a script that watches that logfile (like a sleep every second and print new rows) and reports when methods were compiled (or not)
repeat the test
check the compiler output to see if the crash corresponds with compilation of some method
repeat a few more times to see if there is a pattern
If there is a discernable pattern then use .hotspot_compiler (or .hotspotrc) to make it stop compiling the offending method(s), repeat the test and see if it doesn't blow up. Obviously in your case this process could theoretically take months I'm afraid.
some references
for dealing with logcompilation output --> http://wikis.sun.com/display/HotSpotInternals/LogCompilation+tool
for info on .hotspot_compiler --> http://futuretask.blogspot.com/2005/01/java-tip-7-use-hotspotcompiler-file-to.html or http://blogs.oracle.com/javawithjiva/entry/hotspotrc_and_hotspot_compiler
a really simple, quick & dirty script for watching the compiler output --> http://pastebin.com/Haqjdue9
note that this was written for solaris which always has bizarre options to utils compared to the gnu equivalents so no doubt easier ways to do this on other platforms or using different languages
The other thing I'd do is systematically change the gc algorithm you're using and check the crash times against gc activity (e.g. does it correlate with a young or old gc, what about TLABs?). Your dump indicates you're using parallel scavenge so try
the serial (young) collector (IIRC it can be combined with a parallel old)
ParNew + CMS
G1
if it doesn't recur with the different GC algos then you know it's down to that (and you have no fix but to change GC algo and/or walk back through older JVMs until you find a version of that algo that doesn't blow).

A few ideas:
Use a different JDK, Tomcat and/or OS version
Slightly modify test parameters, e.g. 25 threads at 7.2 M pageviews/day
Monitor or profile memory usage
Debug or tune the Garbage Collector
Run static and dynamic analysis

Have you tried different hardware? It looks like you're using a 64-bit architecture. In my own experience 32-bit is faster and more stable. Perhaps there's a hardware issue somewhere too. Timing of "between 4-24 hours" is quite spread out to be just a software issue. Although you do say system log has no errors, so I could be way off. Still think its worth a try.

Does your memory grow over time? If so, I suggest changing the memory limits lower to see if the system is failing more frequently when the memory is exhausted.
Can you reproduce the problem faster if:
You decrease the memory availble to the JVM?
You decrease the available system resources (i.e. drain system memory so JVM does not have enough)
You change your use cases to a simpler model?
One of the main strategies that I have used is to determine which use case is causing the problem. It might be a generic issue, or it might be use case specific. Try logging the start and stopping of use cases to see if you can determine which use cases are more likely to cause the problem. If you partition your use cases in half, see which half fails the fastest. That is likely to be a more frequent cause of the failure. Naturally, running a few trials of each configuration will increase the accuracy of your measurements.
I have also been known to either change the server to do little work or loop on the work that the server is doing. One makes your application code work a lot harder, the other makes the web server and application server work a lot harder.
Good luck,
Jacob

Try switching your servlet container from Tomcat to Jetty http://jetty.codehaus.org/jetty/.

If I was you, I'd do the following:
try slightly older Tomcat/JVM versions. You seem to be running the newest and greatest. I'd go down two versions or so, possibly try JRockit JVM.
do a thread dump (kill -3 java_pid) while the app is running to see the full stacks. Your current dump shows lots of threads being blocked - but it is not clear where do they block (I/O? some internal lock starvation? anything else?). I'd even maybe schedule kill -3 to be run every minute to compare any random thread dump with the one just before the crash.
I have seen cases where Linux JDK just dies whereas Windows JDK is able to gracefully catch an exception (was StackOverflowException then), so if you can modify the code, add "catch Throwable" somewhere in the top class. Just in case.
Play with GC tuning options. Turn concurrent GC on/off, adjust NewSize/MaxNewSize. And yes, this is not scientific - rather desperate need for working solution. More details here: http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html
Let us know how this was sorted out!

Is it an option to go to the 32-bit JVM instead? I believe it is the most mature offering from Sun.

Strategies for the diagnosis of Java memory issues

I've been tasked with debugging a Java (J2SE) application which after some period of activity begins to throw OutOfMemory exceptions. I am new to Java, but have programming experience. I'm interested in getting your opinions on what a good approach to diagnosing a problem like this might be?
This far I've employed JConsole to get a picture of what's going on. I have a hunch that there are object which are not being released properly and therefor not being cleaned up during garbage collection.
Are there any tools I might use to get a picture of the object ecosystem? Where would you start?

I'd start with a proper Java profiler. JConsole is free, but it's nowhere near as full featured as the ones that cost money. I used JProfiler, and it was well worth the money. See https://stackoverflow.com/questions/14762/please-recommend-a-java-profiler for more options and opinions.

Try the Eclipse Memory Analyzer, or any other tool that can process a java heap dump, and then run your app with the flap that generates a heap dump when you run out of memory.
Then analyze the heap dump and look for suspiciously high object counts.
See this article for more information on the heap dump.
EDIT: Also, please note that your app may just legitimately require more memory than you initially thought. You might try increasing the java minimum and maximum memory allocation to something significantly larger first and see if your application runs indefinitely or simply gets slightly further.

The latest version of the Sun JDK includes VisualVM which is essentially the Netbeans profiler by itself. It works really well.

http://www.yourkit.com/download/index.jsp is the only tool you'll need.
You can take snapshots at (1) app start time, and (2) after running app for N amount of time, then comparing the snapshots to see where memory gets allocated. It will also take a snapshot on OutOfMemoryError so you can compare this snapshot with (1).
For instance, the latest project I had to troubleshoot threw OutOfMemoryError exceptions, and after firing up YourKit I realised that most memory were in fact being allocated to some ehcache "LFU " class, the point being that we specified loads of a certain POJO to be cached in memory, but us not specifying enough -Xms and -Xmx (starting- and max- JVM memory allocation).
I've also used Linux's vmstat e.g. some Linux platforms just don't have enough swap enabled, or don't allocate contiguous blocks of memory, and then there's jstat (bundled with JDK).
UPDATE see https://stackoverflow.com/questions/14762/please-recommend-a-java-profiler

You can also add an "UnhandledExceptionHandler" to your Application's Thread. This will catch 'uncaught' exception, like an out of memory error, and you will at least have an idea where the exception was thrown. Usually this not were the problem is but the 'new' that couldn't be satisfied. As a rule I always add the UnhandledExceptionHandler to a Thread if nothing else to add logging.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.