Trying to diagnose some bizarre Tomcat (7.0.21) and/or JVM errors on a 64-bit linux (CentOS) machine.
I'm load testing our server application and tried hitting it with 100K messages. Launched jvisualvm and kept my eye on the heap the whole time. Everything was looking great* (see below) until I got to about 93K processed messages and then Tomcat just died. Ran a ps on Tomcat's PID number to confirm it was dead.
Up until this crash:
Load test had been running for about 90 minutes; should have finished shortly thereafter since we were at 93K/100K)
CPU was holding strong around 45%
Used heap was around 2GB (plus or minus a bunch after GCs) but heap size grew from 4GB to MAX_HEAP after about 30 minutes
Class loading/unloading was cycling normally
Thread dumps were normal
Nowhere in the server code are any calls to System.exit() - so we can rule that right out (and yes I've double-checked!!!).
I'm not sure if this is Tomcat crashing or the JVM (how do I tell?). And even if I did know, I can't seem to find any indication of what went wrong:
All of the server app's logs just stop without any ERROR messages (even though we have logging universally set to DEBUG and higher)
Tomcat's catalina.out and respect localhost_access_* files just stop without any info
I've heard it is possible to have Tomcat log a coredump when it does but not sure how to do that and online examples aren't helping much.
How would SO go about diagnosing this? What steps should I take to start ruling out all of the possible factors?
Thanks in advance!
If the JVM crashes, you should have a hs_err_pidNNN.log file; you don't have to do anything to enable this. Its location depends on your OS and how you are running Tomcat. On Windows, they can show up on your desktop, unless you are running as a service. Otherwise, they should be in the current working directory of the crashed process.
Your operating system probably provides additional tools for process monitoring; you could describe your environment more, or perhaps ask at serverfault.com.
It's also possible that jvisualvm is actually causing the crash.
I'd try reproducing the problem, and progressively simplify the scenario to help isolate the cause.
Another possibility is that the OS is running out of memory and the OOM Killer is killing your process. In this case, the JVM wouldn't get an opportunity to write a heap dump, or an hs_err_pid file.
You can use the option java -XX:+HeapDumpOnOutOfMemoryError to create a heap dump for jvm crash due to out of memory error.
More details here Using HeapDumpOnOutOfMemoryError parameter for heap dump for JBoss.
Sorry I had to remove the green check from #erickson. I finally figured out what was killing Tomcat.
It looks like a profiler plugin is not configured correctly with VisualVM and attempting to run a profile on the Tomcat process killed it.
Investigating why right now, and will update this answer once I know more.
Related
I have a problem and my job depend on of that.
There are some java apps with tomcat under Linux that crash ramdonly (the apps are not mine and it can not be modified).
Every day we find in the morning some app broken.
I want to see the java stack just when it crashed the app for seeing the message of the JVM (outofmemory, nullpointer etc). For if i can see an advice for fixing the problem.
I do not know nothing about to do this.
I saw searching in internet visualvm and jconsole for this. Is enough for what i want to do?.
I want to see the messages of the java stack of JVM just when crashes.
I need help. Thank you very much.
looks like you have memory leak issue, does the app works after restart for a particular period of time?
You might want to see what happening inside java heap , for that you can take heap dump.. Use jcmd utility for this , you can find this utility within JDK installed on your server.
jcmd <process id/main class> GC.heap_dump filename=filename
NOTE: This will do a GC every time this runs.
To schedule this you need to set the cronjob.
Alternatively if you specify -XX:+HeapDumpOnOutOfMemoryError command-line option while running your application, then when an OutOfMemoryError exception is thrown, the JVM will generate a heap dump(in the logs).
Hope this helps. :)
We are trying to access an application from the tomcat which is on a different host, but it is not loading even though the tomcat is running. It was running fine for the past 3 months. We restarted the tomcat now it is working fine.
But, we could not able to zero in on what happened.
Any idea how to trace / what might have caused this?
The CPU usage was normal and the tomcat memory was 1205640.
the memory setting of tomcat are 1024- 2048(min-max)
We are using tomcat 7.
Help much appreciated....thanks in advance.....cheers!!
...also - not sure on Windows - you may be running out of file descriptors. This typically happens when streams are not properly closed in finally blocks.
In addition, check with netstat if you have a lot of sockets remaining open or accumulating in wait state.
Less likely, the application is creating threads and never releasing them.
The application is leaking something (memory, file descriptors, sockets, threads,...) and running over a limit.
There are different ways to track this. A profiler may help or more simply, running JVM dumps at regular intervals and checking what is accumulating. The excellent MAT will help you analyze the dumps.
Memory leak problems are not uncommon. If your Tomcat instance was running for three months and suddenly the contained application became unresponsive maybe that was the case. One solution (and if your resources allow you to do so) could be monitoring that Tomcat instance though JMX using jconsole to see how it behaves
I have been having occasional problems with a server I wrote. It's in Clojure, but I don't think that matters, and we can pretend it's in Java. Anyway, it works fine for hours at a time, but goes into fits where it behaves very badly: all activity stops, for around fifteen seconds, and then it works normally for a few seconds, then stops for fifteen seconds...and so on for (usually) about ten minutes or so, after which it goes back to behaving normally.
I've done a lot of profiling of it with YourKit, and I've ruled out a number of plausible suspects:
It's not a garbage collection issue: I'm running it with -XX:+UseConcMarkSweepGC, and I've verified that the server continues to run just fine during both minor and major collections, due to the concurrent nature of this garbage collector. And we're not thrashing as we run out of total memory or something: the current heap size is well below its max.
I don't think it's a locking/synchronization issue, but I'm not 100% sure on that. The YourKit profiler shows threads waiting sometimes, eg competing over the lock for System.out to produce log messages, but the only long waits are for worker threads in threadpools when there's nothing to do. And of course YourKit says it's never detected any deadlocks.
It's not something caused by having the profiler attached, because it still happens even if I boot the server up and then leave it alone without ever attaching the profiler.
It's not some other process on the system taking up all the CPU time: top shows CPU usage at 100% for my java process, and basically 0% for everything else.
My biggest problem is that I can't see what the server is doing during these strange funks, because the profiler stops receiving samples. Here's a graph of the CPU usage chart:
The left side of the graph is normal operation, during which we get profiler samples every second or so. The right side is "broken", and is very spiky because the profiler is only getting samples every ten seconds or so. In the samples it does get, the server seems to be doing its usual business: responding to requests and so on; and the logs confirm that it is doing normal stuff, but only at the times the profiler has samples for: during the upward-sloping "straight lines" on the graph, for which the profiler has no samples, the server is doing nothing at all.
So, does this graph look familiar to anyone? Have you had this problem before and fixed it? Or can you point me in the direction of a tool that can figure out what my server is doing during the time when YourKit can't? In case it matters, the server machine is running Ubuntu 10.04, and
$ java -version
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.10) (rhel-1.28.1.10.10.el5_8-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
Okay, from the comments it seems clear to me we are not going to be able to figure this out with the information you've given so far. The best we can do is to give suggestions on how to debug it...
I would try to use jstack during one of the spikes and see if you can use that to figure out where it hangs.
If you have no chance to measure or debug in code try to look form the outside.
I would at first to try to reproduce the problem. In other words is there a external event that produce the behavior. Try to change the load on server. Switch every thing you can to reproduce the problem.
Maybe it's also a good idea to sniff the network traffic (tcpdump) to find something interesting around the time when you server hangs.
You can also run it on another operating system to check if it depends from your installation environment.
If you can't reproduce a situation where the problem occurs, try to find situations where you don't get the problem. For instance remove the server from net. Shutdown all other services.
If you can't find with that any change of behavior of your program try to reduce the complexity of your working code and see if you can find a internal module that seems to be related with the problem.
Have you had this problem before and fixed it? Or can you point me in
the direction of a tool that can figure out what my server is doing
during the time when YourKit can't?
If you have shell access on the server and can see stdout, try taking a thread dump when the server becomes unresponsive. Not sure if this will give you anything different than what jstack (mentioned in the other answer) would give you or not.
On Ubuntu: kill -QUIT <java-pid> (will not actually kill the Java process).
http://www.crazysquirrel.com/computing/java/basics/java-thread-dump.jspx
Short description of my problem: I start up Tomcat with my deployed Wicket application. When I want to shut down tomcat I get this error message:
Error occurred during initialization of VM
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at java.lang.ref.Reference.<clinit>(Reference.java:145)
I am running the following setup:
Ubuntu Linux: 10.04 (lucid) with a 2.6.18-028stab094.3 kernel
Java Version: "1.6.0_26" Java HotSpot(TM) 64-Bit Server VM
Tomcat Version: 7.0.23
jvm_args: -Xms512m -Xmx512m -XX:MaxPermSize=205m (these are added via CATALINA_OPTS, nothing else)
Wicket 1.5.1
Tomcat is configured with two virtual hosts on subdomains with ModProxy
My application is deployed as ROOT.war in the appbase directory (it makes no difference if I deploy one or both applications)
'''No application deployed does not result in OOM on shutdown''', unless I mess around with the jvm args
The size of the war is about 500k, all libraries are deployed in tomcat/common/lib (directory which I added to common.loader in conf/catalina.properties)
ulimit -u -> unlimited
When I check the Tomcat manager app it says the following about the JVM memory:
Free memory: 470.70 MB Total memory: 490.68 MB Max memory: 490.68 MB
(http connector) Max threads: 200 Current thread count: 6 Current thread busy: 1
'top' or 'free -m' is similar:
Mem: 2097152k total, 1326772k used, 770380k free, 0k buffers
20029 myuser 18 0 805m 240m 11m S 0 11.7 0:19.24 java
I tried to start jmap to get a dump of the heap, it also fails with an OutOfMemoryError. Actually as long as one or both of my applications are deployed any other java process fails with the same OOM Error (see top).
The problem occurs while the application is deployed. So something is seriously wrong with it. However the application is actually running smoothly for quite a while. But I have seen OOMs in the application as well, so I don't trust the calm.
My application is using a custom filter class? Could that be it?
For completeness (hopefully), here's the list of libraries in my common/lib:
activation-1.1.jar
antlr-2.7.6.jar
antlr-runtime-3.3.jar
asm-3.1.jar
asm-commons-3.1.jar
asm-tree-3.1.jar
c3p0-0.9.1.1.jar
commons-collections-3.1.jar
commons-email-1.2.jar
dependencies-provided.tgz
dom4j-1.6.1.jar
ejb3-persistence-1.0.2.GA.jar
geronimo-annotation_1.0_spec-1.1.1.jar
geronimo-jaspic_1.0_spec-1.0.jar
geronimo-jta_1.1_spec-1.1.1.jar
hibernate-annotations-3.4.0.GA.jar
hibernate-commons-annotations-3.1.0.GA.jar
hibernate-core-3.3.0.SP1.jar
hibernate-entitymanager-3.4.0.GA.jar
hibernate-search-3.1.0.GA.jar
javassist-3.4.GA.jar
joda-time-1.6.2.jar
jta-1.1.jar
log4j-1.2.16.jar
lombok-0.9.3.jar
lucene-core-2.4.0.jar
mail-1.4.1.jar
mysql-connector-java-5.1.14.jar
persistence-api-1.0.jar
quartz-2.1.1.jar
servlet-api-2.5.jar
slf4j-api-1.6.1.jar
slf4j-log4j12-1.6.1.jar
stringtemplate-4.0.2.jar
wicket-auth-roles-1.5.1.jar
wicket-core-1.5.1.jar
wicket-datetime-1.5.1.jar
wicket-extensions-1.5.1.jar
wicket-request-1.5.1.jar
wicket-util-1.5.1.jar
xml-apis-1.0.b2.jar
I appreciate any hint or even speculation that gives me additional ideas what to try.
Update: I tested some more and found that this behaviour only occurs while one or both of my applications are deployed. The behaviour does not occur on "empty" tomcat (that was a mistake on my part messing with jvm args)
Update2: I am currently experimenting trying to reproduce this behaviour in a virtual box, I want to debug this with a profiler. I am still not convinved that it should be impossible to run my setup on 2GB RAM.
Update3 (10/01/12): I am trying to run jenkins instead of my own application. Same behaviour, so it is definitely server configuration issues. Jenkins jobs fail when maven is called, so I need not even try the shutdown hack suggested below because I need a second java process while running Jenkins. It was suggested to me that because this is a Virtual Server ulimits may be imposed from outside and I would not be able to see them. I think I'll ask a new question regarding this. Thx all.
Update4 (02/05/12): see below for the answer that contains the hint. I'll clarify some more up here: I am now 95% sure that the errors occur because I am reaching my thread limit. However because this is a virtual server the method described below would not work to check this value because it is not visible with ulimit, that was what was confusing me and only today I found out that this is the "numproc" value that I can see in the Parallels Power Panel that I can log into for my virtual server. There were Resource Alerts for numproc but I did not see those either until just now. The value has a hard limit of 96 which I cannot change of course. The current value of numproc corresponds to the number of processes I see with "top" after toggling "H" to see threads. I had a very hard time finding this because this numproc value is hidden deep inside the panel. Sadly 96 is a rather low number if you want to run a tomcat with apache and mysql. I am also very sad that I cannot even find this value in the small print of my hosting contract and it is rather relevant to my application I dare say. So I guess I'll need a server upgrade.
Thanks all for your helpful answers in the end everyone helped me a bit to find out what the problem was.
The tomcat shutdown procedure consits of sending an command/word via a tcp port to the running tomcat VM. This port is configured in the server.xml (if I remember corretly, writting on my phone right now). So far so good.
Unfortunately, the shutdown script does this by starting a 2. VM using the same java options used for the tomcat. Your system simply has not enough memory for this.
As a sollution you could write your own stop script using telnet or something.
I could help with later if needed.
Hope that helps.
Viele grüsse Bert
Seems you have too many threads open.
Use this command :
ulimit -u
What is the result ?
Should be something like :
max user processes (-u) 100
If this is correct, you can edit this file :
/etc/security/limits.conf
and the the following modifications :
#<domain> <type> <item> <value>
user soft nproc 10000
user hard nproc 10000
You can probably survive for a while like this. All you need to do is kill the tomcat process whenever you need to restart it. It is not a nice approach, but the main concern is that your application runs correctly.
It seems to me though, that on the long run, you might need to order a hosting plan with more RAM available.
I was having a similar problem with a tomcat installation just last week. I managed to fix it by giving tomcat a smaller heap. Something like this:
export CATALINA_OPTS=”-Xms256m -Xmx512m”
Before starting Tomcat may help. In the meantime you'll have to kill it the old fashioned way, with a kill -9 ;)
EDIT: you could also take look here, it appears tomcat automatically creates a bunch of "spare" threads, but you can limit those as well as your max thread count in the config. Hope it helps.
Recently, while working on a JSF web app, using Netbeans 6.8, I am constantly getting PermGen: Out Of Memory Errors. I have also noticed that this is not related to hot swapping the code, as some people suggested on the forums; I generally restart my local web server, Tomcat 6.0, whenever I redeploy the code. This used to happen to me once in awhile, but as of late, it was been occurring constantly. I usually can't go more than two minutes before it crashes.
The important observation I've made about this problem, is that it only seems to happen when running the debugger. If I launch the server, regularly, it will run indefinitely. As soon as I run in debug mode, this problem occurs.
I've tried all the tips I've found so far of increasing the JAVA_OPT memory settings for Java in Tomcat; I've tried increasing the available memory for Netbeans in netbeans.conf. Still no luck. If you want to see the specific configuration changes I've made, I can post that as well.
I've also read that this can be a result of memory leaks in Java. I've tried running Netbean's profiler, but it would generally crash as well before I could do anything really useful. Additionally, when it did run, all the object allocations with ridiculous generations were things in java libraries, or primitives -- char[]s were the biggest memory hog of the app, for example, with the largest generations.
I would really like to know if anyone has had a similar problem before, and if so, how they solved it. This is starting to seriously impede my ability to do my work.
Thanks for any help.
add this entry in catlina.sh(or bat), it worked for me
JAVA_OPTS="-Djava.awt.headless=true -Dfile.encoding=UTF-8
-server -Xms1536m -Xmx1536m
-XX:NewSize=256m -XX:MaxNewSize=512m -XX:PermSize=512m
-XX:MaxPermSize=512m -XX:+DisableExplicitGC
Something I have found useful to track down memory leaks without running a profiler or a debugger is to use the "jmap -histo " command (comes with the jdk). Save the output of this program to a file. Run this every few minutes while your application is running. Collect up the outputs and look for objects that are always increasing in number and size. I even wrote a quick app to graph selected objects over time to really highlight run away objects just to make it easier to see where leaks might be occurring.