Here's a tricky one for ya - We have a Java web application, deployed on Tomcat web servers on Amazon ElasticBeanStalk. and we believe we have a memory leak b/c it seems that the JVM crashes every night with OutOfMemory exception.
The problem is that after the crash, EBS automatically scraps the old EC2 instance and starts a fresh one. all the logs and info get scrapped too...
I am now developing a custom CloudWatch metric to monitor the memory of the JVM (you would think there should be a prepared one...) but that won't help me generate heap dumps
Has anyone gone through a similar problem and knows how to catch these errors on EBS?
This certainly sounds like unusual EC2 (not EBS) instance behaviour. It's interesting that if Tomcats falls over then the machine instance gets affected (in terms of stopping or terminating).
This is what I would suggest to diagnose:
get a running instance read to examine / play with
take a look at the "Termination Protection" - is this set to "enabled" or not - that could explaing the "scrapping" part of your problem (if by scrapping you mean the instance terminates and is removed). This you can find in the properties of your EC2 instance using the AWS console.
take a look at the Java memory settings your Tomcat server is configured with. Perhaps the max is (Xmx) bigger that the virtual machine has!? If so perhaps Tomcat is literally running the machine out of memory which could explain some of the EC2-response to your out of memory. I assume you mean "stopped" rather than "scrapped" otherwise how would you know your are getting an out of memory error?
if you manually kill the tomcat/java process on a working instance, does the instance stay operational (or do you get booted off and the instance gets stopped)? If something happens simply because you stop tomcat, it means some monitoring process is kicking in and taking down the machine explicitly.
use the -XX:-HeapDumpOnOutOfMemoryError to produce a dump file - this will help you work out where your leak is and hopefully fix the root cause.
Good luck. Hope that helps.
Consider a log collection service like Sumologic. The log files you specify are collected and available for analysis online. So even if your EC2 instances get replaced you can do forensics to see what happened to them
Related
My application is just a bigger version of the default Jhipster app.. I even have no Cache.
I deployed it successfully on an Amazon free tier t1.micro instance.
I experienced some random 503 errors. I checked the health of the instance and it sometimes said "no data sent" some other times "93% of memory is in use". Now it's down (red).
I cloned the environment, then terminated the original one. I get those various errors.
I deployed the war with Dev spring profile but I believe it's not what is causing this much horror.
Do I need to configure the java memory usage ? Why could the app be this memory hungry?
I posted the question on StackOverflow as I am caring more about performance tuning of the deployed Jhipster war but if you think it's more a problem with Amazon please let me know why you think that.
Thanks
Deploy the application on a instance with much more memory ie an t2.large (8GB)
The size on an existing instance can be altered by using the console "stop", find the console "instance settings" "instance type" change and start again
Ensure that your application has a method for attaching jconsole to it available (apparently the development version does, with jmx). See http://docs.oracle.com/javase/8/docs/technotes/guides/management/jconsole.html for more information on jconsole
Run the application and monitor the nice graphs in jconsole
See what the peak is over a few days of normal use. Also log on to the server with ssh and use free -m to see the system memory use ( see http://www.linuxatemyram.com/ for a guide to interpreting the data )
Once you know the actual amount of RAM it uses choose an appropriate instance size, see http://www.ec2instances.info/
You might need to adjust the -Xmx setting, I don't know the specifics with jhipster but this is a common requirement for java applications
We are trying to access an application from the tomcat which is on a different host, but it is not loading even though the tomcat is running. It was running fine for the past 3 months. We restarted the tomcat now it is working fine.
But, we could not able to zero in on what happened.
Any idea how to trace / what might have caused this?
The CPU usage was normal and the tomcat memory was 1205640.
the memory setting of tomcat are 1024- 2048(min-max)
We are using tomcat 7.
Help much appreciated....thanks in advance.....cheers!!
...also - not sure on Windows - you may be running out of file descriptors. This typically happens when streams are not properly closed in finally blocks.
In addition, check with netstat if you have a lot of sockets remaining open or accumulating in wait state.
Less likely, the application is creating threads and never releasing them.
The application is leaking something (memory, file descriptors, sockets, threads,...) and running over a limit.
There are different ways to track this. A profiler may help or more simply, running JVM dumps at regular intervals and checking what is accumulating. The excellent MAT will help you analyze the dumps.
Memory leak problems are not uncommon. If your Tomcat instance was running for three months and suddenly the contained application became unresponsive maybe that was the case. One solution (and if your resources allow you to do so) could be monitoring that Tomcat instance though JMX using jconsole to see how it behaves
Trying to diagnose some bizarre Tomcat (7.0.21) and/or JVM errors on a 64-bit linux (CentOS) machine.
I'm load testing our server application and tried hitting it with 100K messages. Launched jvisualvm and kept my eye on the heap the whole time. Everything was looking great* (see below) until I got to about 93K processed messages and then Tomcat just died. Ran a ps on Tomcat's PID number to confirm it was dead.
Up until this crash:
Load test had been running for about 90 minutes; should have finished shortly thereafter since we were at 93K/100K)
CPU was holding strong around 45%
Used heap was around 2GB (plus or minus a bunch after GCs) but heap size grew from 4GB to MAX_HEAP after about 30 minutes
Class loading/unloading was cycling normally
Thread dumps were normal
Nowhere in the server code are any calls to System.exit() - so we can rule that right out (and yes I've double-checked!!!).
I'm not sure if this is Tomcat crashing or the JVM (how do I tell?). And even if I did know, I can't seem to find any indication of what went wrong:
All of the server app's logs just stop without any ERROR messages (even though we have logging universally set to DEBUG and higher)
Tomcat's catalina.out and respect localhost_access_* files just stop without any info
I've heard it is possible to have Tomcat log a coredump when it does but not sure how to do that and online examples aren't helping much.
How would SO go about diagnosing this? What steps should I take to start ruling out all of the possible factors?
Thanks in advance!
If the JVM crashes, you should have a hs_err_pidNNN.log file; you don't have to do anything to enable this. Its location depends on your OS and how you are running Tomcat. On Windows, they can show up on your desktop, unless you are running as a service. Otherwise, they should be in the current working directory of the crashed process.
Your operating system probably provides additional tools for process monitoring; you could describe your environment more, or perhaps ask at serverfault.com.
It's also possible that jvisualvm is actually causing the crash.
I'd try reproducing the problem, and progressively simplify the scenario to help isolate the cause.
Another possibility is that the OS is running out of memory and the OOM Killer is killing your process. In this case, the JVM wouldn't get an opportunity to write a heap dump, or an hs_err_pid file.
You can use the option java -XX:+HeapDumpOnOutOfMemoryError to create a heap dump for jvm crash due to out of memory error.
More details here Using HeapDumpOnOutOfMemoryError parameter for heap dump for JBoss.
Sorry I had to remove the green check from #erickson. I finally figured out what was killing Tomcat.
It looks like a profiler plugin is not configured correctly with VisualVM and attempting to run a profile on the Tomcat process killed it.
Investigating why right now, and will update this answer once I know more.
There is a Java Struts application running on Tomcat, that have some memory errors. Sometimes it becomes slowly and hoard all of the memory of Tomcat, until it crashes.
I know how to find and repair "normal code errors", using tests, debugging, etc, but I don't know how to deal with memory errors (How can I reproduce? How can I test? What are the places of code where is more common create a memory error? ).
In one question: Where can I start? Thanks
EDIT:
A snapshot sended by the IT Department (I haven't direct access to the production application)
Use one of the many "profilers". They hook into the JVM and can tell you things like how many new objects are being created per second, and what type they are etc.
Here's just one of many: http://www.ej-technologies.com/products/jprofiler/overview.html
I've used this one and it's OK.
http://kohlerm.blogspot.com/
It is quite good intro how to find memory leaks using eclipse memory analyzer.
If you prefer video tutorials, try youtube, although it is android specific it is very informative.
If your application becomes slowly you could create a heap dump and compare it to another heap dump create when the system is in a healthy condition. Look for differences in larger data structures.
You should run it under profiler (jprofile or yourkit, for example) for some time and see for memory/resource usage. Also try to make thread dumps.
There are couple of options profiler is one of them, another is to dump java heap to a file and analyze it with a special tool (i.e. IBM jvm provides a very cool tool called Memory Analizer that presents very detailed report of allocated memory in the time of jvm crash - http://www.ibm.com/developerworks/java/jdk/tools/memoryanalyzer/).
3rd option is to start your server with jmx server enabled and connect to it via JConsole with this approach you would be able to monitor memory ussage/allocation in the runtime. JConsole is provided with standard sun jdk under bin directory (here u may find how to connect to tomcat via jconsole - Connecting remote tomcat JMX instance using jConsole)
I have an application running on Websphere Application Server 6.0 and it crashes nearly every day because of Out-Of-Memory. From verbose GC is certain there are the memory leaks(many of them)
Unfortunately the application is provided by external vendor and getting things fixed is slow & painful process. As part of the process I need to gather the logs and heapdumps each time the OOM occurs.
Now I'm looking for some way how to automate it. Fundamental problem is how to detect OOM condition. One way would be to create shell script which will periodically search for new heapdumps. This approach seems me a kinda dirty. Another approach might be to leverage the JMX somehow. But I have little or no experience in this area and don't have much idea how to do it.
Or is in WAS some kind of trigger/hooks for this? Thank you very much for every advice!
You can pass the following arguments to the JVM on startup and a heap dump will be automatically generated on an OutOfMemoryError. The second argument lets you specify the path for the heap dump file. By using this at least you could check for the existence of a specific file to see if a heap dump has occurred.
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=<value>
I see two options if you want heap dumping automated but #Mark's solution with heap dump on OOM isn't satisfactory.
You can use the MemoryMXBean to detect high memory pressure, and then programmatically create a heap dump if the usage (or usage delta) seems high.
You can periodically get memory usage info and generate heap dumps with a cron'd shell script using jmap (works both locally and remote).
It would be nice if you could have a callback on OOM, but, uhm, that callback probably would just crash with an OOM error. :)
Have you looked at JConsole ? It uses JMX to give you visibility of a variety of JVM metrics, including memory info. It would probably be worth monitoring your application using this to begin with, to get a feel for how/when the memory is consumed. You may find the memory is consumed uniformly over the day, or when using certain features.
Take a look at the detecting low memory section of the above link.
If you need you can then write a JMX client to watch the application automatically and trigger whatever actions required. JConsole will indicate which JMX methods you need to poll.
And alternative to waiting until the application has crashed may be to script a controlled restart like every night if you're optimistic that it can survive for twelve hours..
Maybe even websphere can do that for you !?
You could add a listener (Session scoped or Application scope attribute listener) class that would be called each time a new object is added in session/app scope.
In this - you can attempt to check the total memory used by app (Log it) as as call run gc (note that invoking it will not imply gc will always run)
(The above is for the logging part and gc based on usage growth)
For scheduled gc:
In addition you can keep a timer task class that runs after every few hrs and does a request for gc.
Our experience with ITCAM has been less than stellar from the monitoring perspective. We dumped it in favor of CA Wily Introscope.
Have you had a look on the jvisualvm tool in the latest Java 6 JDK's?
It is great for inspecting running code.
I'd dispute that the you need the heap dumps when the OOM occurs. Periodic gathering of the information over time should give the picture of what's going on.
As has been observed various tools exist for analysing these problems. I have had success with ITCAM for WebSphere, as an IBMer I have ready access to that. We were very quickly able to indentify the exact lines of code in out problem situation.
If there's any way you can get a tool of that nature then that's the way to go.
It should be possible to write a simple program to get the process list from the kernel and scan it to see if your WAS process is still running. On a Unix box you could probably whip up something in Perl in a few minutes (if you know Perl), not sure how difficult it would be under Windows. Run it as a scheduled task every five minutes or so, and if the process doesn't show up you could have it fork off another process that would deal with the heap dump and re-start WAS.