We have Spring Hibernate JPA web application in production. There is suspect of memory leak in session objects. We are uploading excel records using Apache POI into MYSQL. Commit frequency is 10 records, but each commit takes 5 to 10 seconds pause and CPU reaches almost 100% throughout the import process. Is there any way to profile the hibernate sessions in my application and find what process is causing such a high CPU usage. I was checking out Rhinos Hibernate Profiler but it seems to confusing in configuration and needs changes in code. Since we need to profile production or stage instance, is there any JPA Hibernate Session information profiler without much changes to application configuration/code?
Use Visual VM, with all the plug-ins installed. Attach it to your app's JVM PID when you start it up. It'll show you memory, threads, and lots more.
I don't think it'll be a good idea to profile on a production server. Put that code on another box and run a significant load for a long period of time. That'll show the problem.
Is this the JConsole plugin you're looking for?
http://hibernate-jcons.sourceforge.net/
Related
We have a major challenge which have been stumping us for months now.
A couple of months ago, we took over the maintenance of a legacy application, where the last developer to touch the code, left the company several years ago.
This application needs to be more or less always online. It's developed many years ago without staging and test environments, and without a redundant infrastructure setup.
We're dealing with a legacy Java EJB application running on Payara application server (Glassfish derivative) on an Ubuntu server.
Within the last year or two, it has been necessary to restart Payara approximately once a week, and the Ubuntu server once a month.
This is due to a memory leak which slows down the application over a period of around a week. The GUI becomes almost entirely non-responsive, but a restart of Payara fixes this, at least for a while.
However after each Payara restart, there is still some kind of residual memory use. The baseline memory usage increases, thereby reducing the time between Payara restarts. Around every month, we thus do a full Ubuntu reboot, which fixes the issue.
Naturally we want to find the memory leak, but we are unable to run a profiler on the server because it's resource intensive, and would need to run for several days in order to capture the memory leak.
We have also tried several times to dump the heap using "gcore" command, but it always result in a segfault and then we need to reboot the Ubuntu server.
What other options / approaches do we have to figure out which objects in the heap are not being garbage collected?
I would try to clone the server in some way to another system where you can perform tests without clients being affected. Could even be a system with less resources, if you want to trigger a resource based problem.
To be able to observe the memory leak without having to wait for days, I would create a load test, maybe with Apache JMeter, to simulate accesses of a week within a day or even hours or minutes (don't know if the base load is at a level where that is feasible from the server and network infrastructure).
First you could set up the load test to act as a "regular" mix of requests like seen in the wild. After you can trigger the loss of response, you can try to find out, if there are specific requests that are more likely to be the cause for the leak than others. (It also could be that some basic component that is reused in nearly any call contains the leak, and so you cannot find out "the" call with the leak.)
Then you can instrument this test server with a profiler.
To get another approach (you could do it in parallel) you also can use a static code inspection tool like SonarQube to analyze the source code for typical patterns of memory leaks.
And one other idea comes to my mind, but it is coming with many preconditions: if you have recorded typical scenarios for the backend calls, and if you have enough development resources, and if it is a stateless web application where each call could be inspoected more or less individually, then you could try to set up partial integration tests where you simulate the incoming web calls, with database and file access, but if possible without the application server, and record the increase of the heap usage after each of the calls. Statistically you might be able to find out the "bad" call this way. (So this would be something I would try as very last option.)
Apart from heap dump have to tried any realtime app perf monitoring (APM) like appdynamics or the opensource alternative like https://github.com/scouter-project/scouter.
Alternate approach would be to analyse existing application issue Eg: Payara issues like these https://github.com/payara/Payara/issues/4098 or maybe the ubuntu patch you are currently running app on.
You can use jmap, an exe bundled with the JDK, to check the memory. From the documentation:-
jmap prints shared object memory maps or heap memory details of a given process or core file or a remote debug server.
For more information you can see the documentation or see the stackoverflow question How to analyse the heap dump using jmap in java
There is also a tool called jhat which can be used tp analise java heap.
From the documentation:-
The jhat command parses a java heap dump file and launches a webserver. jhat enables you to browse heap dumps using your favorite webbrowser. jhat supports pre-designed queries (such as 'show all instances of a known class "Foo"') as well as OQL (Object Query Language) - a SQL-like query language to query heap dumps. Help on OQL is available from the OQL help page shown by jhat. With the default port, OQL help is available at http://localhost:7000/oqlhelp/
See JHat Dcoumentation, or How to analyze the heap dump using jhat
I ran into a problem with Solr going OutOfMemory. The situation is as follows. We had 2 Amazon EC2 small instances (3.5G) each running a Spring/BlazeDS backend in Tomcat 6 (behind a loadbalancer). Each instance has its own local Solr instance. The index size on disk is about 500M. The JVM settings were since months (Xms=512m,Xmx=768). We use Solr to find people based on properties they entered in their profile and documents they uploaded. We're not using the Solr update handler, only the select. Updates are done using deltaImports. The Spring app in each Tomcat instance has a job that triggers the /dataimport?command=delta-import handler every 30 seconds.
This worked well for months, even for over a year if I'm correct (I'm not that long on the project). CPU load was at a minimum, with exceptionally some peaks.
The past week we suddenly had OutOfMemory crashes of SOLR on both machines. I reviewed my changes over the past few weeks, but none of the seamed related to SOLR. Bugfixes in the UI, something email related, but again: nothing in the SOLR schema or queries.
Today, we changed the Ec2 instances to m1.large (7.5G) and the SOLR JVM settings to -Xms=2048 / -Mmx=3072. This helped a bit, they run for 3 a 4 hours, but eventually, they crash too.
Oh, and the dataset (number of rows, documents, entities, etc) did not change significantly. There is a constant growth, but it doesn't make sense to me when I triple the JVM memory, that it still crashes...
The question: have you any directions to point me to?
Measure, not guess. Instead of guessing, what has changed, which could lead to your problems, you would better take some memory leak detection tool, e.g. Plumbr. Run your Solr with the tool attached and see, is it will tell you the exact reason of memory leak.
Take a look at your Solr cache settings. Reducing the size of the document cache has helped us stabilize a Solr 3.6 server that was also experiencing OutOfMemory errors. The query result cache size may also be relevant in your case, it was not in mine.
You can see your Solr cache usage on the admin page for your core:
http://localhost:8983/solr/core0/admin/stats.jsp#cache
(Replace core0 with the name of your Solr core)
documentCache
https://wiki.apache.org/solr/SolrCaching#documentCache
queryResultCache
https://wiki.apache.org/solr/SolrCaching#queryResultCache
I am deveoloping a web application with Hibernate, JPA, Spring and Struts2. When I run the application for a few hours in my web server (VPS Tomcat) the OS send a SIGKILL to tomcat because of the memory usage. My server has 288Mb, tomcat gets killed when it reaches 200Mb aprox. Someone has told me that I need more memory but my application is small and doesn´t have too much traffic, it is not in production yet. I am using postgresql and my database is about 150Mb, it has many images. I have tried to use a memory profiler with netbeans, but the IDE becomes to slow and I have not been able to find anything.
I'll appreciate any help.
Do you close properly your connections in a finally block ?
It's hard to reply without the code with only theses informations
i have used JProfiler and yourkit but i am not satisfied with output for actual performance tuning and memory usage currently we have been switched to java melody. This not only help performance optimization in dev but also in production system. Java melody is very easy to integrate and configure and in production you can enable or disable by just updating web.xml
I have a Grails application that is deployed on a Tomcat 6 server. The application runs fine for a while ( a day or two), but slowly eats up more and more memory over time until it grinds to a halt and then surpasses the maximum value. Once I restart the container, everything is fine. I have been verifying this with the grails JavaMelody plugin as well as the Application Info plugin, but I need help in determining what I should be looking for.
It sounds like an application leak, but to my knowledge there is no access to any unmanaged resources. Also, the Hibernate cache seems to be in check. It looks like if I run the garbage collector I get a decent chunk of memory back, but I don't know how to do this sustainably.
So:
How can I use these (or other) monitoring tools to figure out where the problem is?
Is there any other advice that could help me?
Thanks so much.
EDIT
I am using Grails 1.3.7 and I am using the Quartz plugin.
You can use the VisualVM application in the Oracle JDK to attach to the Tomcat instance while running (if using Oracle JVM already) to inspect what goes on. The memory profiler can tell you quite a bit and point you in the right direction. You most likely look for either objects that grow or types of objects that get allocated more and more.
If you need more than the free VisualVM application can tell you, a commercial profiler may be useful.
Depending on your usage of Quartz it may be directly related to a know memory leak with the Quartz plugin with persistence and thread-local. You may want to double check and see if this applies to your situation.
We are running a Java EE web application in JBoss that is using PostgreSQL 8.0.9 as the database.
One page in the application runs a big and complicated query when it is loaded. We had a problem that manifested if a user requested this page and closed their browser window before the requested page was returned to the client. The problem was that the closing of the window would spawn a new PostgreSQL thread/process (viewable via top) and the new thread/process would take a long time to switch from SELECT to idle in the top output. If approximately 5 or more users did this (closed the browser window before the large complicated query page returned to the client) in a small window of time the spawned threads/processes were growing and not switching to idle (staying in SELECT) and consuming a lot of CPU, causing major performance problems. It is important to mention that if the users that closed the browser window logged out, the associated thread/process would switch to idle and the CPU use would decrease. It is also important to mention that if JBoss was restarted the applicable threads/processes would switch to idle (as all the users would be logged out by the restart).
The problem of the hanging threads/processes seems to have been resolved by a database backup and RESTORE. Now the new threads/processes that are spawned are switched from SELECT to idle in a generally short period of time and the CPU is not burdened by them as much. Also, performance on large complicated queries in general seems to have improved significantly since the RESTORE.
We run VACUUM every 24 hours on the database. We do not run REINDEX on the database because of data corruption risks. We do tend to have rather high await numbers on iostat outputs, especially in the performance problem cases described above.
What happens to a database when it is dumped and restored (ex. REINDEX, etc.)? Which one of these seems to be the key to our solution?
Is there a setting that manages the number of threads/processes that are spawned when browser windows are closed before a page with a large complicated query is returned to the client? Is there a setting to manage the transition of threads/processes like this from SELECT to idle? Is there away to manage either of these at the application level?
Version 8.0 is already EOL and version 8.0.9 hasn't been patched in a long time as well: 8.0.26 has been the last. You are missing many patches and should at least update to the latest 8.0-version, but also start a migration to a version that is still supported. Since version 8.2 and 8.3, performance has become much better.
Question: Why do you think REINDEX corrupts your data? Corruption of data would make this statement pretty useless... REINDEX is not something you would do every day, but sometimes you need it.