Java memory leak while using Axis2 and WAS 7 - java

I have a standalone application that is running in IBM Websphere 7.0.0.19. It is running in Java 6 and we pack an Axis2 JAR in our EAR. We have 'parent last' style class loading and we have disabled the Axis service that is packed with WAS7 by default.
Recently, after 6+ weeks of continuous functioning, the application experienced an OOM. Perplexing point is, the application is deployed seperately on 2 different machines. But only one machine went down. Second machine is still up.
We checked OS, server configuration like classloader policy using WAS console and they are similar in both machines.
When the application crashed, it generated a .phd file which we analysed using Eclipse Memory Analyser Tool (MAT). The analysis is shown in the screenshot.
If I'm correct the bootstrap class loader is repeatedly loading and holding on to references of AxisConfiguraiton and so GC is unable to collect them when it runs. But, if that is the case, then both servers must have come down. But only one server experienced an OOM. Memory allocated to JVM is same in both machines.
We are not sure whether the issue is with WAS 7 or with axis2-kernel-1.4.1.jar or with something else.
http://www.slideshare.net/leefs/axis2-client-memory-leak
https://issues.apache.org/jira/browse/AXIS2-3870
http://java.dzone.com/articles/12-year-old-bug-jdk-still-out
(Links may not refer to the current issue. But they are just pointers)
Has anyone experienced something similar ?

We saw memory growth and sockets left open on WebSphere 6.1 with Axis 2 1.4 in the past. It's been a long time, but my notes suggest it might be worth considering an upgrade to at least Axis 2 1.5.1 to fix this bug with the open sockets and also to ensure you're not creating new objects repeatedly where a singleton exists (e.g. the Service object).

Related

Increasing class loaded count in Spring Boot application deployed as WAR on Tomcat

We have a Spring Boot application running on Tomcat, it is a RESTful web service. The same WAR file is deployed on 3 Tomcat instances in our test environment as well as Production environment. While running performance test we noticed a peculiar problem with some servers. Some of the servers stop responding after processing about 2500 requests. The issue happens on 2 out of 3 Production servers and happens on 1 out of 3 Test servers.
On the servers that have the issue, we noticed on our JVM monitoring that the classes loaded count keeps increasing whenever we are running the performance test. The class loaded count goes from 20k to around 2 million. When the class count reaches close to 2million the JVM monitoring also shows that the GC is taking too long, more than 40 seconds. Once it reaches that point, the application would stop responding. The applications throws an OutOfMemoryException “Compressed class space”. If we continue sending more requests, we can see from the application logs that the service is still receiving requests but stops processing midway.
On the other servers without the issue, the class loaded count stays at a constant 20k. And the GC is normal too, taking less than 1 seconds.
Others testing and behaviors we have noticed -
The issue happens on local Tomcat instances installed on Windows PC. The servers are on Linux. The issue happens on both OpenJDK and Oracle JDK 1.8.
We verified the Tomcat instances are equal to each other - we even cloned from the working servers and put them on the bad servers.
Tested with different GC policies - PS, CMS and G1, and the issues happens on all three.
Tested by running the application as a standalone Spring Boot JAR and the issue goes away. The class count stays constant and GC behaves normally.
The application is currently using JAXB libraries to perform XML marshalling/unmarshalling and we found places in the code where we can optimize the code. Refactoring the code and moving to Jackson library is another option.
My questions are -
What would be causing the difference between multiple servers when we are deploying the same WAR file?.
What would be causing the difference between the application running as WAR deployed on Tomcat versus running as standalone Spring boot application?
If we take a heap dump of the JVM or do a profiling, what are the things to look out for?
So it turns out this was due to jaxb 2.1 jar in our classpath. Thanks to Mark for pointing out the known bug with jaxb.
Our application did not explicitly have the jaxb-impl as a dependency, so it was hard to see at first. Upon looking at the Maven dependency tree, we found out two different versions were being loaded from other project and libraries. Our application had jaxb-impl versions 2.1 and 2.2.6 in the classpath. We put the 2.1 version as an exclusion in our application's pom.xml and that fixed the issue.
My guess is that different servers were loading different versions upon the application startup. That could be why some servers were working fine and others that loaded the 2.1 version had issues. Similarly with running as a standalone Spring boot app, it might have loaded the 2.1 version.

Jhipster app memory consumption on Amazon ec2

My application is just a bigger version of the default Jhipster app.. I even have no Cache.
I deployed it successfully on an Amazon free tier t1.micro instance.
I experienced some random 503 errors. I checked the health of the instance and it sometimes said "no data sent" some other times "93% of memory is in use". Now it's down (red).
I cloned the environment, then terminated the original one. I get those various errors.
I deployed the war with Dev spring profile but I believe it's not what is causing this much horror.
Do I need to configure the java memory usage ? Why could the app be this memory hungry?
I posted the question on StackOverflow as I am caring more about performance tuning of the deployed Jhipster war but if you think it's more a problem with Amazon please let me know why you think that.
Thanks
Deploy the application on a instance with much more memory ie an t2.large (8GB)
The size on an existing instance can be altered by using the console "stop", find the console "instance settings" "instance type" change and start again
Ensure that your application has a method for attaching jconsole to it available (apparently the development version does, with jmx). See http://docs.oracle.com/javase/8/docs/technotes/guides/management/jconsole.html for more information on jconsole
Run the application and monitor the nice graphs in jconsole
See what the peak is over a few days of normal use. Also log on to the server with ssh and use free -m to see the system memory use ( see http://www.linuxatemyram.com/ for a guide to interpreting the data )
Once you know the actual amount of RAM it uses choose an appropriate instance size, see http://www.ec2instances.info/
You might need to adjust the -Xmx setting, I don't know the specifics with jhipster but this is a common requirement for java applications

Permgen out of memory

Running Tomcat for an Enterprise level App. Been getting "Permgen out of memory" messages.
I am running this on:
Windows 2008 R2 server,
Java 1.6_43,
Running Tomcat as a service.
No multiple deployments. Service is started, and App runs. Eventually I get Permgen errors.
I can delay the errors by increasing the perm size, however I'd like to actually fix the problem. The vendor is disowning the issue. I don't know if it is a memory leak, as the vender simply say "runs fine with Jrockit". Ofc, that would have been nice to have in the documentation, like 3mos ago. Plus, some posts suggest that Jrockit just expands permspace to fit, up to 4gb if you have the mem (not sure that is accurate...).
Anyway, I see some posts for a potential fix in Java 1.5 with the options
"-XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled"
However, these seem to have been deprecated in Java 1.6, and now the only GC that seems to be available is "-XX:+UseG1GC".
The best link I could find, anywhere, is:
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options
Does anyone know if the new G1 garbage collector includes the permspace? or am I missing an option or 2 in the new Java6 GC settings that maybe i am not understanding?
Any help appreciated!
I wouldn't just increase the permgen space as this error is usually a sign of something wrong in software/setup. Is their a specific webapp that causes this? Without more info, I can only give basic advice.
1) Use the memory leak detector (Tomcat 6+) called Find Leaks
2) Turn off auto-deployment
3) Move JDBC drivers and logging software to the java classpath instead of tomcat per this blog entry
In earlier versions of Sun Java 1.6, the CMSPermGenSweepingEnabled option is functional only if UseConcMarkSweepGC is also set. See these answers:
CMSPermGenSweepingEnabled vs CMSClassUnloadingEnabled
What does JVM flag CMSClassUnloadingEnabled actually do?
I don't know if it's functional in later versions of 1.6 though.
A common cause for these errors/bugs in the past was dynamic class generation, particularly for libraries and frameworks that created dynamic proxies or used aspects. Subtle misuse of Spring and Hibernate (or more specifically cglib and/or aspectj) were common culprits. The underlying issue was that new dynamic classes were getting created on every request, eventually exhausting permgen space. The CMSPermGenSweepingEnabled option was a common workaround/fix. Recent versions of those frameworks no longer have the problem.

NetBeans/Glassfish and PermGen space bug when redeploying (yes, STILL happening)

I know this has probably been asked many times before, but I still haven't seen an actual fix for it.
My day-to-day development environment is as follows:
1. NetBeans (latest), 2. Glassfish (latest as bundled with NB), 3. JPA, JSF, JAXB, Jersey for JAX-RS
I have about 600 classes in my project, spread across two EJB projects and one WAR project, all inside an EAR.
I am on latest JDK 7 (on OS X) and I am on an hourly basis getting the infamous "PermGen space" bug. Let's say if I am doing 3 incremental re-deploys a minute, I can only work for a short while before either:
Glassfish run out of PermGen space, so I just have to kill the process.
Deployment becomes extremely slow, due to me having increase max permgen space (as one is advised to do from dozens of answers on S.O.)
Often the only solution is to kill glassfish every 30 minute or so. It's definitely due to a bug somewhere that simply loads new classes for every new incremental re-deploy instead of getting rid of the old ones. I thought this was supposed to be fixed in JDK 7?
This has been a long standing bug in the kind of development environment, and I am rather shocked that it's still going on after my 5+ years of Java development. It's just so frustrating and incredibly unproductive.
(Just before anyone suggests increasing permgen space, believe me I've tried that, and the only thing it "solves" is to prolong the inevitable. I've seen redeployments take up to 400 seconds at its worst. Redeployment is supposed to take 5-6 seconds for a project this size, no more.)
EDIT: I ran jmap and jhat on the Glassfish process after the following steps:
Start glassfish
Deploy my EA
Undeploy my EA
Then did a heap dump with jmap
It turns out that all my classes (which should have been unloaded) are still loaded! Hopefully this is useful information to someone reading this...
Surely, that is a bug, and I don't think that there is an easy solution for that. (If there were, probably you have had it already).
What you can try: Use some hot code replacement tool for example JRebel, This way you don't have to deploy all the time, instead this tool watches the changes of the .class files (and even other web resources, if you configure so), and replaces the class definition within the running JVM. Sounds cool, right?
It works as a Java agent, it starts when your JVM starts.
There are 3 drawbacks of this solution: The deployment is a bit slower, it's harder to debug, and it's a proprietary software (but does not cost much)
When developing with Netbeans + Glassfish and using "Deploy on Save" we've found that libraries packaged within an application are not unloaded when the project is re-deployed; this causes GF to slow down and quickly run out of memory.
Try de-selecting "Package" for all compile-time libraries and place those not already in the Glassfish classpath in the domainX/lib directory.
Not sure but this may be related to GLASSFISH-17449 or GLASSFISH-16283.

Running webapps in separate processes

I'd like to run a web container where each webapp runs in its own process (JVM). Incoming requests get forwarded by a proxy webapp running on port 80 to individual webapps, each (webapp) running on its own port in its own JVM.
This will solve three problems:
Webapps using JNI (where the JNI code changes between restarts) cannot be restarted. There is no way to guarantee that the old webapp has been garbage-collected before loading the new webapp, so when the code invokes System.loadLibrary() the JVM throws: java.lang.UnsatisfiedLinkError: Native Library x already loaded in another classloader.
Libraries leak memory every time a webapp is reloaded, eventually forcing a full server restart. Tomcat has made headway in addressing this problem but it will never be completely fixed.
Faster restarts. The mechanism I'm proposing would allow near-instant webapp restarts. We no longer have to wait for the old webapp to finish unloading, which is the slowest part.
I've posted a RFE here and here. I'd like to know what you think.
Does any existing web container do this today?
I'm closing this question because I seem to have run into a dead end: http://tomcat.10.n6.nabble.com/One-process-per-webapp-td2084881.html
As a workaround, I'm manually launching a separate Jetty instance per webapp.
Can't you just deploy one app per container and then use DNS entries and reverse proxies to do the exact same thing? I believe Weblogic has something like this in the form of managed domains.
No, AFAIK, none of them do, probably because Java web containers emphasize following the servlet API - which spins off a thread per http request. What you want would be a fork at the JVM level - and that simply isn't a standard Java idiom.
If I understand correctly you are asking for the standard features for enterprise quality servers such IBM's WebSphere Network Deployment (disclaimer I work for IBM) where you can distribute applications across many JVMs, and those JVMs can in fact be distributed across many physical machines.
I'm not sure that your fundamental premise is correct though. It's not necessary to restart a whole JVM in order to deploy a new version of an application. Many app servers will use a class-loader strategy that allows them to discard a version of an app and load a new one.

Categories

Resources