ThreadLocal memory leak in Glassfish - java

Will the ThreadLocal cause memory leak in Glassfish server like it leaks in Tomcat? Why?
http://wiki.apache.org/tomcat/MemoryLeakProtection

Yes, it will leak and Glassfish won't even warn you according to this relatively recent Glassfish JIRA issue:
http://java.net/jira/browse/GLASSFISH-14128
What needs to be said however is that the ThreadLocal specific leaking is not a 'bug' in app/web servers per se, but a problem with code in components running in those containers (whether these components are servlets, session beans or whatever).
What app servers/web containers try to do in general is to shield developer from writing a lot of maintenance code and to make him focus on business logic. There needs to be however some understanding on his part of how the application server works (thread pools, classloaders, deploy/undeploy mechanism, ...) so that stuff like this ThreadLocal issue is done properly or avoided. It is not always easy and it can be very tricky. I remember reading about a memory leak issue in Glassfish? related to use of custom log levels.
What Apache Tomcat does is that it has a helper mechanism to warn user/deal with some commonly occurring memory leak issues in user code. But even in the link provided in the question, you may read that not all possible ThreadLocal memory leaks are done automatically using this mechanism.
Glassfish does not seem to have this added functionality yet.

This problem causes all sorts of issues. I posted about it a while ago
I need help finding my memory leak using MAT
We're manually freeing the objects ourselves. I think I saw in the GF bug lists that this had been fixed in the 3.1x release.

Related

If a Java app is running as a service how can the Spring context be closed?

I wrote a java application meant to run as a service which is using Spring for DI.
Since this app is running "forever" it will never get to the point it closes the Spring context.
May this cause issues in long term? For these cases is it maybe better to create objects in standard way rather than injecting them using Spring?
I can also see the heap usage slowly increasing, I am not sure this could be the cause.
Thank you.
It shouldn't be a problem if you don't have memory leaks in your application.
If you see your heap increasing continuously then there might be something which has nothing to do with Spring.
If you don't have any ideas what's causing the memory consumption then I'd suggest to take a look on some JVM tools, for example JVisualVM is one option which is available in the JDK. With this you can easily check what your threads are doing, which objects you have in the memory which are growing.

How to isolate user sessions in a Java EE?

We are considering development of a mission critical application in Java EE, and one thing that really impressed me is the lack of session isolation in the platform. Let me explain the scenario.
We have a native Windows application (a complete ERP solution) that receives about 2k LoC and 50 bug-fixes per month from sparse contributors. It also supports scripting, so the costumer can add their own logic and we have no clue about what such logic does. Instead of using a thread pool, each server node has a broker and a process pool. The broker receives a client request, enqueues it until a pooled instance is free, sends request to that instance, delivers response to client, and releases the instance back to the process pool.
This architecture is robust because with so many sparse contributions and custom scripting, it's not uncommon for a deployed version to have some serious bug such as an infinite loop, a long-waiting pessimistic lock, a memory corruption or memory leakage. We implemented a memory limit, a timeout for requests, and a simple watchdog. Whenever some process fails to answer correctly and on time, the broker simply kills it, so the watchdog detects and starts another instance. If a process crashes before it started to answer a request, the broker sends the same request to another pooled instance, and the user doesn't know about any failure on the server side (except in admin logs). This is nice because some instances are slowly trashed by bogus code as they work on requests. Because most session data is held at the client or (in rare cases) at a shared storage, it seems to work perfectly.
Now considering a move to Java EE, I couldn't find anything similar on the spec or popular application servers such as Glassfish and JBoss. Yes, I know that most cluster implementations do transparent fail-over with session replication, but we have small companies that use our system on a simple 2-node cluster (and we also have adventurers that use the system on a 1-node server). With a thread pool, I understand that a buggy thread can bring an entire node down, because the server cannot detect and safely kill it. Bringing an entire node down is much worst than killing a single process - we have deployments where each node has about 100 pooled process instances.
I know that IBM and SAP are aware of this problem, based on
http://www.trl.ibm.com/people/kawatiya/pub/Kawachiya07vee.pdf,
and
http://java.sys-con.com/node/47362
, respectively. But based on recent JSRs, forums and open-source tools, there isn't much activity on the community.
Now comes the questions!
If you have a similar scenario and
use Java EE, how did you solve?
Do you know about an upcoming
open-source product or change in
Java EE spec that can address this
issue?
Does .NET have the same problem? Can
you explain or cite references?
Do you know about some modern and
open platform that can address this
issue and is worth the task doing
ERP business logic?
Please, I have to ask you not tell about making more testing or any kind of QA investment, because we cannot force our costumers to make this on their own scripts. We also have cases where urgent bug-fixes must bypass QA, and while we force the customer to accept this, we cannot make him accept that a buggy software part can affect a range of unrelated features. This is issue is about robust architectures, not development process.
Thanks for your attention!
What you have stumbled upon is a fundamental issue regarding the use of Java and "hostile" applications.
It's a fundamental issue not just at the Java EE level, but at the core JVM level. The typical JVMs available have all sorts of issues with loading "unsafe code". From memory leaks, class loader leaks, resource exhaustion, and unclean thread kills, the typical JVM is simply not robust enough to handle badly behaving code well in a shared environment.
A simple example is memory exhaustion of the Java heap. As a basic rule, NOBODY (and by nobody, I specifically mean the core java library and just about every other 3rd party library out there) catches OutOfMemory exceptions. There are the rare few who do, but even they can do little about it. Typical code handles the exceptions they "expect" to handle, but let others fall through. Runtime exceptions (of which OOM is one) will happily bubble up through the call stack all the way to the top, leaving behind a wreckage of unchecked critical path code, leaving all sort of things in unknown state.
Things such as Constructors or static initializers which "can't fail" leaving behind uninitialized class members which are "never null". These damaged classes simply don't know they're damaged. Nobody knows they're damaged, and there's no way to clean them up. A Heap that hits OOM is an unsafe image and pretty much needs to be restarted (unless, of course, you wrote or audited ALL of the code yourself, which, naturally, you won't -- who would?).
Now, there may well be vendor specific JVMs which are better behaved and give you better control. The ones based on the Sun/Oracle JVM (i.e. most of them) do not.
So, it's not necessarily a Java EE issue, it's a JVM issue.
Hosting hostile code in the JVM is a bad idea. The only way it's practical is if you host a scripting language, and that scripting language implements some kind of resource control. That could be done, and you can tweak the existing ones as a start (JavaScript, Groovy, JPython, JRuby). The fact that these languages give users direct access to Java libraries makes them potentially dangerous, so you may have to restrict that as well to only aspects wrapped by script handlers. At this point, though, the "why use Java at all" question floats up.
You'll note Google App Engine does none of these. It spools up a separate JVM for each application that's being run, but even then it greatly restricts what can be done within those JVMs, notably through the existing Java security model. The distinction here is that these instances tend to be "long lived" so as not to endure the processing costs of startup and shutdown. I should say, they SHOULD be long lived, and those that are not do incur those costs.
You can make several instances of the JVM yourself, give them a bit of infrastructure to handle requests for logic, give them custom class loader logic to try and protect from class loader leaks, and minimally let you kill the instances off (they're simply a process) if you want. That can work, and probably work "ok" depending on the granularity of the calls, and the "start up" time for your logic. The start up time will minimally be the loading of the classes for the logic from run to run, that alone may make this a bad idea. And it certainly WON'T be "Java EE". Java EE is not set up to do this kind of thing. But you're not clear what Java EE features you're looking at either.
Effectively, this is what Apache and "mod_php" does. Several instances, as processes, individually handling requests, with badly behaving once being killed off as necessary. This is why PHP is common in the shared hosting business. In this structure, it's basically "safe".
I believe your scenario is highly untypical, thus it is improbable that there is a ready made framework/platform addressing this need. Java EE sort of assumes that the request processing code is written by the same team as the rest of the app, thus it need not be isolated, watched and reset that often, and bug fixes would be handled the same way in all parts of the system. This assumption greatly simplifies development, deployment, testing etc. for most of the projects, not forcing them to pay for something they don't need, And yes, it isn't suitable for everyone. If you want something fundamentally different, you probably need to implement a fair amount of failover logic yourself. Java EE does provide the fundamental building blocks for this though.
I believe (although have no concrete experience to prove it) that .NET or other platforms are basically built on similar assumptions.
We had a similar - though not so severe - port of a really enormous Perl site to Java. On receiving an HTTP request we instantiate a class and call its processRequest method. surrounded by try-catch and time measurement. Adding a timer and thread would suffice to be able to kill the thread. This probably is sufficient in real life.
A Java EE server like glassfish is an OSGi container you might have more isolating means.
Also you could run an array of (web or local) applications on which you dispatch your request via a central web applications. Those applications then are isolated.
Even more isolated are serialized sessions and operating system processes starting a new JVM.

ClassLoader Leak - Are they worth solving?

ClassLoader leaks usually result in java.lang.OutOfMemoryError: PermGen. In the instance of working on application servers you may see this as a result of many redeploys of a common application. The explanation and possible resolutions to this problem can be seen on these two links. (among others)
http://blogs.oracle.com/fkieviet/entry/classloader_leaks_the_dreaded_java
http://dev.eclipse.org/blogs/memoryanalyzer/2008/05/17/the-unknown-generation-perm/
Now for the most part they are easy to get around. Simply increase the -XX:MaxPermSize and when the inevitable happens, restart the JVM completely. The problem with trying to solve this is that in large applications many classes can cause the classloader to leak and thus the classes to stay within the permgen.
Two questions arise from this:
Is it reasonable to say that an issue like this is better to just increase the max perm size and restart where necessary or should finding a resolution be a higher priority?
Are there easier ways to resolve a classloader leak?
It really depends on the application, or rather, the deployment process being used. Many applications are only ever redeplyoed during development, new releases happen once every few months, and the application server is restarted for other reasons far more often than the app is deployed. In those circumstances, chasing Classloader leaks is a waste of time.
Of course, if you plan on implementing a continuous deployment process, especially in a high-availability environment, then Classloader leaks are something you really need to tackle. But there are a lot of other things you need to do better than most projects before that becomes an issue.
#biziclop is right. You need to be pragmatic about this.
If the problem is only in test servers, you can probably dismiss this as not worth the effort to solve.
If the problem is in production servers then you need a solution or a workaround. The solution is hard work, but the workarounds may be less work:
Workaround #1 - don't do hot deploys to production servers; only do full redeployments and restarts.
Workaround #2 - periodically do a full restart of the production servers to avoid running out of permgen space. Combine this with increasing the permgen space.
In a well resourced / well run environment you should be doing all of your testing on separate servers. If the downtime of a full deployment is a concern, you should be minimizing redeployment disruptions using server replication and progressive redeployment. Hot deployments to production should be unnecessary.
If you are in the position where you have no test environment and are doing frequent hot deploys to a production machine to minimize downtime, you are skating thin ice. The chances are that you will eventually make a mistake that results in damage which takes a long time to recover from ...
Those are one of the worst leaks... but any leak is evil. So, I, personally, resolve them. Profiling helps as well.
There are no easy ways per se but:
Threads go into threadGroups +starter thread for each module to ensure new Threads() have that group.
Special care of the Thread.inheritedAccessControlContext (which holds a reference to the classloader)
WeakReferences when you need to keep classes, actually use WeakReferences for listeners, so no one can skip de-registers (and use only annon. clasess). Having the framework for WeakListeners does help.
Extra care for DB drives, java.security.Provider
few more tricks (incl. dynamic enhance of class files but that's overkill usually)
bottom line:
leaks are evil.
Yes, there are easier - and more proper - ways to resolve the leaks. Add the ClassLoader Leak Prevention library to your project, and it should take care of the problem for you!
In case you want to track down the leaks yourself, this blog series will be of help.
I'd approach the problem pragmatically:
Is it causing problems in production environments?
Have you got enough time and resources to track it down?
If the answer to both these questions is yes, then by all means go for it. If it's one yes, one no, it's probably up to the management to decide, if both are nos, don't bother.

tomcat isolate webapps

multiple webapp running on same tomcat using same jvm. sometime, one webapp that have memory leak will cause entire jvm to crash and affect other webapps. any recommendation how to isolated that without need to use multiple jvm and tomcat
Within the same JVM everything shares the the same memory. There is no system to allocate separate pools or quota.
If one of your applications behaves really badly in this regard, the only thing you can do is run it isolated in a separate JVM (separate Tomcat).
Are the applications running as separate processes? Or the same one?
First off you should look at profiling to find the memory leak https://stackoverflow.com/questions/1716597/java-memory-leak-detection-tools.
However, as a quick solution from inside you could use Runtime.getRuntime().totalMemory() to see how much memory is in use, and if it grows above a certain limit, and you know which app is causing the problem, you could restart that app.
You could also try running System.gc() which is a terrible way to do it, and really shouldn't be used as it can be ignored by the JVM.
To the best of my knowledge, the short answer is: No, it can't be done. Tomcat uses a single memory space for all running apps.
My knee-jerk response is that you should fix the memory leak rather than trying to isolate the misbehaving app. Cure is better than quarantine. As I don't know the details of your problem, maybe this isn't practical for some reason.
You can't isolate apps in the same JVM (though you can do things like instrument a particular apps ClassLoader for diagnostics)
If your concern is administration/configuration though (and not total memory consumption) you can run multiple instances of Tomcat off the same install by using catalina.home and catalina.base
JSR 121 was designed to solve this, but it hasn't been implemented yet.
There is no standard system in Java to truly isolate memory used by web applications.
However, you could write some byte-code weaving logic to track how much memory a particular app has allocated. If it goes over a particular threshold, you could throw an exception and stop the app from allocating anymore memory. What do you want to do if you could track all the memory consumed by a web app? What are you trying to implement?
Note that this would only really work effectively for figuring out how much memory a webapp has allocated, not how much it is currently consuming in the system. In order to get that metric, you'd have to byte-code weave finalize() for all objects. Since finalize() gets run in a best-effort fashion by the JVM, this may not get you the most accurate value should the system be under load. The JVM would deprioritize these finalize threads and your value will never get updated even though objects have been cleaned up.
To bring this up to date, it is now possible to run multiple applications on a single JVM. Applications run in isolated java virtual containers which protect your applications from 'noisy neighbours' as well as allowing you to share resources across your applications. This gives you isolation, elasticity and increased application density for Apache Tomcat. Download it from www.elasticat.com NB I do work for Waratek who developed this new JVM

Pitfals of deploying/redploying app to Tomcat without restarting

I have read that it is possible with Tomcat 5.5+ to deploy a war to a Tomcat server without a restart. That sounds fantastic but I guess I am too skeptical about this functionality and it's reliability. My previous experience (with Websphere) was that it was a best practice to restart the server to avoid memory problems, etc. So I wanted to get feedback as to what pitfalls might exist with Tomcat.
(To be clear about my experience, I developed java web apps for 5 years for a large company that partitioned the app developers from the app server engineers - we used Websphere - so I don't have a lot of experience with running/configuring any app servers myself)
In general, there are multiple type of leaks and they apply to redeploy-scenarios. For production systems, it's really the best to perform restarts if possible, as there are so many different components and libraries used in todays applications that it's very hard to find them all and even harder to fix them. Esp. if you haven't got access to all source code.
Memory leaks
Thread and ThreadLocal leaks
ClassLoader leaks
System resource leaks
Connection leaks
ClassLoader leaks are the ones which bite at redeployment.
They can be caused by everything. Really, i mean everything:
Timers: Timers have Threads and Threads created at runtime inherit the current context class loader, which means the WebappClassloader of Tomcat.
ThreadLocals: ThreadLocals are bound to the thread. App servers use Thread pools. When a ThreadLocal is bound to a Thread and the Thread is given back to the pool, the ThreadLocal will stay there if nobody removes() it properly. Happens quite often and very hard to find (ThreadLocals do not have a name, except the rarely used Spring NamedThreadLocal). If the ThreadLocal holds a class loaded by the WebappClassloader, you got a ClassLoader leak.
Caches: e.g. EhCache CacheManager
Reflection: JavaBeans Introspector (e.g. holding Class or Method caches)
JDBC Drivers: they shouldn't be in the .war file anyway. Leak due to static registry
Static libraries which cache ClassLoaders, such as Commons-Logging LogFactory
Specific to Tomcat, my experience is as follows:
For simple apps with "clean" libraries, it works fine in Tomcat
Tomcat tries very hard to clean up classes loaded by the WebappClassloader. For example, all static fields of classes are set to null when a webapp is undeployed. This sometimes leads to NullPointerExceptions when code is run while the undeployment is happening, e.g. background jobs using a Logger
Tomcat has a Listener which cleans up even more stuff. Its called org.apache.catalina.core.JreMemoryLeakPreventionListener and was submitted recently to Tomcat 6.x
I wrote a blog post about my experience with leaks when doing redeployment stresstesting - trying to "fix" all possible leaks of an enterprise-grade Java Web Application.
Hot deployment is very nice as it usually is much faster than bringing the server up and down.
mhaller has written a lot about avoiding leaks. Another issue is for active users to have their session survive the application "reboot". There are several things that must be taken care of, but which all in all means that their session must be serializable and THEN deserialize properly afterwards. This can be a bit tricky if you have stateful database connections etc, but if your code is robust against database hickups anyway that shouldn't be too bad.
Also note that some IDE's allow updating code inside the WAR (in the same way as applications) when saving a modified source file, instead of having to redeploy. MyEclipse does this rather nicely.

Categories

Resources