I know that static field can cause a memory leak, because they will not be GCed.
But when there is an web application which is deployed in a container (such as Tomcat), each application has its own ClassLoader, and it can be undeployed.
My question is, do garbage collector claims objects referenced by static members of the classes which are going to be unloaded?
The simplest case is a singleton (implemented by an static variable referencing self), will it be GCed if the application is undeployed?
This might answer your question:
When an app is stopped, Tomcat (even before 6.0.24) nullifies the
value of all static class variables of classes loaded by the
WebAppClassLoader. In some cases, it may fix a classloader leak (for
example because of a custom ThreadLocal class, see above), but even if
we still have a leak, it may decrease the amount of memory lost
You can read more here
Cheers !!
Related
My application has lot of Spring beans (singleton and effectively stateless).
Do the bean instances add to the application's total memory?
If the beans are unused for a long time, is it possible to release them to be garbage collected?
(I know that the prototype scope is available, but I want only a single instance of the bean to exist at any given time.)
The whole point of the singleton bean is to be a single instance per application context. When spring application starts, these beans are created and put into the context.
They'll reside in the context as long as the application context is available (usually it means as long as the application process is up and running).
Now If beans are stateless, I doubt they'll significantly impact the memory footprint. JVM can allocate millions of objects, and usually, the memory is consumed by the internal state of the allocated object...
If you do have a state - you can define a cache with an expiration so that it will be cleaned and eventually garbage collected...
Spring has caching support or alternatively you can use the in-memory cache implementations directly (like caffeine or guava cache) - no need to roll your own cache implementation from scratch.
Do the bean instances add to the application's total memory?
Sure, they will contribute to the application memory footprint. Your Singleton-Beans are there would be there from the application start up, fully initialized and ready to serve requests without any delay, that's the whole point of keeping them alive in the Spring's Context.
If the beans are unused for a long time, is it possible to release them to be garbage collected?
It feels like you want to address the problem backwards, by trying to mitigate the symptoms without touching the root.
If you have too many Beans that are unused for a significant amount of time (and presumably the application is constantly busy) the problem might be rooted on the Architectural level. The solution (not the easy one) might be to split it into multiple services shipped with more granular subsets of functionalities (which are now you have in a single application) deployed and managed separately.
The title might be a bit strong, but let me explain how I understand what happens. I guess this happened with Tomcat (and the message cited comes from Tomcat), but I'm not sure anymore.
TL;DR At the bottom there's a summary why I'm claiming that it is the web servers' fault.
I might be wrong (but without the possibility of being wrong there would be no reason to ask):
An application uses a library
the library uses a ThreadLocal
the ThreadLocal refers to an object from the library
each object refers to its ClassLoader
The webserver
pools its worker threads for efficiency
lends an arbitrary thread to an application
does nothing special (w.r.t. the thread pool) when an application stops or redeploys
If I understand it correctly, after a redeploy the old "dirty" threads continue to be reused. Their ThreadLocals refer to the old classes which refer to their ClassLoader which refer to the whole old class hierarchy. So a lot of stuff stays in the PermGen space which over time leads to an OutOfMemoryError. Is this right so far?
I'm assuming two things:
the redeploy frequency is a few time per hour
the thread creation overhead is a fraction of a millisecond
So a complete thread pool renewal upon each redeploy costs a fraction of a millisecond a few times per hour, i.e., there's a time overhead of 0.0001 * 12/3600 * 100% i.e. 0.000033%.
But instead of accepting this tiny overhead, there are countless problems. Is my calculation wrong or what am I overlooking?
As a warning we get the message
The web application ... created a ThreadLocal with key of type ... and a value of type ... but failed to remove it when the web application was stopped.
which should be better stated as
The web server ... uses a thread pool but failed to renew it after stopping (or redeploying) an application.
Or am I wrong? The time overhead is negligible even when all threads get recreated from time to time. But clearing their ThreadLocals before they are provided to the applications would suffice and be even faster.
Summary
There are some real problems (recently this one) and the user can do nothing about it. The library writers sometimes can and sometimes can not. IMHO the web servers could solve it pretty easily. The thing happens and has a cause. So I'm blaming the only one party which could do anything about it.
Proposal for what the web server should exactly do
The title of this question is more provocative than correct, but it has its point. And so does the answer by raphw. This linked question has another open bounty.
I think the web servers could solve it as follows:
ensure that each thread gets reused (or killed) sometime
store a LastCleanupTimestamp in a ThreadLocal (for new threads it's the creation time)
when re-using a thread, check if the cleanup timestamp is below some threshold (e.g., now minus some delta, e.g., 1 hour)
if so, clean all ThreadLocals and set a new LastCleanupTimestamp
This would assure that no such leak exists longer than delta plus the duration of the longest request plus the thread turnaround time. The cost would compose as follows:
checking a single ThreadLocal (i.e., some nanoseconds) per request
cleaning all ThreadLocals reflectively (i.e., some more nanoseconds once each delta per thread)
the cost from removing the data possibly useful for the application which stored them. This can't break an application as no application can assume to see a thread containing the thread locals it has set (since it can't even assume to see the thread itself anymore), but it may cost time needed to recreate the data (e.g., a cached DateFormat instance if someone still uses such a terrible thing).
It could be switched off by simply setting the thresold, if no app has been undeployed or redeployed recently.
TL;DR It's not web servers that create memory leaks. It's you.
Let me first state the problem more explicitly: ThreadLocal variables often refer to an instance of a Class that was loaded by a ClassLoader that was meant to be exclusively used by a container's application. When this application gets undeployed, the ThreadLocal reference gets orphaned. Since each instance keeps a reference to its Class and since each Class keeps a reference to its ClassLoader and since each ClassLoader keeps a reference to all classes it ever loaded, the entire class tree of the undeployed application cannot get garbage collected and the JVM instance suffers a memory leak.
Looking at this problem, you can optimize for either:
Allow as many requests per second as possible even throughout a redeploy (thus keep response time short and reuse threads from a thread pool)
Make sure that threads stay clean by discarding threads once they were used when a redeploy occurred (thus patch forgotten manual cleaning)
Most developers of web applications would argue that the first is more important since the second can be achieved by writing good code. And what would happen when a redeploy would happen concurrently to long lasting requests? You cannot shut down the old thread pool since this would interrupt running requests. (There is no globally defined maximum for how long a request cycle can take.) In the end, you would need a quite complex protocol for that and that would bring its own problems.
The ThreadLocal induced leak can however be avoided by always writing:
myThreadLocal.set( ... );
try {
// Do something here.
} finally {
myThreadLocal.remove();
}
That way, your thread will always turn out clean. (On a side note, this is almost like creating global variables: It is almost always a terrible idea. There are some web frameworks like for example Wicket that make a lot of use of this. Web frameworks like this are terrible to use when you need to do things concurrently and get very unintuitive for others to use. There is a trend away from the typical Java one thread per request model such as demonstrated with Play and Netty. Do not get stuck with this anti-pattern. Do use ThreadLocal sparingly! It is almost always a sign of bad design.)
You should further be aware that memory leaks that are induced by ThreadLocal are not always detected. Memory leaks are detected by scanning the web server's worker thread pool for ThreadLocal variables. If a ThreadLocal variable was found the variable's Class reveals its ClassLoader. If this ClassLoader or one of its parents is that of the web application that just got undeployed, the web server can safely assume a memory leak.
However, imagine that you stored some large array of Strings in a ThreadLocal variable. How can the web server assume that this array belongs to your application? The String.class was of course loaded with the JVM's bootstrap ClassLoader instance and cannot be associated with a particular web application. By removing the array, the web server might break some other application that is running in the same container. By not removing it, the web server might leak a large amount of memory. (This time, it is not a ClassLoader and its Classes that are leaked. Depending on the size of the array, this leak might however even be worse.)
And it gets worse. This time, imagine that you stored an ArrayList in your ThreadLocal variable. The ArrayList is part of the Java standard library and therefore loaded with the system ClassLoader. Again, there is no way of telling that the instance belongs to a particular web application. However, this time your ClassLoader and all its Classes will leak as well as all instances of such classes that are stored in the thread local ArrayList. This time, the web server even cannot certainly determine that a memory leak occurred when it finds that the ClassLoader was not garbage collected since garbage collection can only be recommended to a JVM (via System#gc()) but not enforced.
Renewing the thread pool is not as cheap as you might assume.
A web application cannot just go and throw away all threads in a thread pool whenever an application is undeployed. What if you stored some values in those threads? When a web application recycles a thread, it should (I am not sure if all web servers do this) find all non-leaking thread local variables and reregister them in the replaced Thread. The numbers you stated about efficiency would therefore not longer hold.
At the same time, the web server need to implement some logic that manages the replacement of all thread pool's Threads what does neither work in favor of your proposed time calculation. (You might have to deal with long lasting requests - think of running an FTP server in a servlet container -- such that this thread pool transition logic might be active for quite a long time.)
Furthermore, ThreadLocal is not the only possibility of creating a memory leak in a servlet container.
Setting a shut down hook is another example. (And it is unfortunately a common one. Here, you should manually remove the shut down hook when your application is undeployed. This problem would not be solved by discarding threads.) Shut down hooks are furthermore instances of custom subclasses of Thread that were always loaded by an application's class loader.
In general, any application that keeps a reference to an object that was loaded by a child class loader might create a memory leak. (This is generally possible via Thread#getContextClassLoader().) In the end, it is the developer's resposibility to not cause memory leaks, even in Java applications where many developer's misinterpret the automatic garbage collection as there are no memory leaks. (Think of Jochua Bloch's famous stack implementation example.)
After this general statement, I want to comment on Tomcat's memory leak protection:
Tomcat does not promise you to detect all memory leaks but covers specific types of such leaks as they are listed in their wiki. What Tomcat actually does:
Each Thread in the JVM is examined, and the internal structures of the
Thread and ThreadLocal classes are introspected to see if either the
ThreadLocal instance or the value bound to it were loaded by the
WebAppClassLoader of the application being stopped.
Some versions of Tomcat even try to compensate for the leak:
Tomcat 6.0.24 to 6.0.26 modify internal structures of the JDK
(ThreadLocalMap) to remove the reference to the ThreadLocal instance,
but this is unsafe (see #48895) so that it became optional and
disabled by default from 6.0.27. Starting with Tomcat 7.0.6, the
threads of the pool are renewed so that the leak is safely fixed.
However, you have to properly configure Tomcat to do so. The wiki entry on its memory leak protection even warns you how you can break other applications when TimerThreads are involved or how you might leak memory leaks when starting your own Threads or ThreadPoolExecutors or when using common dependencies for several web applications.
All the clean up work offered by Tomcat is a last resort! Its nothing you want to have in your production code.
Summarized: It is not Tomcat that creates a memory leak, it is your code. Some versions of Tomcat try to compensate for such leaks which are detectable if it is configured to do so. However, it is your responsibility to take care of memory leaks and you should see Tomcat's warnings as an invitation to fix your code rather than to reconfigure Tomcat to clean up your mess. If Tomcat detects memory leaks in your application, there might even be more. So take a heap and thread dump out of your application and find out where your code is leaking.
I am debugging a problem that I've had for years in a Tomcat application - a memory leak caused when restarting an application since the Webapp classloader cannot be GC'd. I've taking snapshots of the heap with JProfiler and it seems that at least some my static variables aren't getting freed up.
Certain classes have a static final member which is initialized when the class is first loaded, and because it's final I can't set it to null on app shutdown.
Are static final variables an anti-pattern in Tomcat, or am I missing something? I've just starting poking around with JProfiler 8 so I may be misinterpreting what the incoming references are telling me.
Cheers!
Luke
It is from a few years ago but this presentation I gave at JavaOne covers this topic exactly. The key steps to find the leak are in slide 11 but there is a lot of background information that might be useful as well.
The short version is:
Trigger the leak
Force GC
Use a profiler to find an instance of org.apache.catalina.loader.WebappClassLoader that has the property started=false
Trace the GC roots of that object - those are your leaks
As I note in the presentation, finding the leaks is one thing, finding what triggered them can be a lot harder.
I would recommend running on the latest stable Tomcat version as we are always improving the memory leak detection and prevention code and the warnings and errors that that generates may also provide some pointers.
Static variables should be garbage collected when the class itself is garbage collected, which in turn is the case when its class loader is garbage collected.
You can easily create memory leak by having anything that wasn't loaded by the applications classloader having a reference to any of your classes (or an instance of your classes). Look for things like callback listeners etc. that you didn't remove properly (inner/anonymous classes are easily overlooked).
A single reference to one of your classes prevents its class loader and in turn any class loaded by that class loader to be garbage collected.
Edit, example of leaking an object that prevents GC of all your classes:
MemoryMXBean mx = ManagementFactory.getMemoryMXBean();
NotificationListener nl = new NotificationListener() { ... };
((NotificationEmitter) mx).addNotificationListener(nl, ..., ...);
If you register a listener (NotificationListener here) with an object that exists ouside of your applications scope (MemoryMXBean here), your listener will stay 'live' until its explicitly removed. Since your listener instance hold a reference to its ClassLoader (your application classloader) you have now created a strong reference chain preventing GC of the classloader, and in turn, all the classes it loaded, and through that, any static variables those classes hold.
Edit2: Basically you need to avoid this situation:
[Root ClassLoader]
|
v
[Application ClassLoader]
|
v
(Type loaded by Root).addSomething()
The JVM running the application server has loaded the JRE trough the root class loader (and possibly the application server, too). That means those classes will never become eligible for GC, since there will always be live references to some of them. The application server will load your application in a separate class loader that it will not hold a reference to any longer when your application is redeployed (or at least should). But your application will share all classes from at least the JRE with the application server (at least the JRE, but commonly also the Application Server).
In the hypothetical case when the application server were to create a separate class loader (with no parent, a second root class loader practically) and try to load the JRE a second time (as private to your application) it would cause a lot of problems. Classes intended to be singletons would exists twice, and the two class hierarchies would be incapable of holding any refrences of the other (Caused by the same class loaded by different class loaders beeing different types for the JVM). They couldn't even use java.lang.Object as a reference type for the respective "other" class loaders objects.
This Blog can give you an idea about the memory leak in your application.
When I redeploy my application in tomcat, I get the following issue:
The web application [] created a ThreadLocal with key of type
[java.lang.ThreadLocal] (value [java.lang.ThreadLocal#10d16b])
and a value of type [com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty]
(value [com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty#1a183d2]) but
failed to remove it when the web application was stopped.
This is very likely to create a memory leak.
Also, am using ehcache in my application. This also seems to result in the following exception.
SEVERE: The web application [] created a ThreadLocal with key of type [null]
(value [com.sun.xml.bind.v2.ClassFactory$1#24cdc7]) and a value of type [java
.util.WeakHashMap...
The ehcache seems to create a weak hash map and I get the message that this is very likely to create a memory leak.
I searched over the net and found this,
http://jira.pentaho.com/browse/PRD-3616 but I dont have access to the server as such.
Please let me know if these warnings have any functional impact or can they be ignored? I used the "Find Memory leaks" option in tomcat manager and it says "No memory leaks found"
When you redeploy your application, Tomcat creates a new class loader. The old class loader must be garbage collected, otherwise you get a permgen memory leak.
Tomcat cannot check if the garbage collection will work or not, but it knows about several common points of failures. If the webapp class loader sets a ThreadLocal with an instance whose class was loaded by the webapp class loader itself, the servlet thread holds a reference to that instance. This means that the class loader will not be garbage collected.
Tomcat does a number of such detections, see here for more information. Cleaning thread locals is difficult, you would have to call remove() on the ThreadLocal in each of the threads that is was accessed from. In practice this is only important during development when you redeploy your web app multiple times. In production, you probably do not redeploy, so this can be ignored.
To really find out which instances define the thread locals, you have to use a profiler. For example the heap walker in JProfiler (disclaimer: my company develops JProfiler) will help you to find those thread locals. Select the reported value class (com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty or com.sun.xml.bind.v2.ClassFactory) and show the cumulated incoming references. One of those will be a java.lang.ThreadLocal$ThreadLocalMap$Entry. Select the referenced objects for that incoming reference type and switch to the allocations view. You will see where the instance has been allocated. With that information you can decide whether you can do something about it or not.
Mattias Jiderhamn has an excellent
6-part article that explains very clearly the theory and practice about classloader leaks. Even better, he also released a jar file that we can include in our war files. I tried it on my web apps, and the jar file worked like a charm! The jar file is called classloader-leak-prevention.jar. To use it is as simple as just adding this to our web.xml
<listener>
<listener-class>se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor</listener-class>
</listener>
and then adding this to our pom.xml
<dependency>
<groupId>se.jiderhamn</groupId>
<artifactId>classloader-leak-prevention</artifactId>
<version>1.15.2</version>
</dependency>
For more information, please refer to the
project home page hosted on GitHub
or
Part 6 of his article
Creating Threads without cleaning them up correctly will eventually run you out of memory - been there, done that.
Those who are still wondering for quick solution/workaround, can go for below:
If running the standalone tomcat, kill javaw.exe or the process bearing it.
If running from eclipse, kill eclipse.exe and java.exe or enclosing process.
Still not resolved, Check for the task manager, it is likely that the process which is causing this will be shown with highest memory
usage - Do your analysis and kill that.
You should be good to redeploy the stuff and proceed without memory issues.
I guesses you probably seen this but just in case ehcache doc recommends to put the lib in tomcat and not in WEB-INF/lib.
I recommend initializing thread locals, in a ServletRequestListener.
ServletRequestListener has 2 methods: one for initialization and one for destruction.
This way, you can cleanup your ThreadLocal. Example:
public class ContextInitiator implements ServletRequestListener {
#Override
public void requestInitialized(ServletRequestEvent sre) {
context = new ThreadLocal<ContextThreadLocal>() {
#Override
protected ContextThreadLocal initialValue() {
ContextThreadLocal context = new ContextThreadLocal();
return context;
}
};
context.get().setRequest(sre.getServletRequest());
}
#Override
public void requestDestroyed(ServletRequestEvent sre) {
context.remove();
}
}
web.xml:
<listener>
<listener-class>ContextInitiator</listener-class>
</listener>
In Java when I have an application running in appserver like glassfish, with my application deployed as EJB. When I undeploy EJB what happens to to sigletone classes that a loaded into memory. I understand until I restart the container they are present there and can be garbage collected but I am not sure where and when it will happen, So if I deploy the ejb once again it may pick up the old objects from jvm,
?
Each deployed app is loaded with its own separate classloader. Since the classloader is part of a class's identity, the same class can be loaded multiple times (with different configuration) without the different instances interfering with each other.
This effectively isolates different applications within an app server from each other and even allows the same application to be run twice in parallel.
When an application is undeployed, all its objects (including the classloader and thus the classes themselves) will be garbage collected, if no references to them remain. Unfortunately, it can happen that references remain in some system class and prevent the garbage collection - this is called a classloader leak.
Garbage collection is run at arbitrary times and you cannot control it. While Java's garbage collector can reuse old objects, I guess this is not the case. When you undeploy, the singleton gets destroyed. When you deploy, it creates a new one.