Threadlocal memory leak in Threadpool

Threadlocal memory leak in Threadpool - java

I am getting threadlocal memory leak errors in Tomcat and I am using ThreadPool, but have no implementation of ThreadLocal in my webapp.
SEVERE: The web application [/myWebApp] created a ThreadLocal with key of type [org.a
pache.http.impl.cookie.DateUtils$DateFormatHolder$1] (value [org.apache.http.imp
l.cookie.DateUtils$DateFormatHolder$1#4c2849]) and a value of type [java.lang.re
f.SoftReference] (value [java.lang.ref.SoftReference#1e67280]) but failed to rem
ove it when the web application was stopped. Threads are going to be renewed ove
r time to try and avoid a probable memory leak.
What I dont understand is why i am getting threadlocal error although i have not implemented it? I want to get rid of these messages so I searched the web, and in here it is written that in order to clean the threadlocal i need to use:
ThreadLocal.remove()
but I have no implementation of ThreadLocal.. I'll be appreciated if someone show me a way.

Clearly, something is creating that / those ThreadLocal instances. If it is not your code, then it must be some library you are using, or (unlikely) Tomcat itself.
I would start by looking at what might be creating instances of
org.apache.http.impl.cookie.DateUtils$DateFormatHolder$1
(That's an anonymous class in a nested class in DataUtils, by the way ... so unless something weird is coing on, the creation will be occuring in the DateUtils.java file.)
If examining the source code doesn't help, try debugging the Tomcat instance and setting a breakpoint on the ThreadLocal constructor(s).

The problem lies within your third party library. You cannot use thread locals in a thread pooled environment unless you really clean them up after the end of each request.
This article explains the problem:
http://blog.maxant.co.uk/pebble/2008/09/23/1222200780000.html

The ThreadLocal was obviously created by some framework or library you use (look which one is using HttpClient), but as you can see in the log the value is a SoftReference which should minimize the memory leak.
In fact you can see in the code for DateUtils that it is creating the Threadlocal...

Here's the HttpClient JIRA: https://issues.apache.org/jira/browse/HTTPCLIENT-1216
From release 4.2.2 there's a clearThreadLocal() method, starting with 4.3 the cookie-DateUtils is deprecated and replaced with org.apache.http.client.utils.DateUtils.
Calling DateUtils.clearThreadLocal() once on shutdown is not enough, it only clears the ThreadLocal of the current thread, so you need to invoke it after performing an HTTP request which parses/formats dates, on that thread. This removes most of the performance benefit of using a ThreadLocal though.
Alternatively, if you perform HTTP requests from a thread under your control (not created by Tomcat), remember to shut down any thread pools/executors on application shutdown.
Big shame is that HttpClient could easily be modified to not override ThreadLocal, then the ThreadLocal wouldn't reference the webapp and its classloader, avoiding the bulk of the leak I believe :(

Related

How does my ThreadLocal gets reset every request even though we have a thread pool

I noticed something interesting.
I was told (and from what I read) that it is safe to hold request-scope variables in a ThreadLocal (let's say you don't have access to the request object and can't use request attributes)
Well, it seems to work (at least when I checked in tomcat). e.g. Even if I have 10 threads in the pool, the thread local variables seem to live only for the scope of a single request. I mean, even if I see the same thread name (let's say I have 10 in the pool, so after 10 requests I'm ought to see some repeat) each request "magically" resets all thread local variables for that thread.
Is that true?
Is every time a request thread is returned to the pool, it clears all thread local vars? How?

Later tomcat versions have a thread local protection mechanism to avoid memory leaks and especially defend against hanging class loaders on application re-deploy.
TC 6.0.24 detects it and 7.0.6 removes the thread locals, as documented here: http://wiki.apache.org/tomcat/MemoryLeakProtection
So this is not normal for threads/threadpools, but a TC feature.

Is every time a request thread is returned to the pool, it clears all
thread local vars?
It seems that Tomcat took care of house-cleaning but in general the answer is No, and this was one of the reason that was causing Out Of Memory errors in Glassfish 3.0.1
In Glassfish 3.0.1, for instance during application deployment some
code was creating a ThreadLocal variable holding a reference to some
instances which in turn seems to hold a reference to a lot of other
objects (mostly related to proxy classes generated during EJB and CDI
deployment). Glassfish doesn't seem to clean this reference after the
deployment finishes, which wouldn't be much of a problem if the thread
that deploys the application terminates.
Unfortunately the application deployment thread never dies, because it
is not created solely for the purpose of application deployment. It is
instead taken from the Glassfish thread pool and expected behaviour is
returned to the pool once it finishes the deployment task. That means
the ThreadLocal reference never gets cleaned and over time causes the
heap to overflow.

Spring ThreadLocalTargetSource - does this have memory leaks?

I intend to use Spring's ThreadLocalTargetSource for accessing the user context for my web app and web services application.
I have browsed through the net for sometime now and all I see are scary results about the memory leaks. I see that the ThreadLocalTargetSource implementation already has a destroy method that cleans up and nullifies the threadlocal object. I believe all this is good. Then why do we get memory leaks when Spring is handling it in the right way? Is there anything that we need to do for this explicitly?

It is not a big harm to use, but have to be bit careful.
As the API docs suggest, every thread will have its copy of the target and also there might be few thread-bound objects which might be in memory till application actually shuts down.
The API statements just suggests this class have to be used with bit of extra care, though string provides destroy method make the object available for GC. Otherwise, it is very good to use.
Unset the object for every request cycle if you want to use ThreadLocal.
Please refer to below links to a blog and API doc
http://tigrou.nl/2009/05/09/springs-threadlocaltargetsource/
http://docs.spring.io/spring/docs/2.0.x/api/org/springframework/aop/target/ThreadLocalTargetSource.html

Why do some webservers complain about memory leaks they create?

The title might be a bit strong, but let me explain how I understand what happens. I guess this happened with Tomcat (and the message cited comes from Tomcat), but I'm not sure anymore.
TL;DR At the bottom there's a summary why I'm claiming that it is the web servers' fault.
I might be wrong (but without the possibility of being wrong there would be no reason to ask):
An application uses a library
the library uses a ThreadLocal
the ThreadLocal refers to an object from the library
each object refers to its ClassLoader
The webserver
pools its worker threads for efficiency
lends an arbitrary thread to an application
does nothing special (w.r.t. the thread pool) when an application stops or redeploys
If I understand it correctly, after a redeploy the old "dirty" threads continue to be reused. Their ThreadLocals refer to the old classes which refer to their ClassLoader which refer to the whole old class hierarchy. So a lot of stuff stays in the PermGen space which over time leads to an OutOfMemoryError. Is this right so far?
I'm assuming two things:
the redeploy frequency is a few time per hour
the thread creation overhead is a fraction of a millisecond
So a complete thread pool renewal upon each redeploy costs a fraction of a millisecond a few times per hour, i.e., there's a time overhead of 0.0001 * 12/3600 * 100% i.e. 0.000033%.
But instead of accepting this tiny overhead, there are countless problems. Is my calculation wrong or what am I overlooking?
As a warning we get the message
The web application ... created a ThreadLocal with key of type ... and a value of type ... but failed to remove it when the web application was stopped.
which should be better stated as
The web server ... uses a thread pool but failed to renew it after stopping (or redeploying) an application.
Or am I wrong? The time overhead is negligible even when all threads get recreated from time to time. But clearing their ThreadLocals before they are provided to the applications would suffice and be even faster.
Summary
There are some real problems (recently this one) and the user can do nothing about it. The library writers sometimes can and sometimes can not. IMHO the web servers could solve it pretty easily. The thing happens and has a cause. So I'm blaming the only one party which could do anything about it.
Proposal for what the web server should exactly do
The title of this question is more provocative than correct, but it has its point. And so does the answer by raphw. This linked question has another open bounty.
I think the web servers could solve it as follows:
ensure that each thread gets reused (or killed) sometime
store a LastCleanupTimestamp in a ThreadLocal (for new threads it's the creation time)
when re-using a thread, check if the cleanup timestamp is below some threshold (e.g., now minus some delta, e.g., 1 hour)
if so, clean all ThreadLocals and set a new LastCleanupTimestamp
This would assure that no such leak exists longer than delta plus the duration of the longest request plus the thread turnaround time. The cost would compose as follows:
checking a single ThreadLocal (i.e., some nanoseconds) per request
cleaning all ThreadLocals reflectively (i.e., some more nanoseconds once each delta per thread)
the cost from removing the data possibly useful for the application which stored them. This can't break an application as no application can assume to see a thread containing the thread locals it has set (since it can't even assume to see the thread itself anymore), but it may cost time needed to recreate the data (e.g., a cached DateFormat instance if someone still uses such a terrible thing).
It could be switched off by simply setting the thresold, if no app has been undeployed or redeployed recently.

TL;DR It's not web servers that create memory leaks. It's you.
Let me first state the problem more explicitly: ThreadLocal variables often refer to an instance of a Class that was loaded by a ClassLoader that was meant to be exclusively used by a container's application. When this application gets undeployed, the ThreadLocal reference gets orphaned. Since each instance keeps a reference to its Class and since each Class keeps a reference to its ClassLoader and since each ClassLoader keeps a reference to all classes it ever loaded, the entire class tree of the undeployed application cannot get garbage collected and the JVM instance suffers a memory leak.
Looking at this problem, you can optimize for either:
Allow as many requests per second as possible even throughout a redeploy (thus keep response time short and reuse threads from a thread pool)
Make sure that threads stay clean by discarding threads once they were used when a redeploy occurred (thus patch forgotten manual cleaning)
Most developers of web applications would argue that the first is more important since the second can be achieved by writing good code. And what would happen when a redeploy would happen concurrently to long lasting requests? You cannot shut down the old thread pool since this would interrupt running requests. (There is no globally defined maximum for how long a request cycle can take.) In the end, you would need a quite complex protocol for that and that would bring its own problems.
The ThreadLocal induced leak can however be avoided by always writing:
myThreadLocal.set( ... );
try {
// Do something here.
} finally {
myThreadLocal.remove();
}
That way, your thread will always turn out clean. (On a side note, this is almost like creating global variables: It is almost always a terrible idea. There are some web frameworks like for example Wicket that make a lot of use of this. Web frameworks like this are terrible to use when you need to do things concurrently and get very unintuitive for others to use. There is a trend away from the typical Java one thread per request model such as demonstrated with Play and Netty. Do not get stuck with this anti-pattern. Do use ThreadLocal sparingly! It is almost always a sign of bad design.)
You should further be aware that memory leaks that are induced by ThreadLocal are not always detected. Memory leaks are detected by scanning the web server's worker thread pool for ThreadLocal variables. If a ThreadLocal variable was found the variable's Class reveals its ClassLoader. If this ClassLoader or one of its parents is that of the web application that just got undeployed, the web server can safely assume a memory leak.
However, imagine that you stored some large array of Strings in a ThreadLocal variable. How can the web server assume that this array belongs to your application? The String.class was of course loaded with the JVM's bootstrap ClassLoader instance and cannot be associated with a particular web application. By removing the array, the web server might break some other application that is running in the same container. By not removing it, the web server might leak a large amount of memory. (This time, it is not a ClassLoader and its Classes that are leaked. Depending on the size of the array, this leak might however even be worse.)
And it gets worse. This time, imagine that you stored an ArrayList in your ThreadLocal variable. The ArrayList is part of the Java standard library and therefore loaded with the system ClassLoader. Again, there is no way of telling that the instance belongs to a particular web application. However, this time your ClassLoader and all its Classes will leak as well as all instances of such classes that are stored in the thread local ArrayList. This time, the web server even cannot certainly determine that a memory leak occurred when it finds that the ClassLoader was not garbage collected since garbage collection can only be recommended to a JVM (via System#gc()) but not enforced.
Renewing the thread pool is not as cheap as you might assume.
A web application cannot just go and throw away all threads in a thread pool whenever an application is undeployed. What if you stored some values in those threads? When a web application recycles a thread, it should (I am not sure if all web servers do this) find all non-leaking thread local variables and reregister them in the replaced Thread. The numbers you stated about efficiency would therefore not longer hold.
At the same time, the web server need to implement some logic that manages the replacement of all thread pool's Threads what does neither work in favor of your proposed time calculation. (You might have to deal with long lasting requests - think of running an FTP server in a servlet container -- such that this thread pool transition logic might be active for quite a long time.)
Furthermore, ThreadLocal is not the only possibility of creating a memory leak in a servlet container.
Setting a shut down hook is another example. (And it is unfortunately a common one. Here, you should manually remove the shut down hook when your application is undeployed. This problem would not be solved by discarding threads.) Shut down hooks are furthermore instances of custom subclasses of Thread that were always loaded by an application's class loader.
In general, any application that keeps a reference to an object that was loaded by a child class loader might create a memory leak. (This is generally possible via Thread#getContextClassLoader().) In the end, it is the developer's resposibility to not cause memory leaks, even in Java applications where many developer's misinterpret the automatic garbage collection as there are no memory leaks. (Think of Jochua Bloch's famous stack implementation example.)
After this general statement, I want to comment on Tomcat's memory leak protection:
Tomcat does not promise you to detect all memory leaks but covers specific types of such leaks as they are listed in their wiki. What Tomcat actually does:
Each Thread in the JVM is examined, and the internal structures of the
Thread and ThreadLocal classes are introspected to see if either the
ThreadLocal instance or the value bound to it were loaded by the
WebAppClassLoader of the application being stopped.
Some versions of Tomcat even try to compensate for the leak:
Tomcat 6.0.24 to 6.0.26 modify internal structures of the JDK
(ThreadLocalMap) to remove the reference to the ThreadLocal instance,
but this is unsafe (see #48895) so that it became optional and
disabled by default from 6.0.27. Starting with Tomcat 7.0.6, the
threads of the pool are renewed so that the leak is safely fixed.
However, you have to properly configure Tomcat to do so. The wiki entry on its memory leak protection even warns you how you can break other applications when TimerThreads are involved or how you might leak memory leaks when starting your own Threads or ThreadPoolExecutors or when using common dependencies for several web applications.
All the clean up work offered by Tomcat is a last resort! Its nothing you want to have in your production code.
Summarized: It is not Tomcat that creates a memory leak, it is your code. Some versions of Tomcat try to compensate for such leaks which are detectable if it is configured to do so. However, it is your responsibility to take care of memory leaks and you should see Tomcat's warnings as an invitation to fix your code rather than to reconfigure Tomcat to clean up your mess. If Tomcat detects memory leaks in your application, there might even be more. So take a heap and thread dump out of your application and find out where your code is leaking.

ThreadLocal memory leak in Glassfish

Will the ThreadLocal cause memory leak in Glassfish server like it leaks in Tomcat? Why?
http://wiki.apache.org/tomcat/MemoryLeakProtection

Yes, it will leak and Glassfish won't even warn you according to this relatively recent Glassfish JIRA issue:
http://java.net/jira/browse/GLASSFISH-14128
What needs to be said however is that the ThreadLocal specific leaking is not a 'bug' in app/web servers per se, but a problem with code in components running in those containers (whether these components are servlets, session beans or whatever).
What app servers/web containers try to do in general is to shield developer from writing a lot of maintenance code and to make him focus on business logic. There needs to be however some understanding on his part of how the application server works (thread pools, classloaders, deploy/undeploy mechanism, ...) so that stuff like this ThreadLocal issue is done properly or avoided. It is not always easy and it can be very tricky. I remember reading about a memory leak issue in Glassfish? related to use of custom log levels.
What Apache Tomcat does is that it has a helper mechanism to warn user/deal with some commonly occurring memory leak issues in user code. But even in the link provided in the question, you may read that not all possible ThreadLocal memory leaks are done automatically using this mechanism.
Glassfish does not seem to have this added functionality yet.

This problem causes all sorts of issues. I posted about it a while ago
I need help finding my memory leak using MAT
We're manually freeing the objects ourselves. I think I saw in the GF bug lists that this had been fixed in the 3.1x release.

Why is spawning threads in Java EE container discouraged?

One of the first things I've learned about Java EE development is that I shouldn't spawn my own threads inside a Java EE container. But when I come to think about it, I don't know the reason.
Can you clearly explain why it is discouraged?
I am sure most enterprise applications need some kind of asynchronous jobs like mail daemons, idle sessions, cleanup jobs etc.
So, if indeed one shouldn't spawn threads, what is the correct way to do it when needed?

It is discouraged because all resources within the environment are meant to be managed, and potentially monitored, by the server. Also, much of the context in which a thread is being used is typically attached to the thread of execution itself. If you simply start your own thread (which I believe some servers will not even allow), it cannot access other resources. What this means, is that you cannot get an InitialContext and do JNDI lookups to access other system resources such as JMS Connection Factories and Datasources.
There are ways to do this "correctly", but it is dependent on the platform being used.
The commonj WorkManager is common for WebSphere and WebLogic as well as others
More info here
And here
Also somewhat duplicates this one from this morning
UPDATE: Please note that this question and answer relate to the state of Java EE in 2009, things have improved since then!

For EJBs, it's not only discouraged, it's expressly forbidden by the specification:
An enterprise bean must not use thread
synchronization primitives to
synchronize execution of multiple
instances.
and
The enterprise bean must not attempt
to manage threads. The enterprise
bean must not attempt to start, stop,
suspend, or resume a thread, or to
change a thread’s priority or name.
The enterprise bean must not attempt
to manage thread groups.
The reason is that EJBs are meant to operate in a distributed environment. An EJB might be moved from one machine in a cluster to another. Threads (and sockets and other restricted facilities) are a significant barrier to this portability.

The reason that you shouldn't spawn your own threads is that these won't be managed by the container. The container takes care of a lot of things that a novice developer can find hard to imagine. For example things like thread pooling, clustering, crash recoveries are performed by the container. When you start a thread you may lose some of those. Also the container lets you restart your application without affecting the JVM it runs on. How this would be possible if there are threads out of the container's control?
This the reason that from J2EE 1.4 timer services were introduced. See this article for details.

Concurrency Utilities for Java EE
There is now a standard, and correct way to create threads with the core Java EE API:
JSR 236: Concurrency Utilities for Java™ EE
By using Concurrency Utils, you ensure that your new thread is created, and managed by the container, guaranteeing that all EE services are available.
Examples here

There is no real reason not to do so. I used Quarz with Spring in a webapp without problems. Also the concurrency framework java.util.concurrent may be used. If you implement your own thread handling, set the theads to deamon or use a own deamon thread group for them so the container may unload your webapp any time.
But be careful, the bean scopes session and request do not work in threads spawned! Also other code beased on ThreadLocal does not work out of the box, you need to transfer the values to the spawned threads by yourself.

You can always tell the container to start stuff as part of your deployment descriptors. These can then do whatever maintainance tasks you need to do.
Follow the rules. You will be glad some day you did :)

Threads are prohibited in Java EE containers according to the blueprints. Please refer to the blueprints for more information.

I've never read that it's discouraged, except from the fact that it's not easy to do correctly.
It is fairly low-level programming, and like other low-level techniques you ought to have a good reason. Most concurrency problems can be resolved far more effectively using built-in constructs like thread pools.

One reason I have found if you spawn some threads in you EJB and then you try to have the container unload or update your EJB you are going to run into problems. There is almost always another way to do something where you don't need a Thread so just say NO.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.