How to properly dispose of ThreadLocal variables? - java

What is the cleanest way to dispose of ThreadLocal variables so that they are subject to garbage collection? I read from the docs that:
...after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).
But sometimes threads can be pooled or are not expected to die. Does the ThreadLocal#remove() method actually make the value subject to garbage collection?

ThreadLocal.remove() is indeed removing a reference to the value... and if there is no more other living reference to it : the value will be soon garbage collected.
When the thread died, the thread is removed form the GC-root... therefore the entry for the thread in the ThreadLocal is subject to GC... therefore the value for this entry in the ThreadLocal is subject to GC. But once again, if you have another living ref to the value : it won't be garbage collected.
If the thread is reused (because part of a pool or ...) : it is important to call remove() so that the value can be garbage collected, but also to avoid unexpected behavior when a new job is executed on a recycled thread (the new job don't need to know the value used by the previous job)

Related

Is the weakReference.get() method safe to call from multiple threads at the same time?

Is the weakReference.get() method safe to call from multiple threads
at the same time?
The documentation says that
Once an object has been determined to be garbage collectable at that time it will atomically clear all weak references to that
object
WeakReference.get() when updated by the GC or it couldn't be updated by the GC thread. Otherwise, there would be a risk you could see an object which has previously been collected.
Note: As #Pillar may be suggesting, there is one operation which is not thread safe clear(), it is possible another thread might still get() the value after this is called.

Run thread until object/parent is dereferenced

I have a Java class which needs a monitor running parallel when instantiated. I want to keep running this monitor until the instance is not running any more or it is not referenced.
Usually I tend to use a active flag as a variable, which is closed when the class is shutdown/closed, however this has to be managed carefully and it has to be called when closing.
I am also aware of the finalize member of Object but as I remember it is not safe to use it or is it for this purpose?
Additionally a monitor might have circular references to the monitored object of course, but this might be an other issue.
You could like the object to be monitored in the thread using a WeakReference. This allows the garbage collector to collect and destroy the object.
In the thread you would have to check each time if the referenced object still exists every time you perform your checks. If it no longer exists you can safely exit the thread.
As the garbage collector does not immediately destroy objects there may be an unknown time span where the tread is still active but the monitored object is no longer used.

How is an object marked as finalized in Java (so that the finalize method wouldn't be called the second time)?

The main question is in the topic but let me show my vision of finalization proccess in Java so that I can ask you a little more.
Well the gc starts garbage collection by marking all live objects. When all reachable objects are marked as "live". All other objects are unreachable. The next step is to check every unreachable object and determine whether it can be sweeped right now or it should be finalized at first. The gc thinks the next way if the object's finalize method has a body then this object is finalizable and should be finalized; if the object's finalize method has an empty body (protected void finalize(){ }) then it is not finalizable and can be sweeped by gc right now. (Am I right about it?)
All finalizable objects will be put in the same queue to be finalized later one by one. As I understand a finalizable object can spend a lot of time being placed to the queue while waiting for its turn to be finalized. This can happen because normally only one thread called Finalizer are taking objects from the queue and call their finalize method, and when we have some time consuming operations in some object's finalize method the other objects in the queue will be waiting pretty long to be finalized. Well when an object has been finalized it is marked as FINALIZED and removed from the queue. During the next garbage collection process the collector will see that this object is unreachable (again) and has non-empty finalize method (again) so this object should be put in the queue (again) - but it won't because the collector somehow see that this object was marked as FINALIZED. (This is my main question: in what way this object was marked as FINALIZED, how the collector knows that this object shouldn't be finalized again?)
As long as we are talking about HotSpot JVM ...
Object itself IS NOT marked as finalized.
Each time when you create new finalize object, JVM creates an extra object FinalizerRef (which is somewhat similar with Weak/Soft/Phantom references).
Once your object is proven to unreachable with strong references special references to this object are processed. FinalizerRef for you object will be added to finalizer queue (which is linked list, same as with other reference types).
When finalizer thread consumes FinalizerRef from queue it would null its null pointer to object (though thread will keep strong reference to object until finalizer is finished).
Once FinalizerRef is nullified, object cannot get to finalizer queue any more.
BTW
You can see preference processing times (and number of references) in GC logs with -XX:+PrintReferenceGC (see more GC diagnostic JVM options)
The JVM stores meta data in the object header. Any object with a sub-classes finalize() is called, even if empty. Placing in the queue doesn't take long, but it can wait in the queue for a long time.
I don't know how exactly the real, implemented finalizing process works, but if i had to do it, i'd do it this way - store a tri-state flag in the object metadata that tells GC if the object just stopped being in use, needs the finalizer to be run, or may be removed. You'll probably have to check the java source for details but this should be the overall pattern:
(in new)
object.metadata.is_finalized=NEEDS_FINALIZE;
(in gc)
while ((object=findUnreachableObject())!=null) {
if (object.metadata.is_finalized==NEEDS_FINALIZE) {
if (hasNonNullBody(object.finalize)) {
Finalizer.addForProcessing(object);
object.metadata.is_finalized=IN_FINALIZER_QUEUE;
} else {
object.metadata.is_finalized=REMOVE_NOW;
}
}
if (object.metadata.is_finalized==REMOVE_NOW) {
// destroy the object and free the memory
}
}
(in Finalizer)
while ((object=getObjectForProcessing)!=null) {
object.finalize();
object.metadata.is_finalized=REMOVE_NOW;
}

ThreadLocal garbage collection

From javadoc
Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).
from that it seems that objects referenced by a ThreadLocal variable are garbage collected only when thread dies. But what if ThreadLocal variable a is no more referenced and is subject for garbage collection? Will object references only by variable a be subject to garbage collection if thread that holds a is still alive?
for example there is following class with ThreadLocal variable:
public class Test {
private static final ThreadLocal a = ...; // references object b
}
This class references some object and this object has no other references to it. Then during context undeploy application classloader becomes a subject for garbage collection, but thread is from a thread pool so it does not die. Will object b be subject for garbage collection?
TL;DR : You cannot count on the value of a ThreadLocal being garbage collected when the ThreadLocal object is no longer referenced. You have to call ThreadLocal.remove or cause the thread to terminate
(Thanks to #Lii)
Detailed answer:
from that it seems that objects referenced by a ThreadLocal variable are garbage collected only when thread dies.
That is an over-simplification. What it actually says is two things:
The value of the variable won't be garbage collected while the thread is alive (hasn't terminated), AND the ThreadLocal object is strongly reachable.
The value will be subject to normal garbage collection rules when the thread terminates.
There is an important third case where the thread is still live but the ThreadLocal is no longer strongly reachable. That is not covered by the javadoc. Thus, the GC behaviour in that case is unspecified, and could potentially be different across different Java implementations.
In fact, for OpenJDK Java 6 through OpenJDK Java 8 (and other implementations derived from those code-bases) the actual behaviour is rather complicated. The values of a thread's thread-locals are held in a ThreadLocalMap object. The comments say this:
ThreadLocalMap is a customized hash map suitable only for maintaining thread local values. [...] To help deal with very large and long-lived usages, the hash table entries use WeakReferences for keys. However, since reference queues are not used, stale entries are guaranteed to be removed only when the table starts running out of space.
If you look at the code, stale map entries (with broken WeakReferences) may also be removed in other circumstances. If stale entry is encountered in a get, set, insert or remove operation on the map, the corresponding value is nulled. In some cases, the code does a partial scan heuristic, but the only situation where we can guarantee that all stale map entries are removed is when the hash table is resized (grows).
So ...
Then during context undeploy application classloader becomes a subject for garbage collection, but thread is from a thread pool so it does not die. Will object b be subject for garbage collection?
The best we can say is that it may be ... depending on how the application manages other thread locals the thread in question.
So yes, stale thread-local map entries could be a storage leak if you redeploy a webapp, unless the web container destroys and recreates all of the request threads in the thread pool. (You would hope that a web container would / could do that, but AFAIK it is not specified.)
The other alternative is to have your webapp's Servlets always clean up after themselves by calling ThreadLocal.remove on each one on completion (successful or otherwise) of each request.
ThreadLocal variables are hold in Thread
ThreadLocal.ThreadLocalMap threadLocals;
which is initialized lazily on first ThreadLocal.set/get invocation in the current thread and holds reference to the map until Thread is alive. However ThreadLocalMap uses WeakReferences for keys so its entries may be removed when ThreadLocal is referenced from nowhere else. See ThreadLocal.ThreadLocalMap javadoc for details
If the ThreadLocal itself is collected because it's not accessible anymore (there's an "and" in the quote), then all its content can eventually be collected, depending on whether it's also referenced somewhere else and other ThreadLocal manipulations happen on the same thread, triggering the removal of stale entries (see for example the replaceStaleEntry or expungeStaleEntry methods in ThreadLocalMap). The ThreadLocal is not (strongly) referenced by the threads, it references the threads: think of ThreadLocal<T> as a WeakHashMap<Thread, T>.
In your example, if the classloader is collected, it will unload the Test class as well (unless you have a memory leak), and the ThreadLocal a will be collected.
ThreadLocal contains a reference to a WeakHashMap that holds key-value pairs
It depends, it will not be garbage collected if your are referencing it as static or by singleton and your class is not unloaded, that is why in application server environment and with ThreadLocal values, you have to use some listener or request filter the be sure that you are dereferencing all thread local variables at the end of the request processing. Or either use some Request scope functionality of your framework.
You can look here for some other explanations.
EDIT: In the context of a thread pool as asked, of course if the Thread is garbaged thread locals are.
Object b will not be subject for garbage collection if it somehow refers to your Test class. It can happen without your intention. For example if you have a code like this:
public class Test {
private static final ThreadLocal<Set<Integer>> a =
new ThreadLocal<Set<Integer>>(){
#Override public Set<Integer> initialValue(){
return new HashSet<Integer>(){{add(5);}};
}
};
}
The double brace initialization {{add(5);}} will create an anonymous class which refers to your Test class so this object will never be garbage collected even if you don't have reference to your Test class anymore. If that Test class is used in a web app then it will refer to its class loader which will prevent all other classes to be GCed.
Moreover, if your b object is a simple object it will not be immediately subject for GC. Only when ThreadLocal.ThreadLocalMap in Thread class is resized you will have your object b subject for GC.
However I created a solution for this problem so when you redeploy your web app you will never have class loader leaks.

Runnable is not garbage collected in java

I have seen many Thread java examples and Runnable objects are created as tasks and passed to thread.
As there is no reference to these tasks, so why is this task not garbage collected by java?
Or is garbage collected and I am asking the wrong question here?
Please share your valuable thoughts.
The fact that you don't have an explicit reference to an object doesn't mean that an internal JVM object doesn't hold one to it.
Take an example:
frame.add(new JButton("foobar"));
There is no reference to it from a developer point of view but internally the frame has a list of components. This is what happen with threads, the internal scheduler must keep a reference to them for sure.
The thread itself will be garbage collected just when released from the scheduler (so that no reference effectively exists to it anymore)

Categories

Resources