Clear ThreadLocal value while returning to ThreadPool - java

While reusing Thread from thread pool, we get thread local variable value from the last execution of the thread.
I understand that Thread local is part of Thread so it is getting reused when we use Thread pool. But my problem is I do not want to use thread local variable set in the last execution (valid use case for many people).
Is there any better way of clearing thread local values when thread goes to the pool after current execution?

There is no way to clear ThreadLocals using public API. Nevertheless, ThreadPoolExecutor has a hook method where we can clear ThreadLocals in Thread using reflection before execution
public class ThreadPoolExector
protected void beforeExecute(Thread t, Runnable r) {
... set t.threadLocals and t.inheritableThreadLocals fields to null using reflection
}
...
public class Thread
...
ThreadLocal.ThreadLocalMap threadLocals;
ThreadLocal.ThreadLocalMap inheritableThreadLocals;
...

You can clear the thread local map for a thread using reflection. This will clear all thread locals no matter where they are.
Ideally, you should write your code so your thread locals are not stateful like this. This means unit testing much harder.

It should be the responsibility of the "thread user" to clear the ThreadLocal - that is, the code that put the ThreadLocal in place shall also clear it. Use e.g. a try-finally-block to make sure this happens.
However, you can hack what you want! Seriously, it is not difficult - you just introspect your way into the Thread object/class, find the Map where the ThreadLocals reside, and clear it.

Related

Java How to uniquely identify a Runnable being executed in a Thread

I want to find a way to identify a Runnable instance during execution. Basically I am creating a temporary cache that is accessible by the thread via ThreadLocal, but having it tied to a Thread is not enough as my application is using thread pooling so the same thread will be reused over and over. Since the Runnable that is passed into the thread will not be reused, I wanted to find a way to get to the Runnable so that I can have a way to identify the same runnable during execution. It's going to be used as a key to a Map so even just the return from a toString() would be adequate.
I am not creating the thread pool and the threads are created from multiple poolers so I'd rather not try to augment the Thread / Runnable creation process if possible.
I can't seem to find a way to get to anything useful from Thread.currentThread(), but using that would be preferred, if it's possible.
If you want to use Thread.currentThread() you can inspect the thread's stack trace, which will allow you to determine what's running. If you subclass Runnable for each task you can easily determine which runnable is executing. Otherwise you can inspect deeper in the stack (i.e. whatever code the runnable is calling) to heuristically determine what is being executed.
Use a Map whoes values re the Runnables and whose key is its System.identityHashcode(). Not perfect, but as good as you will get.
Or else make each Runnable have its own UUID attribute and use that as a key. Now that's perfect. But more expensive.

Do I need extra synchronization when using a BlockingQueue?

I have a simple bean #Entity Message.java that has some normal properties. The life-cycle of that object is as follows
Instantiation of Message happens on Thread A, which is then enqueued into a blockingQueue
Another thread from a pool obtains that object and do some stuff with it and changes the state of Message, after that, the object enters again into the blockingQueue. This step is repeated until a condition makes it stop. Each time the object gets read/write is potentially from a different thread, but with the guarantee that only one thread at a time will be reading/writing to it.
Given that circumstances, do I need to synchronize the getters/setters ? Perhaps make the properties volatile ? or can I just leave without synchronization ?
Thanks and hope I could clarify what I am having here.
No, you do not need to synchronize access to the object properties, or even use volatile on the member variables.
All actions performed by a thread before it queues an object on a BlockingQueue "happen-before" the object is dequeued. That means that any changes made by the first thread are visible to the second. This is common behavior for concurrent collections. See the last paragraph of the BlockingQueue class documentation:
Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a BlockingQueue happen-before actions subsequent to the access or removal of that element from the BlockingQueue in another thread.
As long as the first thread doesn't make any modifications after queueing the object, it will be safe.
You don't need to do synchronization yourself, because the queue does it for you already.
Visibility is also guaranteed.
If you're sure that only one thread at a time will access your object, then you don't need synchronisation.
However, you can ensure that by using the synchronized keyword: each time you want to access this object and be sure that no other thread is using the same instance, wrap you code in a synchronized block:
Message myMessage = // ...
synchronized (myMessage) {
// You're the only one to have access to this instance, do what you want
}
The synchronized block will acquire an implicit lock on the myMessage object. So, no other synchronized block will have access to the same instance until you leave this block.
It would sound like you could leave of the synchronized off the methods. The synchronized simply locks the object to allow only a single thread to access it. You've already handled that with the blocking queue.
Volatile would be good to use, as that would ensure that each thread has the latest version, instead of a thread local cache value.

Java starting a thread pool in objects constructor

Is it safe to start a thread pool in an objects constructor? I know that you shouldn't start a thread from a constructor, something about the "this" pointer escaping (I don't exactly understand this, but will do some more searches to try and figure it out).
The code would look something like this:
private ExecutorService pool;
public handler()
{
pool = Executors.newCachedThreadPool();
}
public void queueInstructionSet(InstructionSet set)
{
pool.submit(new Runnable that handles this instruction set);
}
If that doesn't work, i could just create this class as a Runnable and start it in a new thread. However, that seems like it would be adding an unnecessary thread to the program where it doesn't really need one.
Thanks.
EDIT:
Thanks for the replies everyone, they definitely helped make sense of this.
As per the code, in my mind it makes sense that this constructor creates the thread pool, but let me explain what specifically this code is doing, because i may be thinking about this in a weird way.
The entire point of this object is to take "Instruction Sets" objects, and act on them accordingly. The instruction sets come from clients connected to a server. Once a full instruction set is received from a client, that instruction set is sent to this object (handler) for processing.
This handler object contains a reference to every object that an instruction set can act upon. It will submit the instruction set to a thread pool, which will find which object this instruction set wants to interact with, and then handle the instruction set on that object.
I could handle the instruction set object in the IO server, but my thoughts are having a separate class for it makes the entire code more readable, as each class is focusing on doing only one specific thing.
Thoughts? Advice?
Thanks
Your sample code doesn't let "this" escape at all. It's reasonably safe to start a new thread in a constructor (and even use this as the Runnable, which you don't in this example) so long as you're sure that you've already initialized the object as far as the new thread will need it. For example, setting a final field which the new thread will rely on after starting the thread would be a really bad idea :)
Basically letting the "this" reference escape is generally nasty, but not universally so. There are situations in which it's safe. Just be careful.
Having said that, making a constructor start a thread might be seen as doing too much within the constructor. It's hard to say whether it's appropriate in this case or not - we don't know enough about what your code is doing.
EDIT: Yup, having read the extra information, I think this is okay. You should probably have a method to shut down the thread pool as well.
I agree with Jon.
Furthermore, let me point that you're not actually starting any actions on the thread pool in the constructor. You're instantiating the thread pool, but it has no tasks to run at that point. Therefore, as written, you're not going to have something start operating on this instance before it finishes construction.
It sounds like the thread pool would be owned and used by the object; threads wouldn't be pass out of the object. If that's the case, it shouldn't be an issue.
Constructors create an object and initialize its state. I can't imagine a use case where long-running processes are required to do so.
I can see where an object might interact with a thread pool to accomplish a task, but I don't see the need for that object to own the thread pool.
More details might help.
I think it's OK to start a thread pool in the constructor of the object as long as that object fully manages the lifetime of that thread pool.
If you go this path, you will have to work extra hard to provide the following guarantees:
If you constructor throws any exception ( both Runtime and checked ), you must have cleanup code in the constructor that shuts down the thread pool. If you don't do this and create a thread pool with non-daemon threads then, for example, a little console program that uses your object may stay up forever, leaking valuable system resources.
You need to provide something that I call destructor method, similar to close in Java I/O. I usually call it releaseResources. Notice that finalize is not a substitute for this method, because it is called by GC, and for an object with reasonably small memory footprint it may never be called.
When using this object follow this pattern
->
MyThreadPoolContainer container =
new MyThreadPoolContainer( ... args to initialize the object... );
try
{
methodThatUsesContainer( container );
}
finally
{
container.releaseResources( );
}
Document that object constructor allocates limited resources and the destructor method has to be called explicitly to prevent their leakage.

How to re-use a thread in Java?

I am a building a console Sudoku Solver where the main objective is raw speed.
I now have a ManagerThread that starts WorkerThreads to compute the neibhbors of each cell. So one WorkerThread is started for each cell right now. How can I re-use an existing thread that has completed its work?
The Thread Pool Pattern seems to be the solution, but I don't understand what to do to prevent the thread from dying once its job has been completed.
ps : I do not expect to gain much performance for this particular task, just want to experiment how multi-threading works before applying it to the more complex parts of the code.
Thanks
Have a look at the Java SE provided java.util.concurrent API. You can create a threadpool using Executors#newFixedThreadPool() and you can submit tasks using the ExecutorService methods. No need to reinvent your own threadpool. Also see the Sun tutorial on the subject.
when using a thread pool (java.util.concurrent) , you never actually initialized a thread - but rather pass Runnables to the thread pool.
you don't need to worry about the thread life-cycle, just do whatever work you need to do in the runnable and let it exit when it's done.
Have a look into using CyclicBarrier synchro: http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/CyclicBarrier.html
Well, if I had to code this logic my self instead of using a package like Quartz from OpenSymphony, I would do the following:
I'd have a WorkerThread which extends Thread. This class will also have private property called runnable which is Runnable. This property will hold a reference to the code you'd like to execute. Have a public setter for it.
The main thread code will start by running the runnable you initialized it with and then switch to a wait state. Before doing that, it will mark to the pool manager that it has finished and it can be returned to the pool. Next time you need a thread, you pick one from the pool, call setRunnable which sets the property runnable, and then wakes up the thread. It will spawn back to work, enter the infinite loop: execute and runnable and go back to wait state.

Why is java.lang.ThreadLocal a map on Thread instead on the ThreadLocal?

Naively, I expected a ThreadLocal to be some kind of WeakHashMap of Thread to the value type. So I was a little puzzled when I learned that the values of a ThreadLocal is actually saved in a map in the Thread. Why was it done that way? I would expect that the resource leaks associated with ThreadLocal would not be there if the values are saved in the ThreadLocal itself.
Clarification: I was thinking of something like
public class AlternativeThreadLocal<T> {
private final Map<Thread, T> values =
Collections.synchronizedMap(new WeakHashMap<Thread, T>());
public void set(T value) { values.put(Thread.currentThread(), value); }
public T get() { return values.get(Thread.currentThread());}
}
As far as I can see this would prevent the weird problem that neither the ThreadLocal nor it's left over values could ever be garbage-collected until the Thread dies if the value somehow strongly references the ThreadLocal itself.
(Probably the most devious form of this occurs when the ThreadLocal is a static variable on a class the value references. Now you have a big resource leak on redeployments in application servers since neither the objects nor their classes can be collected.)
Sometimes you get enlightened by just asking a question. :-) Now I just saw one possible answer: thread-safety. If the map with the values is in the Thread object, the insertion of a new value is trivially thread-safe. If the map is on the ThreadLocal you have the usual concurrency issues, which could slow things down. (Of course you would use a ReadWriteLock instead of synchronize, but the problem remains.)
You seem to be misunderstanding the problem of ThreadLocal leaks. ThreadLocal leaks occur when the same thread is used repeatedly, such as in a thread pool, and the ThreadLocal state is not cleared between usages. They're not a consequence of the ThreadLocal remaining when the Thread is destroyed, because nothing references the ThreadLocal Map aside from the thread itself.
Having a weakly reference map of Thread to thread-local objects would not prevent the ThreadLocal leak problem because the thread still exists in the thread pool, so the thread-local objects are not eligible for collection when the thread is reused from the pool. You'd still need to manually clear the ThreadLocal to avoid the leak.
As you said in your answer, concurrency control is simplified with the ThreadLocal Map being a single instance per thread. It also makes it impossible for one thread to access another's thread local objects, which might not be the case if the ThreadLocal object exposed an API on the Map you suggest.
I remember some years ago Sun changed the implementation of thread locals to its current form. I don't remember what version it was and what the old impl was like.
Anyway, for a variable that each thread should have a slot for, Thread is the natural container of choice. If we could, we would also add our thread local variable directly as a member of Thread class.
Why would the Map be on ThreadLocal? That doesn't make a lot of sense. So it'd be a Map of ThreadLocals to objects inside a ThreadLocal?
The simple reason it's a Map of Threads to Objects is because:
It's an implementation detail ie that Map isn't exposed in any way;
It's always easy to figure out the current thread (with Thread.currentThread()).
Also the idea is that a ThreadLocal can store a different value for each Thread that uses it so it makes sense that it is based on Thread, doesn't it?

Categories

Resources