How to a reference of a thread, and verification on ThreadLocal

How to a reference of a thread, and verification on ThreadLocal - java

I'm looking for verification on the following use of ThreadLocal.
I have a service, say ServiceA running on a set of processes, say processSetX in the system.
Which processSetX will be on ServiceA isn't known until runtime and may vary.
The processes in processSetX may run on different threads.
ServiceA has to recognize all processes in processSetX the same way.
For this, I'm supposed to write an ID value, say, of type String, to thread local storage (TLS) of a new thread and read this value later on when needed.
So, the ID of the first thread invoking ServiceA will be this ID for ServiceA to recognize them all. When this first thread starts another thread, it'll go onto this new thread's TLS and write this ID. From there on, every thread in this chain will pass this ID to the new one.
I'm looking to verify ThreadLocal is the way to work this.
I haven't used it before - I want to make sure.
TIA.
//==================
EDIT:
is there a way to get the calling thread's reference?
eg.:
a thread, say threadX is making a call to, say methodA(). is there a way for methodA() to know "who" is calling it?
if so - methodA() is able to invoke a getter method of threadX to read a value from its thread-local storage.
TIA
//=================
EDIT-2:
Thread.currentThread() returns something like Thread[main,5,main]. this may collide across threads.

I think at first, you just need a normal member variable.
For example:
// Thread
public class CalledThread extends Thread {
public String myId;
public void run() {
....
// Caller
CalledThread t = new CalledThread();
t.myId = "the thread ID";
t.start();
However, you won't be able to access myId once you start calling your service classes, so for that you could use a ThreadLocal.
In CalledThread assign the myId to your ThreadLocal in run.
threadLocal.set(myId)

Related

Questions about using ThreadLocal in a Spring singleton scoped service

In my singleton scoped service class below, all methods in the class require some user context that is known when Service.doA() is called. Instead of passing around info across methods, I was thinking about storing those values in TheadLocal. I have two questions about this approach:
1) Does the implementation below use ThreadLocal correctly? That is, it is thread-safe and the correct values will be read/written into the ThreadLocal?
2) Does ThreadLocal userInfo need to be cleaned up explicitly to prevent any memory leaks? Will it be garbage collected?
#Service
public class Service {
private static final ThreadLocal<UserInfo> userInfo = new ThreadLocal<>();
public void doA() {
// finds user info
userInfo.set(new UserInfo(userId, name));
doB();
doC();
}
private void doB() {
// needs user info
UserInfo userInfo = userInfo.get();
}
private void doC() {
// needs user info
UserInfo userInfo = userInfo.get();
}
}

1) The example code is ok, except for the name clashes in doB and doC where you're using the same name for the static variable referencing the ThreadLocal as you are for the local variable holding what you pull from the ThreadLocal.
2) The object you store in the ThreadLocal stays attached to that thread until explicitly removed. If your service executes in a servlet container, for instance, when a request completes its thread returns to the pool. If you haven't cleaned up the thread's ThreadLocal variable contents then that data will stick around to accompany whatever request the thread gets allocated for next. Each thread is a GC root, threadlocal variables attached to the thread won't get garbage-collected until after the thread dies. According to the API doc:
Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).
If your context information is limited to the scope of one service, you're better off passing the information around through parameters rather than using ThreadLocal. ThreadLocal is for cases where information needs to be available across different services or in different layers, it seems like you're only overcomplicating your code if it will be used by only one service. Now if you have data that would be used by AOP advice on different disparate objects, putting that data in a threadlocal could be a valid usage.
To perform the clean-up typically you would identify a point where the thread is done with the current processing, for instance in a servlet filter, where the threadlocal variable can be removed before the thread is returned to the threadpool. You wouldn't use a try-finally block because the place where you insert the threadlocal object is nowhere near where you are cleaning it up.

When you use a ThreadLocal you need to make sure that you clean it up whatever happens because:
It creates somehow a memory leak as the value cannot be collected by the GC because an object is eligible for the GC if and only if there is no object anymore that has an hard reference directly or indirectly to the object. So for example here, your ThreadLocal instance has indirectly an hard reference to your value though its internal ThreadLocalMap, the only way to get rid of this hard reference is to call ThreadLocalMap#remove() as it will remove the value from the ThreadLocalMap. The other potential way to make your value eligible for the GC would be the case where your ThreadLocal instance is itself eligible for the GC but here it is a constant in the class Service so it will never be eligible for the GC which is what we want in your case. So the only expected way is to call ThreadLocalMap#remove().
It creates bugs hard to find because most of the time the thread that uses your ThreadLocal is part of a thread pool such that the thread will be reused for another request so if your ThreadLocal has not properly been cleaned up, the thread will reuse the instance of your object stored into your ThreadLocal that is potentially not even related to the new request which leads to complex bugs. So here for example we could get the result of a different user just because the ThreadLocal has not been cleaned up.
So the pattern is the following:
try {
userInfo.set(new UserInfo(userId, name));
// Some code here
} finally {
// Clean up your thread local whatever happens
userInfo.remove();
}
About thread safety, it is of course thread safe even if UserInfo is not thread safe because each thread will use its own instance of UserInfo so no instance of UserInfo stored into a ThreadLocal will be accessed or modified by several threads because a ThreadLocal value is somehow scoped the the current thread.

It definitely should be cleaned up after use. ThreadLocals leak memory extremely easily, both heap memory and permgen/metaspace memory via classloders leaking. In your case the best way would be:
public void doA() {
// finds user info
userInfo.set(new UserInfo(userId, name));
try {
doB();
doC();
} finally {
userInfo.remove()
}
}

Is public synchronized void run() a bad idea?

I have a class that extends Thread that downloads files. I want to ensure that only one download is occurring at once, so I have a static reference to the class, and check to see if it is null before creating a new reference. However occasionally I notice that another instance of this class is created, and therefore downloading on a different thread. I'm trying to figure out what could cause this, however, would it be a bad idea in general to mark the run() method of the Thread to synchronized (or the method that calls start()) ? Are there any side effects to be aware of?

you need to ensure only a single instance of your said object get created in lifetime of JVM. for that there is a much famous singleton pattern which ensure this.
Make the constructor private. Give a static factory method to create the instance.
Example:
Downloader{
private static volatile Downloader iDownloader=null;
private Downloader(){
}
public static Downloader createDownloader(){
if(iDownloader==null){
synchronized(Downloader.class){
if(iDownloader==null)
iDownloader=new Downloader();
}
}
return iDownloader;
}
}

if you want limit number of downloads running at any time you should use a semaphore mechanism in this way u can scale number of downloads, you should not need any synchronized run in this way, also in future if u need two downloads run you just increase your semaphore size

Yes you need to synchronize access to the static flag. And it's fine to do that with synchronized methods. However when you're all done you will have implemented a lock. So have a look at the Java Lock class. The method that starts the file transfer needs to grab the lock before starting the download thread. The thread releases it after either the download is complete or has failed. As the docs point out, release must occur with 100% certainty, or all downloads will be blocked.

you can make your thread a pipeline thread by using Looper class from Android Framework and enqueue your download requests by a Handler instance
here is a nice tutorial that might help you
http://mindtherobot.com/blog/159/android-guts-intro-to-loopers-and-handlers/

Is this custom interface implementation threadsafe

I have many threads using the below interface ServerMessage
If many threads call the dataChanged() at the same time im
using the SwingUtilities.invokeLater to fire of the job.
My feeling was that this would hold but could not more then
one thread enter the dataChanged() and reassign the runnable
before the SwingUtilities.invokeLater(runnable); is processed?
Maybe i should put a lock on the runnable like this.
synchronized (runnable) {
work...
}
And what about the sO wouldn't that one also change when next thread enter?
My interface used by many threads
public interface ServerMessage{
public void dataChanged(ServerEventObjects se);
public void logChanged(String txt);
}
This is the interface method
#Override
public void dataChanged(final ServerEventObjects sO) {
Runnable runnable = new Runnable() {
public void run(){
// create new ServerEventObject to avoid making changes to original.
// could clone() it but mongoDb didn't like that
ServerEventObject serverEventObject = new ServerEventObject();
serverEventObject.eventUuid = sO.eventUuid;
serverEventObject.eventUuid = sO.message;
jTextAreaLog.append(serverEventObject.message +"\n" );
JTableModel.add(serverEventObject)
}
};
SwingUtilities.invokeLater(runnable);
}

Thread-safety is an issue that derives from sharing the state of a given object between different threads.
Interfaces have no implmentation and therefore they do not have safety issues. On the other hand, your classes implementing these interfaces, and depending on usage, may need to deal with thread-safety issues.
Is your method modifying any state in delcaring class?
Is your method modifying the sate on any other objects shared with other threads?
In the code that you shared, there is no shared state between threads that is being compromised.
Since you are using invokeLater() you are making sure that any changes made to Swing components are done through the EventDispatch thread controlled by Swing.
We could argue that the object that you receive in the method as parameter is also compromised, but since you do not change its state, then for the time being it is safe.
So, it appears your code is thread safe.

My feeling was that this would hold but could not more then
one thread enter the dataChanged() and reassign the runnable before the SwingUtilities.invokeLater(runnable); is processed?
There is nothing shared between the threads that would allow this kind of behavior. Inside dataChanged you create a new Runnable instance and every thread that invokes the method would do the same. Given your code, no thread would be able to modify the same Runnable instance since none of the threads share the same instance.
In other words: your implementation of the given interface is thread safe.
Update
I think a good article to read on the topic is The Architecture of the Java Virtual Machine. Since each thread has its own stack, the method invocation, including any local variables, calculations and return values (if any), will be pushed on the calling thread's stack. Since thread stacks are not shared, then no other thread will be able to "see" the local variables. The only way a thread can "interfere" with each other is if they share something and in this case they're not sharing anything.

This particular method is thread safe.
Because runnable is a local variable, each such thread will allocate its own independent runnable. No reassignment can therefore happen.
And if you ask, jTextArea.append is thread safe, although most of Swing is not.

Java starting a thread pool in objects constructor

Is it safe to start a thread pool in an objects constructor? I know that you shouldn't start a thread from a constructor, something about the "this" pointer escaping (I don't exactly understand this, but will do some more searches to try and figure it out).
The code would look something like this:
private ExecutorService pool;
public handler()
{
pool = Executors.newCachedThreadPool();
}
public void queueInstructionSet(InstructionSet set)
{
pool.submit(new Runnable that handles this instruction set);
}
If that doesn't work, i could just create this class as a Runnable and start it in a new thread. However, that seems like it would be adding an unnecessary thread to the program where it doesn't really need one.
Thanks.
EDIT:
Thanks for the replies everyone, they definitely helped make sense of this.
As per the code, in my mind it makes sense that this constructor creates the thread pool, but let me explain what specifically this code is doing, because i may be thinking about this in a weird way.
The entire point of this object is to take "Instruction Sets" objects, and act on them accordingly. The instruction sets come from clients connected to a server. Once a full instruction set is received from a client, that instruction set is sent to this object (handler) for processing.
This handler object contains a reference to every object that an instruction set can act upon. It will submit the instruction set to a thread pool, which will find which object this instruction set wants to interact with, and then handle the instruction set on that object.
I could handle the instruction set object in the IO server, but my thoughts are having a separate class for it makes the entire code more readable, as each class is focusing on doing only one specific thing.
Thoughts? Advice?
Thanks

Your sample code doesn't let "this" escape at all. It's reasonably safe to start a new thread in a constructor (and even use this as the Runnable, which you don't in this example) so long as you're sure that you've already initialized the object as far as the new thread will need it. For example, setting a final field which the new thread will rely on after starting the thread would be a really bad idea :)
Basically letting the "this" reference escape is generally nasty, but not universally so. There are situations in which it's safe. Just be careful.
Having said that, making a constructor start a thread might be seen as doing too much within the constructor. It's hard to say whether it's appropriate in this case or not - we don't know enough about what your code is doing.
EDIT: Yup, having read the extra information, I think this is okay. You should probably have a method to shut down the thread pool as well.

I agree with Jon.
Furthermore, let me point that you're not actually starting any actions on the thread pool in the constructor. You're instantiating the thread pool, but it has no tasks to run at that point. Therefore, as written, you're not going to have something start operating on this instance before it finishes construction.

It sounds like the thread pool would be owned and used by the object; threads wouldn't be pass out of the object. If that's the case, it shouldn't be an issue.
Constructors create an object and initialize its state. I can't imagine a use case where long-running processes are required to do so.
I can see where an object might interact with a thread pool to accomplish a task, but I don't see the need for that object to own the thread pool.
More details might help.

I think it's OK to start a thread pool in the constructor of the object as long as that object fully manages the lifetime of that thread pool.
If you go this path, you will have to work extra hard to provide the following guarantees:
If you constructor throws any exception ( both Runtime and checked ), you must have cleanup code in the constructor that shuts down the thread pool. If you don't do this and create a thread pool with non-daemon threads then, for example, a little console program that uses your object may stay up forever, leaking valuable system resources.
You need to provide something that I call destructor method, similar to close in Java I/O. I usually call it releaseResources. Notice that finalize is not a substitute for this method, because it is called by GC, and for an object with reasonably small memory footprint it may never be called.
When using this object follow this pattern
->
MyThreadPoolContainer container =
new MyThreadPoolContainer( ... args to initialize the object... );
try
{
methodThatUsesContainer( container );
}
finally
{
container.releaseResources( );
}
Document that object constructor allocates limited resources and the destructor method has to be called explicitly to prevent their leakage.

Is there a way to search for and access Threads that are currently running?

Using Java 6:
I have a method that uses a Thread to run a task in the background. This task accesses files, so the method should not be able to have multiple threads running.
I am trying to figure out if there is a way that I can search for active Threads at the beginning of my method. I want to know if there is an active Thread that is already running my task, so that I can handle the situation properly.
Is this possible without having an actual instance of a previous Thread handy? I would like to avoid saving instances of the Thread globally.

You may want something a little more robust, like using a ReentrantLock to prevent concurrent access to those resources.

Just for reference: you can get all active threads in the current thread's group and its subgroups (for a standalone program, this usually can get you all threads) with java.lang.Thread.enumerate(Thread[]). But this is not the way to solve your problem - as Brian said, use a lock.

Use [ReentrantLock.tryLock](http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/locks/ReentrantLock.html#tryLock(). If it returns false, bingo! Some other thread currently holds the lock.

Even if you had access to these Threads, what would you do with that knowledge? How would you tell what that Thread is currently doing?
If you have a service that can be accessed from multiple places, but you want to guarantee only a single thread will be used by this service, you can set up a work queue like this:
public class FileService
{
private final Queue workQueue = new ArrayBlockingQueue(100/*capacity*/);
public FileService()
{
new Thread()
{
public void run()
{
while(true)
{
Object request = workQueue.take(); // blocks until available
doSomeWork(request);
}
}
}.start();
}
public boolean addTask(Object param)
{
return workQueue.offer(param); // return true on success
}
}
Here, the ArrayBlockingQueue takes care of all the thread safety issues. addTask() can be called safely from any other thread; it will simply add a "job" to the workQueue. Another, internal thread will constantly read from the workQueue and perform some operation if there's work to do, otherwise it will wait quietly.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.