What is the use of ThreadLocal? - java

What is the use of ThreadLocal when a Thread normally works on variable keeping it in its local cache ?
Which means thread1 do not know the value of same var in thread2 even if no ThreadLocal is used .

With multiple threads, although you have to do work to make sure you read the "most recent" value of a variable, you expect there to be effectively one variable per instance (assuming we're talking about instance fields here). You might read an out of date value unless you're careful, but basically you've got one variable.
With ThreadLocal, you're explicitly wanting to have one value per thread that reads the variable. That's typically for the sake of context. For example, a web server with some authentication layer might set a thread-local variable early in request handling so that any code within the execution of that request can access the authentication details, without needing any explicit reference to a context object. So long as all the handling is done on the one thread, and that's the only thing that thread does, you're fine.

A thread doesn't have to keep variables in its local cache -- it's just that it's allowed to, unless you tell it otherwise.
So:
If you want to force a thread to share its state with other threads, you have to use synchronization of some sort (including synchronized blocks, volatile variables, etc).
If you want to prevent a thread from sharing its state with other threads, you have to use ThreadLocal (assuming the object that holds the variable is known to multiple threads -- if it's not, then everything is thread-local anyway!).

It's kind of a global variable for the thread itself, so that any code running in the thread can access it directly. (A "really" global variable can be accessed by any code running in the "process"; we could call it ProcessLocal:)
Is global variable bad? Maybe; it should be avoided if we can. But sometimes we have no choice, we cannot pass the object through method parameters, and ThreadLocal proves to be useful in many designs without causing too much trouble.

Use of ThreadLocal is when an object is not thread-safe, but you want to avoid synchronizing access. So each thread stores data on its own Thread local storage memory. By default, data is shared between threads.

Related

Class level, instance level and local ThreadLocals

I understand how class level thread locals makes sense. Being associated with thread, we need thread locals to be shared among different instances and classes across that thread. So we need to make them class level. If we want to share thread local across different instances of same class, we can make them private static. If we want to share thread local across different classes, we can make them public static.
Q0. correct me if am wrong with above
My doubts are about instance scoped (non-static) thread locals and local (defined inside some method) thread locals:
Q1. Is there any valid use case for instance scoped (non-static) thread locals?
Q2. Is there any valid use case for local (defined inside some method) thread locals?
Q3. Are instance scoped (non-static) thread locals deleted when an instance is garbage collected?
Q4. Are local (defined inside some method) thread locals deleted when method returns?
ThreadLocal when implemented correctly as a static variable acts essentially as an instance variable for all threads that have access to it. Even though there's a single ThreadLocal variable, the mechanism makes it so that each thread has its own instance of the value in it.
Therefore
Q1. No, it doesn't make sense to have an instance scoped ThreadLocal. This doesn't mean you couldn't write code that would use an instance scoped TL, but you would need to keep track (in your developer mind) of both the instance and the thread being used for correct functionality, that even if you would find a use case that the code would solve, there would be a lot better way to handle it.
Q2. No. As a local variable can never have more than a single thread access it, it would not differ from a regular local variable.
Q3. The ThreadLocal<> wrapper becomes unreachable, but the actual variable is still contained in the thread's map, as you correctly said. This causes a resource/memory leak, as it can't be cleared until the thread stops.
Q4. Same as with Q3, if you lose the wrapper reference, it's an instant leak. If you assign the reference somewhere, it's just weird programming. A method local ThreadLocal variable would be extremely worrying code.
The class is not something you'd want to use too much anyway in modern code (or even older code), and it's not compatible with reactive programming, but if you do use it the usage is straight-forward. A single ThreadLocal most easily implemented as a class level variable.
Q2. Is there any valid use case for local (defined inside some method) thread locals?
First, lets's just be clear. If you say "a local Foobar" (for any class Foobar), then it's not entirely clear what you are talking about. Variables can be "class level" (i.e., static) or "instance level," or "local;" but a Foobar instance is not a variable. The variables in a Java program can only refer to Foobar instances that are allocated on the heap. It's very easy, and very common to have more than one variable in a program refer to the same instance.
ThreadLocal is a class, and instances of ThreadLocal are objects on the heap. The same ThreadLocal object could be referenced by a static ThreadLocal variable and also, at the same time, referenced by local variables in one or more threads.
When you say "a local ThreadLocal," you could be talking about a local variable that holds a reference to a ThreadLocal instance that is shared with other threads, -OR- you could be talking about a ThreadLocal instance that is only referenced by one local variable. The second case would not make any sense because that instance could not be shared by multiple threads.
Q1. Is there any valid use case for instance scoped (non-static) thread locals?
Maybe so, but I would call it a "code smell." (That is, a reason to look closely at the code and see whether it could be better organized.) I personally would never use ThreadLocal in new code. The only times I have ever used it is, while porting older, single-threaded code into a multi-threaded system; and when I did it, the variables in question always were static (i.e., class level) variables.
I personally try never to use static in new code except in cases where some function is clearly labelled as returning a reference to a "singleton object."
Q3., Q4. [...when are instances deleted...]?
An instance will be eligible to be deleted when there is no "live" variable in the program that refers to it. One way that can happen is if the only variable that refers to it is a local variable of some function, and that function returns. A second way it can happen is if the only variable that refers to the instance is assigned to refer to some other instance. A third way is if the only references to it are from instance variables of other objects, and all of those other objects are themselves, eligible to be deleted.
In most Java implementations, the instance will not be immediately deleted when it becomes eligible for deletion. The actual deletion will happen some time later. When, depends on the strategies employed by the JRE's garbage collector and on the patterns of object use by the program.

When should one prefer ThreadLocal over synchronization, apart from for the performance improvement?

When should one prefer ThreadLocal over synchronization, apart from for the performance improvement? Please explain using a real life example.
ThreadLocal is not an alternative to synchronized. The main problem solved by ThreadLocal is how to manage per-thread static data in an application.
static is something that you should try to avoid whenever you can: It's a recipe for un-testable, brittle code.
When you use ThreadLocal variables these are seen and manipulated by the thread using it ONLY, no other thread can see them. Thread local variables dies when the thread does too.
And one should be careful then using ThreadLocal variables when using thread pools.
ThreadLocal variables are put in a special memory space called Thread private stack.
Shared variable are put in the heap memory space where they are shared among all threads and they are either synchronized or not.
So it is more about use case than performance.
One can use ThreadLocal variable to hold a connection to some DB where the connection is associated with the current thread ONLY and no need for other thread to see it and a need to synchronize it. The cache - a shared in memory map or list,for example, however, is shared among all threads in a server application and it must be synchronized.
The only reason to use Threads is for performance reasons (or perhaps you like confusion ;).
AFAICS If you discount performance, there is not reason to use Threads, ThreadLocal nor synchronzied.
ThreadLocal provides global variable access with in a Thread. This will help when you want to share a variable across methods and still retain Thread scope.
J2EE application servers use ThreadLocal for tracking Transaction, Security context with out passing around

Does Volatile variable makes sense here(multi-core processor)?

I declared a instance variable as voltile. Say two threads are created by two processors under multi core where thread updates the variable. To ensure
instantaneous visibilty, I believe declaring variable as volatile is right choice here so that update done by thread happens in main memory and is visible to another thread .
Right?
Intention here to understand the concept in terms of multicore processor.
I am assuming you are considering using volatile vs. not using any special provisions for concurrency (such as synchronized or AtomicReference).
It is irrelevant whether you are running single-code or multicore: sharing data between threads is never safe without volatile. There are many more things the runtime is allowed to do without it; basically it can pretend the accessing thread is the only thread running on the JVM. The thread can read the value once and store it on the call stack forever; a loop reading the value, but never writing it, may be transformed such that the value is read only once at the outset and never reconsidered, and so on.
So the message is simple: use volatile—but that's not necessarily all you need to take care of in concurrent code.
It doesn't matter if it's done by different processors or not. When you don't have mult-processors, you can still run into concurrency problems because context switches may happen any time.
If a field is not volatile, it may still be in one thread's cache while its context is switched out and the other thread's context switches in. In that case, the thread that just took over the (single) processor will not see that the field has changed.
Since these things can happen even with one processor, they are bound to happen with more than one processor, so indeed, you need to protect your shared data.
Whether volatile is the right choice or not depends on what type it is and what kind of change you are trying to protect from. But again, that has nothing to do with the number of processors.
If the field is a reference type, then volatile only ensures the vilibility of new assignments to the field. It doesn't protect against changes in the object it points to - for that you need to synchronize.

How threadlocal variable is different from a method level variable

If I use a threadlocal variable, then each thread gets a local copy of the variable. My first question is, if each thread mutates the variable, will the mutated value stay in its local copy only? Or at some point will it try to update the 'global variable' too and we will run into concurrency issues?
My other question is: if I declare a variable in a method, then each thread executing the method in its own stack will get its own copy. So is declaring a method level variable the same as making it threadlocal?
First question: each thread updates its copy of threadlocal variable, no global state is shared between threads.
Second question: if you declare local variable it behaves similary to threadlocal - every thread has its own copy but you don't have global access to it e.g. in another method - that's when threadlocal is useful.
The easiest way to look at a ThreadLocal<T> object is as a Map<Thread, T>, where the ThreadLocal#get() call would lookup the proper value by calling Map#get(Thread.currentThread()) on the underlying Map. Note that this is not the actual implementation, but the easiest way to look at it.
ThreadLocal variables are only useful as a member that can actually be accessed by multiple threads at the same time. Local declarations of a variable in a method are just that, local, and therefore not accessible to other threads. I would not say they are 'the same', but that they are both threadsafe.
Typical usage would be an instance member variable of a singleton object, or a static member variable of a class, in a multi-threaded environment.
Mostly, you will see them used to pass around request context information in a servlet environment.
If i use a threadlocal variable, then each thread gets a local copy of
the variable
I think there is some cunfusion regarding the term local copy of the variable. There is no copy. Every thread gets its own variable; these are independant of each other. It doesn't mean, however, that they cannot hold a reference to a shared object. So, just using threadlocal variables alone does not save you from concurrency issues.
Regarding your second question: No. Local variables and threadlocal variables are different. Local variables are not accessible outside the block in which they are defined. Therefore, for example, calling the same method twice will result in a different value each time. On the other hand, threadlocal variables keep their values as long as the thread exists.
Basically, threadlocal variables are kind of 'static' variables for one single thread.
An important point about ThreadLocal variable is the global access. It can be accessed from anywhere inside the thread.inside any method which calls in that thread context.
If you want to maintain a single instance of a variable for all instances of a class, you will use static-class member variables to do it. If you want to maintain an instance of a variable on a per-thread basis, you'll use thread-local variables. ThreadLocal variables are different from normal variables in that each thread has its own individually initialized instance of the variable, which it accesses via get() or set() methods.
Let's say you're developing a multithreaded code tracer whose goal is to uniquely identify each thread's path through your code. The challenge is that you need to coordinate multiple methods in multiple classes across multiple threads. Without ThreadLocal, this would be a complex problem. When a thread started executing, it would need to generate a unique token to identify it in the tracer and then pass that unique token to each method in the trace.
With ThreadLocal, things are simpler. The thread initializes the thread-local variable at the start of execution and then accesses it from each method in each class, with assurance that the variable will only host trace information for the currently executing thread. When it's done executing, the thread can pass its thread-specific trace to a management object responsible for maintaining all traces.
Using ThreadLocal makes sense when you need to store variable instances on a per-thread basis.

ThreadLocal Vs Cloning

I have been reading about threadlocal and scenarios where it is useful.
I like the concept but was wondering how is it different from cloning?
So a threadlocal will return a new copy of a variable which means that we donot have to use synchronization. A good example is SimpleDateFormat object which is not thread safe and ThreadLocal provides a good way to use.
But why can't we simply create a new copy of varibale use clone ?
What is the value add provided by ThreadLocal class as compared to cloning?
ThreadLocal is not a replacement for synchronization or thread-safe object access. If the same object is assigned to a ThreadLocal from different threads, then the program is no more thread-safe than it was before: the same object will still be shared among the different threads.
ThreadLocal acts-like a variable; that is, it "names" or "refers to" an object:
[ThreadLocal] provides thread-local variables [.. such that] each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable.
That is, what ThreadLocal does is it provides get/set isolation between threads that use the same ThreadLocal object. So each thread could assign/retrieve its own different object to the ThreadLocal; but this would still require a "clone" or new instantiation to assign the different objects to begin with!
Remember, an assignment (or method invocation) never creates an implicit clone/copy/duplicate of an object - and this extends to ThreadLocal.
By using ThreadLocal you create as many variables as there are threads, without the need for any further checking. Remember however, that the storage itself does not guarantee thread-safety. You must make sure that each object stored in local storage is used only from that thread!
Should you clone objects manually, you would have to clone an object every time it is used, or check in which thread we are and then clone.
Besides - is cloning operation thread-safe? What would happen if two different threads attempted to clone an object? I actually do not know, but I think that it would not be good practice.
Using ThreadLocal is faster, the SimpleDateFormat instance stored in a ThreadLocal can be reused multiple times in the same thread, while cloning means creating a new object every time.

Categories

Resources