If I use a threadlocal variable, then each thread gets a local copy of the variable. My first question is, if each thread mutates the variable, will the mutated value stay in its local copy only? Or at some point will it try to update the 'global variable' too and we will run into concurrency issues?
My other question is: if I declare a variable in a method, then each thread executing the method in its own stack will get its own copy. So is declaring a method level variable the same as making it threadlocal?
First question: each thread updates its copy of threadlocal variable, no global state is shared between threads.
Second question: if you declare local variable it behaves similary to threadlocal - every thread has its own copy but you don't have global access to it e.g. in another method - that's when threadlocal is useful.
The easiest way to look at a ThreadLocal<T> object is as a Map<Thread, T>, where the ThreadLocal#get() call would lookup the proper value by calling Map#get(Thread.currentThread()) on the underlying Map. Note that this is not the actual implementation, but the easiest way to look at it.
ThreadLocal variables are only useful as a member that can actually be accessed by multiple threads at the same time. Local declarations of a variable in a method are just that, local, and therefore not accessible to other threads. I would not say they are 'the same', but that they are both threadsafe.
Typical usage would be an instance member variable of a singleton object, or a static member variable of a class, in a multi-threaded environment.
Mostly, you will see them used to pass around request context information in a servlet environment.
If i use a threadlocal variable, then each thread gets a local copy of
the variable
I think there is some cunfusion regarding the term local copy of the variable. There is no copy. Every thread gets its own variable; these are independant of each other. It doesn't mean, however, that they cannot hold a reference to a shared object. So, just using threadlocal variables alone does not save you from concurrency issues.
Regarding your second question: No. Local variables and threadlocal variables are different. Local variables are not accessible outside the block in which they are defined. Therefore, for example, calling the same method twice will result in a different value each time. On the other hand, threadlocal variables keep their values as long as the thread exists.
Basically, threadlocal variables are kind of 'static' variables for one single thread.
An important point about ThreadLocal variable is the global access. It can be accessed from anywhere inside the thread.inside any method which calls in that thread context.
If you want to maintain a single instance of a variable for all instances of a class, you will use static-class member variables to do it. If you want to maintain an instance of a variable on a per-thread basis, you'll use thread-local variables. ThreadLocal variables are different from normal variables in that each thread has its own individually initialized instance of the variable, which it accesses via get() or set() methods.
Let's say you're developing a multithreaded code tracer whose goal is to uniquely identify each thread's path through your code. The challenge is that you need to coordinate multiple methods in multiple classes across multiple threads. Without ThreadLocal, this would be a complex problem. When a thread started executing, it would need to generate a unique token to identify it in the tracer and then pass that unique token to each method in the trace.
With ThreadLocal, things are simpler. The thread initializes the thread-local variable at the start of execution and then accesses it from each method in each class, with assurance that the variable will only host trace information for the currently executing thread. When it's done executing, the thread can pass its thread-specific trace to a management object responsible for maintaining all traces.
Using ThreadLocal makes sense when you need to store variable instances on a per-thread basis.
Related
I understand how class level thread locals makes sense. Being associated with thread, we need thread locals to be shared among different instances and classes across that thread. So we need to make them class level. If we want to share thread local across different instances of same class, we can make them private static. If we want to share thread local across different classes, we can make them public static.
Q0. correct me if am wrong with above
My doubts are about instance scoped (non-static) thread locals and local (defined inside some method) thread locals:
Q1. Is there any valid use case for instance scoped (non-static) thread locals?
Q2. Is there any valid use case for local (defined inside some method) thread locals?
Q3. Are instance scoped (non-static) thread locals deleted when an instance is garbage collected?
Q4. Are local (defined inside some method) thread locals deleted when method returns?
ThreadLocal when implemented correctly as a static variable acts essentially as an instance variable for all threads that have access to it. Even though there's a single ThreadLocal variable, the mechanism makes it so that each thread has its own instance of the value in it.
Therefore
Q1. No, it doesn't make sense to have an instance scoped ThreadLocal. This doesn't mean you couldn't write code that would use an instance scoped TL, but you would need to keep track (in your developer mind) of both the instance and the thread being used for correct functionality, that even if you would find a use case that the code would solve, there would be a lot better way to handle it.
Q2. No. As a local variable can never have more than a single thread access it, it would not differ from a regular local variable.
Q3. The ThreadLocal<> wrapper becomes unreachable, but the actual variable is still contained in the thread's map, as you correctly said. This causes a resource/memory leak, as it can't be cleared until the thread stops.
Q4. Same as with Q3, if you lose the wrapper reference, it's an instant leak. If you assign the reference somewhere, it's just weird programming. A method local ThreadLocal variable would be extremely worrying code.
The class is not something you'd want to use too much anyway in modern code (or even older code), and it's not compatible with reactive programming, but if you do use it the usage is straight-forward. A single ThreadLocal most easily implemented as a class level variable.
Q2. Is there any valid use case for local (defined inside some method) thread locals?
First, lets's just be clear. If you say "a local Foobar" (for any class Foobar), then it's not entirely clear what you are talking about. Variables can be "class level" (i.e., static) or "instance level," or "local;" but a Foobar instance is not a variable. The variables in a Java program can only refer to Foobar instances that are allocated on the heap. It's very easy, and very common to have more than one variable in a program refer to the same instance.
ThreadLocal is a class, and instances of ThreadLocal are objects on the heap. The same ThreadLocal object could be referenced by a static ThreadLocal variable and also, at the same time, referenced by local variables in one or more threads.
When you say "a local ThreadLocal," you could be talking about a local variable that holds a reference to a ThreadLocal instance that is shared with other threads, -OR- you could be talking about a ThreadLocal instance that is only referenced by one local variable. The second case would not make any sense because that instance could not be shared by multiple threads.
Q1. Is there any valid use case for instance scoped (non-static) thread locals?
Maybe so, but I would call it a "code smell." (That is, a reason to look closely at the code and see whether it could be better organized.) I personally would never use ThreadLocal in new code. The only times I have ever used it is, while porting older, single-threaded code into a multi-threaded system; and when I did it, the variables in question always were static (i.e., class level) variables.
I personally try never to use static in new code except in cases where some function is clearly labelled as returning a reference to a "singleton object."
Q3., Q4. [...when are instances deleted...]?
An instance will be eligible to be deleted when there is no "live" variable in the program that refers to it. One way that can happen is if the only variable that refers to it is a local variable of some function, and that function returns. A second way it can happen is if the only variable that refers to the instance is assigned to refer to some other instance. A third way is if the only references to it are from instance variables of other objects, and all of those other objects are themselves, eligible to be deleted.
In most Java implementations, the instance will not be immediately deleted when it becomes eligible for deletion. The actual deletion will happen some time later. When, depends on the strategies employed by the JRE's garbage collector and on the patterns of object use by the program.
I'm creating a game using java and at one point I create a thread to initialize a class that initializes other classes etc. After I do that and the thread is not active, my main thread accesses the class I initialized and calls a method that uses variables initialized using the other thread which then calls another class's methods which has variables initialized by the other thread etc.
My question is, if I want to initialize a lot of variables using a separate thread that are in a bunch of different classes, do I need to make all the variables that I initialize volatile or is there a better way so that all the variables I initialize using that one thread are automatically accessible by other threads.
This question isn't should I use the volatile keyword it's more or a, should I not use the volatile keyword and is there a better option that making a lot of my variables volatile.
Also, if it helps, my program has an object oriented structure.
I think you're confused as to the use of the volatile keyword. You don't use the volatile keyword to make variables visible to other threads. volatile is used to establish a happens-before relationship between writes to a variable and subsequent reads of that variable. This is accomplished by forcing the variable to be read from main memory each time it is read rather than allowing the variable to be read from a CPU cache.
You don't need to do anything special to make an object or elements of an object visible to other threads.
What is the use of ThreadLocal when a Thread normally works on variable keeping it in its local cache ?
Which means thread1 do not know the value of same var in thread2 even if no ThreadLocal is used .
With multiple threads, although you have to do work to make sure you read the "most recent" value of a variable, you expect there to be effectively one variable per instance (assuming we're talking about instance fields here). You might read an out of date value unless you're careful, but basically you've got one variable.
With ThreadLocal, you're explicitly wanting to have one value per thread that reads the variable. That's typically for the sake of context. For example, a web server with some authentication layer might set a thread-local variable early in request handling so that any code within the execution of that request can access the authentication details, without needing any explicit reference to a context object. So long as all the handling is done on the one thread, and that's the only thing that thread does, you're fine.
A thread doesn't have to keep variables in its local cache -- it's just that it's allowed to, unless you tell it otherwise.
So:
If you want to force a thread to share its state with other threads, you have to use synchronization of some sort (including synchronized blocks, volatile variables, etc).
If you want to prevent a thread from sharing its state with other threads, you have to use ThreadLocal (assuming the object that holds the variable is known to multiple threads -- if it's not, then everything is thread-local anyway!).
It's kind of a global variable for the thread itself, so that any code running in the thread can access it directly. (A "really" global variable can be accessed by any code running in the "process"; we could call it ProcessLocal:)
Is global variable bad? Maybe; it should be avoided if we can. But sometimes we have no choice, we cannot pass the object through method parameters, and ThreadLocal proves to be useful in many designs without causing too much trouble.
Use of ThreadLocal is when an object is not thread-safe, but you want to avoid synchronizing access. So each thread stores data on its own Thread local storage memory. By default, data is shared between threads.
I have been reading about threadlocal and scenarios where it is useful.
I like the concept but was wondering how is it different from cloning?
So a threadlocal will return a new copy of a variable which means that we donot have to use synchronization. A good example is SimpleDateFormat object which is not thread safe and ThreadLocal provides a good way to use.
But why can't we simply create a new copy of varibale use clone ?
What is the value add provided by ThreadLocal class as compared to cloning?
ThreadLocal is not a replacement for synchronization or thread-safe object access. If the same object is assigned to a ThreadLocal from different threads, then the program is no more thread-safe than it was before: the same object will still be shared among the different threads.
ThreadLocal acts-like a variable; that is, it "names" or "refers to" an object:
[ThreadLocal] provides thread-local variables [.. such that] each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable.
That is, what ThreadLocal does is it provides get/set isolation between threads that use the same ThreadLocal object. So each thread could assign/retrieve its own different object to the ThreadLocal; but this would still require a "clone" or new instantiation to assign the different objects to begin with!
Remember, an assignment (or method invocation) never creates an implicit clone/copy/duplicate of an object - and this extends to ThreadLocal.
By using ThreadLocal you create as many variables as there are threads, without the need for any further checking. Remember however, that the storage itself does not guarantee thread-safety. You must make sure that each object stored in local storage is used only from that thread!
Should you clone objects manually, you would have to clone an object every time it is used, or check in which thread we are and then clone.
Besides - is cloning operation thread-safe? What would happen if two different threads attempted to clone an object? I actually do not know, but I think that it would not be good practice.
Using ThreadLocal is faster, the SimpleDateFormat instance stored in a ThreadLocal can be reused multiple times in the same thread, while cloning means creating a new object every time.
I am a bit confused with the requirements to synchronize the access the private instance variables in java.
I have an applicaion which executes scheduled tasks multithreaded. These tasks (instances of a class) have an instance variable that holds a value object. Further, these tasks have the run methods that execute the task by calling someother classes that hold the execution logic (they in turn use more value objects as part of the processing.)
Now at a high level it looks like all the parallel threads will spawn a chain of these tasks,instance variables , implementation classes and value objects. Do all these need to be made thread safe? all instance variables in all the possible classes and value objects that can be potentially invoked in parallel?
You need to make objects thread safe if multiple threads are going to access them at the same time and if their state is going to change.
It sounds like your task objects are not multi-threaded in that different threads won't access the same task. If that is true you wouldn't need to make your task objects thread safe.
Are the value objects mutable and are they shared in such a way that the same value object instance could be accessed by multiple threads at the same time? If either is yes then you need to make them thread safe.
The easiest way to make an object thread safe is to make it immutable. If its internal state can't change after the object is constructed then it is inherently thread safe. If you can't make your objects immutable then you need to synchronize access to any instance variables whose state could be changed.