Class level, instance level and local ThreadLocals - java

I understand how class level thread locals makes sense. Being associated with thread, we need thread locals to be shared among different instances and classes across that thread. So we need to make them class level. If we want to share thread local across different instances of same class, we can make them private static. If we want to share thread local across different classes, we can make them public static.
Q0. correct me if am wrong with above
My doubts are about instance scoped (non-static) thread locals and local (defined inside some method) thread locals:
Q1. Is there any valid use case for instance scoped (non-static) thread locals?
Q2. Is there any valid use case for local (defined inside some method) thread locals?
Q3. Are instance scoped (non-static) thread locals deleted when an instance is garbage collected?
Q4. Are local (defined inside some method) thread locals deleted when method returns?

ThreadLocal when implemented correctly as a static variable acts essentially as an instance variable for all threads that have access to it. Even though there's a single ThreadLocal variable, the mechanism makes it so that each thread has its own instance of the value in it.
Therefore
Q1. No, it doesn't make sense to have an instance scoped ThreadLocal. This doesn't mean you couldn't write code that would use an instance scoped TL, but you would need to keep track (in your developer mind) of both the instance and the thread being used for correct functionality, that even if you would find a use case that the code would solve, there would be a lot better way to handle it.
Q2. No. As a local variable can never have more than a single thread access it, it would not differ from a regular local variable.
Q3. The ThreadLocal<> wrapper becomes unreachable, but the actual variable is still contained in the thread's map, as you correctly said. This causes a resource/memory leak, as it can't be cleared until the thread stops.
Q4. Same as with Q3, if you lose the wrapper reference, it's an instant leak. If you assign the reference somewhere, it's just weird programming. A method local ThreadLocal variable would be extremely worrying code.
The class is not something you'd want to use too much anyway in modern code (or even older code), and it's not compatible with reactive programming, but if you do use it the usage is straight-forward. A single ThreadLocal most easily implemented as a class level variable.

Q2. Is there any valid use case for local (defined inside some method) thread locals?
First, lets's just be clear. If you say "a local Foobar" (for any class Foobar), then it's not entirely clear what you are talking about. Variables can be "class level" (i.e., static) or "instance level," or "local;" but a Foobar instance is not a variable. The variables in a Java program can only refer to Foobar instances that are allocated on the heap. It's very easy, and very common to have more than one variable in a program refer to the same instance.
ThreadLocal is a class, and instances of ThreadLocal are objects on the heap. The same ThreadLocal object could be referenced by a static ThreadLocal variable and also, at the same time, referenced by local variables in one or more threads.
When you say "a local ThreadLocal," you could be talking about a local variable that holds a reference to a ThreadLocal instance that is shared with other threads, -OR- you could be talking about a ThreadLocal instance that is only referenced by one local variable. The second case would not make any sense because that instance could not be shared by multiple threads.
Q1. Is there any valid use case for instance scoped (non-static) thread locals?
Maybe so, but I would call it a "code smell." (That is, a reason to look closely at the code and see whether it could be better organized.) I personally would never use ThreadLocal in new code. The only times I have ever used it is, while porting older, single-threaded code into a multi-threaded system; and when I did it, the variables in question always were static (i.e., class level) variables.
I personally try never to use static in new code except in cases where some function is clearly labelled as returning a reference to a "singleton object."
Q3., Q4. [...when are instances deleted...]?
An instance will be eligible to be deleted when there is no "live" variable in the program that refers to it. One way that can happen is if the only variable that refers to it is a local variable of some function, and that function returns. A second way it can happen is if the only variable that refers to the instance is assigned to refer to some other instance. A third way is if the only references to it are from instance variables of other objects, and all of those other objects are themselves, eligible to be deleted.
In most Java implementations, the instance will not be immediately deleted when it becomes eligible for deletion. The actual deletion will happen some time later. When, depends on the strategies employed by the JRE's garbage collector and on the patterns of object use by the program.

Related

Java forcing volatile access

Consider a situation like this.
There are two threads and a shared resource(like a HashMap). One thread created the HashMap and initialized it with some key-value pairs and after the shared resource is initialized it will never be modified again.
Now, the second thread is created strictly after the shared resource is initialized and wants to use that resource. At this point I would like some guarantee that the second thread will use the correct version of the shared resource. I presume it is possible that the first thread didn't flush the changes to the main memory before the second thread is created so the second thread will take the old value of the shared resource to it's cache.
Is this analysis correct, and how to force flush to main memory in Java by hand after initializing the shared resource as in this particular situation where I do not want or require volatile or synchronized.
The documentation says:
A call to start on a thread happens-before any action in the started thread.
So, if your code matches your description, it's safe.
If you declare and initialize your HashMap as static field it will be initialized by Java class loader in a thread safe fashion.
If map initialisation happens before start of the second thread then everything is correct. To simplify analysis and to make things simple you can convert ininitialized map into some immutable map implementation and pass it to the created thread explicitly. And this way you would not need to use a shared variable at all.
Is this analysis correct, and how to force flush to main memory in Java by hand after initializing the shared resource as in this particular situation where I do not want or require volatile or synchronized.
It's not possible to not require volatile or synchronized. You have to use some form memory synchronization between threads or stuff doesn't work.
You could use a static initializer as Andrei mentioned (*), or final, both of which imply a memory barrier. But you have to use something.
You may need to require a synchronized map (Collections.synchronizedMap()) or a CurrentHashMap, but you still need to use volatile, synchronized, final or static to guard the field itself.
C.f. Java Concurrency in Practice by Brian Geotz, and also this related question on Stack Overflow (note that the OP gets the name of the book wrong).
(* The whole static initializer thing is kinda complicated, and you should read Mr. Goetz's book, but I'll try to describe it briefly: static fields are part of class initialization. Each static field or static initializer block is written, or executed, by a thread (which could be the thread that called new or accessed the class object for the first time, or could be a different thread). When the process of writing all static fields for the first time is done, the JVM inserts a memory barrier so that the class object, with all its static fields, is visible to all threads in the system as required by the spec.
You do NOT get a memory barrier per field write, like volatile. The class load tries to be efficient and only inserts one barrier at the very end of initialization. Thus you should only use static initializers for what they're supposed to be for: filling in fields for the first time, and don't try to write entire programs inside a static initializer block. It's not efficient and your options for thread safety are actually more limited.
However, the memory barrier that's part of class initialization is available to use, and that's why, Andrei Amarfii said, the pattern of using a static initialzer in Java is used to ensure visibility of objects. It's important enough that Brian Goetz calls it out as one of his four "Safe Publication" patterns.)

Multithreading -How should you initialize variables without using the volatile keyword constantly

I'm creating a game using java and at one point I create a thread to initialize a class that initializes other classes etc. After I do that and the thread is not active, my main thread accesses the class I initialized and calls a method that uses variables initialized using the other thread which then calls another class's methods which has variables initialized by the other thread etc.
My question is, if I want to initialize a lot of variables using a separate thread that are in a bunch of different classes, do I need to make all the variables that I initialize volatile or is there a better way so that all the variables I initialize using that one thread are automatically accessible by other threads.
This question isn't should I use the volatile keyword it's more or a, should I not use the volatile keyword and is there a better option that making a lot of my variables volatile.
Also, if it helps, my program has an object oriented structure.
I think you're confused as to the use of the volatile keyword. You don't use the volatile keyword to make variables visible to other threads. volatile is used to establish a happens-before relationship between writes to a variable and subsequent reads of that variable. This is accomplished by forcing the variable to be read from main memory each time it is read rather than allowing the variable to be read from a CPU cache.
You don't need to do anything special to make an object or elements of an object visible to other threads.

What is the use of ThreadLocal?

What is the use of ThreadLocal when a Thread normally works on variable keeping it in its local cache ?
Which means thread1 do not know the value of same var in thread2 even if no ThreadLocal is used .
With multiple threads, although you have to do work to make sure you read the "most recent" value of a variable, you expect there to be effectively one variable per instance (assuming we're talking about instance fields here). You might read an out of date value unless you're careful, but basically you've got one variable.
With ThreadLocal, you're explicitly wanting to have one value per thread that reads the variable. That's typically for the sake of context. For example, a web server with some authentication layer might set a thread-local variable early in request handling so that any code within the execution of that request can access the authentication details, without needing any explicit reference to a context object. So long as all the handling is done on the one thread, and that's the only thing that thread does, you're fine.
A thread doesn't have to keep variables in its local cache -- it's just that it's allowed to, unless you tell it otherwise.
So:
If you want to force a thread to share its state with other threads, you have to use synchronization of some sort (including synchronized blocks, volatile variables, etc).
If you want to prevent a thread from sharing its state with other threads, you have to use ThreadLocal (assuming the object that holds the variable is known to multiple threads -- if it's not, then everything is thread-local anyway!).
It's kind of a global variable for the thread itself, so that any code running in the thread can access it directly. (A "really" global variable can be accessed by any code running in the "process"; we could call it ProcessLocal:)
Is global variable bad? Maybe; it should be avoided if we can. But sometimes we have no choice, we cannot pass the object through method parameters, and ThreadLocal proves to be useful in many designs without causing too much trouble.
Use of ThreadLocal is when an object is not thread-safe, but you want to avoid synchronizing access. So each thread stores data on its own Thread local storage memory. By default, data is shared between threads.

How threadlocal variable is different from a method level variable

If I use a threadlocal variable, then each thread gets a local copy of the variable. My first question is, if each thread mutates the variable, will the mutated value stay in its local copy only? Or at some point will it try to update the 'global variable' too and we will run into concurrency issues?
My other question is: if I declare a variable in a method, then each thread executing the method in its own stack will get its own copy. So is declaring a method level variable the same as making it threadlocal?
First question: each thread updates its copy of threadlocal variable, no global state is shared between threads.
Second question: if you declare local variable it behaves similary to threadlocal - every thread has its own copy but you don't have global access to it e.g. in another method - that's when threadlocal is useful.
The easiest way to look at a ThreadLocal<T> object is as a Map<Thread, T>, where the ThreadLocal#get() call would lookup the proper value by calling Map#get(Thread.currentThread()) on the underlying Map. Note that this is not the actual implementation, but the easiest way to look at it.
ThreadLocal variables are only useful as a member that can actually be accessed by multiple threads at the same time. Local declarations of a variable in a method are just that, local, and therefore not accessible to other threads. I would not say they are 'the same', but that they are both threadsafe.
Typical usage would be an instance member variable of a singleton object, or a static member variable of a class, in a multi-threaded environment.
Mostly, you will see them used to pass around request context information in a servlet environment.
If i use a threadlocal variable, then each thread gets a local copy of
the variable
I think there is some cunfusion regarding the term local copy of the variable. There is no copy. Every thread gets its own variable; these are independant of each other. It doesn't mean, however, that they cannot hold a reference to a shared object. So, just using threadlocal variables alone does not save you from concurrency issues.
Regarding your second question: No. Local variables and threadlocal variables are different. Local variables are not accessible outside the block in which they are defined. Therefore, for example, calling the same method twice will result in a different value each time. On the other hand, threadlocal variables keep their values as long as the thread exists.
Basically, threadlocal variables are kind of 'static' variables for one single thread.
An important point about ThreadLocal variable is the global access. It can be accessed from anywhere inside the thread.inside any method which calls in that thread context.
If you want to maintain a single instance of a variable for all instances of a class, you will use static-class member variables to do it. If you want to maintain an instance of a variable on a per-thread basis, you'll use thread-local variables. ThreadLocal variables are different from normal variables in that each thread has its own individually initialized instance of the variable, which it accesses via get() or set() methods.
Let's say you're developing a multithreaded code tracer whose goal is to uniquely identify each thread's path through your code. The challenge is that you need to coordinate multiple methods in multiple classes across multiple threads. Without ThreadLocal, this would be a complex problem. When a thread started executing, it would need to generate a unique token to identify it in the tracer and then pass that unique token to each method in the trace.
With ThreadLocal, things are simpler. The thread initializes the thread-local variable at the start of execution and then accesses it from each method in each class, with assurance that the variable will only host trace information for the currently executing thread. When it's done executing, the thread can pass its thread-specific trace to a management object responsible for maintaining all traces.
Using ThreadLocal makes sense when you need to store variable instances on a per-thread basis.

ThreadLocal Vs Cloning

I have been reading about threadlocal and scenarios where it is useful.
I like the concept but was wondering how is it different from cloning?
So a threadlocal will return a new copy of a variable which means that we donot have to use synchronization. A good example is SimpleDateFormat object which is not thread safe and ThreadLocal provides a good way to use.
But why can't we simply create a new copy of varibale use clone ?
What is the value add provided by ThreadLocal class as compared to cloning?
ThreadLocal is not a replacement for synchronization or thread-safe object access. If the same object is assigned to a ThreadLocal from different threads, then the program is no more thread-safe than it was before: the same object will still be shared among the different threads.
ThreadLocal acts-like a variable; that is, it "names" or "refers to" an object:
[ThreadLocal] provides thread-local variables [.. such that] each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable.
That is, what ThreadLocal does is it provides get/set isolation between threads that use the same ThreadLocal object. So each thread could assign/retrieve its own different object to the ThreadLocal; but this would still require a "clone" or new instantiation to assign the different objects to begin with!
Remember, an assignment (or method invocation) never creates an implicit clone/copy/duplicate of an object - and this extends to ThreadLocal.
By using ThreadLocal you create as many variables as there are threads, without the need for any further checking. Remember however, that the storage itself does not guarantee thread-safety. You must make sure that each object stored in local storage is used only from that thread!
Should you clone objects manually, you would have to clone an object every time it is used, or check in which thread we are and then clone.
Besides - is cloning operation thread-safe? What would happen if two different threads attempted to clone an object? I actually do not know, but I think that it would not be good practice.
Using ThreadLocal is faster, the SimpleDateFormat instance stored in a ThreadLocal can be reused multiple times in the same thread, while cloning means creating a new object every time.

Categories

Resources