Confused about ThreadLocal

Confused about ThreadLocal - java

I just learned about ThreadLocal this morning. I read that it should always be final and static like:
private static final ThreadLocal<Session> threadLocal = new ThreadLocal<Session>();
(Session is a Hibernate Session)
My confusion is this: Because it is static, it is available to any thread in the JVM. Yet it will hold information local to each thread which accesses it? I'm trying to wrap my head around this so I apologize if this is unclear. Each thread in the application has access to the same ThreadLocal object, but the ThreadLocal object will store objects local to each thread?

Yes, the instance would be the same, but the code attaches the value you set with the Thread.currentThread(), when you set and when you retrieve, so the value set will be accessible just within the current thread when accessed using the methods set and get.
Its really easy to understand it.
Imagine that each Thread has a map that associates a value to a ThreadLocal instance. Every time you perform a get or a set on a ThreadLocal, the implemention of ThreadLocal gets the map associated to the current Thread (Thread.currentThread()) and perform the get or set in that map using itself as key.
Example:
ThreadLocal tl = new ThreadLocal();
tl.set(new Object()); // in this moment the implementation will do something similar to Thread.getCurrentThread().threadLocals.put(tl, [object you gave])
Object obj = t1.get(); // in this moment the implementation will do something similar to Thread.getCurrentThread().threadLocals.get(tl)
And the interesting thing on this is that the ThreadLocal is hierarchic, meaning if you defined a value for a parent Thread it will be accessible from a child one.

You always access the same instance of ThreadLocal for a specific problem but this instance returns a different value for each thread calling the get method.
That's the point : it's easy to find the object but each thread will have its specific own value. Thus you can for example make sure your specific value won't be accessed by two different threads.
You could see it (conceptually) as a kind of HashMap<Thread><V> which would always be accessed with Thread.currentThread() as key.

Because the thread-specific values are not stored in the ThreadLocal object, but the current Thread's ThreadLocalMap. The ThreadLocal object merely serves as key in these maps.
For details, read the JavaDoc of ThreadLocal and subclasses, or, if you are curious about the implementation, the source code available in every recent JDKs src.zip.

Related

How to make my code thread-safe when my shared variable can change anytime?

Here is a question that has been asked many times, I have double-checked numerous issues that have been raised formerly but none gave me an answer element so I thought I would put it here.
The question is about making my code thread-safe in java knowing that there is only one shared variable but it can change anytime and actually I have the feeling that the code I am optimizing has not been thought for a multi-threading environment, so I might have to think it over...
Basically, I have one class which can be shared between, say, 5 threads. This class has a private property 'myProperty' which can take 5 different values (one for each thread). The problem is that, once it's instantiated by the constructor, that value should not be changed anymore for the rest of the thread's life.
I am pretty well aware of some techniques used to turn most of pieces of code "thead-safe" including locks, the "synchronized" keyword, volatile variables and atomic types but I have the feeling that these won't help in the current situation as they do not prevent the variable from being modified.
Here is the code :
// The thread that calls for the class containing the shared variable //
public class myThread implements Runnable {
#Autowired
private Shared myProperty;
//some code
}
// The class containing the shared variable //
public class Shared {
private String operator;
private Lock lock = new ReentrantLock();
public void inititiate(){
this.lock.lock()
try{
this.operator.initiate() // Gets a different value depending on the calling thread
} finally {
this.lock.unlock();
}
}
// some code
}
As it happens, the above code only guarantees that two threads won't change the variable at the same time, but the latter will still change. A "naive" workaround would consist in creating a table (operatorList) for instance (or a list, a map, etc. ) associating an operator with its calling thread's ID, this way each thread would just have to access its operator using its id in the table but doing this would make us change all the thread classes which access the shared variable and there are many. Any idea as to how I could store the different operator string values in an exclusive manner for each calling thread with minimal changes (without using magic) ?

I'm not 100% sure I understood your question correctly, but I'll give it a shot anyway. Correct me if I'm wrong.
A "naive" workaround would consist in creating a table (operatorList)
for instance (or a list, a map, etc. ) associating an operator with
its calling thread's ID, this way each thread would just have to
access its operator using its id in the table but doing this would
make us change all the thread classes which access the shared variable
and there are many.
There's already something similar in Java - the ThreadLocal class?
You can create a thread-local copy of any object:
private static final ThreadLocal<MyObject> operator =
new ThreadLocal<MyObject>() {
#Override
protected MyObject initialValue() {
// return thread-local copy of the "MyObject"
}
};
Later in your code, when a specific thread needs to get its own local copy, all it needs to do is: operator.get(). In reality, the implementation of ThreadLocal is similar to what you've described - a Map of ThreadLocal values for each Thread. Only the Map is not static, and is actually tied to the specific thread. This way, when a thread dies, it takes its ThreadLocal variables with it.

I'm not sure if I totally understand the situation, but if you want to ensure that each thread uses a thread-specific instance for a variable, the solution is use a variable of type ThreadLocal<T>.

Cost effectiveness of ThreadLocal Variable

How does ThreadLocal variable reduces the cost of creating expensive objects?
For example:
private ThreadLocal<String> myThreadLocal = new ThreadLocal<String>();
In the above line we are creating a ThreadLocal Object which will create an object for thread.But I am not able to understand how can it reduce the cost of creating instances.

Expensive usually means it'll take a while, but it can also mean it'll take a lot of some other resource.
Just like instance variable is per instance, ThreadLocal variable is per thread. It's a way to achieve thread-saftey for expensive-to-create objects.
For ex. SimpleDateFormat, by making it ThreadLocal you can make it threadsafe. Since that class is expensive it is not good to use it in local scope which requires separate instance on each invocation.
By providing each thread their own copy :
1) number of instance of expensive objects are reduced by reusing fixed number of instances.
2) Thread saftey is achieved without cost of synchronization or immutability.

How does ThreadLocal variable reduces the cost of creating expensive objects?
It does not reduces cost of creating objects the single instance of ThreadLocal can store different values for each thread independently.
The TheadLocal construct allows us to store data that will be accessible only by a specific thread.
Let’s say that we want to have an Integer value that will be bundled with the specific thread:
ThreadLocal<Integer> threadLocalValue = new ThreadLocal<>();
Next, when we want to use this value from a thread we only need to call a get() or set() method. Simply put, we can think that ThreadLocal stores data inside of a map – with the thread as the key.
Due to that fact, when we call a get() method on the threadLocalValue we will get an Integer value for the requesting thread:
threadLocalValue.set(1);
Integer result = threadLocalValue.get();
For more infrmation you can see When should I use a ThreadLocal variable?

A variable should always be declared in the smallest scope possible but a ThreadLocal provides a much bigger scope and should be used only for variable that is needed across many lexical scopes. As per doc:
These variables differ from their normal counterparts in that each
thread that accesses one (via its get or set method) has its own,
independently initialized copy of the variable. ThreadLocal instances
are typically private static fields in classes that wish to associate
state with a thread (e.g., a user ID or Transaction ID).
So they are used when you have a common code and you want to save state on a per thread basis. An example is provided in doc:
import java.util.concurrent.atomic.AtomicInteger;
public class ThreadId {
// Atomic integer containing the next thread ID to be assigned
private static final AtomicInteger nextId = new AtomicInteger(0);
// Thread local variable containing each thread's ID
private static final ThreadLocal<Integer> threadId =
new ThreadLocal<Integer>() {
#Override protected Integer initialValue() {
return nextId.getAndIncrement();
}
};
// Returns the current thread's unique ID, assigning it if necessary
public static int get() {
return threadId.get();
}
}
In the above example, the classThreadId generates unique identifiers which is local to each thread which is not changed on subsequent calls. Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection.
How does ThreadLocal variable reduces the cost of creating expensive
objects?
Until some benchmark supports this claim I am not sure this is even the case with latest JVMs.

It doesn't reduce any cost of creating instances. You are exactly creating an instance of ThreadLocal by new ThreadLocal(), and when you use myThreadLocal.put("anyString"), it put an instance of String (which is already existing) into the current thread's threadLocals.

Safe publication example in Java Concurrency in Practice

Java Concurrency in Practice says you can safely publish an effectively immutable object (say, a Date object that you construct and never change again) by sticking it into a synchronized collection like the following (from the book, page 53):
public Map<String, Date> lastLogin =
Collections.synchronizedMap(new HashMap<String, Date>())
I understand that any Date object put into this map will be visible (at least in its initial but completely constructed state) once placed into this synchronized map, but only once other threads can obtain the reference to this Map object.
Since the reference field lastLogin has none of the properties of fields that guarantee visibility (final, volatile, guarded, or initialized by a static initializer), I think that it's possible the map itself will not show up in a completely constructed state to other threads, therefore putting the cart before the horse. Or am I missing something?

Your suspicion is half right, in that the value of lastLogin is not guaranteed to be visible to other threads. Because lastLogin is not volatile or final, another thread may read it as null.
However, you do not need to worry that other threads will see an incomplete version of the map. Collections.synchronizedMap(...) returns an instance of a private class with final fields. JLS section 17.5 says:
The usage model for final fields is a simple one: Set the final fields for an object in that object's constructor; and do not write a reference to the object being constructed in a place where another thread can see it before the object's constructor is finished. If this is followed, then when the object is seen by another thread, that thread will always see the correctly constructed version of that object's final fields.
SynchronizedMap follows these rules, so another thread reading lastLogin will either read null or a reference to the fully constructed map, never a reference to an incomplete or unsafe version of the map.

Is setting this reference thread safe?

I keep getting mixed answers as to whether this code is thread-safe or not. I am working in Java 8.
private final Object lock = new Object();
private volatile Object reference = null;
public Object getOrCompute(Supplier<Object> supplier) {
if (reference == null) {
synchronised(lock) {
if (reference == null) {
reference = supplier.get();
}
}
}
return reference;
}
My expectation is that given a new instance of this class, multiple calls to getOrCompute() will only ever result in one supplier being called and the result of that supplier being the result of all calls (and future calls) to getOrCompute().

It is safe because whatever is done in supplier.get() must not be reordered with the assignment to reference. (Or to be more precise, it mustn't appear to be reordered when you do a volatile read of reference.)
The lock provides exclusivity and the volatile write/read semantics provide visibility. Note that this has only been true since Java 5, which was released a long-long time ago, but you'll still find outdated articles on the Internet about how double-checked locking (for that's the official name of this idiom) isn't working. They were right at the time but they are obsolete now.
What can be unsafe though is the supplier itself, if it supplies a mutable object. But that's a different matter.

Synchronization is not thread safe. It's hindering the threads from accessing the object all at once, but it has no control over which thread gets it when or what it does with the object once it's gained access to it. Synchronization only limits access to one thread at the time, the thread that access it first get to access it first.
In this case, the only thing it does is preventing several threads to instantiate the object. If the object already is instantiated, it will be handed out to whatever thread wants it with no thread safety what so ever.
Imagine you have one thread accessing the method and instantiating the object, it retrieves it and while it's retrieving the object, another thread is trying to instantiate it, which it won't be allowed to since it exist so it can jump straight to retrieving the object, just like thread number one, these can now modify the object at the same time, ergo, not thread safe. But the instantiation of a new object is thread safe in the manner that the object can only be instantiated once.

Thread Confinement

I am reading Java Concurrency in Practice and kind of confused with the thread confinement concept. The book says that
When an object is confined to a thread, such usage is automatically thread-safe even if the confined object itself is not
So when an object is confined to a thread, no other thread can have access to it? Is that what it means to be confined to a thread? How does one keep an object confined to a thread?
Edit:
But what if I still want to share the object with another thread? Let's say that after thread A finishes with object O, thread B wants to access O. In this case, can O still be confined to B after A is done with it?
Using a local variable is one example for sure but that just means you don't share your object with other thread (AT ALL). In case of JDBC Connection pool, doesn't it pass one connection from one thread to another once a thread is done with that connection (totally clueless about this because I never used JDBC).

So when an object is confined to a thread, no other thread can have access to it?
No, it's the other way around: if you ensure that no other thread has access to an object, then that object is said to be confined to a single thread.
There's no language- or JVM-level mechanism that confines an object to a single thread. You simply have to ensure that no reference to the object escapes to a place that could be accessed by another thread. There are tools that help avoid leaking references, such as the ThreadLocal class, but nothing that ensures that no reference is leaked anywhere.
For example: if the only reference to an object is from a local variable, then the object is definitely confined to a single thread, as other threads can never access local variables.
Similarly, if the only reference to an object is from another object that has already been proven to be confined to a single thread, then that first object is confined to the same thread.
Ad Edit: In practice you can have an object that's only accessed by a single thread at a time during its lifetime, but for which that single thread changes (a JDBC Connection object from a connection pool is a good example).
Proving that such an object is only ever accessed by a single thread is much harder than proving it for an object that's confined to a single thread during its entire life, however.
And in my opinion those objects are never really "confined to a single thread" (which would imply a strong guarantee), but could be said to "be used by a single thread at a time only".

The most obvious example is use of thread local storage. See the example below:
class SomeClass {
// This map needs to be thread-safe
private static final Map<Thread,UnsafeStuff> map = new ConcurrentHashMap<>();
void calledByMultipleThreads(){
UnsafeStuff mystuff = map.get(Thread.currentThread());
if (mystuff == null){
map.put(Thread.currentThread(),new UnsafeStuff());
return;
}else{
mystuff.modifySomeStuff();
}
}
}
The UnsafeStuff objects itself "could be shared" with other threads in the sense that if you'd pass some other thread instead of Thread.currentThread() at runtime to the map's get method, you'd get objects belonging to other threads. But you are choosing not to. This is "usage that is confined to a thread". In other words, the runtime conditions are such that the objects is in effect never shared between different threads.
On the other hand, in the example below the object is automatically confined to a thread, and so to say, the "object itself" is confined to the thread. This is in the sense that it is impossible to obtain reference from other threads no matter what the runtime condition is:
class SomeClass {
void calledByMultipleThreads(){
UnsafeStuff mystuff = new UnsafeStuff();
mystuff.modifySomeStuff();
System.out.println(mystuff.toString());
}
}
Here, the UnsafeStuff is allocated within the method and goes out of scope when the method returns.. In other words, the Java spec is ensuring statically that the object is always confined to one thread. So, it is not the runtime condition or the way you use it that is ensuring the confinement, but more the Java spec.
In fact, modern JVM sometimes allocate such objects on stack, unlike the first example (haven't personally checked this, but I don't think at least current JVMs do).
Yet in other words, in the fist example the JVM can't be sure if the object is confined within a thread by just looking inside of calledByMultipleThreads() (who knows what other methods are messing with SomeClass.map). In the latter example, it can.
Edit: But what if I still want to
share the object with another thread?
Let's say that after thread A finishes
with object O, thread B wants to
access O. In this case, can O still be
confined to B after A is done with it?
I don't think it is called "confined" in this case. When you do this, you are just ensuring that an object is not accessed concurrently. This is how EJB concurrency works. You still have to "safely publish" the shared object in question to the threads.

So when an object is confined to a thread, no other thread can have access to it?
That's what thread confinement means - the object can only EVER be accessed by one thread.
Is that what it means to be confined to a thread?
See above.
How does one keep an object confined to a thread?
The general principle is to not put the reference somewhere that would allow another thread to see it. It is a little bit complicated to enumerate a set of rules that will ensure this, but (for instance) if
you create a new object, and
you never assign the object's reference to an instance or class variable, and
you never call a method that does this for the reference,
then the object will be thread confined.

I guess that's what want to say. Like creating a object inside the run method and not passing the reference to any other instance.
Simple example:
public String s;
public void run() {
StringBuilder sb = new StringBuilder();
sb.append("Hello ").append("world");
s = sb.toString();
}
The StringBuilder instance is thread-safe because it is confined to the thread (that executes this run method)

One way is "stack confinement" in which the object is a local variable confined to the thread's stack, so no other thread can access it. In the method below, the list is a local variable and doesn't escape from the method. The list doesn't have to be threadsafe because it is confined to the executing thread's stack. No other thread can modify it.
public String foo(Item i, Item j){
List<Item> list = new ArrayList<Item>();
list.add(i);
list.add(j);
return list.toString();
}
Another way of confining an object to a thread is through the use of a ThreadLocal variable which allows each thread to have its own copy. In the example below, each thread will have its own DateFormat object and so you don't need to worry about the fact that DateFormat is not thread-safe because it won't be accessed by multiple threads.
private static final ThreadLocal<DateFormat> df
= new ThreadLocal<DateFormat>(){
#Override
protected DateFormat initialValue() {
return new SimpleDateFormat("yyyyMMdd");
}
};
Further Reading

See: http://codeidol.com/java/java-concurrency/Sharing-Objects/Thread-Confinement/
A more formal means of maintaining
thread confinement is ThreadLocal,
which allows you to associate a
per-thread value with a value-holding
object. Thread-Local provides get and
set accessormethods that maintain a
separate copy of the value for each
thread that uses it, so a get returns
the most recent value passed to set
from the currently executing thread.
It holds a copy of object per one thread, thread A can't access copy of thread B and broke it's invariants if you will do it specially (for example, assign ThreadLocal value to static variable or expose it using other methods)

That's exactly what it means. The object itself is accessed by only one thread, and is thus thread-safe. ThreadLocal objects are a kind of objects that are bound to an only thread

I means that only code running in one thread accesses the object.
When this is the case, the object doesn't need to be "thread safe"

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.