pure thread-safe Java class

pure thread-safe Java class - java

I am recently reading the book "Java Concurrency in Practice". One example of "safe publish" it gives is to initialize a private int field n during construction, and a later assertion on that field n == "expect value" through a public method could still be failed if it is called from another thread. This makes me feel worried in that, assuming all private fields are initialized only once, do we still have to mark them as volatile or wrap them into ThreadLocal or even use an AtomicReference to get a pure thread safe java class, since these private fields, though not visible outside, definitely could be referenced by the method(s) called by other threads.
EDIT: Just to be clear - the assertion is failed because the calling thread sees a stale value of n, even though it has been set during construction. This is a clearly a memory-visibility issue. Problem is whether synchronizing on n is worthy of the overhead after all, since it is only initialized once, and as a private field, author can make sure it won't be changed again.

This specific case is properly documented in the JSR 133: Java Memory Model and Thread
Specification. It even has a dedicated code sample page 14 section 3.5 Final Fields that exactly match your question.
To summarize:
A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object’s final fields.
There is no guarantee for non final fields
It means that you have to make sure that an happens-before occurs between your object creation in a thread and its usage in another thread. You can use synchronized, volatile or any other mean to enforce an happens-before.
Since you say in another comment that the field is only set during construction, I would make it... final. Also, such shared objects between threads could suggest some design smell; I would review my design to make sure that I am not creating an overly complex, tightly coupled, hard to debug system.

If the fields are never used outside the class, wrapping their usages with synchronized blocks or inside synchronized functions, two threads won't concurrently modify these fields.
volatile keyword is just a part of thread safety. It only makes the value of a field never be cached, always read from memory. Take this example.
private int myPrivateField = 0;
void someFunction() {
while(myPrivateField ==0) {
}
}
void otherFunction() {
myPrivateField = 1;
}
If someFunction() is called from one thread and it's running for a while,
when you call otherFunction() the value of myPrivateField will not be
"updated" inside someFunction, it was cached to 0 as an otimization.
Making myPrivateField as volatile, the value will always be the one
in memory.
For the example, there won't be much difference for the functions be
synchronized, but without synchronization, you can read a value in an
inconsistent state.

Only final fields are guaranteed to be visible after constructor. Any other field requires some visibility mechanism, such as synchronized or volatile.
This is not as much of a burden as it seems: if the field is not final, then it can be changed by another thread while you're reading it. If teh field can be changed, then the last assigned value must be propagated from the writer thread to the reader thread, whether the writer thread called the constructor or a setter.
If the change in this field is not related to any other field in the class, then the field should be volatile. If the field is related to other fields in the class, then use synchronized or other, more modern locking primitives.

Not answering the entire question, but it should be pointed out that using ThreadLocal is exactly the wrong thing to use if you want to ensure visibility of updated values in all threads. Consider the following code:
class Test {
private static final ThreadLocal<Integer> value = new ThreadLocal<>();
public static void main(String[] args) throws InterruptedException {
System.out.println("From main Thread, value is " + value.get());
value.set(42);
System.out.println("Value has been changed");
Thread t = new Thread() {
public void run() {
System.out.println("From other Thread, value is " + value.get());
}
};
t.start();
t.join();
System.out.println("From main Thread, value is " + value.get());
}
}
This will output the following:
From main Thread, value is null
Value has been changed
From other Thread, value is null
From main Thread, value is 42
i.e. the other thread doesn't see the updated value. This is because changes to the value of a ThreadLocal is, by definition, localized to the thread which changes it.
My personal preference would be to use AtomicReference, since this avoids the risk of forgetting to synchronize externally; it also allows things like atomic compare-and-set, which you don't get with a volatile variable. However, this may not be a requirement for your particular application.

Related

Do i need a lock at all if only 1 thread updates the value? [duplicate]

public class Test{
private MyObj myobj = new MyObj(); //it is not volatile
public class Updater extends Thred{
myobje = getNewObjFromDb() ; //not am setting new object
}
public MyObj getData(){
//getting stale date is fine for
return myobj;
}
}
Updated regularly updates myobj
Other classes fetch data using getData
IS this code thread safe without using volatile keyword?
I think yes. Can someone confirm?

No, this is not thread safe. (What makes you think it is?)
If you are updating a variable in one thread and reading it from another, you must establish a happens-before relationship between the write and the subsequent read.
In short, this basically means making both the read and write synchronized (on the same monitor), or making the reference volatile.
Without that, there are no guarantees that the reading thread will see the update - and it wouldn't even be as simple as "well, it would either see the old value or the new value". Your reader threads could see some very odd behaviour with the data corruption that would ensue. Look at how lack of synchronization can cause infinite loops, for example (the comments to that article, especially Brian Goetz', are well worth reading):
The moral of the story: whenever mutable data is shared across threads, if you don’t use synchronization properly (which means using a common lock to guard every access to the shared variables, read or write), your program is broken, and broken in ways you probably can’t even enumerate.

No, it isn't.
Without volatile, calling getData() from a different thread may return a stale cached value.
volatile forces assignments from one thread to be visible on all other threads immediately.
Note that if the object itself is not immutable, you are likely to have other problems.

You may get a stale reference. You may not get an invalid reference.
The reference you get is the value of the reference to an object that the variable points to or pointed to or will point to.
Note that there are no guarantees how much stale the reference may be, but it's still a reference to some object and that object still exists. In other words, writing a reference is atomic (nothing can happen during the write) but not synchronized (it is subject to instruction reordering, thread-local cache et al.).
If you declare the reference as volatile, you create a synchronization point around the variable. Simply speaking, that means that all cache of the accessing thread is flushed (writes are written and reads are forgotten).
The only types that don't get atomic reads/writes are long and double because they are larger than 32-bits on 32-bit machines.

If MyObj is immutable (all fields are final), you don't need volatile.

The big problem with this sort of code is the lazy initialization. Without volatile or synchronized keywords, you could assign a new value to myobj that had not been fully initialized. The Java memory model allows for part of an object construction to be executed after the object constructor has returned. This re-ordering of memory operations is why the memory-barrier is so critical in multi-threaded situations.
Without a memory-barrier limitation, there is no happens-before guarantee so you do not know if the MyObj has been fully constructed. This means that another thread could be using a partially initialized object with unexpected results.
Here are some more details around constructor synchronization:
Constructor synchronization in Java

Volatile would work for boolean variables but not for references. Myobj seems to perform like a cached object it could work with an AtomicReference. Since your code extracts the value from the DB I'll let the code stay as is and add the AtomicReference to it.
import java.util.concurrent.atomic.AtomicReference;
public class AtomicReferenceTest {
private AtomicReference<MyObj> myobj = new AtomicReference<MyObj>();
public class Updater extends Thread {
public void run() {
MyObj newMyobj = getNewObjFromDb();
updateMyObj(newMyobj);
}
public void updateMyObj(MyObj newMyobj) {
myobj.compareAndSet(myobj.get(), newMyobj);
}
}
public MyObj getData() {
return myobj.get();
}
}
class MyObj {
}

How to make my code thread-safe when my shared variable can change anytime?

Here is a question that has been asked many times, I have double-checked numerous issues that have been raised formerly but none gave me an answer element so I thought I would put it here.
The question is about making my code thread-safe in java knowing that there is only one shared variable but it can change anytime and actually I have the feeling that the code I am optimizing has not been thought for a multi-threading environment, so I might have to think it over...
Basically, I have one class which can be shared between, say, 5 threads. This class has a private property 'myProperty' which can take 5 different values (one for each thread). The problem is that, once it's instantiated by the constructor, that value should not be changed anymore for the rest of the thread's life.
I am pretty well aware of some techniques used to turn most of pieces of code "thead-safe" including locks, the "synchronized" keyword, volatile variables and atomic types but I have the feeling that these won't help in the current situation as they do not prevent the variable from being modified.
Here is the code :
// The thread that calls for the class containing the shared variable //
public class myThread implements Runnable {
#Autowired
private Shared myProperty;
//some code
}
// The class containing the shared variable //
public class Shared {
private String operator;
private Lock lock = new ReentrantLock();
public void inititiate(){
this.lock.lock()
try{
this.operator.initiate() // Gets a different value depending on the calling thread
} finally {
this.lock.unlock();
}
}
// some code
}
As it happens, the above code only guarantees that two threads won't change the variable at the same time, but the latter will still change. A "naive" workaround would consist in creating a table (operatorList) for instance (or a list, a map, etc. ) associating an operator with its calling thread's ID, this way each thread would just have to access its operator using its id in the table but doing this would make us change all the thread classes which access the shared variable and there are many. Any idea as to how I could store the different operator string values in an exclusive manner for each calling thread with minimal changes (without using magic) ?

I'm not 100% sure I understood your question correctly, but I'll give it a shot anyway. Correct me if I'm wrong.
A "naive" workaround would consist in creating a table (operatorList)
for instance (or a list, a map, etc. ) associating an operator with
its calling thread's ID, this way each thread would just have to
access its operator using its id in the table but doing this would
make us change all the thread classes which access the shared variable
and there are many.
There's already something similar in Java - the ThreadLocal class?
You can create a thread-local copy of any object:
private static final ThreadLocal<MyObject> operator =
new ThreadLocal<MyObject>() {
#Override
protected MyObject initialValue() {
// return thread-local copy of the "MyObject"
}
};
Later in your code, when a specific thread needs to get its own local copy, all it needs to do is: operator.get(). In reality, the implementation of ThreadLocal is similar to what you've described - a Map of ThreadLocal values for each Thread. Only the Map is not static, and is actually tied to the specific thread. This way, when a thread dies, it takes its ThreadLocal variables with it.

I'm not sure if I totally understand the situation, but if you want to ensure that each thread uses a thread-specific instance for a variable, the solution is use a variable of type ThreadLocal<T>.

How a thread can see stale reference of safely initialized object

I have been trying to figure out that how immutable objects which are safely published could be observed with stale reference.
public final class Helper {
private final int n;
public Helper(int n) {
this.n = n;
}
}
class Foo {
private Helper helper;
public Helper getHelper() {
return helper;
}
public void setHelper(int num) {
helper = new Helper(num);
}
}
So far I could understand that Helper is immutable and can be safely published. A reading thread either reads null or fully initialized Helper object as it won't be available until fully constructed. The solution is to put volatile in Foo class which I don't understand.

The fact that you are publishing a reference to an immutable object is irrelevant here.
If you are reading the value of a reference from multiple threads, you need to ensure that the write happens before a read if you care about all threads using the most up-to-date value.
Happens before is a precisely-defined term in the language spec, specifically the part about the Java Memory Model, which allows threads to make optimisations for example by not always updating things in main memory (which is slow), instead holding them in their local cache (which is much faster, but can lead to threads holding different values for the "same" variable). Happens-before is a relation that helps you to reason about how multiple threads interact when using these optimisations.
Unless you actually create a happens-before relationship, there is no guarantee that you will see the most recent value. In the code you have shown, there is no such relationship between writes and reads of helper, so your threads are not guaranteed to see "new" values of helper. They might, but they likely won't.
The easiest way to make sure that the write happens before the read would be to make the helper member variable final: the writes to values of final fields are guaranteed to happen before the end of the constructor, so all threads always see the correct value of the field (provided this wasn't leaked in the constructor).
Making it final isn't an option here, apparently, because you have a setter. So you have to employ some other mechanism.
Taking the code at face value, the simplest option would be to use a (final) AtomicInteger instead of the Helper class: writes to AtomicInteger are guaranteed to happen before subsequent reads. But I guess your actual helper class is probably more complicated.
So, you have to create that happens-before relationship yourself. Three mechanisms for this are:
Using AtomicReference<Helper>: this has similar semantics to AtomicInteger, but allows you to store a reference-typed value. (Thanks for pointing this out, #Thilo).
Making the field volatile: this guarantees visibility of the most recently-written value, because it causes writes to flush to main memory (as opposed to reading from a thread's cache), and reads to read from main memory. It effectively stops the JVM making this particular optimization.
Accessing the field in a synchronized block. The easiest thing to do would be to make the getter and setter methods synchronized. Significantly, you should not synchronize on helper, since this field is being changed.

Cite from Volatile vs Static in Java
This means that if two threads update a variable of the same Object concurrently, and the variable is not declared volatile, there could be a case in which one of the thread has in cache an old value.
Given your code, the following can happen:
Thread 1 calls getHelper() and gets null
Thread 2 calls getHelper() and gets null
Thread 1 calls setHelper(42)
Thread 2 calls setHelper(24)
And in this case your trouble starts regarding which Helper object will be used in which thread. The keyword volatile will at least solve the caching problem.

The variable helper is being read by multiple threads simultaneously. At the least, you have to make it volatile or the compiler will begin caching it in registers local to threads and any updates to the variable may not reflect in the main memory. Using volatile, when a thread starts reading a shared variable, it will clear its cache and fetch a fresh value from the global memory. When it finishes reading it, it will flush the contents of its cache into the main memory so that other threads may get the updated value.

Thread safety of final field

Let's say I have a JavaBean User that's updated from another thread like this:
public class A {
private final User user;
public A(User user) {
this.user = user;
}
public void aMethod() {
Thread thread = new Thread(new Runnable() {
#Override
public void run() {
...a long running task..
user.setSomething(something);
}
});
t.start();
t.join();
}
public void anotherMethod() {
GUIHandler.showOnGuiSomehow(user);
}
}
Is this code thread safe? I mean, when the thread that created A instance and called A.aMethod reads user fields, does it see user in the fresh state? How to do it in appropriate thread safe manner?
Note that I can't modify the user class and I don't know if it's thread safe itself.

Is this code thread safe? ... does it see user in the fresh state?
Not especially - the fact that user is final in your code makes almost no difference to thread safety other than the fact that it cannot be replaced.
The bit that should change is the instance variable that is set by setSomething. It should be marked as volatile.
class User {
// Marked `volatile` to ensure all writes are visible to other threads.
volatile String something;
public void setSomething(String something) {
this.something = something;
}
}
If however (as you suggest) you do not have access to the User class, you must then perform a synchronization that creates a memory barrier. In its simplest form you could surround your access to the user with a synchronized access.
synchronized (user) {
user.setSomething(something);
}
Added :- It turns out (see here) that this can actually be done like this:
volatile int barrier = 0;
...
user.setSomething(something);
// Forces **all** cached variable to be flushed.
barrier += 1;

marking field as final just means that reference cannot be changed. It means nothing about thread safity of class User. If methods of this class that access fields are synchronized (or use other synchronization technique) it is thread safe. Otherwise it is not.

final only makes the reference not re-assignable, but if the reference points to a mutable class, you can still alter the state inside that object, which causes thead-unsafe.
Your code is only thread safe if the User class is immutable, I.e. all properties of User cannot be altered outside the object, all references in the class point to other immutable class.
If it is not case, then you have to properly synchronize its methods to make it thread safe.

Note that I can't modify the user class and I don't know if it's thread safe itself.
You have to synchronize your access when accessing the User object.
You can for example use the User object to synchronize, so just wrap every access on the user object with something like:
synchronized(user) {
// access some method of the user object
}
That assumes that the user object is only accessed in your threads asynchronously. Also keep the synchronized blocks short.
You could also build a threadsafe wrapper around the user object. I would suggest that if you have a lot of different calls, the code gets cleaner and better to read that way.
good luck!

Concerning threading, finalfields are just guaranteed to be consistent in case of constructor escape, since the JSR-133 about Memory Barrier mechanism:
The values for an object's final fields are set in its constructor.
Assuming the object is constructed "correctly", once an object is
constructed, the values assigned to the final fields in the
constructor will be visible to all other threads without
synchronization. In addition, the visible values for any other object
or array referenced by those final fields will be at least as
up-to-date as the final fields. What does it mean for an object to be
properly constructed? It simply means that no reference to the object
being constructed is allowed to "escape" during construction. (See
Safe Construction Techniques for examples.) In other words, do not
place a reference to the object being constructed anywhere where
another thread might be able to see it; do not assign it to a static
field, do not register it as a listener with any other object, and so
on. These tasks should be done after the constructor completes, not in
the constructor.
However, nothing ensures automatic thread-safety about any final fields in the remaining object's life (meaning after wrapping class's constructor execution).. Indeed, immutability in Java is a pure misnomer:
Now, in common parlance, immutability means "does not change".
Immutability doesn't mean "does not change" in Java. It means "is
transitively reachable from a final field, has not changed since the
final field was set, and a reference to the object containing the
final field did not escape the constructor".

Yes, this is safe. See
Java Language Specification (Java 8) Chapter 17.4.4:
The final action in a thread T1 synchronizes-with any action in another thread T2 that detects that T1 has terminated.
T2 may accomplish this by calling T1.isAlive() or T1.join().
Put this together with 17.4.5. Happens-before Order:
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second. [..] If an action x synchronizes-with a following action y, then we also have hb(x, y).
So after you call t.join(); in your code you will see the updated changes. Since "the thread that created A instance and called A.aMethod" can impossibly read the value after calling aMethod and before t.join is called (because it is busy with method aMethod), this is safe.

How can I use the volatile keyword in Java correctly?

Say I have two threads and an object. One thread assigns the object:
public void assign(MyObject o) {
myObject = o;
}
Another thread uses the object:
public void use() {
myObject.use();
}
Does the variable myObject have to be declared as volatile? I am trying to understand when to use volatile and when not, and this is puzzling me. Is it possible that the second thread keeps a reference to an old object in its local memory cache? If not, why not?
Thanks a lot.

I am trying to understand when to use
volatile and when not
You should mostly avoid using it. Use an AtomicReference instead (or another atomic class where appropriate). The memory effects are the same and the intent is much clearer.
I highly suggest reading the excellent Java Concurrency in Practice for a better understanding.

Leaving the complicated technical details behind, you can see volatile less or more as a synchronized modifier for variables. When you'd like to synchronize access to methods or blocks, then you'd usually like to use the synchronized modifier as follows:
public synchronized void doSomething() {}
If you'd like to "synchronize" access to variables, then you'd like to use the volatile modifier:
private volatile SomeObject variable;
Behind the scenes they do different things, but the effect is the same: the changes are immediately visible for the next accessing thread.
In your specific case, I don't think that the volatile modifier has any value. The volatile does not guarantee in any way that the thread assigning the object will run before the thread using the object. It can be as good the other way round. You probably just want to do a nullcheck in use() method first.
Update: also see this article:
Access to the variable acts as though it is enclosed in a synchronized block, synchronized on itself. We say "acts as though" in the second point, because to the programmer at least (and probably in most JVM implementations) there is no actual lock object involved.

Declaring a volatile Java variable means:
The value of this variable will never be cached thread-locally
Access to the variable acts as though it is enclosed in a synchronized block
The typical and most common use of volatile is :
public class StoppableThread extends Thread {
private volatile boolean stop = false;
public void run() {
while (!stop) {
// do work
}
}
public void stopWork() {
stop = true;
}
}

You can use volatile in this case. You will require volatile, synchronization around the access to the variable or some similar mechanism (like AtomicReference) to guarantee that changes made on the assignment thread are actually visible to the reading thread.

I have spent quite a lot of time trying to understanding the volatile keyword.
I think #aleroot has given the best and simplest example in the world.
This is in turn my explanation for dummies (like me :-)):
Scenario1: Assuming the stop is not declared as volatile then
a given thread does and 'thinks' the following:
stopWork() is called: I have to set the stop to true
Great, I did it in my local stack now I have to update the main heap of JVM.
Oops, JVM tells me to give a way in CPU to another thread, I have to stop for a while...
OK, I am back. Now I can update the main heap with my value. Updating ...
Scenario2: Now let the stop be declared as volatile:
stopWork() is called: I have to set the stop to true
Great, I did it in my local stack now I have to update the main heap of JVM.
Sorry guys, I have to do (2) NOW - I am told it is volatile. I have to occupy CPU a bit longer...
Updating the main heap ...
OK, I am done. Now I can yield.
No synchronization, just a simple idea...
Why not to declare all variables volatile just in case? Because of Scenario2/Step3. It is a bit inefficient but still better than regular synchronization.

There are some confusing comments here: to clarify, your code is incorrect as it stands, assuming two different threads call assign() and use().
In the absence of volatile, or another happens-before relationship (for example, synchronization on a common lock) any write to myObject in assign() is not guaranteed to be seen by the thread calling use() -- not immediately, not in a timely fashion, and indeed not ever.
Yes, volatile is one way of correcting this (assuming this is incorrect behaviour -- there are plausible situations where you don't care about this!).
You are exactly correct that the 'use' thread can see any 'cached' value of myObject, including the one it was assigned at construction time and any intermediate value (again in the absence of other happens-before points).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.