Java 'final' instance variable - visibility and propagation of variable's internal state

Java 'final' instance variable - visibility and propagation of variable's internal state - java

While reading the documentation related to Threads and Locks, a sentence describing final keyword attracted me:
Correspondingly, compilers are allowed to keep the value of a final field cached in a register and not reload it from memory in situations where a non-final field would have to be reloaded.
Does it mean, that if I declare a final Object object = ... as an instance variable and then access it (modify its inner state - object.state) from anonymous inner classes (multiple instances of Runnable), then the value of object.state could actually be read/written from/into CPU cache and it (value of object.state) could be out of sync across those Runnable instances?
So if I want to be sure, that the value of object.state is properly propagated across all the threads I have to declare the object as volatile Object object instead?
Thank you.
Edit: I have edited my original question. And now I know I misunderstood the documentation so the answer to my last question is NO - volatile/final Object object has no effect on object.state - it depends how the object.state is declared, initialized and/or accessed.
Thanks to the #Burak Serdar for the answer!

When you declare a final Object object=..., the final value is the reference to the object, not the internal state of the object. It means than nothing can modify object, it does not say that nothing can modify, say, object.value. So the variable object can be cached, which does not mean the internal state of object can be cached.

Final will not help you with threading issues the way you hope, sorry.
Volatile may help in some situations, but you may need a lock.
From the Oracle tutorials:
-Reads and writes are atomic for reference variables and for most primitive variables (all types except long and double).
-Reads and writes are atomic for all variables declared volatile (including long and double variables).
That means your object will not be immediately updated between reads and writes, but it will be consistent within one read/write.

Related

Can using volatile with "AtomicInteger" guarantees thread safety? [duplicate]

This question already has answers here:
What is the difference between atomic / volatile / synchronized?
(7 answers)
Closed 3 years ago.
Suppose I have
private volatile AtomicInteger atomInt = new AtomicInteger(3);
and in methods my usage is atomInt.incrementAndGet().
Since I am using AtomicInteger, it will avoid "thread interference". And then I am using volatile, so it will guarantee the consistent view of the variable across all threads. Does it mean that I have got complete thread safety or still there are chances of "memory consistency issues"?
I got confused because of usage of "reduce" in tutorial, so it suggests me that there are still chances but I cannot think of it:
Using volatile variables reduces the risk of memory consistency
errors, because any write to a volatile variable establishes a
happens-before relationship with subsequent reads of that same
variable.

And then I am using volatile, so it will guarantee the consistent view of the variable across all threads.
Thread-safety is already guaranteed by atomic variables. volatile is redundant if you won't reassign the variable. You can replace volatile with final here:
private final AtomicInteger atomInt = new AtomicInteger(3);
Does it mean that I have got complete thread safety or still there are chances of "memory consistency issues"?
At this moment, it's absolutely thread-safe. No "memory consistency issues" might happen with the variable. But using proper thread-safe components doesn't mean that the whole class/program is thread-safe. Problems might take place if interactions between them are incorrect.
Using volatile variables reduces the risk of memory consistency errors ...
volatile variables can only guarantee visibility. They don't guarantee atomicity.
As Brian Goetz writes (emphasis mine):
volatile variables are convenient, but they have limitations. The most common use for volatile variables is as a completion, interruption, or status flag. Volatile variables can be used for other kinds of state information, but more care is required when attempting this. For example, the semantics of volatile are not strong enough to make the increment operation (count++) atomic, unless you can guarantee that the variable is written only from a single thread.
You can use volatile variables only when all the following criteria are met:
Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
The variable does not participate in invariants with other state variables;
Locking is not required for any other reason while the variable is being accessed.
From the docs of the java.util.concurrent.atomic package:
get has the memory effects of reading a volatile variable.
set has the memory effects of writing (assigning) a volatile variable.

Volatile does mean that changes to the variable will be visible. But in this case you shouldn’t be changing the reference held by the variable.
It seems very odd that you’d want to make a reference to an Atomic object volatile. The whole point of the atomicinteger class is to provide a way to access and change an integer value safely. The only reason to make some variable volatile is because you intend to overwrite its value. Why overwrite the reference to the AtomicInteger when you can use its instance methods to update its value?
That’s why you are getting advice to make this variable final instead of volatile. Making the variable final nails down the reference so it can’t change, while making sure the reference contained by that variable is visible. The atomicInteger manages its own state in a threadsafe way so you shouldn’t have to overwrite the reference to it.
So it’s not exactly correct to say volatile is redundant here. But it is doing something that typically shouldn’t have to be done. Use volatile when you have to change the value contained in the variable. Use final when you shouldn’t be changing the value contained in the variable.

Java final keyword semantics with respect to cache

What is the behavior of Java final keyword with respect of caches?
quote from:jsr133-faq
The values for an object's final fields are set in its constructor.
Assuming the object is constructed "correctly", once an object is
constructed, the values assigned to the final fields in the
constructor will be visible to all other threads without
synchronization. In addition, the visible values for any other object
or array referenced by those final fields will be at least as
up-to-date as the final fields.
I don't understand what it refers to when it says as up-to-date as the final fields.:
In addition, the visible values for any other object or array
referenced by those final fields will be at least as up-to-date as the
final fields.
My guess is, for example:
public class CC{
private final Mutable mutable; //final field
private Other other; //non-final field
public CC(Mutable m, Other o){
mutable=m;
other=o;
}
}
When the constructor CC returns, besides the pointer value of mutable, all values on the object graph rooted at m, if exist in the local processor cache, will be flushed to main memory. And at the same time, mark the corresponding cache lines of other processors' local caches as Invalid.
Is that the case? What does it look like in assembly? How do they actually implement it?

Is that the case?
The actual guarantee is that any thread that can see an instance of CC created using that constructor is guaranteed to see the mutable reference and also the state of the Mutable object's fields as of the time that the constructor completed.
It does not guarantee that the state of all values in the closure of the Mutable instance will be visible. However, any writes (in the closure or not) made by the thread that executed the constructor prior to the constructor completing will be visible. (By "happens-before" analysis.)
Note that the behavior is specified in terms what one thread is guaranteed to see, not in terms of cache flushing / invalidation. The latter is a way of implementing the behavior that the specification requires. There may be other ways.
What does it look like in assembly?
That will be version / platform / etc specific. There is a way to get the JIT compiler to dump out the compiled code, if you want to investigate what the native code looks like for your hardware.
How to see JIT-compiled code in JVM?
How do they actually implement it?
See above.

Can a volatile variable that is never assigned null to ever contain null?

Can in the following conceptual Java example:
public class X implements Runnable {
public volatile Object x = new Object();
#Runnable
public void run() {
for (;;) {
Thread.sleep(1000);
x = new Object();
}
}
}
x ever be read as null from another thread?
Bonus: Do I need to declare it volatile (I do not really care about that value, it suffices that sometime in the future it will be the newly assigned value and never is null)

Technically, yes it can. That is the main reason for the original ConcurrentHashMap's readUnderLock. The javadoc even explains how:
Reads value field of an entry under lock. Called if value field ever appears to be null. This is possible only if a compiler happens to reorder a HashEntry initialization with its table assignment, which is legal under memory model but is not known to ever occur.
Since the HashEntry's value is volatile this type of reordering is legal on consturction.
Moral of the story is that all non-final initializations can race with object construction.
Edit:
#Nathan Hughes asked a valid question:
#John: in the OP's example wouldn't the construction have happened before the thread the runnable is passed into started? it would seem like that would impose a happens-before barrier subsequent to the field's initialization.
Doug Lea had a couple comments on this topic, the entire thread can be read here. He answered the comment:
But the issue is whether assignment of the new C instance to some other memory must occur after the volatile stores.
With the answer
Sorry for mis-remembering why I had treated this issue as basically settled:
Unless a JVM always pre-zeros memory (which usually not a good option), then
even if not explicitly initialized, volatile fields must be zeroed
in the constructor body, with a release fence before publication.
And so even though there are cases in which the JMM does not
strictly require mechanics preventing publication reordering
in constructors of classes with volatile fields, the only good
implementation choices for JVMs are either to use non-volatile writes
with a trailing release fence, or to perform each volatile write
with full fencing. Either way, there is no reordering with publication.
Unfortunately, programmers cannot rely on a spec to guarantee
it, at least until the JMM is revised.
And finished with:
Programmers do not expect that even though final fields are specifically
publication-safe, volatile fields are not always so.
For various implementation reasons, JVMs arrange that
volatile fields are publication safe anyway, at least in
cases we know about.
Actually updating the JMM/JLS to mandate this is not easy
(no small tweak that I know applies). But now is a good time
to be considering a full revision for JDK9.
In the mean time, it would make sense to further test
and validate JVMs as meeting this likely future spec.

This depends on how the X instance is published.
Suppose x is published unsafely, eg. through a non-volatile field
private X instance;
...
void someMethod() {
instance = new X();
}
Another thread accessing the instance field is allowed to see a reference value referring to an uninitialized X object (ie. where its constructor hasn't run yet). In such a case, its field x would have a value of null.
The above example translates to
temporaryReferenceOnStack = new memory for X // a reference to the instance
temporaryReferenceOnStack.<init> // call constructor
instance = temporaryReferenceOnStack;
But the language allows the following reordering
temporaryReferenceOnStack = new memory for X // a reference to the instance
instance = temporaryReferenceOnStack;
temporaryReferenceOnStack.<init> // call constructor
or directly
instance = new memory for X // a reference to the instance
instance.<init> // call constructor
In such a case, a thread is allowed to see the value of instance before the constructor is invoked to initialize the referenced object.
Now, how likely this is to happen in current JVMs? Eh, I couldn't come up with an MCVE.
Bonus: Do I need to declare it volatile (I do not really care about
that value, it suffices that sometime in the future it will be the
newly assigned value and never is null)
Publish the enclosing object safely. Or use a final AtomicReference field which you set.

No. The Java memory model guarantees that you will never seen x as null. x must always be the initial value it was assigned, or some subsequent value.
This actually works with any variable, not just volatile. What you are asking about is called "out of thin air values". C.f. Java Concurrency in Practice which talks about this concept in some length.
The other part of your question "Do I need to declare x as volatile:" given the context, yes, it should be either volatile or final. Either one provides safe publication for your object referenced by x. C.f. Safe Publication. Obviously, x can't be changed later if it's final.

How do final fields prevent other threads from seeing partially constructed objects?

I was looking into creating an immutable datatype that has final fields (including an array that is constructed and filled prior to being assigned to the final member field), and noticed that it seems that the JVM is specified to guarantee that any other thread that gets a reference to this object will see the initialized fields and array values (assuming no pointers to this are published within the constructor, see What is an "incompletely constructed object"? and How do JVM's implicit memory barriers behave when chaining constructors?).
I am curious how this is achieved without synchronizing every access to this object, or otherwise paying some significant performance penalty. According to my understanding, the JVM can achieve this by doing the following:
Issue a write-fence at the end of the constructor
Publish the reference to the new object only after the write-fence
Issue a read-fence any time you refer to a final field of an object
I can't think of a simpler or cheaper way of eliminating the risk of other threads seeing uninitialized final fields (or recursive references through final fields).
This seems like it could impose a severe performance penalty due to all of the read-fences in the other threads reading the object, but eliminating the read-fences introduces the possibility that the object reference is seen in another processor before it issues a read-fence or otherwise sees the updates to the memory locations corresponding to the newly initialized final fields.
Does anyone know how this works? And whether this introduces a significant performance penalty?

See the "Memory Barriers" section in this writeup.
A StoreStore barrier is required after final fields are set and before the object reference is assigned to another variable. This is the key piece of info you're asking about.
According to the "Reordering" section there, a store of a final field can not be reordered with respect to a store of a reference to the object containing the final field.
Additionally, it states that in v.afield = 1; x.finalField = v; ... ; sharedRef = x;, neither of the first two can be reordered with respect to the third; which ensures that stores to the fields of an object that is stored as a final field are themselves guaranteed to be visible to other threads before a reference to the object containing the final field is stored.
Together, this means that all stores to final fields must be visible to all threads before a reference to the object containing the field is stored.

Why can an Object member variable not be both final and volatile in Java?

If in a class I have a ConcurrentHashMap instance that will be modified and read by multiple threads I might define like this:
public class My Class {
private volatile ConcurrentHashMap<String,String> myMap = new ConcurrentHashMap<String,String>();
...
}
adding final to the myMap field results in an error saying I can only use final or volatile. Why can it not be both?

volatile only has relevance to modifications of the variable itself, not the object it refers to. It makes no sense to have a final volatile field because final fields cannot be modified. Just declare the field final and it should be fine.

It's because of Java Memory Model (JMM).
Essentially, when you declare object field as final you need to initialize it in object's constructor and then final field won't change it's value. And JMM promises that after ctor is finished any thread will see the same (correct) value of final field. So, you won't need to use explicit synchronization, such as synchronize or Lock to allow all threads to see correct value of final field.
When you declare object's field as volatile, field's value can change, but still every read of value from any thread will see latest value written to it.
So, final and volatile achieve same purpose -- visibility of object's field value, but first is specifically used for a variable may only be assigned to once and second is used for a variable that can be changed many times.
References:
http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.12.4
http://docs.oracle.com/javase/specs/jls/se7/html/jls-8.html#jls-8.3.1.4

Because volatile and final are two extreme ends in Java
volatile means the variable is bound to changes
final means the value of the variable will never change whatsoever

volatile is used for variables that their value may change, in certain cases, otherwise there is no need for volatile, and final means that the variable may not change, so there's no need for volatile.
Your concurrency concerns are important, but making the HashMap volatile will not solve the problem, for handling the concurrency issues, you already use ConcurrentHashMap.

A volatile field gives you guarantees as what happens when you change it. (No an object which it might be a reference to)
A final field cannot be changed (What the fields reference can be changed)
It makes no sense to have both.

volatile modifier guarantees that all reads and writes go straight to main memory, i.e. like the variable access is almost into synchronized block. This is irrelevant for final variable that cannot be changed.

Because it doesn't make any sense. Volatile affects object reference value, not the object's fields/etc.
In your situation (you have concurrent map) you should do the field final.

In a multithread environment different threads will read a variable from main memory and add it to the CPU cache. It may result in two different threads making changes on the same variable, while ignoring each others results.
enter image description here
We use word volatile to indicate that variable will be saved in main memory and will be read from main memory. Thus whenever a thread want to read/write a variable it will be done from main memory, essentially making a variable safe in multithread environment.
When we use final keyword we indicate that variable will not change. As you can see if a variable is unchangeable, than it doesn't matter if multiple threads will use it. No thread can change the variable, so even if variable is saved to CPU caches at different times, and threads will use this variable at different times than it's still ok, because the variable can only be read.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.