How do final fields prevent other threads from seeing partially constructed objects? - java

I was looking into creating an immutable datatype that has final fields (including an array that is constructed and filled prior to being assigned to the final member field), and noticed that it seems that the JVM is specified to guarantee that any other thread that gets a reference to this object will see the initialized fields and array values (assuming no pointers to this are published within the constructor, see What is an "incompletely constructed object"? and How do JVM's implicit memory barriers behave when chaining constructors?).
I am curious how this is achieved without synchronizing every access to this object, or otherwise paying some significant performance penalty. According to my understanding, the JVM can achieve this by doing the following:
Issue a write-fence at the end of the constructor
Publish the reference to the new object only after the write-fence
Issue a read-fence any time you refer to a final field of an object
I can't think of a simpler or cheaper way of eliminating the risk of other threads seeing uninitialized final fields (or recursive references through final fields).
This seems like it could impose a severe performance penalty due to all of the read-fences in the other threads reading the object, but eliminating the read-fences introduces the possibility that the object reference is seen in another processor before it issues a read-fence or otherwise sees the updates to the memory locations corresponding to the newly initialized final fields.
Does anyone know how this works? And whether this introduces a significant performance penalty?

See the "Memory Barriers" section in this writeup.
A StoreStore barrier is required after final fields are set and before the object reference is assigned to another variable. This is the key piece of info you're asking about.
According to the "Reordering" section there, a store of a final field can not be reordered with respect to a store of a reference to the object containing the final field.
Additionally, it states that in v.afield = 1; x.finalField = v; ... ; sharedRef = x;, neither of the first two can be reordered with respect to the third; which ensures that stores to the fields of an object that is stored as a final field are themselves guaranteed to be visible to other threads before a reference to the object containing the final field is stored.
Together, this means that all stores to final fields must be visible to all threads before a reference to the object containing the field is stored.

Related

Java 'final' instance variable - visibility and propagation of variable's internal state

While reading the documentation related to Threads and Locks, a sentence describing final keyword attracted me:
Correspondingly, compilers are allowed to keep the value of a final field cached in a register and not reload it from memory in situations where a non-final field would have to be reloaded.
Does it mean, that if I declare a final Object object = ... as an instance variable and then access it (modify its inner state - object.state) from anonymous inner classes (multiple instances of Runnable), then the value of object.state could actually be read/written from/into CPU cache and it (value of object.state) could be out of sync across those Runnable instances?
So if I want to be sure, that the value of object.state is properly propagated across all the threads I have to declare the object as volatile Object object instead?
Thank you.
Edit: I have edited my original question. And now I know I misunderstood the documentation so the answer to my last question is NO - volatile/final Object object has no effect on object.state - it depends how the object.state is declared, initialized and/or accessed.
Thanks to the #Burak Serdar for the answer!
When you declare a final Object object=..., the final value is the reference to the object, not the internal state of the object. It means than nothing can modify object, it does not say that nothing can modify, say, object.value. So the variable object can be cached, which does not mean the internal state of object can be cached.
Final will not help you with threading issues the way you hope, sorry.
Volatile may help in some situations, but you may need a lock.
From the Oracle tutorials:
-Reads and writes are atomic for reference variables and for most primitive variables (all types except long and double).
-Reads and writes are atomic for all variables declared volatile (including long and double variables).
That means your object will not be immediately updated between reads and writes, but it will be consistent within one read/write.

Java final keyword semantics with respect to cache

What is the behavior of Java final keyword with respect of caches?
quote from:jsr133-faq
The values for an object's final fields are set in its constructor.
Assuming the object is constructed "correctly", once an object is
constructed, the values assigned to the final fields in the
constructor will be visible to all other threads without
synchronization. In addition, the visible values for any other object
or array referenced by those final fields will be at least as
up-to-date as the final fields.
I don't understand what it refers to when it says as up-to-date as the final fields.:
In addition, the visible values for any other object or array
referenced by those final fields will be at least as up-to-date as the
final fields.
My guess is, for example:
public class CC{
private final Mutable mutable; //final field
private Other other; //non-final field
public CC(Mutable m, Other o){
mutable=m;
other=o;
}
}
When the constructor CC returns, besides the pointer value of mutable, all values on the object graph rooted at m, if exist in the local processor cache, will be flushed to main memory. And at the same time, mark the corresponding cache lines of other processors' local caches as Invalid.
Is that the case? What does it look like in assembly? How do they actually implement it?
Is that the case?
The actual guarantee is that any thread that can see an instance of CC created using that constructor is guaranteed to see the mutable reference and also the state of the Mutable object's fields as of the time that the constructor completed.
It does not guarantee that the state of all values in the closure of the Mutable instance will be visible. However, any writes (in the closure or not) made by the thread that executed the constructor prior to the constructor completing will be visible. (By "happens-before" analysis.)
Note that the behavior is specified in terms what one thread is guaranteed to see, not in terms of cache flushing / invalidation. The latter is a way of implementing the behavior that the specification requires. There may be other ways.
What does it look like in assembly?
That will be version / platform / etc specific. There is a way to get the JIT compiler to dump out the compiled code, if you want to investigate what the native code looks like for your hardware.
How to see JIT-compiled code in JVM?
How do they actually implement it?
See above.

Can a volatile variable that is never assigned null to ever contain null?

Can in the following conceptual Java example:
public class X implements Runnable {
public volatile Object x = new Object();
#Runnable
public void run() {
for (;;) {
Thread.sleep(1000);
x = new Object();
}
}
}
x ever be read as null from another thread?
Bonus: Do I need to declare it volatile (I do not really care about that value, it suffices that sometime in the future it will be the newly assigned value and never is null)
Technically, yes it can. That is the main reason for the original ConcurrentHashMap's readUnderLock. The javadoc even explains how:
Reads value field of an entry under lock. Called if value field ever appears to be null. This is possible only if a compiler happens to reorder a HashEntry initialization with its table assignment, which is legal under memory model but is not known to ever occur.
Since the HashEntry's value is volatile this type of reordering is legal on consturction.
Moral of the story is that all non-final initializations can race with object construction.
Edit:
#Nathan Hughes asked a valid question:
#John: in the OP's example wouldn't the construction have happened before the thread the runnable is passed into started? it would seem like that would impose a happens-before barrier subsequent to the field's initialization.
Doug Lea had a couple comments on this topic, the entire thread can be read here. He answered the comment:
But the issue is whether assignment of the new C instance to some other memory must occur after the volatile stores.
With the answer
Sorry for mis-remembering why I had treated this issue as basically settled:
Unless a JVM always pre-zeros memory (which usually not a good option), then
even if not explicitly initialized, volatile fields must be zeroed
in the constructor body, with a release fence before publication.
And so even though there are cases in which the JMM does not
strictly require mechanics preventing publication reordering
in constructors of classes with volatile fields, the only good
implementation choices for JVMs are either to use non-volatile writes
with a trailing release fence, or to perform each volatile write
with full fencing. Either way, there is no reordering with publication.
Unfortunately, programmers cannot rely on a spec to guarantee
it, at least until the JMM is revised.
And finished with:
Programmers do not expect that even though final fields are specifically
publication-safe, volatile fields are not always so.
For various implementation reasons, JVMs arrange that
volatile fields are publication safe anyway, at least in
cases we know about.
Actually updating the JMM/JLS to mandate this is not easy
(no small tweak that I know applies). But now is a good time
to be considering a full revision for JDK9.
In the mean time, it would make sense to further test
and validate JVMs as meeting this likely future spec.
This depends on how the X instance is published.
Suppose x is published unsafely, eg. through a non-volatile field
private X instance;
...
void someMethod() {
instance = new X();
}
Another thread accessing the instance field is allowed to see a reference value referring to an uninitialized X object (ie. where its constructor hasn't run yet). In such a case, its field x would have a value of null.
The above example translates to
temporaryReferenceOnStack = new memory for X // a reference to the instance
temporaryReferenceOnStack.<init> // call constructor
instance = temporaryReferenceOnStack;
But the language allows the following reordering
temporaryReferenceOnStack = new memory for X // a reference to the instance
instance = temporaryReferenceOnStack;
temporaryReferenceOnStack.<init> // call constructor
or directly
instance = new memory for X // a reference to the instance
instance.<init> // call constructor
In such a case, a thread is allowed to see the value of instance before the constructor is invoked to initialize the referenced object.
Now, how likely this is to happen in current JVMs? Eh, I couldn't come up with an MCVE.
Bonus: Do I need to declare it volatile (I do not really care about
that value, it suffices that sometime in the future it will be the
newly assigned value and never is null)
Publish the enclosing object safely. Or use a final AtomicReference field which you set.
No. The Java memory model guarantees that you will never seen x as null. x must always be the initial value it was assigned, or some subsequent value.
This actually works with any variable, not just volatile. What you are asking about is called "out of thin air values". C.f. Java Concurrency in Practice which talks about this concept in some length.
The other part of your question "Do I need to declare x as volatile:" given the context, yes, it should be either volatile or final. Either one provides safe publication for your object referenced by x. C.f. Safe Publication. Obviously, x can't be changed later if it's final.

Local References in ConcurrentHashMap

In ConcurrentHashMap, segments is marked final (and thus will never change), but the method ensureSegment creates a method-local copy, ss, of segments upon which to operate.
Does anybody know this purpose? which benefit we can get?
Update:
I search from google, get one page which explained ConcurrentHashMap in JDK7 The Concurrency Of ConcurrentHashMap, below are excerpts
Local References
Even though segments is marked final (and thus will never change), Doug Lea prudently creates a method-local copy, ss, of segments upon which to operate. Such defensive programming allows a programmer to not worry about otherwise-volatile instance member references changing during execution of a method (i.e. inconsistent reads). Of course, this is simply a new reference and does not prevent your method from seeing changes to the referent.
Can anyone explain the bold text?
There is no semantic difference between accessing a final field and accessing a local variable holding a copy of the final field’s value. However, it is an established pattern to copy fields into local variables in performance critical code.
Even in cases where it does not make a difference (that depends on the state of the HotSpot optimizer), it will save at least some bytes in the method’s code.
Each access to an instance field, be it final or not (the only exception being compile-time constants), will get compiled as two instructions, first pushing the this reference onto the operand stack via aload_0, then performing a getfield operation, which has a two byte index to the constant pool entry describing the field. In other words, each field access needs four bytes in the method’s code, whereas reading a local variable needs only one byte if the variable is one of the method’s first four (counting this as a local variable), which is the case here (see aload_n).
So storing a field’s value in a local variable when it is going to be accessed multiple times is a good behavior to protect against changes of a mutable variable and to avoid the cost of a volatile read and still doesn’t hurt even when it is obsolete as in the case of a final field as it will even produce more compact byte code. And in a simple interpreted execution, i.e. before the optimizer kicks in, the single local variable access might indeed be faster than the detour via the this instance.
The array is marked final, but the array elements are not (can not). So each array element can be set or replaced, for example using:
this.segments[i] = ...
ensureSegment in Java 7 uses a different method to set the array element, using sun.misc.Unsafe, for improved performance when concurrently calling the method. This is not what a "regular" developer should do.
A local variable
final Segment<K,V>[] ss = this.segments;
is typically used to ensure (well - to increase that probability) that ss is in a CPU register, and not re-read from memory, for the duration of this method. I guess that would not be needed in this case, as the compiler can infer that. Using ss does make the following lines slightly shorter, maybe that's why it is used.

Cost of using final fields

We know that making fields final is usually a good idea as we gain thread-safety and immutability which makes the code easier to reason about. I'm curious if there's an associated performance cost.
The Java Memory Model guarantees this final Field Semantics:
A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.
This means that for a class like this
class X {
X(int a) {
this.a = a;
}
final int a;
static X instance;
}
whenever Thread 1 creates an instance like this
X.instance = new X(43);
while (true) doSomethingEventuallyEvictingCache();
and Thread 2 sees it
while (X.instance == null) {
doSomethingEventuallyEvictingCache();
}
System.out.println(X.instance.a);
it must print 43. Without the final modifier, the JIT or the CPU could reorder the stores (first store X.instance and then set a=43) and Thread 2 could see the default-initialized value and print 0 instead.
When JIT sees final it obviously refrains from reordering. But it also has to force the CPU to obey the order. Is there an associated performance penalty?
Is there an associated performance penalty?
If you take a look at the source code of the JIT compiler, you will find the following comment regarding final member variables in the file src/share/vm/opto/parse1.cpp:
This method (which must be a constructor by the rules of Java) wrote a final. The effects of all initializations must be committed to memory before any code after the constructor publishes the reference to the newly constructor object. Rather than wait for the publication, we simply block the writes here. Rather than put a barrier on only those writes which are required to complete, we force all writes to complete.
The compiler emits additional instructions if there are final member variables. Most likely, these additional instructions cause a performance penalty. But it's unclear, if this impact is significant for any application.

Categories

Resources