Is it true that java volatile accesses cannot be reordered?

Is it true that java volatile accesses cannot be reordered? - java

Note
By saying that a memory access can (or cannot) be reordered I meand that it can be
reordered either by the compiler when emitting byte code byte or by the JIT when emitting
machine code or by the CPU when executing out of order (eventually requiring barriers to prevent this) with respect to any other memory access.
If often read that accesses to volatile variables cannot be reordered due to the Happens-Before relationship (HBR).
I found that an HBR exists between every two consecutive (in program order) actions of
a given thread and yet they can be reordered.
Also a volatile access HB only with accesses on the same variable/field.
What I thinks makes the volatile not reorderable is this
A write to a volatile field (§8.3.1.4) happens-before every subsequent read [of any thread]
of that field.
If there are others threads a reordering of the variables will becomes visible as in this
simple example
volatile int a, b;
Thread 1 Thread 2
a = 1; while (b != 2);
b = 2; print(a); //a must be 1
So is not the HBR itself that prevent the ordering but the fact that volatile extends this relationship with other threads, the presence of other threads is the element that prevent reordering.
If the compiler could prove that a reordering of a volatile variable would not change the
program semantic it could reorder it even if there is an HBR.
If a volatile variable is never accessed by other threads than its accesses
could be reordered
volatile int a, b, c;
Thread 1 Thread 2
a = 1; while (b != 2);
b = 2; print(a); //a must be 1
c = 3; //c never accessed by Thread 2
I think c=3 could very well be reordered before a=1, this quote from the specs
confirm this
It should be noted that the presence of a happens-before relationship between
two actions does not necessarily imply that they have to take place in that order
in an implementation. If the reordering produces results consistent with a legal
execution, it is not illegal.
So I made these simple java programs
public class vtest1 {
public static volatile int DO_ACTION, CHOOSE_ACTION;
public static void main(String[] args) {
CHOOSE_ACTION = 34;
DO_ACTION = 1;
}
}
public class vtest2 {
public static volatile int DO_ACTION, CHOOSE_ACTION;
public static void main(String[] args) {
(new Thread(){
public void run() {
while (DO_ACTION != 1);
System.out.println(CHOOSE_ACTION);
}
}).start();
CHOOSE_ACTION = 34;
DO_ACTION = 1;
}
}
In both cases both fields are marked as volatile and accessed with putstatic.
Since these are all the information the JIT has1, the machine code would be identical,
thus the vtest1 accesses will not be optimized2.
My question
Are volatile accesses really never reordered by specification or they could be3, but this is never done in practice?
If volatile accesses can never be reordered, what parts of the specs say so? and would this means that all volatile accesses are executed and seen in program order by the CPUs?
1Or the JIT can known that a field will never be accessed by other thread? If yes, how?.
2Memory barriers will be present for example.
3For example if no other threads are involved.

What the JLS says (from JLS-8.3.1.4. volatile Fields) is, in part, that
The Java programming language provides a second mechanism, volatile fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4).
Which means the access may be reordered, but the results of any reordering must eventually be consistent (when accessed by another thread) with the original order. A field in a single threaded application wouldn't need locking (from volatile or synchronization).

The Java memory model provides sequential consistency (SC) for correctly synchronized programs. SC in simple terms means that if all possible executions of some program, can be explained by different executions in which all memory actions are executed in some sequential order and this order is consistent with the program order (PO) of each of the threads, then this program is consistent with these sequential executions; so it is sequential consistent (hence the name).
What this effectively means that the JIT/CPU/memory subsystem can reorder volatile writes and reads as much as it wants as long as there exists a sequential execution that could also explain the outcome of the actual execution. So the actual execution isn't that important.
If we look at the following example:
volatile int a, b, c;
Thread 1 Thread 2
a = 1; while (c != 1);
b = 1; print(b);
c = 1;
There is a happens before relation between a=1 and b=2 (PO), and a happens before relation between c=2 and c=3 (PO) and a happens before relation c=1 and c!=0 (Volatile variable rule) and a happens before relation between c!=0 and print(b) (PO).
Since the happens before relation is transitive, there is a happens before relation between a=1 and print(b). So in that sense, it can't be reordered. However, there is nobody to prove that a reordering happened, so it can still be reordered.

I'm going to be using notation from JLS §17.4.5.
In your second example, (if you'll excuse my loose notation) you have
Thread 1 ordering:
hb(a = 1, b = 2)
hb(b = 2, c = 3)
Volatile guarantees:
hb(b = 2, b != 2)
hb(a = 1, access a for print)
Thread 2 ordering:
hb(while(b != 2);, print(a))
and we have (emphasis mine)
More specifically, if two actions share a happens-before relationship,
they do not necessarily have to appear to have happened in that order
to any code with which they do not share a happens-before
relationship. Writes in one thread that are in a data race with reads
in another thread may, for example, appear to occur out of order to
those reads.
There is no happens-before relationship between c=3 and Thread 2. The implementation is free to reorder c=3 to its heart's content.

From 17.4. Memory Model of JLS
The memory model describes possible behaviors of a program. An implementation is free to produce any code it likes, as long as all resulting executions of a program produce a result that can be predicted by the memory model.
This provides a great deal of freedom for the implementor to perform a myriad of code transformations, including the reordering of actions and removal of unnecessary synchronization.

Related

False sharing and Java memory model

I am using the example described here https://jenkov.com/tutorials/java-concurrency/false-sharing.html where we have the following class:
public class Counter {
public volatile long count1 = 0;
public volatile long count2 = 0;
}
and two threads are each updating the variables count1 and count2 at the same time. Because count1 and count2 are in the same cache line, each time a thread updates one of the variables, the other thread needs to refresh the cache line.
My question is regarding the volatile keyword. The way volatile is usually presented is that when a thread A updates a volatile variable x, and then another thread B reads x, that other thread B is guaranteed to see the updated value of x. But here volatile is used on two different variables. So another property of volatile would be that when a thread A updates a volatile variable, if another thread B then reads any other (also volatile?) variable from the same cache line, then that other thread B needs to refresh the whole cache line?

Both variables need to be volatile. Caches on modern CPUs are always coherent. So it doesn't matter if a cache line is falsely shared or not since the cache ensures coherence. It is the compiler that can still mess things up.
If there is data race, then the JMM allows for any written value to be seen the read is in data race with (*). So imagine the following example:
class FalseSharing{
int a;
volatile int b;
}
The FalseSharing instance is shared between multiple threads and one thread does the following:
while(shared.a==0){
print(shared.b);<== triggers the false sharing
}
And another thread at some point calls 'shared.a=1'. Because there is a data race on 'a', the compiler can rewrite the code to:
if(shared.a==0){
while(true){
print(shared.b);
}
}
This is legal since the JMM mandates that the possible read values of 'a' are {0,1}. And JVM is allowed to produce executions that produce a subset; in this case {0}.
After this code transformation, it is clear that it doesn't matter if 'a' is piggybacking on a falsely shared cache line.
(*) Happens-before consistent is the JMM's solution to prevent undefined behavior in case of a data race. It states that a read needs to see the most recent write before it is in the happens-before order, or a write it is in data race with.

What does "subsequent read" mean in the context of volatile variables?

Java memory visibility documentation says that:
A write to a volatile field happens-before every subsequent read of that same field.
I'm confused what does subsequent means in context of multithreading. Does this sentence implies some global clock for all processors and cores. So for example I assign value to variable in cycle c1 in some thread and then second thread is able to see this value in subsequent cycle c1 + 1?

It sounds to me like it's saying that it provides lockless acquire/release memory-ordering semantics between threads. See Jeff Preshing's article explaining the concept (mostly for C++, but the main point of the article is language neutral, about the concept of lock-free acquire/release synchronization.)
In fact Java volatile provides sequential consistency, not just acq/rel. There's no actual locking, though. See Jeff Preshing's article for an explanation of why the naming matches what you'd do with a lock.)
If a reader sees the value you wrote, then it knows that everything in the producer thread before that write has also already happened.
This ordering guarantee is only useful in combination with other guarantees about ordering within a single thread.
e.g.
int data[100];
volatile bool data_ready = false;
Producer:
data[0..99] = stuff;
// release store keeps previous ops above this line
data_ready = true;
Consumer:
while(!data_ready){} // spin until we see the write
// acquire-load keeps later ops below this line
int tmp = data[99]; // gets the value from the producer
If data_ready was not volatile, reading it wouldn't establish a happens-before relationship between two threads.
You don't have to have a spinloop, you could be reading a sequence number, or an array index from a volatile int, and then reading data[i].
I don't know Java well. I think volatile actually gives you sequential-consistency, not just release/acquire. A sequential-release store isn't allowed to reorder with later loads, so on typical hardware it needs an expensive memory barrier to make sure the local core's store buffer is flushed before any later loads are allowed to execute.
Volatile Vs Atomic explains more about the ordering volatile gives you.
Java volatile is just an ordering keyword; it's not equivalent to C11 _Atomic or C++11 std::atomic<T> which also give you atomic RMW operations. In Java, volatile_var++ is not an atomic increment, it a separate load and store, like volatile_var = volatile_var + 1. In Java, you need a class like AtomicInteger to get an atomic RMW.
And note that C/C++ volatile doesn't imply atomicity or ordering at all; it only tells the compiler to assume that the value can be modified asynchronously. This is only a small part of what you need to write lockless for anything except the simplest cases.

It means that once a certain Thread writes to a volatile field, all other Thread(s) will observe (on the next read) that written value; but this does not protect you against races though.
Threads have their caches, and those caches will be invalidated and updated with that newly written value via cache coherency protocol.
EDIT
Subsequent means whenever that happens after the write itself. Since you don't know the exact cycle/timing when that will happen, you usually say when some other thread observes the write, it will observer all the actions done before that write; thus a volatile establishes the happens-before guarantees.
Sort of like in an example:
// Actions done in Thread A
int a = 2;
volatile int b = 3;
// Actions done in Thread B
if(b == 3) { // observer the volatile write
// Thread B is guaranteed to see a = 2 here
}
You could also loop (spin wait) until you see 3 for example.

Peter's answer gives the rationale behind the design of the Java memory model.
In this answer I'm attempting to give an explanation using only the concepts defined in the JLS.
In Java every thread is composed by a set of actions.
Some of these actions have the potential to be observable by other threads (e.g. writing a shared variable), these
are called synchronization actions.
The order in which the actions of a thread are written in the source code is called the program order.
An order defines what is before and what is after (or better, not before).
Within a thread, each action has a happens-before relationship (denoted by <) with the next (in program order) action.
This relationship is important, yet hard to understand, because it's very fundamental: it guarantees that if A < B then
the "effects" of A are visible to B.
This is indeed what we expect when writing the code of a function.
Consider
Thread 1 Thread 2
A0 A'0
A1 A'1
A2 A'2
A3 A'3
Then by the program order we know A0 < A1 < A2 < A3 and that A'0 < A'1 < A'2 < A'3.
We don't know how to order all the actions.
It could be A0 < A'0 < A'1 < A'2 < A1 < A2 < A3 < A'3 or the sequence with the primes swapped.
However, every such sequence must have that the single actions of each thread are ordered according to the thread's program order.
The two program orders are not sufficient to order every action, they are partial orders, in opposition of the
total order we are looking for.
The total order that put the actions in a row according to a measurable time (like a clock) they happened is called the execution order.
It is the order in which the actions actually happened (it is only requested that the actions appear to be happened in
this order, but that's just an optimization detail).
Up until now, the actions are not ordered inter-thread (between two different threads).
The synchronization actions serve this purpose.
Each synchronization action synchronizes-with at least another synchronization action (they usually comes in pairs, like
a write and a read of a volatile variable, a lock and the unlock of a mutex).
The synchronize-with relationship is the happens-before between thread (the former implies the latter), it is exposed as
a different concept because 1) it slightly is 2) happens-before are enforced naturally by the hardware while synchronize-with
may require software intervention.
happens-before is derived from the program order, synchronize-with from the synchronization order (denoted by <<).
The synchronization order is defined in terms of two properties: 1) it is a total order 2) it is consistent with each thread's
program order.
Let's add some synchronization action to our threads:
Thread 1 Thread 2
A0 A'0
S1 A'1
A1 S'1
A2 S'2
S2 A'3
The program orders are trivial.
What is the synchronization order?
We are looking for something that by 1) includes all of S1, S2, S'1 and S'2 and by 2) must have S1 < S2 and S'1 < S'2.
Possible outcomes:
S1 < S2 < S'1 < S'2
S1 < S'1 < S'2 < S2
S'1 < S1 < S'2 < S'2
All are synchronization orders, there is not one synchronization order but many, the question of above is wrong, it
should be "What are the synchronization orders?".
If S1 and S'1 are so that S1 << S'1 than we are restricting the possible outcomes to the ones where S1 < S'2 so the
outcome S'1 < S1 < S'2 < S'2 of above is now forbidden.
If S2 << S'1 then the only possible outcome is S1 < S2 < S'1 < S'2, when there is only a single outcome I believe we have
sequential consistency (the converse is not true).
Note that if A << B these doesn't mean that there is a mechanism in the code to force an execution order where A < B.
Synchronization actions are affected by the synchronization order they do not impose any materialization of it.
Some synchronization actions (e.g. locks) impose a particular execution order (and thereby a synchronization order) but some don't (e.g. reads/writes of volatiles).
It is the execution order that create the synchronization order, this is completely orthogonal to the synchronize-with relationship.
Long story short, the "subsequent" adjective refers to any synchronization order, that is any valid (according to each thread
program order) order that encompasses all the synchronization actions.
The JLS then continues defining when a data race happens (when two conflicting accesses are not ordered by happens-before)
and what it means to be happens-before consistent.
Those are out of scope.

I'm confused what does subsequent means in context of multithreading. Does this sentence implies some global clock for all processors and cores...?
Subsequent means (according to the dictionary) coming after in time. There certainly is a global clock across all CPUs in a computer (think X Ghz) and the document is trying to say that if thread-1 did something at clock tick 1 then thread-2 does something on another CPU at clock tick 2, it's actions are considered subsequent.
A write to a volatile field happens-before every subsequent read of that same field.
The key phrase that could be added to this sentence to make it more clear is "in another thread". It might make more sense to understand it as:
A write to a volatile field happens-before every subsequent read of that same field in another thread.
What this is saying that if a read of a volatile field happens in Thread-2 after (in time) the write in Thread-1, then Thread-2 will be guaranteed to see the updated value. Further up in the documentation you point to is the section (emphasis mine):
... The results of a write by one thread are guaranteed to be visible to a read by another thread only if the write operation happens-before the read operation. The synchronized and volatile constructs, as well as the Thread.start() and Thread.join() methods, can form happens-before relationships. In particular.
Notice the highlighted phrase. The Java compiler is free to reorder instructions in any one thread's execution for optimization purposes as long as the reordering doesn't violate the definition of the language – this is called execution order and is critically different than program order.
Let's look at the following example with variables a and b that are non-volatile ints initialized to 0 with no synchronized clauses. What is shown is program order and the time in which the threads are encountering the lines of code.
Time Thread-1 Thread-2
1 a = 1;
2 b = 2;
3 x = a;
4 y = b;
5 c = a + b; z = x + y;
If Thread-1 adds a + b at Time 5, it is guaranteed to be 3. However, if Thread-2 adds x + y at Time 5, it might get 0, 1, 2, or 3 depends on race conditions. Why? Because the compiler might have reordered the instructions in Thread-1 to set a after b because of efficiency reasons. Also, Thread-1 may not have appropriately published the values of a and b so that Thread-2 might get out of date values. Even if Thread-1 gets context-switched out or crosses a write memory barrier and a and b are published, Thread-2 needs to cross a read barrier to update any cached values of a and b.
If a and b were marked as volatile then the write to a must happen-before (in terms of visibility guarantees) the subsequent read of a on line 3 and the write to b must happen-before the subsequent read of b on line 4. Both threads would get 3.
We use volatile and synchronized keywords in java to ensure happens-before guarantees. A write memory barrier is crossed when assigning a volatile or exiting a synchronized block and a read barrier is crossed when reading a volatile or entering a synchronized block. The Java compiler cannot reorder write instructions past these memory barriers so the order of updates is assured. These keywords control instruction reordering and insure proper memory synchronization.
NOTE: volatile is unnecessary in a single-threaded application because program order assures the reads and writes will be consistent. A single-threaded application might see any value of (non-volatile) a and b at times 3 and 4 but it always sees 3 at Time 5 because of language guarantees. So although use of volatile changes the reordering behavior in a single-threaded application, it is only required when you share data between threads.

This is more a definition of what will not happen rather than what will happen.
Essentially it is saying that once a write to an atomic variable has happened there cannot be any other thread that, on reading the variable, will read a stale value.
Consider the following situation.
Thread A is continuously incrementing an atomic value a.
Thread B occasionally reads A.a and exposes that value as a
non-atomic b variable.
Thread C occasionally reads both A.a and B.b.
Given that a is atomic it is possible to reason that from the point of view of C, b may occasionally be less than a but will never be greater than a.
If a was not atomic no such guarantee could be given. Under certain caching situations it would be quite possible for C to see b progress beyond a at any time.
This is a simplistic demonstration of how the Java memory model allows you to reason about what can and cannot happen in a multi-threaded environment. In real life the potential race conditions between reading and writing to data structures can be much more complex but the reasoning process is the same.

how synchronized keyword works internally

I read the below program and answer in a blog.
int x = 0;
boolean bExit = false;
Thread 1 (not synchronized)
x = 1;
bExit = true;
Thread 2 (not synchronized)
if (bExit == true)
System.out.println("x=" + x);
is it possible for Thread 2 to print “x=0”?
Ans : Yes ( reason : Every thread has their own copy of variables. )
how do you fix it?
Ans: By using make both threads synchronized on a common mutex or make both variable volatile.
My doubt is : If we are making the 2 variable as volatile then the 2 threads will share the variables from the main memory. This make a sense, but in case of synchronization how it will be resolved as both the thread have their own copy of variables.
Please help me.

This is actually more complicated than it seems. There are several arcane things at work.
Caching
Saying "Every thread has their own copy of variables" is not exactly correct. Every thread may have their own copy of variables, and they may or may not flush these variables into the shared memory and/or read them from there, so the whole thing is non-deterministic. Moreover, the very term flushing is really implementation-dependent. There are strict terms such as memory consistency, happens-before order, and synchronization order.
Reordering
This one is even more arcane. This
x = 1;
bExit = true;
does not even guarantee that Thread 1 will first write 1 to x and then true to bExit. In fact, it does not even guarantee that any of these will happen at all. The compiler may optimize away some values if they are not used later. The compiler and CPU are also allowed to reorder instructions any way they want, provided that the outcome is indistinguishable from what would happen if everything was really in program order. That is, indistinguishable for the current thread! Nobody cares about other threads until...
Synchronization comes in
Synchronization does not only mean exclusive access to resources. It is also not just about preventing threads from interfering with each other. It's also about memory barriers. It can be roughly described as each synchronization block having invisible instructions at the entry and exit, the first one saying "read everything from the shared memory to be as up-to-date as possible" and the last one saying "now flush whatever you've been doing there to the shared memory". I say "roughly" because, again, the whole thing is an implementation detail. Memory barriers also restrict reordering: actions may still be reordered, but the results that appear in the shared memory after exiting the synchronized block must be identical to what would happen if everything was indeed in program order.
All that only works, of course, only if both blocks use the same locking object.
The whole thing is described in details in Chapter 17 of the JLS. In particular, what's important is the so-called "happens-before order". If you ever see in the documentation that "this happens-before that", it means that everything the first thread does before "this" will be visible to whoever does "that". This may even not require any locking. Concurrent collections are a good example: one thread puts there something, another one reads that, and that magically guarantees that the second thread will see everything the first thread did before putting that object into the collection, even if those actions had nothing to do with the collection itself!
Volatile variables
One last warning: you better give up on the idea that making variables volatile will solve things. In this case maybe making bExit volatile will suffice, but there are so many troubles that using volatiles can lead to that I'm not even willing to go into that. But one thing is for sure: using synchronized has much stronger effect than using volatile, and that goes for memory effects too. What's worse, volatile semantics changed in some Java version so there may exist some versions that still use the old semantics which was even more obscure and confusing, whereas synchronized always worked well provided you understand what it is and how to use it.
Pretty much the only reason to use volatile is performance because synchronized may cause lock contention and other troubles. Read Java Concurrency in Practice to figure all that out.
Q & A
1) You wrote "now flush whatever you've been doing there to the shared
memory" about synchronized blocks. But we will see only the variables
that we access in the synchronize block or all the changes that the
thread call synchronize made (even on the variables not accessed in the
synchronized block)?
Short answer: it will "flush" all variables that were updated during the synchronized block or before entering the synchronized block. And again, because flushing is an implementation detail, you don't even know whether it will actually flush something or do something entirely different (or doesn't do anything at all because the implementation and the specific situation already somehow guarantee that it will work).
Variables that wasn't accessed inside the synchronized block obviously won't change during the execution of the block. However, if you change some of those variables before entering the synchronized block, for example, then you have a happens-before relationship between those changes and whatever happens in the synchronized block (the first bullet in 17.4.5). If some other thread enters another synchronized block using the same lock object then it synchronizes-with the first thread exiting the synchronized block, which means that you have another happens-before relationship here. So in this case the second thread will see the variables that the first thread updated prior to entering the synchronized block.
If the second thread tries to read those variables without synchronizing on the same lock, then it is not guaranteed to see the updates. But then again, it isn't guaranteed to see the updates made inside the synchronized block as well. But this is because of the lack of the memory-read barrier in the second thread, not because the first one didn't "flush" its variables (memory-write barrier).
2) In this chapter you post (of JLS) it is written that: "A write to a
volatile field (§8.3.1.4) happens-before every subsequent read of that
field." Doesn't this mean that when the variable is volatile you will
see only changes of it (because it is written write happens-before
read, not happens-before every operation between them!). I mean
doesn't this mean that in the example, given in the description of the
problem, we can see bExit = true, but x = 0 in the second thread if
only bExit is volatile? I ask, because I find this question here: http://java67.blogspot.bg/2012/09/top-10-tricky-java-interview-questions-answers.html
and it is written that if bExit is volatile the program is OK. So the
registers will flush only bExits value only or bExits and x values?
By the same reasoning as in Q1, if you do bExit = true after x = 1, then there is an in-thread happens-before relationship because of the program order. Now since volatile writes happen-before volatile reads, it is guaranteed that the second thread will see whatever the first thread updated prior to writing true to bExit. Note that this behavior is only since Java 1.5 or so, so older or buggy implementations may or may not support this. I have seen bits in the standard Oracle implementation that use this feature (java.concurrent collections), so you can at least assume that it works there.
3) Why monitor matters when using synchronized blocks about memory
visibility? I mean when try to exit synchronized block aren't all
variables (which we accessed in this block or all variables in the
thread - this is related to the first question) flushed from registers
to main memory or broadcasted to all CPU caches? Why object of
synchronization matters? I just cannot imagine what are relations and
how they are made (between object of synchronization and memory).
I know that we should use the same monitor to see this changes, but I
don't understand how memory that should be visible is mapped to
objects. Sorry, for the long questions, but these are really
interesting questions for me and it is related to the question (I
would post questions exactly for this primer).
Ha, this one is really interesting. I don't know. Probably it flushes anyway, but Java specification is written with high abstraction in mind, so maybe it allows for some really weird hardware where partial flushes or other kinds of memory barriers are possible. Suppose you have a two-CPU machine with 2 cores on each CPU. Each CPU has some local cache for every core and also a common cache. A really smart VM may want to schedule two threads on one CPU and two threads on another one. Each pair of the threads uses its own monitor, and VM detects that variables modified by these two threads are not used in any other threads, so it only flushes them as far as the CPU-local cache.
See also this question about the same issue.
4) I thought that everything before writing a volatile will be up to
date when we read it (moreover when we use volatile a read that in
Java it is memory barrier), but the documentation don't say this.
It does:
17.4.5.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
If hb(x, y) and hb(y, z), then hb(x, z).
A write to a volatile field (§8.3.1.4) happens-before every subsequent
read of that field.
If x = 1 comes before bExit = true in program order, then we have happens-before between them. If some other thread reads bExit after that, then we have happens-before between write and read. And because of the transitivity, we also have happens-before between x = 1 and read of bExit by the second thread.
5) Also, if we have volatile Person p does we have some dependency
when we use p.age = 20 and print(p.age) or have we memory barrier in
this case(assume age is not volatile) ? - I think - No
You are correct. Since age is not volatile, then there is no memory barrier, and that's one of the trickiest things. Here is a fragment from CopyOnWriteArrayList, for example:
Object[] elements = getArray();
E oldValue = get(elements, index);
if (oldValue != element) {
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len);
newElements[index] = element;
setArray(newElements);
} else {
// Not quite a no-op; ensures volatile write semantics
setArray(elements);
Here, getArray and setArray are trivial setter and getter for the array field. But since the code changes elements of the array, it is necessary to write the reference to the array back to where it came from in order for the changes to the elements of the array to become visible. Note that it is done even if the element being replaced is the same element that was there in the first place! It is precisely because some fields of that element may have changed by the calling thread, and it's necessary to propagate these changes to future readers.
6) And is there any happens before 2 subsequent reads of volatile
field? I mean does the second read will see all changes from thread
which reads this field before it(of course we will have changes only
if volatile influence visibility of all changes before it - which I am
a little confused whether it is true or not)?
No, there is no relationship between volatile reads. Of course, if one thread performs a volatile write and then two other thread perform volatile reads, they are guaranteed to see everything at least up to date as it was before the volatile write, but there is no guarantee of whether one thread will see more up-to-date values than the other. Moreover, there is not even strict definition of one volatile read happening before another! It is wrong to think of everything happening on a single global timeline. It is more like parallel universes with independent timelines that sometimes sync their clocks by performing synchronization and exchanging data with memory barriers.

It depends on the implementation which decides if threads will keep a copy of the variables in their own memory. In case of class level variables threads have a shared access and in case of local variables threads will keep a copy of it. I will provide two examples which shows this fact , please have a look at it.
And in your example if I understood it correctly your code should look something like this--
package com.practice.multithreading;
public class LocalStaticVariableInThread {
static int x=0;
static boolean bExit = false;
public static void main(String[] args) {
Thread t1=new Thread(run1);
Thread t2=new Thread(run2);
t1.start();
t2.start();
}
static Runnable run1=()->{
x = 1;
bExit = true;
};
static Runnable run2=()->{
if (bExit == true)
System.out.println("x=" + x);
};
}
Output
x=1
I am getting this output always. It is because the threads share the variable and the when it is changed by one thread other thread can see it. But in real life scenarios we can never say which thread will start first, since here the threads are not doing anything we can see the expected result.
Now take this example--
Here if you make the i variable inside the for-loop` as static variable then threads won t keep a copy of it and you won t see desired outputs, i.e. the count value will not be 2000 every time even if u have synchronized the count increment.
package com.practice.multithreading;
public class RaceCondition2Fixed {
private int count;
int i;
/*making it synchronized forces the thread to acquire an intrinsic lock on the method, and another thread
cannot access it until this lock is released after the method is completed. */
public synchronized void increment() {
count++;
}
public static void main(String[] args) {
RaceCondition2Fixed rc= new RaceCondition2Fixed();
rc.doWork();
}
private void doWork() {
Thread t1 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
/*if we don t use join then count will be 0. Because when we call t1.start() and t2.start()
the threads will start updating count in the spearate threads, meanwhile the main thread will
print the value as 0. So. we need to wait for the threads to complete. */
System.out.println(Thread.currentThread().getName()+" Count is : "+count);
}
}

Behavior of memory barrier in Java

After reading more blogs/articles etc, I am now really confused about the behavior of load/store before/after memory barrier.
Following are 2 quotes from Doug Lea in one of his clarification article about JMM, which are both very straighforward:
Anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f.
Note that it is important for both threads to access the same volatile variable in order to properly set up the happens-before relationship. It is not the case that everything visible to thread A when it writes volatile field f becomes visible to thread B after it reads volatile field g.
But then when I looked into another blog about memory barrier, I got these:
A store barrier, “sfence” instruction on x86, forces all store instructions prior to the barrier to happen before the barrier and have the store buffers flushed to cache for the CPU on which it is issued.
A load barrier, “lfence” instruction on x86, forces all load instructions after the barrier to happen after the barrier and then wait on the load buffer to drain for that CPU.
To me, Doug Lea's clarification is more strict than the other one: basically, it means if the load barrier and store barrier are on different monitors, the data consistency will not be guaranteed. But the later one means even if the barriers are on different monitors, the data consistency will be guaranteed. I am not sure if I understanding these 2 correctly and also I am not sure which of them is correct.
Considering the following codes:
public class MemoryBarrier {
volatile int i = 1, j = 2;
int x;
public void write() {
x = 14; //W01
i = 3; //W02
}
public void read1() {
if (i == 3) { //R11
if (x == 14) //R12
System.out.println("Foo");
else
System.out.println("Bar");
}
}
public void read2() {
if (j == 2) { //R21
if (x == 14) //R22
System.out.println("Foo");
else
System.out.println("Bar");
}
}
}
Let's say we have 1 write thread TW1 first call the MemoryBarrier's write() method, then we have 2 reader threads TR1 and TR2 call MemoryBarrier's read1() and read2() method.Consider this program run on CPU which does not preserve ordering (x86 DO preserve ordering for such cases which is not the case), according to memory model, there will be a StoreStore barrier (let's say SB1) between W01/W02, as well as 2 LoadLoad barrier between R11/R12 and R21/R22 (let's say RB1 and RB2).
Since SB1 and RB1 are on same monitor i, so thread TR1 which calls read1 should always see 14 on x, also "Foo" is always printed.
SB1 and RB2 are on different monitors, if Doug Lea is correct, thread TR2 will not be guaranteed to see 14 on x, which means "Bar" may be printed occasionally. But if memory barrier runs like Martin Thompson described in the blog, the Store barrier will push all data to main memory and Load barrier will pull all data from main memory to cache/buffer, then TR2 will also be guaranteed to see 14 on x.
I am not sure which one is correct, or both of them are but what Martin Thompson described is just for x86 architecture. JMM does not guarantee change to x is visible to TR2 but x86 implementation does.
Thanks~

Doug Lea is right. You can find the relevant part in section §17.4.4 of the Java Language Specification:
§17.4.4 Synchronization Order
[..] A write to a volatile variable v (§8.3.1.4) synchronizes-with all subsequent reads of v by any thread (where "subsequent" is defined according to the synchronization order). [..]
The memory model of the concrete machine doesn't matter, because the semantics of the Java Programming Language are defined in terms of an abstract machine -- independent of the concrete machine. It's the responsibility of the Java runtime environment to execute the code in such a way, that it complies with the guarantees given by the Java Language Specification.
Regarding the actual question:
If there is no further synchronization, the method read2 can print "Bar", because read2 can be executed before write.
If there is an additional synchronization with a CountDownLatch to make sure that read2 is executed after write, then method read2 will never print "Bar", because the synchronization with CountDownLatch removes the data race on x.
Independent volatile variables:
Does it make sense, that a write to a volatile variable does not synchronize-with a read of any other volatile variable?
Yes, it makes sense. If two threads need to interact with each other, they usually have to use the same volatile variable in order to exchange information. On the other hand, if a thread uses a volatile variable without a need for interacting with all other threads, we don't want to pay the cost for a memory barrier.
It is actually important in practice. Let's make an example. The following class uses a volatile member variable:
class Int {
public volatile int value;
public Int(int value) { this.value = value; }
}
Imagine this class is used only locally within a method. The JIT compiler can easily detect, that the object is only used within this method (Escape analysis).
public int deepThought() {
return new Int(42).value;
}
With the above rule, the JIT compiler can remove all effects of the volatile reads and writes, because the volatile variable can not be accesses from any other thread.
This optimization actually exists in the Java JIT compiler:
src/share/vm/opto/memnode.cpp

As far as I understood the question is actually about volatile read/writes and its happens-before guarantees. Speaking of that part, I have only one thing to add to nosid's answer:
Volatile writes cannot be moved before normal writes, volatile reads cannot be moved after normal reads. That's why read1() and read2() results will be as nosid wrote.
Speaking about barriers - the defininition sounds fine for me, but the one thing that probably confused you is that these are things/tools/way to/mechanism (call it whatever you like) to implement behavior described in JMM in hotspot. When using Java, you should rely on JMM guarantees, not implementation details.

Java memory barriers

I'm reading JSR 133 Cookbook and have the following question about memory barriers. An example of inserted memory barriers is in the book, but only writing and reading from local variables is used. Suppose I have the following variables
int a;
volatile int b;
And the code
b=a;
Do I understand correctly that this one line would produce the following instructions
load a
LoadStore membar
store b

The underlying behavior of the JVM is guaranteed only against the volatile variable. It may be possible that two separate threads may have access to different values for variable 'a' even after a thread completes evaluation of the b = a; statement. The JVM only guarantees that access to the volatile variable is serialized and has Happens-Before semantics. What this means is that the result of executing b = a; on two different threads (in the face of a "volatile" value for 'a' (ha ha)) is indeterminate because the JVM only says that the store to 'b' is serialized, it puts no guarantee on which thread has precedence.
More precisely what this means is that the JVM treats variable 'b' as having its own lock; allowing only one thread to read or write 'b' at a time; and this lock only protects access to 'b' and nothing else.
Now, this means different things under different JVMs and how this lock is actually implemented on different machine architectures may result in vastly different runtime behavior for your application. The only guarantee you should trust is what the Java reference manual says, "A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable." For further review see Dennis Byrne's excellent article for some examples of how different JVM implementations deal with this issue.
Happens-Before semantics are not very interesting in the provided example because an integer primitive doesn't provide much opportunity for the kind of instruction reordering that volatile was intended (in part) to remedy. A better example is this:
private AnObjectWithAComplicatedConstructor _sampleA;
private volatile AnObjectWithAComplicatedConstructor _sampleB;
public void getSampleA() {
if (_sampleA == null) {
_sampleA = new AnObjectWithAComplicatedConstructor();
}
return _sampleA;
}
public void getSampleB() {
if (_sampleB == null) {
_sampleB = new AnObjectWithAComplicatedConstructor();
}
return _sampleB;
}
In this example field '_sampleA' has a serious problem; in a multithreaded situation it is very possible that '_sampleA' may be in the process of being initialized in one thread at the same time another thread attempts to use it leading to all sorts of sporatic and very, very difficult to duplicate bugs. To see this consider thread X to execute the 'new' byte code instruction statement of the new in getSampleA() and then stores the (yet-to-be-initialized) result in field '_sampleA'. Thread X is now paused by the JVM and thread Y starts executing getSampleA() and sees that the '_sampleA' is not null; which uninitialized value is then returned and thread Y now starts calling methods on the resulting instance causing all sorts of problems; which will, of course, only appear in production, at odd hours, and under heavy service loads.
The worse case for field _sampleB is that it may have multiple threads initializing individual instances; all but one of which will eventually be discarded. Code like this should be wrapped in a "synchronized" block but the volatile keyword will do the trick because it requires that the value finally stored in '_sampleB' has Happens-Before semantics which means that the stuff to the right of the equals sign is guaranteed to be complete when the stuff on the left hand side of the equals sign is performed.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.