Do I need the volatile keyword? (Java)

Do I need the volatile keyword? (Java) - java

Do I only need to mark a field volatile if multiple threads are reading it at the same time?
What about the scenario where Thread A changes the value of a field, and Thread B evaluates it after Thread A is guaranteed to have completed?
In my scenario, is there a happens-before relationship enforced (without the volatile keyword)?

You need the volatile keyword or some other synchronization mechanism to enforce the "happens before" relationship that guarantees visibility of the variable in threads other than the thread in which it was written. Without such synchronization, everything is treated as happening "at the same time", even if it doesn't by wall clock time.
In your particular example, one thing that may happen without synchronization is that the value written by thread A is never flushed from cache to main memory, and thread B executes on another processor and never sees the value written by thread A.
When you are dealing with threads, wall clock time means nothing. You must synchronize properly if you want data to pass between threads properly. There's no shortcut to proper synchronization that won't cause you headaches later on.
In the case of your original question, some ways proper synchronization can be achieved are by using the volatile keyword, by using synchronized blocks, or by having the thread that is reading the value of the variable join() the thread in which the variable is written.
Edit: In response to your comment, a Future has internal synchronization such that calling get() on a Future establishes a "happens before" relationship when the call returns, so that also achieves proper synchronization.

No, you don't need volatile...
is there a happens-before relationship enforced (without the volatile keyword)?
...but your code needs to do something to establish "happens-before."
There will be a happens-before relationship if (and only if) your code does something that the "Java Language Specification" (JLS) says will establish "happens-before."
What about the scenario where Thread A changes the value of a field, and Thread B evaluates it after Thread A is guaranteed to have completed?
Depends on what you mean by "guaranteed." If "guaranteed" means, "established happens-before," then your code will work as expected.
One way you can guarantee it is for thread B to call threadA.join(). The JLS guarantees that if thread B calls threadA.join(), then everything thread A did must "happen before" the join() call returns.
You do not need any of the shared variables to be volatile if thread B only accesses them after joining thread A.

You can chose one of available options to achieve the same purpose.
You can use volatile to force all threads to get latest value of the variable from main memory.
You can use synchronization to guard critical data
You can use Lock API
You can use Atomic variables
Refer to this documentation page on high level concurrency constructs.
Have a look at related SE questions:
Avoid synchronized(this) in Java?
What is the difference between atomic / volatile / synchronized?

Related

Understanding atomic-ness, visibility and reordering of synchonized blocks vs volatile variables in Java

I am trying to understand volatile keyword from the book Java Concurrency in Practice. I compares synchronized keyword with volatile variables in three aspects: atomic-ness, volatility and reordering. I have some doubts about the same. I have discussed them one by one below:
1) Visibility: `synchronized` vs `volatile`
Book says following with respect to visibility of synchronized:
Everything thread A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock.
It says following with respect to visibility of volatile variables:
Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
The visibility effects of volatile variables extend beyond the value of the
volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable. So from a memory visibility perspective, writing a volatile variable is like exiting a synchronized block and reading a volatile variable is like entering a synchronized block.
Q1. I feel second paragraph above (of volatile) corresponds to what book said about synchronized. But is there synchronized-equivalent to volatile's first paragraph? In other words, does using synchronized ensures any / some variables not getting cached in processor caches and registers?
Note that book also says following about visibility for synchronized:
Locking is not just about mutual exclusion; it is also about memory visibility.
2) Reordering: `synchornized` vs `volatile`
Book says following about volatile in the context of reordering:
When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations.
Q2. Book does not say anything about reordering in the context of synchronized. Can someone explain what can be said of reordering in the context of synchronized?
3) Atomicity
Book says following about atomicity of synchronized and volatile.
the semantics of volatile are not strong enough to make the increment operation (count++) atomic, unless you can guarantee that the variable is written only from a single thread.
Locking can guarantee both visibility and atomicity; volatile variables can
only guarantee visibility.
Q3. I guess this means two threads can see volatile int a together, both will increment it and then save it. But only one last read will have effect, thus making whole "read-increment-save" non atomic. Am I correct with this interpretation on non-atomic-ness of volatile?
Q4. Does all locking-equivalent are comparable and have same visibility, ordering and atomicity property: synchronized blocks, atomic variables, locks?
PS: This question is related to and completely revamped version of this question I asked some days back. Since its full revamp, I havent deleted the older one. I wrote this question in more focused and structured way. Will delete older once I get answer to this one.

The key difference between 'synchronized' and 'volatile', is that 'synchronized' can make threads pause, whereas volatile can't.
'caches and registers' isn't a thing. The book says that because in practice that's usually how things are implemented, and it makes it easier (or perhaps not, given these questions) to understand the how & why of the JMM (java memory model).
The JMM doesn't name them, however. All it says is that the VM is free to give each thread its own local copy of any variable, or not, to be synchronized at some arbitrary time with some or all other threads, or not... unless there is a happens-before relationship anywhere, in which case the VM must ensure that at the point of execution between the two threads where a happens before relationship has been established, they observe all variables in the same state.
In practice that would presumably mean to flush caches. Or not; it might mean the other thread overwrites its local copy.
The VM is free to implement this stuff however it wants, and differently on every architecture out there. As long as the VM sticks to the guaranteed that the JMM makes, it's a good implementation, and as a consequence, your software must work given solely those guarantees and no other assumptions; because what works on your machine might not work on another if you rely on assumptions that aren't guaranteed by the JMM.
Reordering
Reordering is also not in the VM spec, at all. What IS in the VM spec are the following two concepts:
Within the confines of a single thread, all you can observe from inside it is consistent with an ordered view. That is, if you write 'x = 5; y = 10;' it is not possible to observe, from within the same thread, y being 10 but x being its old value. Regardless of synchronized or volatile. So, anytime it can reorder things without that being observable, then the VM is free to. Will it? Up to the VM. Some do, some don't.
When observing effects caused by other threads, and you have not established a happens-before relationship, you may see some, all, or none of these effects, in any order. Really, anything can happen here. In practice, then: Do NOT attempt to observe effects caused by other threads without establishing a happens-before, because the results are arbitrary and untestable.
Happens-before relationships are established by all sorts of things; synchronized blocks obviously do it (if your thread is frozen trying to acquire a lock, and it then runs, any synchronized blocks on that object that finished 'happened before', and anything they did you can now observe, with the guarantee that what you observe is consistent with those things having run in order, and where all data they wrote you can see (as in, you won't get an older 'cache' or whatnot). Volatile accesses do as well.
Atomicity
Yes, your interpretation of why x++ is not atomic even if x is volatile, is correct.
I'm not sure what your Q4 is trying to ask.
In general, if you want to atomically increment an integer, or do any of many other concurrent-esque operations, look at the java.util.concurrent package. These contain efficient and useful implementations of various concepts. AtomicInteger, for example, can be used to atomically increment something, in a way that is visible from other threads, while still being quite efficient (for example, if your CPU supports Compare-And-Set (CAS) operations, Atomicinteger will use it; not something you can do from general java without resorting to Unsafe).

Just to complement rzwitserloot excellent answer:
A1. You can think of it like so: synchronized guarantees that all cashed changes will become visible to other threads that enter a synchronized block (flushed from the cache) once the first thread exits the synchronized block and before the other enters.
A2. Operations executed by a thread T1 within a synchronized block appear to some other thread T2 as not reordered if and only if T2 synchronizes on the same guard.
A3. I'm not sure what you understand by that. It may happen that when incrementing both threads will first perform a read of the variable a which will yield some value v, then both threads will locally increase their local copy of the value v producing v' = v + 1, then both threads will write v' to a. Thus finally the value of a could be v + 1 instead of v + 2.
A4. Basically yes, although in a synchronized block you can perform atomically many operations, while atomic variables allow you to only do a certain single operation like an atomic increment. Moreover, the difference is that when using the synchronized block incorrectly, i.e. by reading variables outside of a synchronized block which are modified by another thread within a synchronized block, you can observe them not-atomically and reordered. Something which is impossible with atomic variables. Locking is exactly the same as synchronized.

Q1. I feel second paragraph above (of volatile) corresponds to what book said about synchronized.
Sure. volatile access can be regarded as synchronization lite.
But is there
synchronized-equivalent to volatile's first paragraph? In other words,
does using synchronized ensures any / some variables not getting
cached in processor caches and registers?
The book has confused you by mixing levels. volatile access has nothing directly to do with processor caches or registers, and in fact the book is surely incorrect about the caches. Volatility and synchronization are about the inter-thread visibility of certain actions, especially of writes to shared variables. How the semantics are implemented is largely a separate concern.
In any case, no, synchronization does not put any constraints on storage of variables. Everything to do with synchronized semantics happens at the boundaries of synchronized regions. This is why all accesses to a given variable from a set of concurrently running threads must be synchronized on the same object in order for a program to be properly synchronized with respect to that variable.
Book says following about volatile in the context of reordering:
When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it
should not be reordered with other memory operations.
Q2. Book does not say anything about reordering in the context of synchronized. Can someone explain what can be said of reordering in
the context of synchronized?
But that already does say something (not everything) about synchronized access. You need to understand that a "memory operation" in this sense is a read or write of a shared variable, or acquiring or releasing any object's monitor. Entry to a synchronized region involves acquiring a monitor, so already the book says, correctly, that volatile accesses will not be reordered across the boundaries of a synchronized region.
More generally, reads of shared variables will not be reordered relative to the beginning of a synchronized region, and writes will not be reordered relative to the end of one.
Q3. I guess this means two threads can see volatile int a together, both will increment it and then save it. But only one last
read will have effect, thus making whole "read-increment-save" non
atomic. Am I correct with this interpretation on non-atomic-ness of
volatile?
Yes. The autoincrement operator performs both a read and a write of the variable to which it is applied. If that variable is volatile then volatile semantics apply to those individually, so other operations on the same variable can happen between if there is no other protection.
Q4. Does all locking-equivalent are comparable and have same visibility, ordering and atomicity property: synchronized blocks,
atomic variables, locks?
Huh? This sub-question is much too broad. You're reading a whole book about it. Generally speaking, though, no these mechanisms have some characteristics in common and some that differ. All have effects on on the visibility of memory operations and their ordering, but they are not identical. "Atomicity" is a function of the other two.

Is visibility guaranteed for sequential thread runs for non volatile var?

I have read below references:
http://tutorials.jenkov.com/java-concurrency/volatile.html
https://www.geeksforgeeks.org/volatile-keyword-in-java/
https://docs.oracle.com/javase/tutorial/essential/concurrency/atomic.html
But, I am still not clear on what is the expected behavior in below case:
I have a thread pool in which thread is reused.
I have a non-volatile var which is accessed by different threads form that thread pool.
Threads are run in sequential.(Yes, they can be run in one thread. But, just for this case. So, don't ask why not use one thread)
And the part I am not clear is that. Do I still need volatile to make sure the change made in previous thread is visible to the next thread execution.
Like does java flash thread local cache to memory after each execution?
And does the thread reload the local cache before each execution?
It keep complain there is code section that is not quoted. So, I have to do this. Please help fix.

Java Memory Model design can answer your question:
In the Java Memory Model a volatile field has a store barrier inserted after a write to it and a load barrier inserted before a read of it. Qualified final fields of a class have a store barrier inserted after their initialisation to ensure these fields are visible once the constructor completes when a reference to the object is available.
see https://dzone.com/articles/memory-barriersfences
In other words: when a thread tries to read from volatile var JMM forces all CPUs owning this memory area to write back to memory from the local CPU's cache. Then CPU loads its value to the local cache if necessary.
And the part I am not clear is that. Do I still need volatile to make sure the change made in previous thread is visible to the next thread execution.
Yes and no. Volatile keyword is just for making a value of variable visible to other threads. But if you need only one thread read-write at the moment = you need synchronization. If you need to provide just visibility = volatile keyword is enough.
If you do not use the volatile keyword and share the value through the threads it might be not up to date or even be corrupted. For LONG type Java do 2 writes by 32bits and they are not atomic. Let's say another thread can read the value between these two writes happened.

It doesn't matter whether the Threads are running sequentially for parallely, if you don't use volatile keyword there is no visibility guarantee. That means there is no guarantee that the other thread will see the latest copy as the thread may read the values from register.

Preferring synchronized to volatile

I've read this answer in the end of which the following's written:
Anything that you can with volatile can be done with synchronized, but
not vice versa.
It's not clear. JLS 8.3.1.4 defines volatile fields as follows:
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable
(§17.4).
So, the volatile fields are about memory visibility. Also, as far as I got from the answer I cited, reading and writing to volatile fields are synched.
Synchronization, in turn guarantees that the only one thread has access to a synched block. As I got, it has nothing to do with memory visibility. What did I miss?

In fact synchronization is also related to memory visibilty as the JVM adds a memory barrier in the exit of the synchronized block. This ensures that the results of writes by the thread in the synchronization block are guaranteed to be visible by reads by another threads once the first thread has exited the synchronized block.
Note : following #PaŭloEbermann's comment, if the other thread go through a read memory barrier (by getting in a synchronized block for example), their local cache will not be invalidated and therefore they might read an old value.
The exit of a synchronized block is a happens-before in this doc : http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html#MemoryVisibility
Look for these extracts:
The results of a write by one thread are guaranteed to be visible to a
read by another thread only if the write operation happens-before the
read operation.
and
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor. And because the happens-before relation
is transitive, all actions of a thread prior to unlocking
happen-before all actions subsequent to any thread locking that
monitor.

Synchronized and volatile are different, but usually both of them are used to solve same common problem.
Synchronized is to make sure that only one thread will access the shared resource at a given point of time.
Whereas, those shared resources are often declared as volatile, it is because, if a thread has changed the shared resource value, it has to updated in the other thread also. But without volatile, the Runtime, just optimizes the code, by reading the value from the cache. So what volatile does is, whenever any thread access volatile, it wont read the value from the cache, instead it actually gets it from the actual memory and the same is used.
Was going through log4j code and this is what I found.
/**
* Config should be consistent across threads.
*/
protected volatile PrivateConfig config;

If multiple threads write to a shared volatile variable and they also need to use a previous value of it, it can create a race condition. So at this point you need use synchronization.
... if two threads are both reading and writing to a shared variable, then using the volatile keyword for that is not enough. You need to use a synchronized in that case to guarantee that the reading and writing of the variable is atomic. Reading or writing a volatile variable does not block threads reading or writing. For this to happen you must use the synchronized keyword around critical sections.
For detailed tutorial about volatile, see 'volatile' is not always enough.

That's wrong. Synchronization has to do with memory visibility. Every thread has is own cache. If you got a lock the cache is refresehd. If you release a lock the cache is flused to the main memory.
If you read a volatile field there is also a refresh, if you write a volatile field there is a flush.

Java volatile and synchronized

I know that volatile keyword refresh all the invisible data i.e. if some thread read volatile variable all potential invisible variables/references (not only the variable that will be read) will be okey(visible i.e. with correct values) after this reading. Right? But what about synchronized ? Is it the same ? If in synchronized block we read 3 variables for example will all other varibles will be visible?
What will hapen if one thread change the value of some variable (for example set varible "age" from 2 to 33) from non-synchronized block and after this thread die ? The value could be written in the thread stack, but main thread maybe will not see this change, the background thread will die and the new value is gone and can not be retrieved?
And last question if we have 2 background threads and we know that our main thread will be notified (in some way) just before every one of them will die and our thread will wait both of them to finish their work and will continue after that, how we can assure that all variables changes(which are made by the background threads) will be visible to the main thread? We can just put synchronized block after the background thread finishes or ? We don't want to access variables that are changed from the background threads with synchronized blocks every time after this threads are dead (because it's overhead), but we need to have their right values ? But it is unnatural to read some fake volatile variable or to use fake synchronized block(if it refresh all data) just to refresh all data.
I hope that my questions are explained well.
Thanks in advance.

Reading the value of a volatile variable creates a happens-before relationship between the writing from one thread and the reading of another.
See http://jeremymanson.blogspot.co.uk/2008/11/what-volatile-means-in-java.html:
The Java volatile modifier is an example of a special mechanism to
guarantee that communication happens between threads. When one thread
writes to a volatile variable, and another thread sees that write, the
first thread is telling the second about all of the contents of memory
up until it performed the write to that volatile variable.
Synchronized blocks create a happens-before relationship as well. See http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/package-summary.html
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor. And because the happens-before relation
is transitive, all actions of a thread prior to unlocking
happen-before all actions subsequent to any thread locking that
monitor.
Which have the same effect on visibility.
If a value is written without any kind of synchronization then there is no guarantee that other threads will ever see this value. If you want to share values across threads, you should be adding some kind of synchronization or locking, or using volatile.

All your questions are answered in the java.util.concurrent package documentation.
But what about synchronized?
Here's what the documentation says:
An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor. And because the happens-before relation is transitive, all actions of a thread prior to unlocking happen-before all actions subsequent to any thread locking that monitor.
.
The value could be written in the thread stack, but main thread maybe will not see this change, the background thread will die and the new value is gone and can not be retrieved?
If the value is written in the thread stack, then we're talking about a local variable. Local variable are not accessible anywhere except in the method declaring that variable. So if the thread dies, of course the stack doesn't exist and the local variable doesn't exist either. If you're talking about a field of an object, that's stored on the heap. If no other thread has a reference to this object, and if the reference is unreachable from any static variable, then it will be garbage collected.
and our thread will wait both of them to finish their work and will continue after that, how we can assure that all variables changes(which are made by the background threads) will be visible to the main thread
The documentation says:
All actions in a thread happen-before any other thread successfully returns from a join on that thread.
So, since the main thread waits for the background threads to die, it uses join(), and every action made by the background threads will be visible by the main thread after join() returns.

Fully covering the whole topic from the level your question seems to imply will require more than a StackOverflow.com answer, so I recommend looking for a good book on multi threading programming.
volatile guarantee that the read and write accesses to the qualified variable are totally ordered with respect to other accesses to the samevolatile variable1.
It does this by preventing the volatile read and write accesses to be reorder with previous or future instructions and enforcing that all the side effects before the access of the writing thread are visible to a reading thread.
This means that volatile variable are read and written as you see in your code and like the instructions were executed one at a time, beginning the next one only when all the side effects of the previous are completed and visible to every other thread.
To better understand what this means and why this is necessary, look at this question of mine on difference between Memory Barriers and lock prefixed instruction.
Note that volatile in Java is much much much stronger than volatile in C or C++. It does guarantee more that the usual treating of read/write access as a side effect with regard to optimization purposes. This means that is doesn't simply imply that the variable is read from the memory every time, a Java volatile is a memory barrier.
The synchronized block simply guarantees exclusive (i.e. one thread at a time) execution of a block of code.
It doesn't imply that all the thread see the memory access is the same order, a thread could see first the write to a protected shared data structure and then the write to the lock!
1 In some circumstances, where a full memory barrier is emitted, this may enforce that writes and reads to all volatile variables made by a thread T will be visible to other threads in T program order. Note that this not suffices for synchronization as there is still no relationship between inter-thread accesses.
No.
Shared vars are not on the stack of any thread, copy of their value can be but the variables exists independently of any thread.
When a thread dies gracefully, the write is done and can be retrieved (beware of memory ordering again).
If the thread dies forcefully, it could be interrupted anywhere. In any way the actual Java assignment is implemented if the thread stops before the write to the shaded var (by direct assignment or by copying a local value on the stack) then the write never happened.
You don't need synchronized, just volatile as the main thread only reads and the background threads only write (different vars).

Does empty synchronized(this){} have any meaning to memory visibility between threads?

I read this in an upvoted comment on StackOverflow:
But if you want to be safe, you can add simple synchronized(this) {}
at the end of you #PostConstruct [method]
[note that variables were NOT volatile]
I was thinking that happens-before is forced only if both write and read is executed in synchronized block or at least read is volatile.
Is the quoted sentence correct? Does an empty synchronized(this) {} block flush all variables changed in current method to "general visible" memory?
Please consider some scenerios
what if second thread never calls lock on this? (suppose that second thread reads in other methods). Remember that question is about: flush changes to other threads, not give other threads a way (synchronized) to poll changes made by original thread. Also no-synchronization in other methods is very likely in Spring #PostConstruct context - as original comment says.
is memory visibility of changes forced only in second and subsequent calls by another thread? (remember that this synchronized block is a last call in our method) - this would mark this way of synchronization as very bad practice (stale values in first call)

Much of what's written about this on SO, including many of the answers/comments in this thread, are, sadly, wrong.
The key rule in the Java Memory Model that applies here is: an unlock operation on a given monitor happens-before a subsequent lock operation on that same monitor. If only one thread ever acquires the lock, it has no meaning. If the VM can prove that the lock object is thread-confined, it can elide any fences it might otherwise emit.
The quote you highlight assumes that releasing a lock acts as a full fence. And sometimes that might be true, but you can't count on it. So your skeptical questions are well-founded.
See Java Concurrency in Practice, Ch 16 for more on the Java Memory Model.

All writes that occur prior to a monitor exit are visible to all threads after a monitor enter.
A synchronized(this){} can be turned into bytecode like
monitorenter
monitorexit
So if you have a bunch of writes prior to the synchronized(this){} they would have occurred before the monitorexit.
This brings us to the next point of my first sentence.
visible to all threads after a monitor enter
So now, in order for a thread to ensure the writes ocurred it must execute the same synchronization ie synchornized(this){}. This will issue at the very least a monitorenter and establish your happens before ordering.
So to answer your question
Does an empty synchronized(this) {} block flush all variables changed
in current method to "general visible" memory?
Yes, as long as you maintain the same synchronization when you want to read those non-volatile variables.
To address your other questions
what if second thread never calls lock on this? (suppose that second
thread reads in other methods). Remember that question is about: flush
changes to other threads, not give other threads a way (synchronized)
to poll changes made by original thread. Also no-synchronization in
other methods is very likely in Spring #PostConstruct context
Well in this case using synchronized(this) without any other context is relatively useless. There is no happens-before relationship and it's in theory just as useful as not including it.
is memory visibility of changes forced only in second and subsequent
calls by another thread? (remember that this synchronized block is a
last call in our method) - this would mark this way of synchronization
as very bad practice (stale values in first call)
Memory visibility is forced by the first thread calling synchronized(this), in that it will write directly to memory. Now, this doesn't necessarily mean each threads needs to read directly from memory. They can still read from their own processor caches. Having a thread call synchronized(this) ensures it pulls the value of the field(s) from memory and retrieve most up to date value.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.