Why is this java class not Thread safe.
class TestClass {
private int x;
int get() {
return x;
}
void set(int x) {
this.x = x;
}
}
I read that keyword synchronized is needed to make it thread safe? After all isn't the operations done inside atomic?
Although the assignment itself is an atomic operation, due to different hardware and compiler implementations, different threads may see different values of the member x. I.e., a modification by one thread may be invisible to the other thread, because of some kind of caching. This is usually called a thread visibility problem.
You can synchronize your code properly either by synchronizing on a monitor (using the synchronized keyword or the java.util.concurrent locks), or by declaring x to be volatile.
With multiple processors, some values may be cached by the processor and may not reflect the changes made by other threads/processors for the same objects. Actually, JVM may be implemented to work this way even with a single processor.
Synchronized methods are explicitly required by language specification to present a memory barrier and require reread of all instance variables from the memory.
Because your code is not synchronized, one thread may set the value, but the other thread will return the value still cached by that thread.
Please read 'Memory and Locks' chapter of Java Language Specification.
Because the field 'x' is not declared volatile there is no requirement for the JVM to ensure that 'x' is visible to all other threads. I.e. if one thread is constantly reading the value of 'x' and another thread is writing it, it is possible that the reading thread will never "see" the value change.
A synchronized keyword is not required, but will work as it will create the necessary memory barrier/cache flush to ensure 'x' is visible, but using the volatile keyword in this case will be more efficient.
When you have two method modifying/accessing a non-volatile variable it is never thread safe. If you want to have just one method you can try :
synchronized int getAndSet(int x, boolean set) {
if (set) this.x = x;
return this.x; // param x is for set
}
Related
In the book "Java Concurrency in Practice", under the section, 3.1.1 State data, there is a code
#NotThreadSafe
public class MutableInteger {
private int value;
public int get() { return value; }
public void set(int value) { this.value = value; }
}
which is not thread safe,because:
if one thread calls set, other threads calling get may or may not see
that update.
whereas using synchronized keyword on both set and get methods makes it "correct". How?
#ThreadSafe
public class SynchronizedInteger {
#GuardedBy("this") private int value;
public synchronized int get() { return value; }
public synchronized void set(int value) { this.value = value; }
}
Here too if value is 0, and Thread A has called set(2) while Thread B has called get(), B may get value 0 and then A will set it to 2...which previous code was already doing. So what benefit we got from synchronizing the code..
May be I am missing something, but please guide. Thank you
The issue you fix this way is not that thread B executes the set immediately after A executes a get, that one will still return the "old" (well, technically correct at the time, but soon to be wrong) value.
The issue the synchronization fixes is that even if thread B wrote before thread A read, A could read an old value due to caching (most likely CPU caches, but this depends on the JVM implementation). A non-synchronized read from a non-volatile variable can use a cached value. In other words: the synchronized creates a read-barrier, which means "you have to re-read this value, even if you already have it in your CPU cache".
Note that for this specific case, simply adding volatile to value would have the same effect, but for more complex access patterns synchronized (or it's equivalence in newer APIs Lock) is necessary.
When you use synchronized method you get the exclusive access to the object that is in "Race Condition" risks. In that case you get the exclusive access to the value.
This goal is obtained because the synchronized method use semaphore.
Synchronized in Java
Java Doc - Here you can find a good example.
From Java Doc:
If count is an instance of SynchronizedCounter, then making these methods synchronized has two effects:
First,
it is not possible for two invocations of synchronized methods on the
same object to interleave.
When one thread is executing a synchronized method for an object, all
other threads that invoke synchronized methods for the same object
block (suspend execution) until the first thread is done with the
object.
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads.
It's all about the "Happens Before Relationship", as termed by the official Java documentation.
In your case of two synchronised getter & setter methods reading & writing the same instance variable respectively, it depends on the sequence of operations, ie, whether the getter or setter was called first.
This relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement.
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
Synchronisation is one of ways to achieve this consistency. Another one, in your particular case would be to make the variable as volatile.
From the official Java docs:
Using volatile variables reduces the risk of memory consistency
errors, because any write to a volatile variable establishes a
happens-before relationship with subsequent reads of that same
variable. This means that changes to a volatile variable are always
visible to other threads.
Brian Goetz in his article from https://www.ibm.com/developerworks/java/library/j-jtp06197/
uses the example pasted below as a cheap read-write lock. My question is that if the int variable value is not declared volatile then would it make a difference? My understanding is that since the writes to value are done within a synchronized block so latest value will be visible to other threads any way and therefore declaring it volatile is redundant. Please clarify?
#ThreadSafe
public class CheesyCounter {
// Employs the cheap read-write lock trick
// All mutative operations MUST be done with the 'this' lock held
#GuardedBy("this") private volatile int value;
public int getValue() { return value; }
public synchronized int increment() {
return value++;
}
}
My understanding is that since the writes to value are done within a synchronized block so latest value will be visible to other threads any way
This is incorrect. Generally, there is no guarantee that other threads "see" changes to variables as soon as the change is made. A thread may see a stale value for a changed variable because, e.g. the thread sees the value really in a register instead of main memory.
A volatile variable establishes "happens-before" semantics. The JLS, section 17.4.5, states:
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
The JLS, Section 8.3.1.4:
A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4).
The reason that the field must be volatile is that even though the read is atomic, it needs to ensure that the value is current -- that any value previously written by another thread is visible. The read being atomic is not enough; volatile is still necessary to ensure consistency of the value.
public synchronized int increment()
This synchronized prevents you from skipping an increment if two threads or more were trying to increment at the same time (because ++ is not atomic).
private volatile int value
This prevents you from seeing an outdated value in one thread which was already incremented in another thread. (Note that we could also have made getValue synchronized to achieve this)
Thanks very much for the answers guys. Found this on oracle website as well now: "Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object."
https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html
private double value;
public synchronized void setValue(double value) {
this.value = value;
}
public double getValue() {
return this.value;
}
In the above example is there any point in making the getter synchronized?
I think its best to cite Java Concurrency in Practice here:
It is a common mistake to assume that synchronization needs to be used only when writing to shared variables; this is simply not true.
For each mutable state variable that may be accessed by more than one
thread, all accesses to that variable must be performed with the same
lock held. In this case, we say that the variable is guarded by that
lock.
In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions "must" happen in insufflciently synchronized multithreaded programs will almost certainly be incorrect.
Normally, you don't have to be so careful with primitives, so if this would be an int or a boolean it might be that:
When a thread reads a variable without synchronization, it may see a
stale value, but at least it sees a value that was actually placed
there by some thread rather than some random value.
This, however, is not true for 64-bit operations, for instance on long or double if they are not declared volatile:
The Java Memory Model requires fetch and
store operations to be atomic, but for nonvolatile long and double
variables, the JVM is permitted to treat a 64-bit read or write as two
separate 32-bit operations. If the reads and writes occur in different
threads, it is therefore possible to read a nonvolatile long and get
back the high 32 bits of one value and the low 32 bits of another.
Thus, even if you don't care about stale values, it is not safe to use
shared mutable long and double variables in multithreaded programs
unless they are declared volatile or guarded by a lock.
Let me show you by example what is a legal way for a JIT to compile your code. You write:
while (myBean.getValue() > 1.0) {
// perform some action
Thread.sleep(1);
}
JIT compiles:
if (myBean.getValue() > 1.0)
while (true) {
// perform some action
Thread.sleep(1);
}
In just slightly different scenarios even the Java compiler could prouduce similar bytecode (it would only have to eliminate the possibility of dynamic dispatch to a different getValue). This is a textbook example of hoisting.
Why is this legal? The compiler has the right to assume that the result of myBean.getValue() can never change while executing above code. Without synchronized it is allowed to ignore any actions by other threads.
The reason here is to guard against any other thread updating the value when a thread is reading and thus avoid performing any action on stale value.
Here get method will acquire intrinsic lock on "this" and thus any other thread which might attempt to set/update using setter method will have to wait to acquire lock on "this" to enter the setter method which is already acquired by thread performing get.
This is why its recommended to follow the practice of using same lock when performing any operation on a mutable state.
Making the field volatile will work here as there are no compound statements.
It is important to note that synchronized methods use intrinsic lock which is "this". So get and set both being synchronized means any thread entering the method will have to acquire lock on this.
When performing non atomic 64 bit operations special consideration should be taken. Excerpts from Java Concurrency In Practice could be of help here to understand the situation -
"The Java Memory Model requires fetch and store operations to be atomic, but for non-volatile long and double variables, the JVM is permitted to treat a 64 bit read or write as two separate 32
bit operations. If the reads and writes occur in different threads, it is therefore possible to read a non-volatile long and get back the high 32 bits of one value and the low 32 bits of another. Thus, even if you don't care about stale values, it
is not safe to use shared mutable long and double variables in multi-threaded programs unless they are declared
volatile or guarded by a lock."
Maybe for someone this code looks awful, but it works very well.
private Double value;
public void setValue(Double value){
updateValue(value, true);
}
public Double getValue(){
return updateValue(value, false);
}
private double updateValue(Double value,boolean set){
synchronized(MyClass.class){
if(set)
this.value = value;
return value;
}
}
Sorry this is such a long question.
Ive been doing lots of research lately into multi-threading as I slowly implement it into a personal project. However, probably due to an abundance of slightly incorrect examples, the use of synchronized blocks and volatility in certain situations is still a bit unclear to me.
My core question is this: Are changes to references and primitives automatically volatile (that is, performed on the main memory and not a cache) when a thread is inside a synchronized block, or does the read also have to be synchronized for it to work properly?
If so What is the purpose of synchronizing a simple getter method? (see example 1 ) Also, are ALL changes sent to main memory as long as the thread has synchronized on anything? eg if it is sent off to do loads of work all over the place inside a very high level sync will every single change then made be to main memory, and nothing ever to cache, until its unlocked again?
If not Does the change have to be explicitly inside a synchronized block, or can java actually pick up on, for example, uses of the Lock object? (see example 3)
If either Does the synchronized object need to be related to the reference/primitive being changed in any way (eg the immediate object that contains it)? Can I write by syncing on one object and read with another if its otherwise safe? (see example 2)
(please note for the following examples that I know that synchronized methods and synchronized(this) are frowned upon and why, but discussion about that is beyond the scope of my question)
Example 1:
class Counter{
int count = 0;
public synchronized void increment(){
count++;
}
public int getCount(){
return count;
}
}
In this example, increment() needs to be synchronized since ++ is not an atomic operation. As such, two threads incremending at the same time may result in a overall increase of 1 to the count. The count primitive needs to be atomic (eg not long/double/reference), and it is so thats fine.
Does getCount() need to be synchronized here and why exactly? The explanation I have heard the most is that I will have no guarantee whether the count returned will be the pre- or post-increment. However, this seems like the explanation for something slightly different, thats found itself in the wrong place. I mean if I were to synchronize getCount(), then I still see no guarantee - its now down to not knowing the locking order, insead of not knowing whether the actual read happens to be before/after the actual write.
Example 2:
Is the following example threadsafe, if you assume that through trickery not shown here that none of these methods will never be called at the same time? Will count increment in an expected way if its done so using a random method each time, and then be read properly, or does the lock have to be the same object? (btw I fully realise how rediculous this example is but Im more interested in theory than practice)
class Counter{
private final Object lock1 = new Object();
private final Object lock2 = new Object();
private final Object lock3 = new Object();
int count = 0;
public void increment1(){
synchronized(lock1){
count++;
}
}
public void increment2(){
synchronized(lock2){
count++;
}
}
public int getCount(){
synchronized(lock3){
return count;
}
}
}
Example 3:
Is the happens-before relationship simply a java concept, or is it an actual thing built into the JVM? Even though I can guarantee a conceptual happens-before relationship for this next example, is java smart enough to pick it up if its a built in thing? I am assuming it is not, but is this example actually threadsafe? If its threadsafe, what about if getCount() did no locking?
class Counter{
private final Lock lock = new Lock();
int count = 0;
public void increment(){
lock.lock();
count++;
lock.unlock();
}
public int getCount(){
lock.lock();
int count = this.count;
lock.unlock();
return count;
}
}
Yes, the read has to be synchronized as well. This page says:
The results of a write by one thread are guaranteed to be visible to a
read by another thread only if the write operation happens-before the
read operation.
[...]
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor
The same page says:
Actions prior to "releasing" synchronizer methods such as Lock.unlock,
Semaphore.release, and CountDownLatch.countDown happen-before actions
subsequent to a successful "acquiring" method such as Lock.lock
So locks offer the same visibility guarantees as synchronized blocks.
Whether you use synchronized blocks or locks, the visibility is only guaranteed if the reader thread uses the same monitor or lock as the writer thread.
Your Example 1 is incorrect: the getter must be synchronized as well if you want to see the latest value of the count.
Your example 2 is incorrect because it uses different locks to guard the same count.
Your example 3 is OK. If the getter did not lock, you could see an older value of the count. The happens-before is something that is guaranteed by the JVM. The JVM has to respect the rules specified, by flushing caches to the main memory for example.
Try to view it in terms of two distinct, simple operations:
Locking (mutual exclusion),
Memory barrier (cache sync, instruction reordering barrier).
Entering a synchronized block entails both locking and memory barrier; leaving the synchronized block entails unlocking + memory barrier; reading/writing a volatile field entails memory barrier only. Thinking in these terms I think you can clarify for yourself all the question above.
As for Example 1, the reading thread will not have any kind of memory barrier. It's not just between seeing the value before/after read, it's about never observing any change to the var after a thread is started.
Example 2. is the most interesting issue you raise. You are indeed given no guarantees by the JLS in this case. In practice you won't be given any ordering guarantees (it's as if the locking aspect wasn't there at all), but you'll still have the benefit of the memory barriers so you will observe changes, unlike the first example. Basically, this is exactly the same as removing synchronized and tagging the int as volatile (apart from the runtime costs of acquiring locks).
Regarding Example 3, by "just a Java thing" I feel you have generics with erasure in mind, something that only the static code checking is aware of. This is not like that -- both locks and memory barriers are pure runtime artifacts. In fact, the compiler can't reason about them at all.
Is a volatile int in Java thread-safe? That is, can it be safely read from and written to without locking?
Yes, you can read from it and write to it safely - but you can't do anything compound such as incrementing it safely, as that's a read/modify/write cycle. There's also the matter of how it interacts with access to other variables.
The precise nature of volatile is frankly confusing (see the memory model section of the JLS for more details) - I would personally generally use AtomicInteger instead, as a simpler way of making sure I get it right.
[...] as in being able to be safely read from and written to without locking?
Yes, a read will always result in the value of the last write, (and both reads and writes are atomic operations).
A volatile read / write introduces a so called happens-before relation in the execution.
From the Java Language Specification Chapter 17: Threads and Locks
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
In other words, when dealing with volatile variables you don't have to explicitly synchronize (introduce a happens-before relation) using synchronized keyword in order to ensure that the thread gets the latest value written to the variable.
As Jon Skeet points out though, the use of volatile variables are limited, and you should in general consider using classes from the java.util.concurrent package instead.
Access to volatile int in Java will be thread-safe. When I say access I mean the unit operation over it, like volatile_var = 10 or int temp = volatile_var (basically write/read with constant values). Volatile keyword in java ensures two things :
When reading you always get the value in main memory. Generally for optimization purposes JVM use registers or in more general terms local memory foe storing/access variables. So in multi-threaded environment each thread may see different copy of variable. But making it volatile makes sure that write to variable is flushed to main memory and read to it also happens from main memory and hence making sure that thread see at right copy of variable.
Access to the volatile is automatically synchronized. So JVM ensures an ordering while read/write to the variable.
However Jon Skeet mentions rightly that in non atomic operations (volatile_var = volatile + 1) different threads may get unexpected result.
1) If two threads are both reading and writing to a shared variable, then using the volatile keyword for that is not enough. You need to use a synchronized in that case to guarantee that the reading and writing of the variable is atomic. Reading or writing a volatile variable does not block threads reading or writing. For this to happen you must use the synchronized keyword around critical sections.
2) As an alternative to a synchronized block you could also use one of the many atomic data types found in the java.util.concurrent package. For instance, the AtomicLong or AtomicReference or one of the others.
It's thread safe if you have one writer thread and multiple reader threads.
class Foo {
private volatile Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null)
helper = new Helper();
}
}
return helper;
}
}
Note : If helper is immutable then no need of volatile keyword.Here singleton will work properly.
In case of counter which is being incremented by multiple threads (reading writing operation) will not give correct answer. This condition is also illustrated by race condition.
public class Counter{
private volatile int i;
public int increment(){
i++;
}
}
NOTE : Here volatile will not help.
Not always.
It's not thread safe if multiple threads are writing and reading the variable. It's thread safe if you have one writer thread and multiple reader threads.
If you are looking for Thread safely, use AtomicXXX classes
A small toolkit of classes that support lock-free thread-safe programming on single variables.
In essence, the classes in this package extend the notion of volatile values, fields, and array elements to those that also provide an atomic conditional update operation of the form:
boolean compareAndSet(expectedValue, updateValue);
Refer to #teto answer in below post:
Volatile boolean vs AtomicBoolean
If a volatile is not dependent on any other volatile variable its thread safe for read operation. In case of write volatile does not guarantee thread safety.
Assume you have a variable i which is volatile and its value is dependent on another volatile variable say j. Now Thread-1 access variable j and increment it and is about to update it in main memory from CPU cache. In case the Thread-2 reads the
variable i before Thread-1 can actually update the j in main memory. The value of i will be as per the old value of j which would be incorrect. Its also called Dirty read.