How stale data is avoided using synchronized keyword?

How stale data is avoided using synchronized keyword? - java

In the book "Java Concurrency in Practice", under the section, 3.1.1 State data, there is a code
#NotThreadSafe
public class MutableInteger {
private int value;
public int get() { return value; }
public void set(int value) { this.value = value; }
}
which is not thread safe,because:
if one thread calls set, other threads calling get may or may not see
that update.
whereas using synchronized keyword on both set and get methods makes it "correct". How?
#ThreadSafe
public class SynchronizedInteger {
#GuardedBy("this") private int value;
public synchronized int get() { return value; }
public synchronized void set(int value) { this.value = value; }
}
Here too if value is 0, and Thread A has called set(2) while Thread B has called get(), B may get value 0 and then A will set it to 2...which previous code was already doing. So what benefit we got from synchronizing the code..
May be I am missing something, but please guide. Thank you

The issue you fix this way is not that thread B executes the set immediately after A executes a get, that one will still return the "old" (well, technically correct at the time, but soon to be wrong) value.
The issue the synchronization fixes is that even if thread B wrote before thread A read, A could read an old value due to caching (most likely CPU caches, but this depends on the JVM implementation). A non-synchronized read from a non-volatile variable can use a cached value. In other words: the synchronized creates a read-barrier, which means "you have to re-read this value, even if you already have it in your CPU cache".
Note that for this specific case, simply adding volatile to value would have the same effect, but for more complex access patterns synchronized (or it's equivalence in newer APIs Lock) is necessary.

When you use synchronized method you get the exclusive access to the object that is in "Race Condition" risks. In that case you get the exclusive access to the value.
This goal is obtained because the synchronized method use semaphore.
Synchronized in Java
Java Doc - Here you can find a good example.
From Java Doc:
If count is an instance of SynchronizedCounter, then making these methods synchronized has two effects:
First,
it is not possible for two invocations of synchronized methods on the
same object to interleave.
When one thread is executing a synchronized method for an object, all
other threads that invoke synchronized methods for the same object
block (suspend execution) until the first thread is done with the
object.
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads.

It's all about the "Happens Before Relationship", as termed by the official Java documentation.
In your case of two synchronised getter & setter methods reading & writing the same instance variable respectively, it depends on the sequence of operations, ie, whether the getter or setter was called first.
This relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement.
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
Synchronisation is one of ways to achieve this consistency. Another one, in your particular case would be to make the variable as volatile.
From the official Java docs:
Using volatile variables reduces the risk of memory consistency
errors, because any write to a volatile variable establishes a
happens-before relationship with subsequent reads of that same
variable. This means that changes to a volatile variable are always
visible to other threads.

Related

How a thread can see stale reference of safely initialized object

I have been trying to figure out that how immutable objects which are safely published could be observed with stale reference.
public final class Helper {
private final int n;
public Helper(int n) {
this.n = n;
}
}
class Foo {
private Helper helper;
public Helper getHelper() {
return helper;
}
public void setHelper(int num) {
helper = new Helper(num);
}
}
So far I could understand that Helper is immutable and can be safely published. A reading thread either reads null or fully initialized Helper object as it won't be available until fully constructed. The solution is to put volatile in Foo class which I don't understand.

The fact that you are publishing a reference to an immutable object is irrelevant here.
If you are reading the value of a reference from multiple threads, you need to ensure that the write happens before a read if you care about all threads using the most up-to-date value.
Happens before is a precisely-defined term in the language spec, specifically the part about the Java Memory Model, which allows threads to make optimisations for example by not always updating things in main memory (which is slow), instead holding them in their local cache (which is much faster, but can lead to threads holding different values for the "same" variable). Happens-before is a relation that helps you to reason about how multiple threads interact when using these optimisations.
Unless you actually create a happens-before relationship, there is no guarantee that you will see the most recent value. In the code you have shown, there is no such relationship between writes and reads of helper, so your threads are not guaranteed to see "new" values of helper. They might, but they likely won't.
The easiest way to make sure that the write happens before the read would be to make the helper member variable final: the writes to values of final fields are guaranteed to happen before the end of the constructor, so all threads always see the correct value of the field (provided this wasn't leaked in the constructor).
Making it final isn't an option here, apparently, because you have a setter. So you have to employ some other mechanism.
Taking the code at face value, the simplest option would be to use a (final) AtomicInteger instead of the Helper class: writes to AtomicInteger are guaranteed to happen before subsequent reads. But I guess your actual helper class is probably more complicated.
So, you have to create that happens-before relationship yourself. Three mechanisms for this are:
Using AtomicReference<Helper>: this has similar semantics to AtomicInteger, but allows you to store a reference-typed value. (Thanks for pointing this out, #Thilo).
Making the field volatile: this guarantees visibility of the most recently-written value, because it causes writes to flush to main memory (as opposed to reading from a thread's cache), and reads to read from main memory. It effectively stops the JVM making this particular optimization.
Accessing the field in a synchronized block. The easiest thing to do would be to make the getter and setter methods synchronized. Significantly, you should not synchronize on helper, since this field is being changed.

Cite from Volatile vs Static in Java
This means that if two threads update a variable of the same Object concurrently, and the variable is not declared volatile, there could be a case in which one of the thread has in cache an old value.
Given your code, the following can happen:
Thread 1 calls getHelper() and gets null
Thread 2 calls getHelper() and gets null
Thread 1 calls setHelper(42)
Thread 2 calls setHelper(24)
And in this case your trouble starts regarding which Helper object will be used in which thread. The keyword volatile will at least solve the caching problem.

The variable helper is being read by multiple threads simultaneously. At the least, you have to make it volatile or the compiler will begin caching it in registers local to threads and any updates to the variable may not reflect in the main memory. Using volatile, when a thread starts reading a shared variable, it will clear its cache and fetch a fresh value from the global memory. When it finishes reading it, it will flush the contents of its cache into the main memory so that other threads may get the updated value.

redundant volatile in cheap read-write lock?

Brian Goetz in his article from https://www.ibm.com/developerworks/java/library/j-jtp06197/
uses the example pasted below as a cheap read-write lock. My question is that if the int variable value is not declared volatile then would it make a difference? My understanding is that since the writes to value are done within a synchronized block so latest value will be visible to other threads any way and therefore declaring it volatile is redundant. Please clarify?
#ThreadSafe
public class CheesyCounter {
// Employs the cheap read-write lock trick
// All mutative operations MUST be done with the 'this' lock held
#GuardedBy("this") private volatile int value;
public int getValue() { return value; }
public synchronized int increment() {
return value++;
}
}

My understanding is that since the writes to value are done within a synchronized block so latest value will be visible to other threads any way
This is incorrect. Generally, there is no guarantee that other threads "see" changes to variables as soon as the change is made. A thread may see a stale value for a changed variable because, e.g. the thread sees the value really in a register instead of main memory.
A volatile variable establishes "happens-before" semantics. The JLS, section 17.4.5, states:
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
The JLS, Section 8.3.1.4:
A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4).
The reason that the field must be volatile is that even though the read is atomic, it needs to ensure that the value is current -- that any value previously written by another thread is visible. The read being atomic is not enough; volatile is still necessary to ensure consistency of the value.

public synchronized int increment()
This synchronized prevents you from skipping an increment if two threads or more were trying to increment at the same time (because ++ is not atomic).
private volatile int value
This prevents you from seeing an outdated value in one thread which was already incremented in another thread. (Note that we could also have made getValue synchronized to achieve this)

Thanks very much for the answers guys. Found this on oracle website as well now: "Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object."
https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html

volatile synchronized combination for performance

When Synchronization is used there is a performance impact. Can volatile be used in combination with synchronized to reduce the performance overhead ? For example, instance of Counter will be shared among many threads and each thread can access Counter's public methods. In the below code volatile is used for getter and synchronized is used for setter
public class Counter
{
private volatile int count;
public Counter()
{
count = 0;
}
public int getCount()
{
return count;
}
public synchronized void increment()
{
++count;
}
}
Please let me know in which scenario this might break ?

Yes, you definitely can. In fact, if you look at the source code of AtomicInteger, it's essentially what they do. AtomicInteger.get simply returns value, which is a volatile int (link). The only real difference from what you've done and what they do is that they use a CAS for the increment instead of synchronization. On modern hardware, a CAS can eliminate any mutual exclusion; on older hardware, the JVM will put some sort of mutex around the increment.
Volatile reads are about as fast as non-volatile ones, so the reads will be quite fast.
Not only that, but volatile fields are guaranteed not to tear: see JLS 17.7, which specifies that volatile longs and doubles are not subject to word tearing. So your code would work with a long just as well as an int.
As Diego Frehner points out, you might not see the result of an increment if you get the value "right as" the increment happens -- you'll either see the before or the after. Of course, if get were synchronized you'd have exactly the same behavior from the read thread -- you'd either see the before-increment or post-increment value. So it's really the same either way. In other words, it doesn't make sense to say that you won't see the value as it's happening -- unless you meant word tearing, which (a) you won't get and (b) you would never want.

1. I have personally used this mechanism of volatile combined with synchronized.
2. You can alone use synchronized, and you will always get a consistent result, but using
only volatile alone will Not yield the same result always.
3. This is because volatile keyword is not a synchronization primitive. It merely prevents caching of the value on the thread, but it does not prevent two threads from modifying the same value and writing it back concurrently.
4. volatile give concurrent access to threads without lock, but then using synchronized will allow only one thread to get access to this and all the synchronized methods in the class.
5. And using both volatile and synchronized will do this....
volatile - will reflect the changed values to thread, and prevent caching,
synchronized - But using synchronized keyword, will make sure that only one thread gets the access to the synchronized methods of the class.

You will not always get the most actual count when calling getCount(). An AtomicInteger could be appropriate for you.

There wouldn't be a performance gain from using both. Volatile guarantees that the value of a variable will be consistent when reading/writing to the variable across threads executing in parallel by preventing caching. Synchronized, when applied to a method (as you do in your example), only allows a single thread to enter that method at a time and blocks others until execution is complete.

Java: How exactly do synchronized operations relate to volatility?

Sorry this is such a long question.
Ive been doing lots of research lately into multi-threading as I slowly implement it into a personal project. However, probably due to an abundance of slightly incorrect examples, the use of synchronized blocks and volatility in certain situations is still a bit unclear to me.
My core question is this: Are changes to references and primitives automatically volatile (that is, performed on the main memory and not a cache) when a thread is inside a synchronized block, or does the read also have to be synchronized for it to work properly?
If so What is the purpose of synchronizing a simple getter method? (see example 1 ) Also, are ALL changes sent to main memory as long as the thread has synchronized on anything? eg if it is sent off to do loads of work all over the place inside a very high level sync will every single change then made be to main memory, and nothing ever to cache, until its unlocked again?
If not Does the change have to be explicitly inside a synchronized block, or can java actually pick up on, for example, uses of the Lock object? (see example 3)
If either Does the synchronized object need to be related to the reference/primitive being changed in any way (eg the immediate object that contains it)? Can I write by syncing on one object and read with another if its otherwise safe? (see example 2)
(please note for the following examples that I know that synchronized methods and synchronized(this) are frowned upon and why, but discussion about that is beyond the scope of my question)
Example 1:
class Counter{
int count = 0;
public synchronized void increment(){
count++;
}
public int getCount(){
return count;
}
}
In this example, increment() needs to be synchronized since ++ is not an atomic operation. As such, two threads incremending at the same time may result in a overall increase of 1 to the count. The count primitive needs to be atomic (eg not long/double/reference), and it is so thats fine.
Does getCount() need to be synchronized here and why exactly? The explanation I have heard the most is that I will have no guarantee whether the count returned will be the pre- or post-increment. However, this seems like the explanation for something slightly different, thats found itself in the wrong place. I mean if I were to synchronize getCount(), then I still see no guarantee - its now down to not knowing the locking order, insead of not knowing whether the actual read happens to be before/after the actual write.
Example 2:
Is the following example threadsafe, if you assume that through trickery not shown here that none of these methods will never be called at the same time? Will count increment in an expected way if its done so using a random method each time, and then be read properly, or does the lock have to be the same object? (btw I fully realise how rediculous this example is but Im more interested in theory than practice)
class Counter{
private final Object lock1 = new Object();
private final Object lock2 = new Object();
private final Object lock3 = new Object();
int count = 0;
public void increment1(){
synchronized(lock1){
count++;
}
}
public void increment2(){
synchronized(lock2){
count++;
}
}
public int getCount(){
synchronized(lock3){
return count;
}
}
}
Example 3:
Is the happens-before relationship simply a java concept, or is it an actual thing built into the JVM? Even though I can guarantee a conceptual happens-before relationship for this next example, is java smart enough to pick it up if its a built in thing? I am assuming it is not, but is this example actually threadsafe? If its threadsafe, what about if getCount() did no locking?
class Counter{
private final Lock lock = new Lock();
int count = 0;
public void increment(){
lock.lock();
count++;
lock.unlock();
}
public int getCount(){
lock.lock();
int count = this.count;
lock.unlock();
return count;
}
}

Yes, the read has to be synchronized as well. This page says:
The results of a write by one thread are guaranteed to be visible to a
read by another thread only if the write operation happens-before the
read operation.
[...]
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor
The same page says:
Actions prior to "releasing" synchronizer methods such as Lock.unlock,
Semaphore.release, and CountDownLatch.countDown happen-before actions
subsequent to a successful "acquiring" method such as Lock.lock
So locks offer the same visibility guarantees as synchronized blocks.
Whether you use synchronized blocks or locks, the visibility is only guaranteed if the reader thread uses the same monitor or lock as the writer thread.
Your Example 1 is incorrect: the getter must be synchronized as well if you want to see the latest value of the count.
Your example 2 is incorrect because it uses different locks to guard the same count.
Your example 3 is OK. If the getter did not lock, you could see an older value of the count. The happens-before is something that is guaranteed by the JVM. The JVM has to respect the rules specified, by flushing caches to the main memory for example.

Try to view it in terms of two distinct, simple operations:
Locking (mutual exclusion),
Memory barrier (cache sync, instruction reordering barrier).
Entering a synchronized block entails both locking and memory barrier; leaving the synchronized block entails unlocking + memory barrier; reading/writing a volatile field entails memory barrier only. Thinking in these terms I think you can clarify for yourself all the question above.
As for Example 1, the reading thread will not have any kind of memory barrier. It's not just between seeing the value before/after read, it's about never observing any change to the var after a thread is started.
Example 2. is the most interesting issue you raise. You are indeed given no guarantees by the JLS in this case. In practice you won't be given any ordering guarantees (it's as if the locking aspect wasn't there at all), but you'll still have the benefit of the memory barriers so you will observe changes, unlike the first example. Basically, this is exactly the same as removing synchronized and tagging the int as volatile (apart from the runtime costs of acquiring locks).
Regarding Example 3, by "just a Java thing" I feel you have generics with erasure in mind, something that only the static code checking is aware of. This is not like that -- both locks and memory barriers are pure runtime artifacts. In fact, the compiler can't reason about them at all.

Thread safety in Java class

Why is this java class not Thread safe.
class TestClass {
private int x;
int get() {
return x;
}
void set(int x) {
this.x = x;
}
}
I read that keyword synchronized is needed to make it thread safe? After all isn't the operations done inside atomic?

Although the assignment itself is an atomic operation, due to different hardware and compiler implementations, different threads may see different values of the member x. I.e., a modification by one thread may be invisible to the other thread, because of some kind of caching. This is usually called a thread visibility problem.
You can synchronize your code properly either by synchronizing on a monitor (using the synchronized keyword or the java.util.concurrent locks), or by declaring x to be volatile.

With multiple processors, some values may be cached by the processor and may not reflect the changes made by other threads/processors for the same objects. Actually, JVM may be implemented to work this way even with a single processor.
Synchronized methods are explicitly required by language specification to present a memory barrier and require reread of all instance variables from the memory.
Because your code is not synchronized, one thread may set the value, but the other thread will return the value still cached by that thread.
Please read 'Memory and Locks' chapter of Java Language Specification.

Because the field 'x' is not declared volatile there is no requirement for the JVM to ensure that 'x' is visible to all other threads. I.e. if one thread is constantly reading the value of 'x' and another thread is writing it, it is possible that the reading thread will never "see" the value change.
A synchronized keyword is not required, but will work as it will create the necessary memory barrier/cache flush to ensure 'x' is visible, but using the volatile keyword in this case will be more efficient.

When you have two method modifying/accessing a non-volatile variable it is never thread safe. If you want to have just one method you can try :
synchronized int getAndSet(int x, boolean set) {
if (set) this.x = x;
return this.x; // param x is for set
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.