Java multi thread behavior without synchronization

Java multi thread behavior without synchronization - java

I faced the following question during an interview:
Lets assume a simple class
public class Example{
private int a;
public void update(){
a = some new value;
}
public int getA(){
return a;
}
}
Now there are 2 threads (T1 and T2) which read and update the a value in the following sequence:
T2 (call update() and the value was set to 1)
T1 (call getA())
T2 (call update() and the value was set to 2)
T1 (call getA())
Is it possible for the last call getA() of thread T1 to return the value 1? If yes under what circumstances?

The last call to to T1 could return 0, 1, or 2. It doesn't really make sense to ask "under what circumstances." Under the circumstance of running this code, basically. The code isn't written for concurrency, so there's no guarantee.
In to guarantee that a a write to a variable by one thread is visible to a read of that variable by another thread, there needs to be a synchronization point between the threads. Otherwise, the JVM is allowed to optimize things in such a way that changes are only visible to the thread that makes them. For example, the writing thread's current notion of the value can be cached on the processor and written to main memory later or never. When another thread reads main memory for the value, it finds the initial value (0), a stale update (1), or the latest update (2).
The easiest fix in this case would be to declare a as a volatile variable. You'd still need some mechanism to ensure that T2 writes before T1 reads, but only in a weak, wall-clock sense.

Related

Making a POJO Thread Safe

Here's the class:
#NotThreadSafe
public class MutableInteger {
private int value;
public int get() { return value;}
public void set(int value) { this.value = value;}
}
Here's the post-condition that I have come up with: The value returned by get() is equal to the value value set by set() or 0.
It is easy to see that the above post-condition will not always hold true. Take two threads A and B for instance. Suppose A sets value to 5 and B sets it to 8 after that. Doing a get() in thread A would return 8. It should have returned 5. It is a simple race condition situation.
How can I make this class thread safe? In the book Java: Concurrency in Practice, the authors have guarded value and both the methods on the same object. I fail to understand how this helps at all with the race condition. First of all, set() is not a compound action. So, why do we need to synchronise it? And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value. Doing a get() in the initial thread will return the new value, breaching the post-condition.
(I understand the the author is guarding the get()) for visibility stuff. But I am not sure how it eliminates the race condition.

First of all, set() is not a compound action. So, why do we need to synchronise it?
You're not synchronising the set() in its own right, you're synchronising both the get() and set() methods against the same object (assuming you make both these methods synchronised.)
If you didn't do this, and the value variable isn't marked as volatile, then you can't guarantee that a thread will see the correct value because of per-thread caching. (Thread a could update it to 5, then thread b could still potentially see 8 even after thread a has updated it. That's what's meant by lack of thread safety in this context.)
You're correct that all reference assignments are atomic, so there's no worry about a corrupt reference in this scenario.
And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value.
New threads setting new values (or new code setting new values) isn't an issue at all in terms of thread safety - that's designed, and expected. The issue is if the results are inconsistent, or specifically if multiple threads are able to concurrently view the object in an inconsistent state.

how synchronized keyword works internally

I read the below program and answer in a blog.
int x = 0;
boolean bExit = false;
Thread 1 (not synchronized)
x = 1;
bExit = true;
Thread 2 (not synchronized)
if (bExit == true)
System.out.println("x=" + x);
is it possible for Thread 2 to print “x=0”?
Ans : Yes ( reason : Every thread has their own copy of variables. )
how do you fix it?
Ans: By using make both threads synchronized on a common mutex or make both variable volatile.
My doubt is : If we are making the 2 variable as volatile then the 2 threads will share the variables from the main memory. This make a sense, but in case of synchronization how it will be resolved as both the thread have their own copy of variables.
Please help me.

This is actually more complicated than it seems. There are several arcane things at work.
Caching
Saying "Every thread has their own copy of variables" is not exactly correct. Every thread may have their own copy of variables, and they may or may not flush these variables into the shared memory and/or read them from there, so the whole thing is non-deterministic. Moreover, the very term flushing is really implementation-dependent. There are strict terms such as memory consistency, happens-before order, and synchronization order.
Reordering
This one is even more arcane. This
x = 1;
bExit = true;
does not even guarantee that Thread 1 will first write 1 to x and then true to bExit. In fact, it does not even guarantee that any of these will happen at all. The compiler may optimize away some values if they are not used later. The compiler and CPU are also allowed to reorder instructions any way they want, provided that the outcome is indistinguishable from what would happen if everything was really in program order. That is, indistinguishable for the current thread! Nobody cares about other threads until...
Synchronization comes in
Synchronization does not only mean exclusive access to resources. It is also not just about preventing threads from interfering with each other. It's also about memory barriers. It can be roughly described as each synchronization block having invisible instructions at the entry and exit, the first one saying "read everything from the shared memory to be as up-to-date as possible" and the last one saying "now flush whatever you've been doing there to the shared memory". I say "roughly" because, again, the whole thing is an implementation detail. Memory barriers also restrict reordering: actions may still be reordered, but the results that appear in the shared memory after exiting the synchronized block must be identical to what would happen if everything was indeed in program order.
All that only works, of course, only if both blocks use the same locking object.
The whole thing is described in details in Chapter 17 of the JLS. In particular, what's important is the so-called "happens-before order". If you ever see in the documentation that "this happens-before that", it means that everything the first thread does before "this" will be visible to whoever does "that". This may even not require any locking. Concurrent collections are a good example: one thread puts there something, another one reads that, and that magically guarantees that the second thread will see everything the first thread did before putting that object into the collection, even if those actions had nothing to do with the collection itself!
Volatile variables
One last warning: you better give up on the idea that making variables volatile will solve things. In this case maybe making bExit volatile will suffice, but there are so many troubles that using volatiles can lead to that I'm not even willing to go into that. But one thing is for sure: using synchronized has much stronger effect than using volatile, and that goes for memory effects too. What's worse, volatile semantics changed in some Java version so there may exist some versions that still use the old semantics which was even more obscure and confusing, whereas synchronized always worked well provided you understand what it is and how to use it.
Pretty much the only reason to use volatile is performance because synchronized may cause lock contention and other troubles. Read Java Concurrency in Practice to figure all that out.
Q & A
1) You wrote "now flush whatever you've been doing there to the shared
memory" about synchronized blocks. But we will see only the variables
that we access in the synchronize block or all the changes that the
thread call synchronize made (even on the variables not accessed in the
synchronized block)?
Short answer: it will "flush" all variables that were updated during the synchronized block or before entering the synchronized block. And again, because flushing is an implementation detail, you don't even know whether it will actually flush something or do something entirely different (or doesn't do anything at all because the implementation and the specific situation already somehow guarantee that it will work).
Variables that wasn't accessed inside the synchronized block obviously won't change during the execution of the block. However, if you change some of those variables before entering the synchronized block, for example, then you have a happens-before relationship between those changes and whatever happens in the synchronized block (the first bullet in 17.4.5). If some other thread enters another synchronized block using the same lock object then it synchronizes-with the first thread exiting the synchronized block, which means that you have another happens-before relationship here. So in this case the second thread will see the variables that the first thread updated prior to entering the synchronized block.
If the second thread tries to read those variables without synchronizing on the same lock, then it is not guaranteed to see the updates. But then again, it isn't guaranteed to see the updates made inside the synchronized block as well. But this is because of the lack of the memory-read barrier in the second thread, not because the first one didn't "flush" its variables (memory-write barrier).
2) In this chapter you post (of JLS) it is written that: "A write to a
volatile field (§8.3.1.4) happens-before every subsequent read of that
field." Doesn't this mean that when the variable is volatile you will
see only changes of it (because it is written write happens-before
read, not happens-before every operation between them!). I mean
doesn't this mean that in the example, given in the description of the
problem, we can see bExit = true, but x = 0 in the second thread if
only bExit is volatile? I ask, because I find this question here: http://java67.blogspot.bg/2012/09/top-10-tricky-java-interview-questions-answers.html
and it is written that if bExit is volatile the program is OK. So the
registers will flush only bExits value only or bExits and x values?
By the same reasoning as in Q1, if you do bExit = true after x = 1, then there is an in-thread happens-before relationship because of the program order. Now since volatile writes happen-before volatile reads, it is guaranteed that the second thread will see whatever the first thread updated prior to writing true to bExit. Note that this behavior is only since Java 1.5 or so, so older or buggy implementations may or may not support this. I have seen bits in the standard Oracle implementation that use this feature (java.concurrent collections), so you can at least assume that it works there.
3) Why monitor matters when using synchronized blocks about memory
visibility? I mean when try to exit synchronized block aren't all
variables (which we accessed in this block or all variables in the
thread - this is related to the first question) flushed from registers
to main memory or broadcasted to all CPU caches? Why object of
synchronization matters? I just cannot imagine what are relations and
how they are made (between object of synchronization and memory).
I know that we should use the same monitor to see this changes, but I
don't understand how memory that should be visible is mapped to
objects. Sorry, for the long questions, but these are really
interesting questions for me and it is related to the question (I
would post questions exactly for this primer).
Ha, this one is really interesting. I don't know. Probably it flushes anyway, but Java specification is written with high abstraction in mind, so maybe it allows for some really weird hardware where partial flushes or other kinds of memory barriers are possible. Suppose you have a two-CPU machine with 2 cores on each CPU. Each CPU has some local cache for every core and also a common cache. A really smart VM may want to schedule two threads on one CPU and two threads on another one. Each pair of the threads uses its own monitor, and VM detects that variables modified by these two threads are not used in any other threads, so it only flushes them as far as the CPU-local cache.
See also this question about the same issue.
4) I thought that everything before writing a volatile will be up to
date when we read it (moreover when we use volatile a read that in
Java it is memory barrier), but the documentation don't say this.
It does:
17.4.5.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
If hb(x, y) and hb(y, z), then hb(x, z).
A write to a volatile field (§8.3.1.4) happens-before every subsequent
read of that field.
If x = 1 comes before bExit = true in program order, then we have happens-before between them. If some other thread reads bExit after that, then we have happens-before between write and read. And because of the transitivity, we also have happens-before between x = 1 and read of bExit by the second thread.
5) Also, if we have volatile Person p does we have some dependency
when we use p.age = 20 and print(p.age) or have we memory barrier in
this case(assume age is not volatile) ? - I think - No
You are correct. Since age is not volatile, then there is no memory barrier, and that's one of the trickiest things. Here is a fragment from CopyOnWriteArrayList, for example:
Object[] elements = getArray();
E oldValue = get(elements, index);
if (oldValue != element) {
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len);
newElements[index] = element;
setArray(newElements);
} else {
// Not quite a no-op; ensures volatile write semantics
setArray(elements);
Here, getArray and setArray are trivial setter and getter for the array field. But since the code changes elements of the array, it is necessary to write the reference to the array back to where it came from in order for the changes to the elements of the array to become visible. Note that it is done even if the element being replaced is the same element that was there in the first place! It is precisely because some fields of that element may have changed by the calling thread, and it's necessary to propagate these changes to future readers.
6) And is there any happens before 2 subsequent reads of volatile
field? I mean does the second read will see all changes from thread
which reads this field before it(of course we will have changes only
if volatile influence visibility of all changes before it - which I am
a little confused whether it is true or not)?
No, there is no relationship between volatile reads. Of course, if one thread performs a volatile write and then two other thread perform volatile reads, they are guaranteed to see everything at least up to date as it was before the volatile write, but there is no guarantee of whether one thread will see more up-to-date values than the other. Moreover, there is not even strict definition of one volatile read happening before another! It is wrong to think of everything happening on a single global timeline. It is more like parallel universes with independent timelines that sometimes sync their clocks by performing synchronization and exchanging data with memory barriers.

It depends on the implementation which decides if threads will keep a copy of the variables in their own memory. In case of class level variables threads have a shared access and in case of local variables threads will keep a copy of it. I will provide two examples which shows this fact , please have a look at it.
And in your example if I understood it correctly your code should look something like this--
package com.practice.multithreading;
public class LocalStaticVariableInThread {
static int x=0;
static boolean bExit = false;
public static void main(String[] args) {
Thread t1=new Thread(run1);
Thread t2=new Thread(run2);
t1.start();
t2.start();
}
static Runnable run1=()->{
x = 1;
bExit = true;
};
static Runnable run2=()->{
if (bExit == true)
System.out.println("x=" + x);
};
}
Output
x=1
I am getting this output always. It is because the threads share the variable and the when it is changed by one thread other thread can see it. But in real life scenarios we can never say which thread will start first, since here the threads are not doing anything we can see the expected result.
Now take this example--
Here if you make the i variable inside the for-loop` as static variable then threads won t keep a copy of it and you won t see desired outputs, i.e. the count value will not be 2000 every time even if u have synchronized the count increment.
package com.practice.multithreading;
public class RaceCondition2Fixed {
private int count;
int i;
/*making it synchronized forces the thread to acquire an intrinsic lock on the method, and another thread
cannot access it until this lock is released after the method is completed. */
public synchronized void increment() {
count++;
}
public static void main(String[] args) {
RaceCondition2Fixed rc= new RaceCondition2Fixed();
rc.doWork();
}
private void doWork() {
Thread t1 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
/*if we don t use join then count will be 0. Because when we call t1.start() and t2.start()
the threads will start updating count in the spearate threads, meanwhile the main thread will
print the value as 0. So. we need to wait for the threads to complete. */
System.out.println(Thread.currentThread().getName()+" Count is : "+count);
}
}

How synchronized Block In Java works? Variable reference or memory is blocked?

I have a situation and I need some advice about synchronized block in Java. I have a Class Test below:
Class Test{
private A a;
public void doSomething1(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething2(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething3(String input){
result = a.process(input);
}
}
What I want is when multi threads call methods doSomeThing1() or doSomeThing2(), object "a" will be used and shared among multi threads (it have to be) and it only processes one input at a time (waiting until others thread set object "a" free) and when doSomeThing3 is called, the input is processed immediately.
My question is will the method doSomeThing3() be impacted my method doSomeThing1() and doSomeThing2()? Will it have to wait if doSomeThing1() and doSomeThing2() are using object "a"?

A method is never impacted by anything that your threads do. What gets impacted is data, and the answer to your question depends entirely on what data are updated (if any) inside the a.process() call.
You asked "Variable reference or memory is blocked?"
First of all, "variable" and "memory" are the same thing. Variables, and fields and objects are all higher level abstractions that are built on top of the lower-level idea of "memory".
Second of all, No. Locking an object does not prevent other threads from accessing or modifying the object or, from accessing or modifying anything else.
Locking an object does two things: It prevents other threads from locking the same object at the same time, and it makes certain guarantees about the visibility of memory updates. The simple explanation is, if thread X updates some variables and then releases a lock, thread Y will be guaranteed to see the updates only after it has acquired the same lock.
What that means for your example is, if thread X calls doSomething1() and modifies the object a; and then thread Y later calls doSomething3(), thread Y is not guaranteed to see the the updates. It might see the a object in its original state, it might see it in the fully updated state, or it might see it in some invalid half-way state. The reason why is because, even though thread X locked the object, modified it, and then released the lock; thread Y never locked the same object.

In your code, doSomething3() can proceed in parallel with doSomething1() or doSomething2(), so in that sense it does what you want. However, depending on exactly what a.process() does, this may cause a race condition and corrupt data. Note that even if doSomething3() is called, any calls to doSomething1() or doSomething2() that have started will continue; they won't be put in abeyance while doSomething3() is processed.

compareAndSet memory effects of unsuccessful operations

Java exposes the CAS operation through its atomic classes, e.g.
boolean compareAndSet(expected,update)
The JavaDocs specifies the memory effects of a compareAndSet operation as follows:
compareAndSet and all other read-and-update operations
such as getAndIncrement have the memory effects of both
reading and writing volatile variables.
This definitely holds for successful compareAndSet invocations. But do the memory effects also hold if compareAndSet returns false?
I would say that an unsuccessful compareAndSet corresponds to a volatile read (as the current value of the atomic instance has to be accessed in this case), but I do not see why an CAS should perform special memory barrier instructions in the unsuccessful case.
The question actually is, whether an unsuccessful CAS also establishes a happens-before relationship. Consider the following program:
public class Atomics {
private static AtomicInteger ai = new AtomicInteger(5);
private static int x = 0;
public static void main(String[] args) {
new Thread(() -> {
while (x == 0) {
ai.compareAndSet(0, 0); // returns false
}
}, "T1").start();
new Thread(() -> {
x = 1;
ai.compareAndSet(0, 0); // returns false
}, "T2").start();
}
}
Will thread T2 (and the program) definitely terminate?

The problem with establishing a happens-before relationship using volatile reads and writes is that such a relationship exists only for a write and a subsequent read. If one thread T1 writes to a shared volatile variable and another thread T2 reads from the same variable, there can’t be a happens-before relationship if T2 read the variable before T1 wrote to it. If all that determines whether T1 writes before T2 reads is the thread scheduling, we don’t have any guarantees.
A practical way of dealing with it without additional synchronization is to evaluate the actual value, T2 has read. If this value makes it apparent that T1 has already written the new value, we have a valid happens-before relationship. This is how it works when using a volatile boolean fooIsInitialized flag or a volatile int currentPhase counter. It’s obvious that this can’t work if the value written is the same as the old value or if the new value was never actually written.
The problem with your example program is that it speculates about the thread scheduling. It assumes that T2 eventually performs the cas action and that there will be a subsequent iteration in T1 in which the next cas will create a happens-before relationship. But this is not guaranteed. It might not be intuitively understandable but without synchronization all iterations of T1 could happen before the actions of T2, even if the loop is infinite. It is even a valid thread scheduling behavior, letting T1 running forever consuming 100% CPU time before ever assigning CPU time to T2 as preemptive thread switching between threads of equal priority is not guaranteed.
But even if the underlying system does assign CPU time to T2, which will eventually perform the actions, there is no need for the JVM to make this apparent to T1 as it’s not observable to T1 that T2 has ever ran. It’s unlikely to ever spot this in a real life implementation but the answer still is that there is no guaranty. Things change when there is a chain of actions that will make it observable to T1 that T2 ran (i.e. changed its status) but of course, that chain of actions would make the cas obsolete.

I like Holger's answer, and due to the technical information would accept that, but I will write one answer with the same result but a different perspective. Is it possible that this program can run for ever? Yes, consider the possible compiler re-ordering.
new Thread(() -> {
if(x == 0){
while (true) {
ai.compareAndSet(0, 0); // returns false
}
}
}, "T1").start();
Is this a possible re-order? Yes, it is. There are no rules that says a hoist here cannot occur despite a subsequent volatile store. The only rules here is that the read can not be executed below the volatile store.
Edit: I realize the question is regarding memory effects, not compiler ordering. I'll leave this answer as it may be useful but doesn't really answer the question.

Guarding the initialization of a non-volatile field with a lock?

For educational purposes I'm writing a simple version of AtomicLong, where an internal variable is guarded by ReentrantReadWriteLock.
Here is a simplified example:
public class PlainSimpleAtomicLong {
private long value;
private final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
public PlainSimpleAtomicLong(long initialValue) {
this.value = initialValue;
}
public long get() {
long result;
rwLock.readLock().lock();
result = value;
rwLock.readLock().unlock();
return result;
}
// incrementAndGet, decrementAndGet, etc. are guarded by rwLock.writeLock()
}
My question: since "value" is non-volatile, is it possible for other threads to observe incorrect initial value via PlainSimpleAtomicLong.get()?
E.g. thread T1 creates L = new PlainSimpleAtomicLong(42) and shares reference with a thread T2. Is T2 guaranteed to observe L.get() as 42?
If not, would wrapping this.value = initialValue; into a write lock/unlock make a difference?

Chapter 17 reasons about concurrent code in terms of happens before relationships. In your example, if you take two random threads then there is no happens-before relationship between this.value = initialValue; and result = value;.
So if you have something like:
T1.start();
T2.start();
...
T1: L = new PlainSimpleAtomicLong(42);
T2: long value = L.get();
The only happens-before (hb) relationships you have (apart from program order in each thread) is: 1 & 2 hb 3,4,5.
But 4 and 5 are not ordered. If however T1 called L.get() before T2 called L.get() (from a wall clock perspective) then you would have a hb relationship between unlock() in T1 and lock() in T2.
As already commented, I don't think your proposed code could break on any combination of JVM/hardware but it could break on a theoretical implementation of the JMM.
As for your suggestion to wrap the constructor in a lock/unlock, I don't think it would be enough because, in theory at least, T1 could release a valid reference (non null) to L before running the body of the constructor. So the risk would be that T2 could acquire the lock before T1 has acquired it in the constructor. There again, this is an interleaving that is probably impossible on current JVMs/hardware.
So to conclude, if you want theoretical thread safety, I don't think you can do without a volatile long value, which is how AtomicLong is implemented. volatile would guarantee that the field is initialised before the object is published. Note finally that the issues I mention here are not due to your object being unsafe (see #BrettOkken answer) but are based on a scenario where the object is not safely published across threads.

Assuming that you do not allow a reference to the instance to escape your constructor (your example looks fine), then a second thread can never see the object with any value of "value" other than what it was constructed with because all accesses are protected by a monitor (the read write lock) which was final in the constructor.
https://www.ibm.com/developerworks/library/j-jtp0618/
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Lock.html

I think that for initial values , than both threads would see the same values (since they can have the object only after the constructor is finished).
But
If you change the value in 1 thread , then other thread may not see the same value if you don't use volatile
If you want to implement set, wrapping set with lock/unlock will not solve the problem - this is good when need atomic operation (like increment).
I
It doesn't mean that it would work the way you want since you don't control the context switch. For example if 2 threads call set, with values 4 & 8 , since you don't know when the context switch occurs , you don't know who will gain the lock first.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.