Guarding the initialization of a non-volatile field with a lock?

Guarding the initialization of a non-volatile field with a lock? - java

For educational purposes I'm writing a simple version of AtomicLong, where an internal variable is guarded by ReentrantReadWriteLock.
Here is a simplified example:
public class PlainSimpleAtomicLong {
private long value;
private final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
public PlainSimpleAtomicLong(long initialValue) {
this.value = initialValue;
}
public long get() {
long result;
rwLock.readLock().lock();
result = value;
rwLock.readLock().unlock();
return result;
}
// incrementAndGet, decrementAndGet, etc. are guarded by rwLock.writeLock()
}
My question: since "value" is non-volatile, is it possible for other threads to observe incorrect initial value via PlainSimpleAtomicLong.get()?
E.g. thread T1 creates L = new PlainSimpleAtomicLong(42) and shares reference with a thread T2. Is T2 guaranteed to observe L.get() as 42?
If not, would wrapping this.value = initialValue; into a write lock/unlock make a difference?

Chapter 17 reasons about concurrent code in terms of happens before relationships. In your example, if you take two random threads then there is no happens-before relationship between this.value = initialValue; and result = value;.
So if you have something like:
T1.start();
T2.start();
...
T1: L = new PlainSimpleAtomicLong(42);
T2: long value = L.get();
The only happens-before (hb) relationships you have (apart from program order in each thread) is: 1 & 2 hb 3,4,5.
But 4 and 5 are not ordered. If however T1 called L.get() before T2 called L.get() (from a wall clock perspective) then you would have a hb relationship between unlock() in T1 and lock() in T2.
As already commented, I don't think your proposed code could break on any combination of JVM/hardware but it could break on a theoretical implementation of the JMM.
As for your suggestion to wrap the constructor in a lock/unlock, I don't think it would be enough because, in theory at least, T1 could release a valid reference (non null) to L before running the body of the constructor. So the risk would be that T2 could acquire the lock before T1 has acquired it in the constructor. There again, this is an interleaving that is probably impossible on current JVMs/hardware.
So to conclude, if you want theoretical thread safety, I don't think you can do without a volatile long value, which is how AtomicLong is implemented. volatile would guarantee that the field is initialised before the object is published. Note finally that the issues I mention here are not due to your object being unsafe (see #BrettOkken answer) but are based on a scenario where the object is not safely published across threads.

Assuming that you do not allow a reference to the instance to escape your constructor (your example looks fine), then a second thread can never see the object with any value of "value" other than what it was constructed with because all accesses are protected by a monitor (the read write lock) which was final in the constructor.
https://www.ibm.com/developerworks/library/j-jtp0618/
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Lock.html

I think that for initial values , than both threads would see the same values (since they can have the object only after the constructor is finished).
But
If you change the value in 1 thread , then other thread may not see the same value if you don't use volatile
If you want to implement set, wrapping set with lock/unlock will not solve the problem - this is good when need atomic operation (like increment).
I
It doesn't mean that it would work the way you want since you don't control the context switch. For example if 2 threads call set, with values 4 & 8 , since you don't know when the context switch occurs , you don't know who will gain the lock first.

Related

Should locks in multi threading always remain immutable?

I am trying the below code snippet which is giving expected output.But I am curious to know how many lock objects I am creating 1 or 2?
package com.practice;
class A{
String lock="";
A(String lock){
this.lock = lock;
}
A(){ }
public void printMe(int x) {
int counter = x;
synchronized (lock) {
System.out.println(++counter);
}
}
}
public class MultiThreadingPractice {
public static void main(String[] args) {
A a = new A("1");
Thread t1=new Thread(() -> {
a.printMe(1);
});
a.lock = new String();
Thread t2=new Thread(() -> {
a.printMe(1);
});
t1.start();
t2.start();
}
}

Should locks in multi threading always remain immutable? Yes.
You're using a String (any Object would do) as a lock. When you assign a new value (new String) to the lock, that means you have more than one lock instance around. It's ok as long as all threads are synchronizing on the same lock instance, but there is nothing overt in your class A code to ensure that is the case.
In your actual use in the example, you're safe. Since no thread get started until after you've finished setting the lock to the third and last instance, nothing will be trying to sync on the lock until it's stable. (3 instances: the first is the initialization to the empty String; the second is to the supplied constructor argument "1"; and the third is the explicit assignment, to a different empty String). So, though this code "works", it only works by what I refer to as "coincidence", i.e., it's not thread-safe by design.
But let's assume a case where you start each thread immediately after you construct it. This means you'd be reassigning the lock member after t1 was running but before t2 was even created.
After a while both threads will be synchronizing on the new lock instance, but for a period around the point at which you switched the lock, thread t1 could be and probably is in the synchronized(lock) { ... } clause using the old lock instance. And around that time, thread t2 could execute and attempt to synchronize on the new lock instance.
In short, you've created a timing window (race hazard) in the mechanism that you intend to use to eliminate timing windows.
You could arrange a further level of synchronization that allows you to replace the lock, but I am unable to imagine any straightforward situation where that would be necessary, useful, or sensible. Much better to allocate one lock before any contention can occur and then stick to that one lock.
P.S. "How many locks am I creating?" 3. Though the first two are never used.

curious to know how many lock objects I am creating 1 or 2?
This line in your program creates a single String instance when the program is loaded, and both of your A instances start out with a reference to the same String.
String lock="";
This line creates a second String instance, because new String(...) always creates a new object, regardless of whether or not some other String with the same value has been interned.
a.lock = new String();

Should locks in multi threading always remain immutable? No.
The typical way to synchronize is to use synchronized methods, not separate locks. This approach is less error-prone and therefore preferred. It is equivalent to use current object as lock:
synchronized(this) {
...
}
and since this object is not usually immutable, we can conclude that using mutable objects as locks is a common practice.
What you are doing in your code, when you change the reference to the lock object (yes, you are changing the reference and not the lock object itself), is a very bad approach. It provokes programming errors.

Making a POJO Thread Safe

Here's the class:
#NotThreadSafe
public class MutableInteger {
private int value;
public int get() { return value;}
public void set(int value) { this.value = value;}
}
Here's the post-condition that I have come up with: The value returned by get() is equal to the value value set by set() or 0.
It is easy to see that the above post-condition will not always hold true. Take two threads A and B for instance. Suppose A sets value to 5 and B sets it to 8 after that. Doing a get() in thread A would return 8. It should have returned 5. It is a simple race condition situation.
How can I make this class thread safe? In the book Java: Concurrency in Practice, the authors have guarded value and both the methods on the same object. I fail to understand how this helps at all with the race condition. First of all, set() is not a compound action. So, why do we need to synchronise it? And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value. Doing a get() in the initial thread will return the new value, breaching the post-condition.
(I understand the the author is guarding the get()) for visibility stuff. But I am not sure how it eliminates the race condition.

First of all, set() is not a compound action. So, why do we need to synchronise it?
You're not synchronising the set() in its own right, you're synchronising both the get() and set() methods against the same object (assuming you make both these methods synchronised.)
If you didn't do this, and the value variable isn't marked as volatile, then you can't guarantee that a thread will see the correct value because of per-thread caching. (Thread a could update it to 5, then thread b could still potentially see 8 even after thread a has updated it. That's what's meant by lack of thread safety in this context.)
You're correct that all reference assignments are atomic, so there's no worry about a corrupt reference in this scenario.
And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value.
New threads setting new values (or new code setting new values) isn't an issue at all in terms of thread safety - that's designed, and expected. The issue is if the results are inconsistent, or specifically if multiple threads are able to concurrently view the object in an inconsistent state.

What happens if a volatile variable is written from 2 threads?

Consider the snippet from Java Concurrency in Practice-
#ThreadSafe
public class SynchronizedInteger{
#GuardedBy("this") private int value;
public synchronized int getValue() {
return value;
}
public synchronized void setValue(int value) {
this.value = value;
}
}
An extract from the same book-
A good way to think about volatile variables is to imagine that they
behave roughly like the SynchronizedInteger class in Listing above,
replacing reads and writes of the volatile variable with calls to get
and set. Yet accessing a volatile variable performs no locking and so
cannot cause the executing thread to block, making volatile variables
a lighter-weight synchronization mechanism than synchronized.
A special case of thread confinement applies to volatile variables. It is safe to perform read-modify-write operations on shared volatile variables as long as you ensure that the volatile variable is only written from a single thread.
So, if you make the instance variable in the above class as volatile and then remove the synchronized keyword, after that suppose there are 3 threads
Thread A & Thread B are writing to the same volatile variable.
Thread C reads the volatile variable.
Since the volatile variable is now written from 2 threads, why is it unsafe to perform read-modify-write operations on this shared volatile variable?

The keyword volatile is used to ensure that changes to your Object will be seen by other Threads.
This does not enforce, that non-atomic operations on the Object will be performed without an other Thread interfering before the operation is finished.
For enforcing this you will need the keyword synchronized.

It's because read-modify-write operations on volatile variables are not atomic. v++ is actually something like:
r1 = v;
r2 = r1 + 1;
v = r2;
So if you have two threads performing this operation once each, it could possibly result in the variable being incremented only once, as they both read the old value. That's an example of why it's not safe.
In your example it would be not safe if you removed synchronized, made the field volatile and had two threads calling setValue after some conditional logic based on the return of getValue - the value could have been modified by the other thread.
If you want atomic operations look at the java.util.concurrent.atomic package.

If you write volatile variable from multiple threads without using any synchronized constructs, you are bound to get data inconsistency errors.
Use volatile variables without synchronization in case of single write thread and multiple read threads for atomic operations.
Volatile make sure that variable value is fetched from main memory instead of Thread cache. It's safe to use in case of single write and multiple read operations.
Use Atomic variables or synchronization or Lock API to update and read variables from multiple threads.
Refer to related SE question:
What is meant by "thread-safe" code?

If two threads are writing without reading the variable first, there is no problem.. it is safe. Problem arises if a thread first reads, then modifies and then writes. What if second thread is also reading at the same time, reads the same old value as the first thread, and modifies it.. and when it writes, it will simply overwrite the first threads update. BOOM.
val i = 1
-> Thread reads 1 -> Threads 2 reads 1 -> Thread 1 does 1 * .2 = 1.2 -> Thread 2 does 1 * .3 = 1.3 -> Thread 1 writes 1.2 back -> Thread 2 cooly overwrites it to 1.3 instead of doing 1.2 * .3

Java multi thread behavior without synchronization

I faced the following question during an interview:
Lets assume a simple class
public class Example{
private int a;
public void update(){
a = some new value;
}
public int getA(){
return a;
}
}
Now there are 2 threads (T1 and T2) which read and update the a value in the following sequence:
T2 (call update() and the value was set to 1)
T1 (call getA())
T2 (call update() and the value was set to 2)
T1 (call getA())
Is it possible for the last call getA() of thread T1 to return the value 1? If yes under what circumstances?

The last call to to T1 could return 0, 1, or 2. It doesn't really make sense to ask "under what circumstances." Under the circumstance of running this code, basically. The code isn't written for concurrency, so there's no guarantee.
In to guarantee that a a write to a variable by one thread is visible to a read of that variable by another thread, there needs to be a synchronization point between the threads. Otherwise, the JVM is allowed to optimize things in such a way that changes are only visible to the thread that makes them. For example, the writing thread's current notion of the value can be cached on the processor and written to main memory later or never. When another thread reads main memory for the value, it finds the initial value (0), a stale update (1), or the latest update (2).
The easiest fix in this case would be to declare a as a volatile variable. You'd still need some mechanism to ensure that T2 writes before T1 reads, but only in a weak, wall-clock sense.

What is the memory visibility of variables accessed in static singletons in Java?

I've seen this type of code a lot in projects, where the application wants a global data holder, so they use a static singleton that any thread can access.
public class GlobalData {
// Data-related code. This could be anything; I've used a simple String.
//
private String someData;
public String getData() { return someData; }
public void setData(String data) { someData = data; }
// Singleton code
//
private static GlobalData INSTANCE;
private GlobalData() {}
public synchronized GlobalData getInstance() {
if (INSTANCE == null) INSTANCE = new GlobalData();
return INSTANCE;
}
}
I hope it's easy to see what's going on. One can call GlobalData.getInstance().getData() at any time on any thread. If two threads call setData() with different values, even if you can't guarantee which one "wins", I'm not worried about that.
But thread-safety isn't my concern here. What I'm worried about is memory visibility. Whenever there's a memory barrier in Java, the cached memory is synched between the corresponding threads. A memory barrier happens when passing through synchronizations, accessing volatile variables, etc.
Imagine the following scenario happening in chronological order:
// Thread 1
GlobalData d = GlobalData.getInstance();
d.setData("one");
// Thread 2
GlobalData d = GlobalData.getInstance();
d.setData("two");
// Thread 1
String value = d.getData();
Isn't it possible that the last value of value in thread 1 can still be "one"? The reason being, thread 2 never called any synchronized methods after calling d.setData("two") so there was never a memory barrier? Note that the memory-barrier in this case happens every time getInstance() is called because it's synchronized.

You are absolutely correct.
There is no guarantee that writes in one Thread will be visible is another.
To provide this guarantee you would need to use the volatile keyword:
private volatile String someData;
Incidentally you can leverage the Java classloader to provider thread safe lazy init of your singleton as documented here. This avoids the synchronized keyword and therefore saves you some locking.
It is worth noting that the current accepted best practice is to use an enum for storing singleton data in Java.

Correct, it is possible that thread 1 still sees the value as being "one" since no memory synchronization event occurred and there is no happens before relationship between thread 1 and thread 2 (see section 17.4.5 of the JLS).
If someData was volatile then thread 1 would see the value as "two" (assuming thread 2 completed before thread 1 fetched the value).
Lastly, and off topic, the implementation of the singleton is slightly less than ideal since it synchronizes on every access. It is generally better to use an enum to implement a singleton or at the very least, assign the instance in a static initializer so no call to the constructor is required in the getInstance method.

Isn't it possible that the last value of value in thread 1 can still be "one"?
Yes it is. The java memory model is based on happens before (hb) relationships. In your case, you only have getInstance exit happens-before subsequent getInstance entry, due to the synchronized keyword.
So if we take your example (assuming the thread interleaving is in that order):
// Thread 1
GlobalData d = GlobalData.getInstance(); //S1
d.setData("one");
// Thread 2
GlobalData d = GlobalData.getInstance(); //S2
d.setData("two");
// Thread 1
String value = d.getData();
You have S1 hb S2. If you called d.getData() from Thread2 after S2, you would see "one". But the last read of d is not guaranteed to see "two".

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.