Making a POJO Thread Safe - java

Here's the class:
#NotThreadSafe
public class MutableInteger {
private int value;
public int get() { return value;}
public void set(int value) { this.value = value;}
}
Here's the post-condition that I have come up with: The value returned by get() is equal to the value value set by set() or 0.
It is easy to see that the above post-condition will not always hold true. Take two threads A and B for instance. Suppose A sets value to 5 and B sets it to 8 after that. Doing a get() in thread A would return 8. It should have returned 5. It is a simple race condition situation.
How can I make this class thread safe? In the book Java: Concurrency in Practice, the authors have guarded value and both the methods on the same object. I fail to understand how this helps at all with the race condition. First of all, set() is not a compound action. So, why do we need to synchronise it? And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value. Doing a get() in the initial thread will return the new value, breaching the post-condition.
(I understand the the author is guarding the get()) for visibility stuff. But I am not sure how it eliminates the race condition.

First of all, set() is not a compound action. So, why do we need to synchronise it?
You're not synchronising the set() in its own right, you're synchronising both the get() and set() methods against the same object (assuming you make both these methods synchronised.)
If you didn't do this, and the value variable isn't marked as volatile, then you can't guarantee that a thread will see the correct value because of per-thread caching. (Thread a could update it to 5, then thread b could still potentially see 8 even after thread a has updated it. That's what's meant by lack of thread safety in this context.)
You're correct that all reference assignments are atomic, so there's no worry about a corrupt reference in this scenario.
And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value.
New threads setting new values (or new code setting new values) isn't an issue at all in terms of thread safety - that's designed, and expected. The issue is if the results are inconsistent, or specifically if multiple threads are able to concurrently view the object in an inconsistent state.

Related

How stale data is avoided using synchronized keyword?

In the book "Java Concurrency in Practice", under the section, 3.1.1 State data, there is a code
#NotThreadSafe
public class MutableInteger {
private int value;
public int get() { return value; }
public void set(int value) { this.value = value; }
}
which is not thread safe,because:
if one thread calls set, other threads calling get may or may not see
that update.
whereas using synchronized keyword on both set and get methods makes it "correct". How?
#ThreadSafe
public class SynchronizedInteger {
#GuardedBy("this") private int value;
public synchronized int get() { return value; }
public synchronized void set(int value) { this.value = value; }
}
Here too if value is 0, and Thread A has called set(2) while Thread B has called get(), B may get value 0 and then A will set it to 2...which previous code was already doing. So what benefit we got from synchronizing the code..
May be I am missing something, but please guide. Thank you
The issue you fix this way is not that thread B executes the set immediately after A executes a get, that one will still return the "old" (well, technically correct at the time, but soon to be wrong) value.
The issue the synchronization fixes is that even if thread B wrote before thread A read, A could read an old value due to caching (most likely CPU caches, but this depends on the JVM implementation). A non-synchronized read from a non-volatile variable can use a cached value. In other words: the synchronized creates a read-barrier, which means "you have to re-read this value, even if you already have it in your CPU cache".
Note that for this specific case, simply adding volatile to value would have the same effect, but for more complex access patterns synchronized (or it's equivalence in newer APIs Lock) is necessary.
When you use synchronized method you get the exclusive access to the object that is in "Race Condition" risks. In that case you get the exclusive access to the value.
This goal is obtained because the synchronized method use semaphore.
Synchronized in Java
Java Doc - Here you can find a good example.
From Java Doc:
If count is an instance of SynchronizedCounter, then making these methods synchronized has two effects:
First,
it is not possible for two invocations of synchronized methods on the
same object to interleave.
When one thread is executing a synchronized method for an object, all
other threads that invoke synchronized methods for the same object
block (suspend execution) until the first thread is done with the
object.
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads.
It's all about the "Happens Before Relationship", as termed by the official Java documentation.
In your case of two synchronised getter & setter methods reading & writing the same instance variable respectively, it depends on the sequence of operations, ie, whether the getter or setter was called first.
This relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement.
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
Synchronisation is one of ways to achieve this consistency. Another one, in your particular case would be to make the variable as volatile.
From the official Java docs:
Using volatile variables reduces the risk of memory consistency
errors, because any write to a volatile variable establishes a
happens-before relationship with subsequent reads of that same
variable. This means that changes to a volatile variable are always
visible to other threads.

How synchronized Block In Java works? Variable reference or memory is blocked?

I have a situation and I need some advice about synchronized block in Java. I have a Class Test below:
Class Test{
private A a;
public void doSomething1(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething2(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething3(String input){
result = a.process(input);
}
}
What I want is when multi threads call methods doSomeThing1() or doSomeThing2(), object "a" will be used and shared among multi threads (it have to be) and it only processes one input at a time (waiting until others thread set object "a" free) and when doSomeThing3 is called, the input is processed immediately.
My question is will the method doSomeThing3() be impacted my method doSomeThing1() and doSomeThing2()? Will it have to wait if doSomeThing1() and doSomeThing2() are using object "a"?
A method is never impacted by anything that your threads do. What gets impacted is data, and the answer to your question depends entirely on what data are updated (if any) inside the a.process() call.
You asked "Variable reference or memory is blocked?"
First of all, "variable" and "memory" are the same thing. Variables, and fields and objects are all higher level abstractions that are built on top of the lower-level idea of "memory".
Second of all, No. Locking an object does not prevent other threads from accessing or modifying the object or, from accessing or modifying anything else.
Locking an object does two things: It prevents other threads from locking the same object at the same time, and it makes certain guarantees about the visibility of memory updates. The simple explanation is, if thread X updates some variables and then releases a lock, thread Y will be guaranteed to see the updates only after it has acquired the same lock.
What that means for your example is, if thread X calls doSomething1() and modifies the object a; and then thread Y later calls doSomething3(), thread Y is not guaranteed to see the the updates. It might see the a object in its original state, it might see it in the fully updated state, or it might see it in some invalid half-way state. The reason why is because, even though thread X locked the object, modified it, and then released the lock; thread Y never locked the same object.
In your code, doSomething3() can proceed in parallel with doSomething1() or doSomething2(), so in that sense it does what you want. However, depending on exactly what a.process() does, this may cause a race condition and corrupt data. Note that even if doSomething3() is called, any calls to doSomething1() or doSomething2() that have started will continue; they won't be put in abeyance while doSomething3() is processed.

Guarding the initialization of a non-volatile field with a lock?

For educational purposes I'm writing a simple version of AtomicLong, where an internal variable is guarded by ReentrantReadWriteLock.
Here is a simplified example:
public class PlainSimpleAtomicLong {
private long value;
private final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
public PlainSimpleAtomicLong(long initialValue) {
this.value = initialValue;
}
public long get() {
long result;
rwLock.readLock().lock();
result = value;
rwLock.readLock().unlock();
return result;
}
// incrementAndGet, decrementAndGet, etc. are guarded by rwLock.writeLock()
}
My question: since "value" is non-volatile, is it possible for other threads to observe incorrect initial value via PlainSimpleAtomicLong.get()?
E.g. thread T1 creates L = new PlainSimpleAtomicLong(42) and shares reference with a thread T2. Is T2 guaranteed to observe L.get() as 42?
If not, would wrapping this.value = initialValue; into a write lock/unlock make a difference?
Chapter 17 reasons about concurrent code in terms of happens before relationships. In your example, if you take two random threads then there is no happens-before relationship between this.value = initialValue; and result = value;.
So if you have something like:
T1.start();
T2.start();
...
T1: L = new PlainSimpleAtomicLong(42);
T2: long value = L.get();
The only happens-before (hb) relationships you have (apart from program order in each thread) is: 1 & 2 hb 3,4,5.
But 4 and 5 are not ordered. If however T1 called L.get() before T2 called L.get() (from a wall clock perspective) then you would have a hb relationship between unlock() in T1 and lock() in T2.
As already commented, I don't think your proposed code could break on any combination of JVM/hardware but it could break on a theoretical implementation of the JMM.
As for your suggestion to wrap the constructor in a lock/unlock, I don't think it would be enough because, in theory at least, T1 could release a valid reference (non null) to L before running the body of the constructor. So the risk would be that T2 could acquire the lock before T1 has acquired it in the constructor. There again, this is an interleaving that is probably impossible on current JVMs/hardware.
So to conclude, if you want theoretical thread safety, I don't think you can do without a volatile long value, which is how AtomicLong is implemented. volatile would guarantee that the field is initialised before the object is published. Note finally that the issues I mention here are not due to your object being unsafe (see #BrettOkken answer) but are based on a scenario where the object is not safely published across threads.
Assuming that you do not allow a reference to the instance to escape your constructor (your example looks fine), then a second thread can never see the object with any value of "value" other than what it was constructed with because all accesses are protected by a monitor (the read write lock) which was final in the constructor.
https://www.ibm.com/developerworks/library/j-jtp0618/
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Lock.html
I think that for initial values , than both threads would see the same values (since they can have the object only after the constructor is finished).
But
If you change the value in 1 thread , then other thread may not see the same value if you don't use volatile
If you want to implement set, wrapping set with lock/unlock will not solve the problem - this is good when need atomic operation (like increment).
I
It doesn't mean that it would work the way you want since you don't control the context switch. For example if 2 threads call set, with values 4 & 8 , since you don't know when the context switch occurs , you don't know who will gain the lock first.

Multithreads: lock on get and set

I know that in a program that works with multiple threads it's necessary to synchronize the methods because it's possible to have problems like race conditions.
But I cannot understand why we need to synchronize also the methods that need just to read a shared variable.
Look at this example:
public ConcurrentIntegerArray(final int size) {
arr = new int[size];
}
public void set(final int index, final int value) {
lock.lock();
try {
arr[index] = value;
} finally {
lock.unlock();
}
}
public int get(final int index) {
lock.lock();
try {
return arr[index];
} finally {
lock.unlock();
}
}
They did a look on the get and also on the set method. On the set method I understand why. For example if I want to put with Thread1 in index=3 the number 5 and after some milliseconds the Thread2 have to put in index=3 the number 6. Can it happen that I have in index=3 in my array still a 5 instead of a 6 (if I don't do a synchronization on the method set)? This because the Thread1 can have a switch-context and so the Thread2 enter in the same method put the value and after the Thread1 assign the value 5 on the same position So instead of a 6 I have a 5.
But I don't understand why we need (look the example) to synchronize also the method get. I'm asking this question because we need just to read on the memory and not to write.So why we need also on the method get to have a synchronization? Can someone give to me a very simple example?
Both methods need to be synchronized. Without synchronization on the get method, this sequence is possible:
get is called, but the old value isn't returned yet.
Another thread calls set and updates the value.
The first thread that called get now examines the now-returned value and sees what is now an outdated value.
Synchronization would disallow this scenario by guaranteeing that another thread can't just call set and invalidate the get value before it even returns. It would force a thread that calls set to wait for the thread that calls get to finish.
If you do not lock in the get method than a thread might keep a local copy of the array and never refreshes from the main memory. So its possible that a get never sees a value which was updated by a set method. Lock will force the visibility.
Each thread maintain their own copy of value. The synchronized ensures that the coherency is maintained between different threads. Without synchronized, one can never be sure if any one has modified it. Alternatively, one can define the variable as volatile and it will have the same memory effects as synchronized.
The locking action also guarantees memory visibility. From the Lock doc:
All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock, [...]:
A successful lock operation has the same memory synchronization effects as a successful Lock action.
A successful unlock operation has the same memory synchronization effects as a successful Unlock action.
Without acquiring the lock, due to memory consistency errors, there's no reason a call to get needs to see the most updated value. Modern processors are very fast, access to DRAM is comparatively very slow, so processors store values they are working on in a local cache. In concurrent programming this means one thread might write to a variable in memory but a subsequent read from a different thread gets a stale value because it is read from the cache.
The locking guarantees that the value is actually read from memory and not from the cache.

AtomicInteger vs synchronized getters/setters

Is this class thread-safe?
Is it possible to see inconsistent values? Lets say initially a's value is 80. Thread 1 calls setA(100) and enters the function but did not yet call a.set(100) and Thread 2 concurrently calls getA(). Is it possible for Thread 2 to see 80?
public class A {
private AtomicInteger a;
public int getA() {
return a.get()
}
public void setA(int newVal){
a.set(newVal);
}
}
I know that synchronizing it will guarantee thread 2 sees 100, but not sure with AtomicInteger.
Is this class thread-safe?
Yes it is.
Thread 1 calls setA(100) and enters the function but did not yet call a.set(100) and Thread 2 concurrently calls getA(). Is it possible for Thread 2 to see 80?
Yes. Until the memory barrier code that synchronizes the volatile field inside of AtomicInteger completes, the race condition can show 80 or 100.
Thread 1 could even enter the AtomicInteger.set method and be before the inner field assignment and still the 80 may be returned by get AtomicInteger.get method.
There are no guarantees about when the values will be updated in other threads. What is guaranteed is when the get() completes, you get the most recent synchronized value and when the set() completes, all other threads will see the updates.
There is no guarantee as to the timing of getter and setter calls in different threads.
As #Gray noted, there is a possibility for a race condition here.
Calling get and then set is not an atomic operation. The Atomic* classes offer a lock-free atomic conditional update operation, compareAndSet - you should use that one for thread safety.

Categories

Resources