Memory Consistency - happens-before relationship in Java [duplicate] - java

This question already has answers here:
How to understand happens-before consistent
(5 answers)
Closed 4 years ago.
While reading Java docs on Memory Consistency errors. I find points related to two actions that creates happen - before relationship:
When a statement invokes Thread.start(), every statement that has a
happens-before relationship with that statement also has a
happens-before relationship with every statement executed by the new
thread. The effects of the code that led up to the creation of the
new thread are visible to the new thread.
When a thread terminates and causes a Thread.join() in another thread
to return, then all the statements executed by the terminated
thread have a happens-before relationship with all the statements
following the successful join. The effects of the code in the thread
are now visible to the thread that performed the join.
I am not able to understand their meaning. It would be great if someone explain it with a simple example.

Modern CPUs don't always write data to memory in the order it was updated, for example if you run the pseudo code (assuming variables are always stored to memory here for simplicity);
a = 1
b = a + 1
...the CPU may very well write b to memory before it writes a to memory. This isn't really a problem as long as you run things in a single thread, since the thread running the code above will never see the old value of either variable once the assignments have been made.
Multi threading is another matter, you'd think the following code would let another thread pick up the value of your heavy computation;
a = heavy_computation()
b = DONE
...the other thread doing...
repeat while b != DONE
nothing
result = a
The problem though is that the done flag may be set in memory before the result is stored to memory, so the other thread may pick up the value of memory address a before the computation result is written to memory.
The same problem would - if Thread.start and Thread.join didn't have a "happens before" guarantee - give you problems with code like;
a = 1
Thread.start newthread
...
newthread:
do_computation(a)
...since a may not have a value stored to memory when the thread starts.
Since you almost always want the new thread to be able to use data you initialized before starting it, Thread.start has a "happens before" guarantee, that is, data that has been updated before calling Thread.start is guaranteed to be available to the new thread. The same thing goes for Thread.join where data written by the new thread is guaranteed to be visible to the thread that joins it after termination.
It just makes threading much easier.

Consider this:
static int x = 0;
public static void main(String[] args) {
x = 1;
Thread t = new Thread() {
public void run() {
int y = x;
};
};
t.start();
}
The main thread has changed field x. Java memory model does not guarantee that this change will be visible to other threads if they are not synchronized with the main thread. But thread t will see this change because the main thread called t.start() and JLS guarantees that calling t.start() makes the change to x visible in t.run() so y is guaranteed to be assigned 1.
The same concerns Thread.join();

Thread visibility problems may occur in a code that isn't properly synchronized according to the java memory model. Due to compiler & hardware optimizations, writes by one thread aren't always visible by reads of another thread. The Java Memory Model is a formal model that makes the rules of "properly synchronized" clear, so that programmers can avoid thread visibility problems.
Happens-before is a relation defined in that model, and it refers to specific executions. A write W that is proven to be happens-before a read R is guaranteed to be visible by that read, assuming that there's no other interfering write (i.e. one with no happens-before relation with the read, or one happening between them according to that relation).
The simplest kind of happens-before relation happens between actions in the same thread. A write W to V in thread P happens-before a read R of V in the same thread, assuming that W comes before R according to the program order.
The text you are referring to states that thread.start() and thread.join() also guarantee happens-before relationship. Any action that happens-before thread.start() also happens before any action within that thread. Similarly, actions within the thread happen before any actions that appear after thread.join().
What's the practical meaning of that? If for example you start a thread and wait for it to terminate in a non-safe manner (e.g. a sleep for a long time, or testing some non-synchronized flag), then when you'll try reading the data modifications done by the thread, you may see them partially, thus having the risk of data inconsistencies. The join() method acts as a barrier that guarantees that any piece of data published by the thread is visible completely and consistently by the other thread.

According to oracle document, they define that The happens-before relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement.
package happen.before;
public class HappenBeforeRelationship {
private static int counter = 0;
private static void threadPrintMessage(String msg){
System.out.printf("[Thread %s] %s\n", Thread.currentThread().getName(), msg);
}
public static void main(String[] args) {
threadPrintMessage("Increase counter: " + ++counter);
Thread t = new Thread(new CounterRunnable());
t.start();
try {
t.join();
} catch (InterruptedException e) {
threadPrintMessage("Counter is interrupted");
}
threadPrintMessage("Finish count: " + counter);
}
private static class CounterRunnable implements Runnable {
#Override
public void run() {
threadPrintMessage("start count: " + counter);
counter++;
threadPrintMessage("stop count: " + counter);
}
}
}
Output will be:
[Thread main] Increase counter: 1
[Thread Thread-0] start count: 1
[Thread Thread-0] stop count: 2
[Thread main] Finish count: 2
Have a look output, line [Thread Thread-0] start count: 1 shows that all counter changes before invocation of Thread.start() are visible in Thread's body.
And line [Thread main] Finish count: 2 indicates that all changes in Thread's body are visible to main thread that calls Thread.join().
Hope it can help you clearly.

Related

Quartz Job call synchronized method before completion of first one [duplicate]

I have some questions regarding the usage and significance of the synchronized keyword.
What is the significance of the synchronized keyword?
When should methods be synchronized?
What does it mean programmatically and logically?
The synchronized keyword is all about different threads reading and writing to the same variables, objects and resources. This is not a trivial topic in Java, but here is a quote from Sun:
synchronized methods enable a simple
strategy for preventing thread
interference and memory consistency
errors: if an object is visible to
more than one thread, all reads or
writes to that object's variables are
done through synchronized methods.
In a very, very small nutshell: When you have two threads that are reading and writing to the same 'resource', say a variable named foo, you need to ensure that these threads access the variable in an atomic way. Without the synchronized keyword, your thread 1 may not see the change thread 2 made to foo, or worse, it may only be half changed. This would not be what you logically expect.
Again, this is a non-trivial topic in Java. To learn more, explore topics here on SO and the Interwebs about:
Concurrency
Java Memory Model
Keep exploring these topics until the name "Brian Goetz" becomes permanently associated with the term "concurrency" in your brain.
Well, I think we had enough of theoretical explanations, so consider this code
public class SOP {
public static void print(String s) {
System.out.println(s+"\n");
}
}
public class TestThread extends Thread {
String name;
TheDemo theDemo;
public TestThread(String name,TheDemo theDemo) {
this.theDemo = theDemo;
this.name = name;
start();
}
#Override
public void run() {
theDemo.test(name);
}
}
public class TheDemo {
public synchronized void test(String name) {
for(int i=0;i<10;i++) {
SOP.print(name + " :: "+i);
try{
Thread.sleep(500);
} catch (Exception e) {
SOP.print(e.getMessage());
}
}
}
public static void main(String[] args) {
TheDemo theDemo = new TheDemo();
new TestThread("THREAD 1",theDemo);
new TestThread("THREAD 2",theDemo);
new TestThread("THREAD 3",theDemo);
}
}
Note: synchronized blocks the next thread's call to method test() as long as the previous thread's execution is not finished. Threads can access this method one at a time. Without synchronized all threads can access this method simultaneously.
When a thread calls the synchronized method 'test' of the object (here object is an instance of 'TheDemo' class) it acquires the lock of that object, any new thread cannot call ANY synchronized method of the same object as long as previous thread which had acquired the lock does not release the lock.
Similar thing happens when any static synchronized method of the class is called. The thread acquires the lock associated with the class(in this case any non static synchronized method of an instance of that class can be called by any thread because that object level lock is still available). Any other thread will not be able to call any static synchronized method of the class as long as the class level lock is not released by the thread which currently holds the lock.
Output with synchronised
THREAD 1 :: 0
THREAD 1 :: 1
THREAD 1 :: 2
THREAD 1 :: 3
THREAD 1 :: 4
THREAD 1 :: 5
THREAD 1 :: 6
THREAD 1 :: 7
THREAD 1 :: 8
THREAD 1 :: 9
THREAD 3 :: 0
THREAD 3 :: 1
THREAD 3 :: 2
THREAD 3 :: 3
THREAD 3 :: 4
THREAD 3 :: 5
THREAD 3 :: 6
THREAD 3 :: 7
THREAD 3 :: 8
THREAD 3 :: 9
THREAD 2 :: 0
THREAD 2 :: 1
THREAD 2 :: 2
THREAD 2 :: 3
THREAD 2 :: 4
THREAD 2 :: 5
THREAD 2 :: 6
THREAD 2 :: 7
THREAD 2 :: 8
THREAD 2 :: 9
Output without synchronized
THREAD 1 :: 0
THREAD 2 :: 0
THREAD 3 :: 0
THREAD 1 :: 1
THREAD 2 :: 1
THREAD 3 :: 1
THREAD 1 :: 2
THREAD 2 :: 2
THREAD 3 :: 2
THREAD 1 :: 3
THREAD 2 :: 3
THREAD 3 :: 3
THREAD 1 :: 4
THREAD 2 :: 4
THREAD 3 :: 4
THREAD 1 :: 5
THREAD 2 :: 5
THREAD 3 :: 5
THREAD 1 :: 6
THREAD 2 :: 6
THREAD 3 :: 6
THREAD 1 :: 7
THREAD 2 :: 7
THREAD 3 :: 7
THREAD 1 :: 8
THREAD 2 :: 8
THREAD 3 :: 8
THREAD 1 :: 9
THREAD 2 :: 9
THREAD 3 :: 9
The synchronized keyword prevents concurrent access to a block of code or object by multiple threads. All the methods of Hashtable are synchronized, so only one thread can execute any of them at a time.
When using non-synchronized constructs like HashMap, you must build thread-safety features in your code to prevent consistency errors.
synchronized means that in a multi threaded environment, an object having synchronized method(s)/block(s) does not let two threads to access the synchronized method(s)/block(s) of code at the same time. This means that one thread can't read while another thread updates it.
The second thread will instead wait until the first thread completes its execution. The overhead is speed, but the advantage is guaranteed consistency of data.
If your application is single threaded though, synchronized blocks does not provide benefits.
The synchronized keyword causes a thread to obtain a lock when entering the method, so that only one thread can execute the method at the same time (for the given object instance, unless it is a static method).
This is frequently called making the class thread-safe, but I would say this is a euphemism. While it is true that synchronization protects the internal state of the Vector from getting corrupted, this does not usually help the user of Vector much.
Consider this:
if (vector.isEmpty()){
vector.add(data);
}
Even though the methods involved are synchronized, because they are being locked and unlocked individually, two unfortunately timed threads can create a vector with two elements.
So in effect, you have to synchronize in your application code as well.
Because method-level synchronization is a) expensive when you don't need it and b) insufficient when you need synchronization, there are now un-synchronized replacements (ArrayList in the case of Vector).
More recently, the concurrency package has been released, with a number of clever utilities that take care of multi-threading issues.
Overview
Synchronized keyword in Java has to do with thread-safety, that is, when multiple threads read or write the same variable.
This can happen directly (by accessing the same variable) or indirectly (by using a class that uses another class that accesses the same variable).
The synchronized keyword is used to define a block of code where multiple threads can access the same variable in a safe way.
Deeper
Syntax-wise the synchronized keyword takes an Object as it's parameter (called a lock object), which is then followed by a { block of code }.
When execution encounters this keyword, the current thread tries to "lock/acquire/own" (take your pick) the lock object and execute the associated block of code after the lock has been acquired.
Any writes to variables inside the synchronized code block are guaranteed to be visible to every other thread that similarly executes code inside a synchronized code block using the same lock object.
Only one thread at a time can hold the lock, during which time all other threads trying to acquire the same lock object will wait (pause their execution). The lock will be released when execution exits the synchronized code block.
Synchronized methods:
Adding synchronized keyword to a method definition is equal to the entire method body being wrapped in a synchronized code block with the lock object being this (for instance methods) and ClassInQuestion.getClass() (for class methods).
- Instance method is a method which does not have static keyword.
- Class method is a method which has static keyword.
Technical
Without synchronization, it is not guaranteed in which order the reads and writes happen, possibly leaving the variable with garbage.
(For example a variable could end up with half of the bits written by one thread and half of the bits written by another thread, leaving the variable in a state that neither of the threads tried to write, but a combined mess of both.)
It is not enough to complete a write operation in a thread before (wall-clock time) another thread reads it, because hardware could have cached the value of the variable, and the reading thread would see the cached value instead of what was written to it.
Conclusion
Thus in Java's case, you have to follow the Java Memory Model to ensure that threading errors do not happen.
In other words: Use synchronization, atomic operations or classes that use them for you under the hoods.
Sources
http://docs.oracle.com/javase/specs/jls/se8/html/index.html
Java® Language Specification, 2015-02-13
Think of it as a kind of turnstile like you might find at a football ground. There are parallel steams of people wanting to get in but at the turnstile they are 'synchronised'. Only one person at a time can get through. All those wanting to get through will do, but they may have to wait until they can go through.
What is the synchronized keyword?
Threads communicate primarily by sharing access to fields and the objects reference fields refer to. This form of communication is extremely efficient, but makes two kinds of errors possible: thread interference and memory consistency errors. The tool needed to prevent these errors is synchronization.
Synchronized blocks or methods prevents thread interference and make sure that data is consistent. At any point of time, only one thread can access a synchronized block or method (critical section) by acquiring a lock. Other thread(s) will wait for release of lock to access critical section.
When are methods synchronized?
Methods are synchronized when you add synchronized to method definition or declaration. You can also synchronize a particular block of code with-in a method.
What does it mean pro grammatically and logically?
It means that only one thread can access critical section by acquiring a lock. Unless this thread release this lock, all other thread(s) will have to wait to acquire a lock. They don't have access to enter critical section with out acquiring lock.
This can't be done with a magic. It's programmer responsibility to identify critical section(s) in application and guard it accordingly. Java provides a framework to guard your application, but where and what all sections to be guarded is the responsibility of programmer.
More details from java documentation page
Intrinsic Locks and Synchronization:
Synchronization is built around an internal entity known as the intrinsic lock or monitor lock. Intrinsic locks play a role in both aspects of synchronization: enforcing exclusive access to an object's state and establishing happens-before relationships that are essential to visibility.
Every object has an intrinsic lock associated with it. By convention, a thread that needs exclusive and consistent access to an object's fields has to acquire the object's intrinsic lock before accessing them, and then release the intrinsic lock when it's done with them.
A thread is said to own the intrinsic lock between the time it has acquired the lock and released the lock. As long as a thread owns an intrinsic lock, no other thread can acquire the same lock. The other thread will block when it attempts to acquire the lock.
When a thread releases an intrinsic lock, a happens-before relationship is established between that action and any subsequent acquisition of the same lock.
Making methods synchronized has two effects:
First, it is not possible for two invocations of synchronized methods on the same object to interleave.
When one thread is executing a synchronized method for an object, all other threads that invoke synchronized methods for the same object block (suspend execution) until the first thread is done with the object.
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object.
This guarantees that changes to the state of the object are visible to all threads.
Look for other alternatives to synchronization in :
Avoid synchronized(this) in Java?
Synchronized normal method equivalent to
Synchronized statement (use this)
class A {
public synchronized void methodA() {
// all function code
}
equivalent to
public void methodA() {
synchronized(this) {
// all function code
}
}
}
Synchronized static method equivalent to Synchronized statement (use class)
class A {
public static synchronized void methodA() {
// all function code
}
equivalent to
public void methodA() {
synchronized(A.class) {
// all function code
}
}
}
Synchronized statement (using variable)
class A {
private Object lock1 = new Object();
public void methodA() {
synchronized(lock1 ) {
// all function code
}
}
}
For synchronized, we have both Synchronized Methods and Synchronized Statements. However, Synchronized Methods is similar to Synchronized Statements so we just need to understand Synchronized Statements.
=> Basically, we will have
synchronized(object or class) { // object/class use to provides the intrinsic lock
// code
}
Here is 2 think that help understanding synchronized
Every object/class have an intrinsic lock associated with it.
When a thread invokes a synchronized statement, it automatically acquires the intrinsic lock for that synchronized statement's object and releases it when the method returns. As long as a thread owns an intrinsic lock, NO other thread can acquire the SAME lock => thread safe.
=>
When a thread A invokes synchronized(this){// code 1} => all the block code (inside class) where have synchronized(this) and all synchronized normal method (inside class) is locked because SAME lock. It will execute after thread A unlock ("// code 1" finished).
This behavior is similar to synchronized(a variable){// code 1} or synchronized(class).
SAME LOCK => lock (not depend on which method? or which statements?)
Use synchronized method or synchronized statements?
I prefer synchronized statements because it is more extendable. Example, in future, you only need synchronized a part of method. Example, you have 2 synchronized method and it don't have any relevant to each other, however when a thread run a method, it will block the other method (it can prevent by use synchronized(a variable)).
However, apply synchronized method is simple and the code look simple. For some class, there only 1 synchronized method, or all synchronized methods in the class in relevant to each other => we can use synchronized method to make code shorter and easy to understand
Note
(it not relevant to much to synchronized, it is the different between object and class or none-static and static).
When you use synchronized or normal method or synchronized(this) or synchronized(non-static variable) it will synchronized base on each object instance.
When you use synchronized or static method or synchronized(class) or synchronized(static variable) it will synchronized base on class
Reference
https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html
https://docs.oracle.com/javase/tutorial/essential/concurrency/locksync.html
Hope it help
Here is an explanation from The Java Tutorials.
Consider the following code:
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() {
c++;
}
public synchronized void decrement() {
c--;
}
public synchronized int value() {
return c;
}
}
if count is an instance of SynchronizedCounter, then making these methods synchronized has two effects:
First, it is not possible for two invocations of synchronized methods on the same object to interleave. When one thread is executing a synchronized method for an object, all other threads that invoke synchronized methods for the same object block (suspend execution) until the first thread is done with the object.
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads.
To my understanding synchronized basically means that the compiler write a monitor.enter and monitor.exit around your method. As such it may be thread safe depending on how it is used (what I mean is you can write an object with synchronized methods that isn't threadsafe depending on what your class does).
What the other answers are missing is one important aspect: memory barriers. Thread synchronization basically consists of two parts: serialization and visibility. I advise everyone to google for "jvm memory barrier", as it is a non-trivial and extremely important topic (if you modify shared data accessed by multiple threads). Having done that, I advise looking at java.util.concurrent package's classes that help to avoid using explicit synchronization, which in turn helps keeping programs simple and efficient, maybe even preventing deadlocks.
One such example is ConcurrentLinkedDeque. Together with the command pattern it allows to create highly efficient worker threads by stuffing the commands into the concurrent queue -- no explicit synchronization needed, no deadlocks possible, no explicit sleep() necessary, just poll the queue by calling take().
In short: "memory synchronization" happens implicitly when you start a thread, a thread ends, you read a volatile variable, you unlock a monitor (leave a synchronized block/function) etc. This "synchronization" affects (in a sense "flushes") all writes done before that particular action. In the case of the aforementioned ConcurrentLinkedDeque, the documentation "says":
Memory consistency effects: As with other concurrent collections,
actions in a thread prior to placing an object into a
ConcurrentLinkedDeque happen-before actions subsequent to the access
or removal of that element from the ConcurrentLinkedDeque in another
thread.
This implicit behavior is a somewhat pernicious aspect because most Java programmers without much experience will just take a lot as given because of it. And then suddenly stumble over this thread after Java isn't doing what it is "supposed" to do in production where there is a different work load -- and it's pretty hard to test concurrency issues.
Synchronized simply means that multiple threads if associated with single object can prevent dirty read and write if synchronized block is used on particular object. To give you more clarity , lets take an example :
class MyRunnable implements Runnable {
int var = 10;
#Override
public void run() {
call();
}
public void call() {
synchronized (this) {
for (int i = 0; i < 4; i++) {
var++;
System.out.println("Current Thread " + Thread.currentThread().getName() + " var value "+var);
}
}
}
}
public class MutlipleThreadsRunnable {
public static void main(String[] args) {
MyRunnable runnable1 = new MyRunnable();
MyRunnable runnable2 = new MyRunnable();
Thread t1 = new Thread(runnable1);
t1.setName("Thread -1");
Thread t2 = new Thread(runnable2);
t2.setName("Thread -2");
Thread t3 = new Thread(runnable1);
t3.setName("Thread -3");
t1.start();
t2.start();
t3.start();
}
}
We've created two MyRunnable class objects , runnable1 being shared with thread 1 and thread 3 & runnable2 being shared with thread 2 only.
Now when t1 and t3 starts without synchronized being used , PFB output which suggest that both threads 1 and 3 simultaneously affecting var value where for thread 2 , var has its own memory.
Without Synchronized keyword
Current Thread Thread -1 var value 11
Current Thread Thread -2 var value 11
Current Thread Thread -2 var value 12
Current Thread Thread -2 var value 13
Current Thread Thread -2 var value 14
Current Thread Thread -1 var value 12
Current Thread Thread -3 var value 13
Current Thread Thread -3 var value 15
Current Thread Thread -1 var value 14
Current Thread Thread -1 var value 17
Current Thread Thread -3 var value 16
Current Thread Thread -3 var value 18
Using Synchronzied, thread 3 waiting for thread 1 to complete in all scenarios. There are two locks acquired , one on runnable1 shared by thread 1 and thread 3 and another on runnable2 shared by thread 2 only.
Current Thread Thread -1 var value 11
Current Thread Thread -2 var value 11
Current Thread Thread -1 var value 12
Current Thread Thread -2 var value 12
Current Thread Thread -1 var value 13
Current Thread Thread -2 var value 13
Current Thread Thread -1 var value 14
Current Thread Thread -2 var value 14
Current Thread Thread -3 var value 15
Current Thread Thread -3 var value 16
Current Thread Thread -3 var value 17
Current Thread Thread -3 var value 18
In java to prevent multiple threads manipulating a shared variable we use synchronized keyword. Lets understand it with help of the following example:
In the example I have defined two threads and named them increment and decrement. Increment thread increases the value of shared variable (counter) by the same amount the decrement thread decreases it i.e 5000 times it is increased (which result in 5000 + 0 = 5000) and 5000 times we decrease (which result in 5000 - 5000 = 0).
Program without synchronized keyword:
class SynchronizationDemo {
public static void main(String[] args){
Buffer buffer = new Buffer();
MyThread incThread = new MyThread(buffer, "increment");
MyThread decThread = new MyThread(buffer, "decrement");
incThread.start();
decThread.start();
try {
incThread.join();
decThread.join();
}catch(InterruptedException e){ }
System.out.println("Final counter: "+buffer.getCounter());
}
}
class Buffer {
private int counter = 0;
public void inc() { counter++; }
public void dec() { counter--; }
public int getCounter() { return counter; }
}
class MyThread extends Thread {
private String name;
private Buffer buffer;
public MyThread (Buffer aBuffer, String aName) {
buffer = aBuffer;
name = aName;
}
public void run(){
for (int i = 0; i <= 5000; i++){
if (name.equals("increment"))
buffer.inc();
else
buffer.dec();
}
}
}
If we run the above program we expect value of buffer to be same since incrementing and decrementing the buffer by same amount would result in initial value we started with right ?. Lets see the output:
As you can see no matter how many times we run the program we get a different result reason being each thread manipulated the counter at the same time. If we could manage to let the one thread to first increment the shared variable and then second to decrement it or vice versa we will then get the right result that is exactly what can be done with synchronized keyword by just adding synchronized keyword before the inc and dec methods of Buffer like this:
Program with synchronized keyword:
// rest of the code
class Buffer {
private int counter = 0;
// added synchronized keyword to let only one thread
// be it inc or dec thread to manipulate data at a time
public synchronized void inc() { counter++; }
public synchronized void dec() { counter--; }
public int getCounter() { return counter; }
}
// rest of the code
and the output:
no matter how many times we run it we get the same output as 0
Java synchronized
volatile[About] => synchronized
synchronized block in Java is a monitor in multithreading. synchronized block with the same object/class can be executed by only single thread, all others are waiting. It can help with race condition[About] situation.
Java 5 extended synchronized by supporting happens-before[About]
An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor.
The next step is java.util.concurrent
synchronized simple means no two threads can access the block/method simultaneously. When we say any block/method of a class is synchronized it means only one thread can access them at a time. Internally the thread which tries to access it first take a lock on that object and as long as this lock is not available no other thread can access any of the synchronized methods/blocks of that instance of the class.
Note another thread can access a method of the same object which is not defined to be synchronized. A thread can release the lock by calling
Object.wait()
synchronized is a keyword in Java which is used to make happens before relationship in multithreading environment to avoid memory inconsistency and thread interference error.

Try to solving race condition without using any library in Java

I searched "java race condition" and saw a lot of articles, but none of them is what I am looking for.
I am trying to solve the race condition without using lock, synchronization, Thread.sleep something else. My code is here:
public class Test {
static public int amount = 0;
static public boolean x = false;
public static void main(String[] args) {
Thread a = new myThread1();
Thread b = new myThread2();
b.start();
a.start();
}
}
class myThread1 extends Thread {
public void run() {
for (int i = 0; i < 1000000; i++) {
if (i % 100000 == 0) {
System.out.println(i);
}
}
while(true){
Test.x = true;
}
}
}
class myThread2 extends Thread {
public void run() {
System.out.println("Thread 2: waiting...");
while (!Test.x) {
}
System.out.println("Thread 2: finish waiting!");
}
}
I expect the output should be:
Thread 2: waiting...
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
Thread 2: finish waiting!
(Terminated normally)
But it actually is:
Thread 2: waiting...
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
(And the program won't terminate)
After I added a statement to myThread2, changed
while (!Test.x) {
}
to
while (!Test.x) {
System.out.println(".");
}
The program terminate normally and the output is what I expected (except those ".')
I know when 2 threads execute concurrently, the CPU may arbitrarily switch to another before fetch the next instruction of machine code.
I thought it will be fine if one thread read a variable while another thread write to the variable. And I really don't get it why the program will not terminate normally. I also tried to add a Thread sleep statement inside the while loop of myThread1, but the program still will not terminate.
This question puzzled me few weeks, hope any one can help me please.
Try to declare x as volatile :
static public volatile boolean x = false;
Test.x isn't volatile and thus might not be synchronized between threads.
How the print-command in the second loop affects the overall behavior can't be predicted, but apparently in this case it causes x to be synchronized.
In general: if you omit all thread-related features of java, you can't produce any code, that has a well defined behavior. The minimum would be mark variables that are used by different threads as volatile and synchronize pieces of code, that my not run concurrently.
The shared variable x is being read and written from multiple threads without any synchronisation and hence only bad things can happen.
When you have the following,
while (!Test.x) {
}
The compiler might optimise this into an infinite loop since x (the non volatile variable) is not being changed inside the while loop, and this would prevent the program from terminating.
Adding a print statement will add more visibility since it has a synchronised block protecting System.out, this will lead into crossing the memory barrier and getting a fresh copy of Test.x.
You CAN NOT synchronise shared mutable state without using synchronisation constructs.
Much more better would be a LOCK object you may wait in Thread2 and send a notificytion in thread 1. You are currently active waiting in Thread2 and consume a lot of CPU resources.
Dummy code example:
main() {
Object lock = new Object();
Thread2 t2 = new Thread2(lock);
t2.start();
Thread1 t1 = new Thread1(lock);
t1.start();
...
}
class Thread1 {
Object lock = null;
public Thread1(Object lock) {
this.lock = lock;
...
}
public void run() {
...
synchronized (lock) {
lock.notifyAll();
}
}
} // end class Thread1
// similar Thread2
class Thread2 {
... constructor
public void run()
{
System.out.println("Thread 2: waiting...");
synchronized(lock) {
lock.wait();
}
System.out.println("Thread 2: finish waiting!");
}
....
This construct does not consume any CPU cycles without doing anything in "Thread2". You can create a custom number of "Thread2" instances and wait till "Thread1" is finished. You should "only" start all "Thread2" instances before "Thread1". Otherwise "Thread1" may finish before "Thread2" instances are started.
What you are really asking is, "Why does my program work as expected when I add a call to println()?"
Actions performed by one thread aren't generally required to be visible to other threads. The JVM is free to treat each thread as if it's operating in its own, private universe, which is often faster than trying to keep all other threads updated with events in that private universe.
If you have a need for some threads to stay up-to-date with some actions of another thread, you must "synchronize-with" those threads. If you don't, there's no guarantee threads will ever observe the actions of another.
Solving a race condition without a memory barrier is a nonsensical question. There's no answer, and no reason to look for one. Declare x to be a volatile field!
When you call System.out.println(), you are invoking a synchronized method, which, like volatile, acts as a memory barrier to synchronize with other threads. It appears to be sufficient in this case, but in general, even this is not enough to guarantee your program will work as expected. To guarantee the desired behavior, the first thread should acquire and release the same lock, System.out, after setting x to true.
Update:
Eric asks, "I am curious how volatile work, what has it done behind. I thought that everything can be created by addition, subtraction, compare, jumping, and assignment."
Volatile writes work by ensuring that values are written to a location that is accessible to all reading threads, like main memory, instead of something like a processor register or a data cache line.
Volatile reads work by ensuring that values are read from that shared location, instead of, for example, using a value cached in a register.
When Java byte codes are executed, they are translated to native instructions specific to the executing processor. The instructions necessary to make volatile work will vary, but the specification of the Java platform require that whatever the implementation, certain guarantees about visibility are met.

how synchronized keyword works internally

I read the below program and answer in a blog.
int x = 0;
boolean bExit = false;
Thread 1 (not synchronized)
x = 1;
bExit = true;
Thread 2 (not synchronized)
if (bExit == true)
System.out.println("x=" + x);
is it possible for Thread 2 to print “x=0”?
Ans : Yes ( reason : Every thread has their own copy of variables. )
how do you fix it?
Ans: By using make both threads synchronized on a common mutex or make both variable volatile.
My doubt is : If we are making the 2 variable as volatile then the 2 threads will share the variables from the main memory. This make a sense, but in case of synchronization how it will be resolved as both the thread have their own copy of variables.
Please help me.
This is actually more complicated than it seems. There are several arcane things at work.
Caching
Saying "Every thread has their own copy of variables" is not exactly correct. Every thread may have their own copy of variables, and they may or may not flush these variables into the shared memory and/or read them from there, so the whole thing is non-deterministic. Moreover, the very term flushing is really implementation-dependent. There are strict terms such as memory consistency, happens-before order, and synchronization order.
Reordering
This one is even more arcane. This
x = 1;
bExit = true;
does not even guarantee that Thread 1 will first write 1 to x and then true to bExit. In fact, it does not even guarantee that any of these will happen at all. The compiler may optimize away some values if they are not used later. The compiler and CPU are also allowed to reorder instructions any way they want, provided that the outcome is indistinguishable from what would happen if everything was really in program order. That is, indistinguishable for the current thread! Nobody cares about other threads until...
Synchronization comes in
Synchronization does not only mean exclusive access to resources. It is also not just about preventing threads from interfering with each other. It's also about memory barriers. It can be roughly described as each synchronization block having invisible instructions at the entry and exit, the first one saying "read everything from the shared memory to be as up-to-date as possible" and the last one saying "now flush whatever you've been doing there to the shared memory". I say "roughly" because, again, the whole thing is an implementation detail. Memory barriers also restrict reordering: actions may still be reordered, but the results that appear in the shared memory after exiting the synchronized block must be identical to what would happen if everything was indeed in program order.
All that only works, of course, only if both blocks use the same locking object.
The whole thing is described in details in Chapter 17 of the JLS. In particular, what's important is the so-called "happens-before order". If you ever see in the documentation that "this happens-before that", it means that everything the first thread does before "this" will be visible to whoever does "that". This may even not require any locking. Concurrent collections are a good example: one thread puts there something, another one reads that, and that magically guarantees that the second thread will see everything the first thread did before putting that object into the collection, even if those actions had nothing to do with the collection itself!
Volatile variables
One last warning: you better give up on the idea that making variables volatile will solve things. In this case maybe making bExit volatile will suffice, but there are so many troubles that using volatiles can lead to that I'm not even willing to go into that. But one thing is for sure: using synchronized has much stronger effect than using volatile, and that goes for memory effects too. What's worse, volatile semantics changed in some Java version so there may exist some versions that still use the old semantics which was even more obscure and confusing, whereas synchronized always worked well provided you understand what it is and how to use it.
Pretty much the only reason to use volatile is performance because synchronized may cause lock contention and other troubles. Read Java Concurrency in Practice to figure all that out.
Q & A
1) You wrote "now flush whatever you've been doing there to the shared
memory" about synchronized blocks. But we will see only the variables
that we access in the synchronize block or all the changes that the
thread call synchronize made (even on the variables not accessed in the
synchronized block)?
Short answer: it will "flush" all variables that were updated during the synchronized block or before entering the synchronized block. And again, because flushing is an implementation detail, you don't even know whether it will actually flush something or do something entirely different (or doesn't do anything at all because the implementation and the specific situation already somehow guarantee that it will work).
Variables that wasn't accessed inside the synchronized block obviously won't change during the execution of the block. However, if you change some of those variables before entering the synchronized block, for example, then you have a happens-before relationship between those changes and whatever happens in the synchronized block (the first bullet in 17.4.5). If some other thread enters another synchronized block using the same lock object then it synchronizes-with the first thread exiting the synchronized block, which means that you have another happens-before relationship here. So in this case the second thread will see the variables that the first thread updated prior to entering the synchronized block.
If the second thread tries to read those variables without synchronizing on the same lock, then it is not guaranteed to see the updates. But then again, it isn't guaranteed to see the updates made inside the synchronized block as well. But this is because of the lack of the memory-read barrier in the second thread, not because the first one didn't "flush" its variables (memory-write barrier).
2) In this chapter you post (of JLS) it is written that: "A write to a
volatile field (§8.3.1.4) happens-before every subsequent read of that
field." Doesn't this mean that when the variable is volatile you will
see only changes of it (because it is written write happens-before
read, not happens-before every operation between them!). I mean
doesn't this mean that in the example, given in the description of the
problem, we can see bExit = true, but x = 0 in the second thread if
only bExit is volatile? I ask, because I find this question here: http://java67.blogspot.bg/2012/09/top-10-tricky-java-interview-questions-answers.html
and it is written that if bExit is volatile the program is OK. So the
registers will flush only bExits value only or bExits and x values?
By the same reasoning as in Q1, if you do bExit = true after x = 1, then there is an in-thread happens-before relationship because of the program order. Now since volatile writes happen-before volatile reads, it is guaranteed that the second thread will see whatever the first thread updated prior to writing true to bExit. Note that this behavior is only since Java 1.5 or so, so older or buggy implementations may or may not support this. I have seen bits in the standard Oracle implementation that use this feature (java.concurrent collections), so you can at least assume that it works there.
3) Why monitor matters when using synchronized blocks about memory
visibility? I mean when try to exit synchronized block aren't all
variables (which we accessed in this block or all variables in the
thread - this is related to the first question) flushed from registers
to main memory or broadcasted to all CPU caches? Why object of
synchronization matters? I just cannot imagine what are relations and
how they are made (between object of synchronization and memory).
I know that we should use the same monitor to see this changes, but I
don't understand how memory that should be visible is mapped to
objects. Sorry, for the long questions, but these are really
interesting questions for me and it is related to the question (I
would post questions exactly for this primer).
Ha, this one is really interesting. I don't know. Probably it flushes anyway, but Java specification is written with high abstraction in mind, so maybe it allows for some really weird hardware where partial flushes or other kinds of memory barriers are possible. Suppose you have a two-CPU machine with 2 cores on each CPU. Each CPU has some local cache for every core and also a common cache. A really smart VM may want to schedule two threads on one CPU and two threads on another one. Each pair of the threads uses its own monitor, and VM detects that variables modified by these two threads are not used in any other threads, so it only flushes them as far as the CPU-local cache.
See also this question about the same issue.
4) I thought that everything before writing a volatile will be up to
date when we read it (moreover when we use volatile a read that in
Java it is memory barrier), but the documentation don't say this.
It does:
17.4.5.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
If hb(x, y) and hb(y, z), then hb(x, z).
A write to a volatile field (§8.3.1.4) happens-before every subsequent
read of that field.
If x = 1 comes before bExit = true in program order, then we have happens-before between them. If some other thread reads bExit after that, then we have happens-before between write and read. And because of the transitivity, we also have happens-before between x = 1 and read of bExit by the second thread.
5) Also, if we have volatile Person p does we have some dependency
when we use p.age = 20 and print(p.age) or have we memory barrier in
this case(assume age is not volatile) ? - I think - No
You are correct. Since age is not volatile, then there is no memory barrier, and that's one of the trickiest things. Here is a fragment from CopyOnWriteArrayList, for example:
Object[] elements = getArray();
E oldValue = get(elements, index);
if (oldValue != element) {
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len);
newElements[index] = element;
setArray(newElements);
} else {
// Not quite a no-op; ensures volatile write semantics
setArray(elements);
Here, getArray and setArray are trivial setter and getter for the array field. But since the code changes elements of the array, it is necessary to write the reference to the array back to where it came from in order for the changes to the elements of the array to become visible. Note that it is done even if the element being replaced is the same element that was there in the first place! It is precisely because some fields of that element may have changed by the calling thread, and it's necessary to propagate these changes to future readers.
6) And is there any happens before 2 subsequent reads of volatile
field? I mean does the second read will see all changes from thread
which reads this field before it(of course we will have changes only
if volatile influence visibility of all changes before it - which I am
a little confused whether it is true or not)?
No, there is no relationship between volatile reads. Of course, if one thread performs a volatile write and then two other thread perform volatile reads, they are guaranteed to see everything at least up to date as it was before the volatile write, but there is no guarantee of whether one thread will see more up-to-date values than the other. Moreover, there is not even strict definition of one volatile read happening before another! It is wrong to think of everything happening on a single global timeline. It is more like parallel universes with independent timelines that sometimes sync their clocks by performing synchronization and exchanging data with memory barriers.
It depends on the implementation which decides if threads will keep a copy of the variables in their own memory. In case of class level variables threads have a shared access and in case of local variables threads will keep a copy of it. I will provide two examples which shows this fact , please have a look at it.
And in your example if I understood it correctly your code should look something like this--
package com.practice.multithreading;
public class LocalStaticVariableInThread {
static int x=0;
static boolean bExit = false;
public static void main(String[] args) {
Thread t1=new Thread(run1);
Thread t2=new Thread(run2);
t1.start();
t2.start();
}
static Runnable run1=()->{
x = 1;
bExit = true;
};
static Runnable run2=()->{
if (bExit == true)
System.out.println("x=" + x);
};
}
Output
x=1
I am getting this output always. It is because the threads share the variable and the when it is changed by one thread other thread can see it. But in real life scenarios we can never say which thread will start first, since here the threads are not doing anything we can see the expected result.
Now take this example--
Here if you make the i variable inside the for-loop` as static variable then threads won t keep a copy of it and you won t see desired outputs, i.e. the count value will not be 2000 every time even if u have synchronized the count increment.
package com.practice.multithreading;
public class RaceCondition2Fixed {
private int count;
int i;
/*making it synchronized forces the thread to acquire an intrinsic lock on the method, and another thread
cannot access it until this lock is released after the method is completed. */
public synchronized void increment() {
count++;
}
public static void main(String[] args) {
RaceCondition2Fixed rc= new RaceCondition2Fixed();
rc.doWork();
}
private void doWork() {
Thread t1 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
/*if we don t use join then count will be 0. Because when we call t1.start() and t2.start()
the threads will start updating count in the spearate threads, meanwhile the main thread will
print the value as 0. So. we need to wait for the threads to complete. */
System.out.println(Thread.currentThread().getName()+" Count is : "+count);
}
}

java Multithreading concurrency of two or more threads?

Well,
I'm trying to understand this case. When i create two thread sharing the same instance of Runnable. Why is this order?
Hello from Thread t 0
Hello from Thread u 1
Hello from Thread t 2
Hello from Thread t 4
Hello from Thread u 3 <----| this is not in order
Hello from Thread u 6
Hello from Thread t 5 <----| this one too
Hello from Thread t 8
Hello from Thread t 9
Hello from Thread t 10
i'll show you the code of two thread:
public class MyThreads {
public static void main(String[] args) {
HelloRunnerShared r = new HelloRunnerShared();
Thread t = new Thread(r,"Thread t");
Thread u = new Thread(r,"Thread u");
t.start();
u.start();
}
}
And concluding, the final question is if i'm running this thread i understand they're not running in order but. Why a thread is keeping or printing a number in disorder?
This is the code for the runnable:
class HelloRunnerShared implements Runnable{
int i=0;
public void run(){
String name = Thread.currentThread().getName();
while (i< 300) {
System.out.println("Hello from " + name + " " + i++);
}
}
}
i thought they would be processed intercalated. It's just an assumption!!
Thanks!
Why do you think threads should be executing in a particular order? It's a nondeterministic phenomenon -- whichever is scheduled first, runs first.
Use the ExecutorService.invokeAll if you want things to run in the order in a fixed order, regardless of their schedule.
There are several things going on:
The OS scheduler can switch between threads any time it wants. There's no fairness requirement, the scheduler may favor one thread over another (for instance, it could be trying to minimize the amount of context-switching).
The only locking going on is on the PrintStream used by the println method, which keeps the threads from writing to the console simultaneously. Which thread acquires the lock on the PrintStream when depends on the OS scheduler. The locks used are the intrinsic ones used with the synchronized keyword, they are not fair. The scheduler can give the lock to the same thread that took it last time.
++ is not an atomic operation. The two threads can get in each other's way updating i. You could use AtomicInteger instead of an int.
Access to i is not protected by a lock or any other means of enforcing a happens-before boundary, so updates to it may or may not be visible to other threads. Just because one thread updates i doesn't automatically mean the other thread will see the updated value right away, or at all (how forgiving the JVM is about this depends on the implementation). In the absence of happens-before boundaries the JVM can make optimizations like reordering bytecodes or performing aggressive caching.

Java Thread: explain why not always print the newest value of variable

I have this piece of code using Java Thread:
public class ThreadExample implements Runnable{
public Thread thread;
static int i = 0;
ThreadExample(){
thread = new Thread(this);
thread.start();
}
public static void main(String[] args) {
ThreadExample example = new ThreadExample();
for(int n =0; n<1000; n++){
System.out.println("main thread "+i);
i++;
}
}
public void run(){
for(int index=0; index<1000; index++){
System.out.println("Sub thread "+i);
i++;
}
}
}
And when run, the result is:
main thread 0
Sub thread 0
Sub thread 2
Sub thread 3
Sub thread 4
Sub thread 5
Sub thread 6
Sub thread 7
Sub thread 8
Sub thread 9
Sub thread 10
Sub thread 11
Sub thread 12
main thread 1
....
I know thread that run don't follow by its order. But, the thing that I don't understand is: why main thread prints 1 (it prints variable i), when variable i has come to 12 ? (because subthread has printed to 12).
Thanks :)
Most likely, the explanation is that there's a large delay between when the text is prepared to be printed and when it can actually be printed. The main thread prepared the statement "main thread 1" and then had to wait until the sub thread released the lock on System.out, and it acquired the lock, in order to actually print it.
If you make i volatile, and it still happens, then this is pretty much the only explanation.
This happens because the threads won't see each others modifications without some synchronization or volatile keyword.
See http://docs.oracle.com/javase/specs/jls/se7/html/jls-8.html#jls-8.3.1.4
The result is likely due to the way the lock acquisition (which occurs under the covers for the System.out.println call) isn't inherently fair. Unfair implementations, or at least partially unfair ones, usually don't guarantee that the threads will acquire the lock in the same order they wait for it.
The result is that when two threads are competing for the lock repeatedly in a tight loop (as in your example), you will usually have a run of acquisitions by one threads, then a run of acquisitions by the other, and so on. Unfair locks often are simpler to implement, and generally improve performance (throughput) over the perfectly fair alternative for highly contended locks, since they don't suffer from the lock convoy effect. The downside, evidently, is that no guarantees are made about the order particular waiting threads will acquire the lock, and in theory some waiting threads can be starved out in definitely (in practive this may not occur, or the lock may be designed with a fallback to fair behavior when some threads has been waiting an unusual amount of time).
Given unfair locks, the pattern about is natural. The loop in your 2nd thread got the lock, and shortly after the first thread read 1 and began waiting - the String to output has already been concatenated with the leading text at this point, so the value 1 is "baked in". The main thread has to wait for several lock/unlock pairs on the other thread, at which it gets a chance to run, and prints the old value.
An alternate explanation is that due to the lack of volatile, the interpreter or JIT isn't even reading the shared value of i anymore, but has hosted the variable into a register - this is allowed under the JMM and JVM spec since there are no intervening methods which might modify i.
As far as I know this is because of the buffering of System.out
Try to write in a file or call System.out.flush() after each printlen.
and try to increase i via a synchronized method:
public synchronized void increment() {
i++;
}

Categories

Resources