Difference in thread behaviour while accessing variable - java

I was going through the tutorial https://www.youtube.com/watch?v=SC2jXxOPe5E to understand how volatile variable works and came across a strange behavior.
For the following code snippet
public class VolatileDemo {
static boolean running = false;
public static void main(String a[]) throws InterruptedException {
Thread t = new Thread(new Runnable() {
#Override
public void run() {
while (!running) {
}
System.out.print("Started");
while (running) {
}
System.out.print("Stopped");
}
});
t.start();
Thread.sleep(1000);
running = true;
System.out.print("Starting ");
Thread.sleep(1000);
running = false;
System.out.print("Stopping");
}
}
The output is : Starting Stopping (which is understandable by video)
But for the following code snippet
public class VolatileDemo {
static boolean running = false;
public static void main(String a[]) throws InterruptedException {
Thread t = new Thread(new Runnable() {
#Override
public void run() {
while (!running) {
System.out.print("Flag " + running);
}
System.out.print(" Started");
while (running) {
System.out.print(" Flag " + running);
}
System.out.print(" Stopped");
}
});
t.start();
Thread.sleep(1000);
running = true;
System.out.print(" Starting");
Thread.sleep(1000);
running = false;
System.out.print(" Stopping");
}
}
The output is Flag: false Starting Started Flag: true Stopping Stopped(ignore the output)
My concern here is why the thread was able to read the updated value of 'running' in case 2?
Edit: The difference between the two snippets is the addition below statement in later case
System.out.print("Flag " + running);

I think it's important to understand what the purpose of volatile is.
On a multi-processor system with multiple levels of cache an update to a variable can take some time to reach main memory and hence other threads depending on latency and the hardware design. The code basically is supposed to demo this happening, the change to the running variable is changed and the output should show some delay between when it was changed and when the thread actually starts. Adding the volatile keyword should reduce this delay as it forces the write through to main memory to happen immediately instead of when the cache decides to do it.
Please note that volatile doesn't make code thread safe, it just tells the JVM that the variable needs to be written directly to main memory bypassing any delayed writing scheme the caching hardware might otherwise do. It also means that the variable is read from main memory so that stale data is not used. It is for reducing the latency between a variable being updated in one thread and seen to be updated in another. This isn't something you'll need often, and should use sparingly as bypassing the cache will have performance ramifications for your code.
When you've added extra instructions to the code of the thread you've effectively significantly reduced the rate at which it can poll the running variable. I'd say it's likely that the time between running being changed and being updated in main memory is very small, much quicker than the time it takes to output on the console (which takes longer than you'd think). So it is very likely that you won't see what you expect except on the rare occasion that the boolean evaluation in the while loop happens at the exact correct moment.
Unlike deadlock, this particular property of multi-processing is quite hard to demonstrate on a single machine. It's much more likely to happen on a NUMA system architecture (or a cluster) where the cache to memory update latency can be much larger. On a single system the time between a variable being updated in cache and written to main memory is very small.

Related

Is Java caching entire objects or only parts of objects? (Visibility Issues)

I was trying to intentionally create visibility issues with threads and I got unexpected results:
public class DownloadStatus {
private int totalBytes;
private boolean isDone;
public void increment() {
totalBytes++;
}
public int getTotalBytes() {
return totalBytes;
}
public boolean isDone() {
return isDone;
}
public void done() {
isDone = true;
}
}
public class DownloadFileTask implements Runnable {
DownloadStatus status;
public DownloadFileTask(DownloadStatus status) {
this.status = status;
}
#Override
public void run() {
System.out.println("start download");
for (int i = 0; i < 10_000; i++) { //"download" a 10,000 bytes file each time you run
status.increment(); //each byte downloaded - update the status
}
System.out.println("download ended with: " + status.getTotalBytes()); //**NOTE THIS LINE**
status.done();
}
}
//creating threads, one to download, another to wait for the download to be done.
public static void main(String[] args) {
DownloadStatus status = new DownloadStatus();
Thread t1 = new Thread(new DownloadFileTask(status));
Thread t2 = new Thread(() -> {
while (!status.isDone()) {}
System.out.println("DONE!!");
});
t1.start();
t2.start();
}
So, running this would create a visibility problem - the second thread wouldn't see the updated value since it had cached it before it got written back by the first thread - this causes an endless (while) loop, the second thread is constantly checking the cached isDone(). (at least that's how I think it works).
The thing I don't get is why this visibility problem stops happening when I comment out the line from the second code block that calls status.getTotalBytes().
From my understanding both threads start by caching the status object as-is, so the second thread should constantly check his cached value (and essentially not see the new value updated by the first thread).
Why is this line calling a method in the status object causing this visibility issue? (and more interestingly - why not calling it fixes it?)
What you call a "visibility problem" is actually a data race.
A single thread sees the effects of its operations in the order they are written. That is if you update a variable and then read it, you'll always see the updated value within that thread.
The effects of a thread's execution may be different when viewed from another thread. This is mainly related to the language and the underlying hardware architecture. The compiler may reorder instructions, delay memory writes while keeping values in registers, or the values may be kept in a cache before written to the main memory. Without an explicit memory barrier, the value in the main memory would not be updated. That's what you call the "visibility problem".
It is likely that there is a memory barrier in System.println. So when you execute that line, all updates up to that point will be committed to the main memory, and the other threads can see it. Note that without explicit synchronization, there is still no guarantee that the other threads will see it, because those threads may re-use the value they got for that variable before. There is nothing in the program that tells the compiler/runtime that the values may be changed by other threads.
This is the race condition between two threads. There is nothing to do with status.getTotalBytes() statement in your code. It is the scheduler that decides which thread will run. It is by chance that you are not getting stuck in the infinit loop after commenting the println statement. The main problem in your code that increment and set status should be atomic operation and replace the definition of run method as below. Secondly increment is also not a atomic operation. You can unpredictable results if there is no proper synchronization.
#Override
public void run() {
System.out.println("start download");
incrementAndSetStatus();
}
public synchronized void incrementAndSetStatus(){
for (int i = 0; i < 100000; i++) { //"download" a 10,000 bytes file each time you run
status.increment(); //each byte downloaded - update the status
}
System.out.println("download ended with: " + status.getTotalBytes()); //**NOTE THIS LINE**
status.done();
}

Unexpected thread behavior. Visibility

I have the following code:
public static boolean turn = true;
public static void main(String[] args) {
Runnable r1 = new Runnable() {
public void run() {
while (true) {
while (turn) {
System.out.print("a");
turn = false;
}
}
}
};
Runnable r2 = new Runnable() {
public void run() {
while (true) {
while (!turn) {
System.out.print("b");
turn = true;
}
}
}
};
Thread t1 = new Thread(r1);
Thread t2 = new Thread(r2);
t1.start();
t2.start();
}
In class we've learned about "Visibility" problems that may occur when using un-synchronized code.
I understand that in order to save time, the compiler will decide the grab turn to the cache in the CPU for the loop, meaning that the thread will not be aware if the turn value was changed in the RAM because he doesn't check it.
From what I understand, I would expected the code to run like this:
T1 will see turn as true -> enter loop and print -> change turn to false -> gets stuck
T2 will think turn hasn't changed -> will get stuck
I would expect that if T1 will start before T2: only 'a' will be printed and both threads will run in an infinite loop without printing anything else
However, when I'm running the code sometimes I get a few "ababa...." before both threads will stuck.
What am I missing ?
EDIT:
The following code does what I expect it: the thread will run in a infinite loop:
public class Test extends Thread {
boolean keepRunning = true;
public void run() {
long count = 0;
while (keepRunning) {
count++;
}
System.out.println("Thread terminated." + count);
}
public static void main(String[] args) throws InterruptedException {
Test t = new Test();
t.start();
Thread.sleep(1000);
t.keepRunning = false;
System.out.println("keepRunning set to false.");
}
}
How are they different from each other ?
When I run the code, sometimes I get a few "ababa...." before both threads will stuck.
I suspect that what is happening is that the behavior is changing when the code is JIT compiled. Before JIT compilation the writes are visible because the interpreter is doing write-throughs. After JIT compilation, either the cache flushes or the reads have been optimized away ... because the memory model allows this.
What am I missing ?
What you are missing is that you are expecting unspecified behavior to be consistent. It doesn't have to be. After all, it is unspecified! (This is true, even if my proposed explanation above is incorrect.)
The fact that turn isn't volatile doesn't mean that your code WILL break, just that it MIGHT break. For all we know, the thread could see false or true at any given moment. Caches could just be randomly flushed for no reason in particular, the thread could retain its cache, etc etc.
It could be because your code is experiencing side effects from System.out.print, which internally writes to a synchronized method:
521 private void write(String s) {
522 try {
523 synchronized (this) {
(Source - DocJar)
The memory effects of synchronized could be flushing the cache and therefore impact your code.
As #Stephen C said, it could also be the JIT, which might hoist the boolean check because it assumes that the value can't change due to another thread.
So out of the three different possibilities mentioned so far, they could all be factors to contribute to how your code behaves. Visibility is a factor, not a determiner.

Java threads counter "issue"?

I was trying impact of thread priority and when println in run method stays in comment both threads end in the same time and I don't understand this behavior, can you explain ? Thank you.
Main.class
public class Main {
public static void main(String[] args) {
Test t1 = new Test("Thread #1");
Test t2 = new Test("Thread #2");
t1.thread.setPriority(10);
t2.thread.setPriority(1);
t1.thread.start();
t2.thread.start();
try {
t1.thread.join();
t2.thread.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(t1.thread.getName() + ": " + t1.count);
System.out.println(t2.thread.getName() + ": " + t2.count);
System.out.println("End of main thread.");
}
}
Test.class
public class Test implements Runnable{
public Thread thread;
static boolean stop = false;
int count = 0;
public Test(String name){
thread = new Thread(this, name);
}
#Override
public void run(){
for(int i = 0; i < 10000000 && stop == false; i++){
count = i;
//System.out.println(count + " " + thread.getName());
}
stop = true;
System.out.println("End of " + thread.getName());
}
}
without println with println
End of Thread #1 End of Thread #1
End of Thread #2 End of Thread #2
Thread #1: 9999999 Thread #1: 9999999
Thread #2: 9999999 Thread #2: 3265646
End of main thread. End of main thread.
Your two threads access a shared mutable variable without proper synchronization. In this case, there is no guaranty about when (or whether at all) a thread will learn about a change made by another thread. In your case, the change made by one thread is not noticed by the other at all. Note that while for a primitive data type like boolean, not reading the up to date value is the worst thing that can happen, for non-primitive data types, even worse problems, i.e. inconsistent results could occur.
Inserting a print statement has the side effect of synchronizing the threads, because the PrintStream perform an internal synchronization. Since there is no guaranty that System.out will contain such a synchronizing print stream implementation, this is an implementation specific side-effect.
If you change the declaration of stop to
static volatile boolean stop = false;
the threads will re-read the value from the shared heap in each iteration, reacting immediately on the change, at the cost of reduced overall performance.
Note that there are still no guarantees that this code works as you expect, as there is no guaranty about neither, that the thread priority has any effect nor that threads run in parallel at all. Thread scheduling is implementation and environment dependent behavior. E.g. you might find out that not the thread with the highest priority finishes its loop first, but just the thread that happened to be started first.
To clarify: the only purpose of thread/process "priority," in any language environment on any operating system, is to suggest to the OS "which of these two 'ought to be, I think, run first'," if both of them happen to be instantaneously "runnable" and a choice must be made to run only one of them.
(In my experience, the best example of this in-practice is the Unix/Linux nice command, which voluntarily reduces the execution-priority of a command by a noticeable amount.) CPU-intensive workloads which perform little I/O can actually benefit from being given a reduced priority.
As other answerers have already stressed, it is impossible to predict "what will actually happen," and priority can never be used to alter this premise. You must explicitly use appropriate synchronization-primitives to assure that your code executes properly in all situations.

Strange behavior of a Java thread associated with System.out [duplicate]

This question already has an answer here:
Loop doesn't see value changed by other thread without a print statement
(1 answer)
Closed 7 years ago.
I have a simple TestThreadClientMode class to test a race condition. I tried two attempts:
When I run the following code with System.out.println(count); commented in the second thread, the output was:
OS: Windows 8.1
flag done set true
...
and the second thread was alive forever. Because the second thread never sees change of the done flag which was set true by Main thread.
When I uncommented System.out.println(count); the output was:
OS: Windows 8.1
0
...
190785
190786
flag done set true
Done! Thread-0 true
And the program stopped after 1 second.
How did System.out.println(count); make the second thread see the change in done?
Code
public class TestThreadClientMode {
private static boolean done;
public static void main(String[] args) throws InterruptedException {
new Thread(new Runnable() {
public void run() {
int count = 0;
while (!done) {
count ++;
//System.out.println(count);
}
System.out.println("Done! " + Thread.currentThread().getName() + " " + done);
}
}).start();
System.out.println("OS: " + System.getProperty("os.name"));
Thread.sleep(1000);
done = true;
System.out.println("flag done set true ");
}
}
This is a brilliant example of memory consistency errors. Simply put, the variable is updated but the first thread does not always see the variable change. This issue can be solved by making done variable volatile by declaring it like so:
private static volatile boolean done;
In this case, changes to the variable are visible to all threads and the program always terminates after one second.
Update: It appears that using System.out.println does indeed solve the memory consistency issue - this is because the print function makes use of an underlying stream, which implements synchronization. Synchronization establishes a happens-before relationship as described in the tutorial I linked, which has the same effect as the volatile variable. (Details from this answer. Also credit to #Chris K for pointing out the side effect of the stream operation.)
How did System.out.println(count); make the second thread see the change in done?
You are witnessing a side effect of println; your program is suffering from a concurrent race condition. When coordinating data between CPUs it is important to tell the Java program that you want to share the data between the CPUs, otherwise the CPUs are free to delay communication with each other.
There are a few ways to do this in Java. The main two are the keywords 'volatile' and 'synchronized' which both insert what hardware guys call 'memory barriers' into your code. Without inserting 'memory barriers' into the code, the behaviour of your concurrent program is not defined. That is, we do not know when 'done' will become visible to the other CPU, and it is thus a race condition.
Here is the implementation of System.out.println; notice the use of synchronized. The synchronized keyword is responsible for placing memory barriers in the generated assembler which is having the side effect of making the variable 'done' visible to the other CPU.
public void println(boolean x) {
synchronized (this) {
print(x);
newLine();
}
}
The correct fix for your program, is to place a read memory barrier when reading done and a write memory barrier on writing to it. Typically this is done by reading or writing to 'done' from within a synchronized block. In this case, marking the variable done as volatile will have the same net effect. You can also use an AtomicBoolean instead of boolean for the variable.
The println() implementation contains explicit memory barrier:
public void println(String x) {
synchronized (this) {
print(x);
newLine();
}
}
Which causes the invoking thread to refresh all variables.
The following code will have the same behavior as yours:
public void run() {
int count = 0;
while (!done) {
count++;
synchronized (this) {
}
}
System.out.println("Done! " + Thread.currentThread().getName() + " " + done);
}
In fact any object can be used for monitor, following will also work:
synchronized ("".intern()) {
}
Another way to create explicit memory barrier is using volatile, so the following will work:
new Thread() {
private volatile int explicitMemoryBarrier;
public void run() {
int count = 0;
while (!done) {
count++;
explicitMemoryBarrier = 0;
}
System.out.println("Done! " + Thread.currentThread().getName() + " " + done);
}
}.start();

Making Java Volatile to work

I done example program to understand how volatile work. In the below example Even without volatile the program work fine. Could some one help me to understand how the program works fine without volatile?
public class VolatileExp {
private /*volatile*/ boolean statusFlag=false;
private void changeState() {
try {
int counter=0;
while (!statusFlag) {
System.err.println("counter: "+counter++);
//Thread.sleep(100);
}
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]) {
final VolatileExp hello = new VolatileExp();
Thread t1 = new Thread(new Runnable() {
#Override
public void run() {
hello.changeState();
}
});
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(2000);
hello.statusFlag=true;
System.err.println("setting the status flag ");
} catch (Exception e) {
e.printStackTrace();
}
}
});
t1.start();
t2.start();
}
}
There are several reasons why you can’t observe missing updates for your non-volatile variable.
As pointed out by others in the comments, you can’t rely on failures to happen. In this very example, your program runs too short, so the optimizer won’t make any effort here. Running your program with the       -server option will change that.
Further, you are executing a System.err.println(…); statement within the loop which is internally synchronized. Hence, the heap variables will be re-read in every iteration unless the optimizer decides to enlarge the synchronized code block to cover the entire loop (which is rather unlikely as this would imply holding a lock forever). So after the heap value changed, sooner or later, the first thread will eventually read the changed flag.
Since the second thread also invokes System.err.println(…); after changing the flag it will be forced to actually write the updated values to the heap so both threads are implicitly synchronized on System.err. But even without doing the printout the second thread will eventually write the value to the heap as the thread ends afterwards.
So you have a program that works on most systems due to side-effects but is still broken. Note that in theory the first thread running in a loop consuming 100% CPU time could force the second thread to never run and thus never set the termination flag. However, most today’s systems will preemptively switch between threads.
Even if it worked every time, relying on it was very dangerous as it is not easy to see the side-effects on which it relies which means, simple changes like removing the print statement in the first thread and running with the -server option (or on any other JVM performing similar optimizations) would turn the program from accidentally running into likely breaking.

Categories

Resources