Strange behavior of a Java thread associated with System.out [duplicate]

Strange behavior of a Java thread associated with System.out [duplicate] - java

This question already has an answer here:
Loop doesn't see value changed by other thread without a print statement
(1 answer)
Closed 7 years ago.
I have a simple TestThreadClientMode class to test a race condition. I tried two attempts:
When I run the following code with System.out.println(count); commented in the second thread, the output was:
OS: Windows 8.1
flag done set true
...
and the second thread was alive forever. Because the second thread never sees change of the done flag which was set true by Main thread.
When I uncommented System.out.println(count); the output was:
OS: Windows 8.1
0
...
190785
190786
flag done set true
Done! Thread-0 true
And the program stopped after 1 second.
How did System.out.println(count); make the second thread see the change in done?
Code
public class TestThreadClientMode {
private static boolean done;
public static void main(String[] args) throws InterruptedException {
new Thread(new Runnable() {
public void run() {
int count = 0;
while (!done) {
count ++;
//System.out.println(count);
}
System.out.println("Done! " + Thread.currentThread().getName() + " " + done);
}
}).start();
System.out.println("OS: " + System.getProperty("os.name"));
Thread.sleep(1000);
done = true;
System.out.println("flag done set true ");
}
}

This is a brilliant example of memory consistency errors. Simply put, the variable is updated but the first thread does not always see the variable change. This issue can be solved by making done variable volatile by declaring it like so:
private static volatile boolean done;
In this case, changes to the variable are visible to all threads and the program always terminates after one second.
Update: It appears that using System.out.println does indeed solve the memory consistency issue - this is because the print function makes use of an underlying stream, which implements synchronization. Synchronization establishes a happens-before relationship as described in the tutorial I linked, which has the same effect as the volatile variable. (Details from this answer. Also credit to #Chris K for pointing out the side effect of the stream operation.)

How did System.out.println(count); make the second thread see the change in done?
You are witnessing a side effect of println; your program is suffering from a concurrent race condition. When coordinating data between CPUs it is important to tell the Java program that you want to share the data between the CPUs, otherwise the CPUs are free to delay communication with each other.
There are a few ways to do this in Java. The main two are the keywords 'volatile' and 'synchronized' which both insert what hardware guys call 'memory barriers' into your code. Without inserting 'memory barriers' into the code, the behaviour of your concurrent program is not defined. That is, we do not know when 'done' will become visible to the other CPU, and it is thus a race condition.
Here is the implementation of System.out.println; notice the use of synchronized. The synchronized keyword is responsible for placing memory barriers in the generated assembler which is having the side effect of making the variable 'done' visible to the other CPU.
public void println(boolean x) {
synchronized (this) {
print(x);
newLine();
}
}
The correct fix for your program, is to place a read memory barrier when reading done and a write memory barrier on writing to it. Typically this is done by reading or writing to 'done' from within a synchronized block. In this case, marking the variable done as volatile will have the same net effect. You can also use an AtomicBoolean instead of boolean for the variable.

The println() implementation contains explicit memory barrier:
public void println(String x) {
synchronized (this) {
print(x);
newLine();
}
}
Which causes the invoking thread to refresh all variables.
The following code will have the same behavior as yours:
public void run() {
int count = 0;
while (!done) {
count++;
synchronized (this) {
}
}
System.out.println("Done! " + Thread.currentThread().getName() + " " + done);
}
In fact any object can be used for monitor, following will also work:
synchronized ("".intern()) {
}
Another way to create explicit memory barrier is using volatile, so the following will work:
new Thread() {
private volatile int explicitMemoryBarrier;
public void run() {
int count = 0;
while (!done) {
count++;
explicitMemoryBarrier = 0;
}
System.out.println("Done! " + Thread.currentThread().getName() + " " + done);
}
}.start();

Related

can synchronized guarantee variables outside synchronous code block visible between threads?

I have a question about visibility between varaibles in threads (see below), while loop cant stop after I comment synchronized (this){}, but uncomment it, while loop can stop normally, which prove synchronized(this){} can make shared varabiles visible between threads.
I know JMM's happens before principle is used to guarantee shared variables visible each other, but I dont know the above code satisfy which principle of happens before? or can synchronized guarantee variables outside synchronous code block visible between threads?
#Slf4j(topic = "threadVisible")
public class ThreadVisible {
int i = 0;
public void go() {
new Thread(()-> {
while (true) {
synchronized (this){}
if (i != 0) break;
}
}).start();
}
public static void main(String[] args) throws InterruptedException {
ThreadVisible t = new ThreadVisible();
t.go();
Thread.sleep(3000);
t.i = 10; //
log.info("end");
}
}

There is no happens before in your code between t.i = 10 and if (i != 0), even with the synchronized statement. The reason is that to create a hb relationship, you need to synchronize the assignment t.i = 10 too.
The machine (JVM + OS + CPU) on which you run this probably does more than required when calling synchronized and effectively synchronizes everything. On a different machine you could experience an infinite loop.
Actually you could replace synchronized (this){} by System.out.println("OK") and you would probably get the same behaviour, just because println is synchronized.
Also empty synchronized statements are rarely what you need, although it makes no difference in your case because of the while loop.

Is Java caching entire objects or only parts of objects? (Visibility Issues)

I was trying to intentionally create visibility issues with threads and I got unexpected results:
public class DownloadStatus {
private int totalBytes;
private boolean isDone;
public void increment() {
totalBytes++;
}
public int getTotalBytes() {
return totalBytes;
}
public boolean isDone() {
return isDone;
}
public void done() {
isDone = true;
}
}
public class DownloadFileTask implements Runnable {
DownloadStatus status;
public DownloadFileTask(DownloadStatus status) {
this.status = status;
}
#Override
public void run() {
System.out.println("start download");
for (int i = 0; i < 10_000; i++) { //"download" a 10,000 bytes file each time you run
status.increment(); //each byte downloaded - update the status
}
System.out.println("download ended with: " + status.getTotalBytes()); //**NOTE THIS LINE**
status.done();
}
}
//creating threads, one to download, another to wait for the download to be done.
public static void main(String[] args) {
DownloadStatus status = new DownloadStatus();
Thread t1 = new Thread(new DownloadFileTask(status));
Thread t2 = new Thread(() -> {
while (!status.isDone()) {}
System.out.println("DONE!!");
});
t1.start();
t2.start();
}
So, running this would create a visibility problem - the second thread wouldn't see the updated value since it had cached it before it got written back by the first thread - this causes an endless (while) loop, the second thread is constantly checking the cached isDone(). (at least that's how I think it works).
The thing I don't get is why this visibility problem stops happening when I comment out the line from the second code block that calls status.getTotalBytes().
From my understanding both threads start by caching the status object as-is, so the second thread should constantly check his cached value (and essentially not see the new value updated by the first thread).
Why is this line calling a method in the status object causing this visibility issue? (and more interestingly - why not calling it fixes it?)

What you call a "visibility problem" is actually a data race.
A single thread sees the effects of its operations in the order they are written. That is if you update a variable and then read it, you'll always see the updated value within that thread.
The effects of a thread's execution may be different when viewed from another thread. This is mainly related to the language and the underlying hardware architecture. The compiler may reorder instructions, delay memory writes while keeping values in registers, or the values may be kept in a cache before written to the main memory. Without an explicit memory barrier, the value in the main memory would not be updated. That's what you call the "visibility problem".
It is likely that there is a memory barrier in System.println. So when you execute that line, all updates up to that point will be committed to the main memory, and the other threads can see it. Note that without explicit synchronization, there is still no guarantee that the other threads will see it, because those threads may re-use the value they got for that variable before. There is nothing in the program that tells the compiler/runtime that the values may be changed by other threads.

This is the race condition between two threads. There is nothing to do with status.getTotalBytes() statement in your code. It is the scheduler that decides which thread will run. It is by chance that you are not getting stuck in the infinit loop after commenting the println statement. The main problem in your code that increment and set status should be atomic operation and replace the definition of run method as below. Secondly increment is also not a atomic operation. You can unpredictable results if there is no proper synchronization.
#Override
public void run() {
System.out.println("start download");
incrementAndSetStatus();
}
public synchronized void incrementAndSetStatus(){
for (int i = 0; i < 100000; i++) { //"download" a 10,000 bytes file each time you run
status.increment(); //each byte downloaded - update the status
}
System.out.println("download ended with: " + status.getTotalBytes()); //**NOTE THIS LINE**
status.done();
}

Difference in thread behaviour while accessing variable

I was going through the tutorial https://www.youtube.com/watch?v=SC2jXxOPe5E to understand how volatile variable works and came across a strange behavior.
For the following code snippet
public class VolatileDemo {
static boolean running = false;
public static void main(String a[]) throws InterruptedException {
Thread t = new Thread(new Runnable() {
#Override
public void run() {
while (!running) {
}
System.out.print("Started");
while (running) {
}
System.out.print("Stopped");
}
});
t.start();
Thread.sleep(1000);
running = true;
System.out.print("Starting ");
Thread.sleep(1000);
running = false;
System.out.print("Stopping");
}
}
The output is : Starting Stopping (which is understandable by video)
But for the following code snippet
public class VolatileDemo {
static boolean running = false;
public static void main(String a[]) throws InterruptedException {
Thread t = new Thread(new Runnable() {
#Override
public void run() {
while (!running) {
System.out.print("Flag " + running);
}
System.out.print(" Started");
while (running) {
System.out.print(" Flag " + running);
}
System.out.print(" Stopped");
}
});
t.start();
Thread.sleep(1000);
running = true;
System.out.print(" Starting");
Thread.sleep(1000);
running = false;
System.out.print(" Stopping");
}
}
The output is Flag: false Starting Started Flag: true Stopping Stopped(ignore the output)
My concern here is why the thread was able to read the updated value of 'running' in case 2?
Edit: The difference between the two snippets is the addition below statement in later case
System.out.print("Flag " + running);

I think it's important to understand what the purpose of volatile is.
On a multi-processor system with multiple levels of cache an update to a variable can take some time to reach main memory and hence other threads depending on latency and the hardware design. The code basically is supposed to demo this happening, the change to the running variable is changed and the output should show some delay between when it was changed and when the thread actually starts. Adding the volatile keyword should reduce this delay as it forces the write through to main memory to happen immediately instead of when the cache decides to do it.
Please note that volatile doesn't make code thread safe, it just tells the JVM that the variable needs to be written directly to main memory bypassing any delayed writing scheme the caching hardware might otherwise do. It also means that the variable is read from main memory so that stale data is not used. It is for reducing the latency between a variable being updated in one thread and seen to be updated in another. This isn't something you'll need often, and should use sparingly as bypassing the cache will have performance ramifications for your code.
When you've added extra instructions to the code of the thread you've effectively significantly reduced the rate at which it can poll the running variable. I'd say it's likely that the time between running being changed and being updated in main memory is very small, much quicker than the time it takes to output on the console (which takes longer than you'd think). So it is very likely that you won't see what you expect except on the rare occasion that the boolean evaluation in the while loop happens at the exact correct moment.
Unlike deadlock, this particular property of multi-processing is quite hard to demonstrate on a single machine. It's much more likely to happen on a NUMA system architecture (or a cluster) where the cache to memory update latency can be much larger. On a single system the time between a variable being updated in cache and written to main memory is very small.

Java threads counter "issue"?

I was trying impact of thread priority and when println in run method stays in comment both threads end in the same time and I don't understand this behavior, can you explain ? Thank you.
Main.class
public class Main {
public static void main(String[] args) {
Test t1 = new Test("Thread #1");
Test t2 = new Test("Thread #2");
t1.thread.setPriority(10);
t2.thread.setPriority(1);
t1.thread.start();
t2.thread.start();
try {
t1.thread.join();
t2.thread.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(t1.thread.getName() + ": " + t1.count);
System.out.println(t2.thread.getName() + ": " + t2.count);
System.out.println("End of main thread.");
}
}
Test.class
public class Test implements Runnable{
public Thread thread;
static boolean stop = false;
int count = 0;
public Test(String name){
thread = new Thread(this, name);
}
#Override
public void run(){
for(int i = 0; i < 10000000 && stop == false; i++){
count = i;
//System.out.println(count + " " + thread.getName());
}
stop = true;
System.out.println("End of " + thread.getName());
}
}
without println with println
End of Thread #1 End of Thread #1
End of Thread #2 End of Thread #2
Thread #1: 9999999 Thread #1: 9999999
Thread #2: 9999999 Thread #2: 3265646
End of main thread. End of main thread.

Your two threads access a shared mutable variable without proper synchronization. In this case, there is no guaranty about when (or whether at all) a thread will learn about a change made by another thread. In your case, the change made by one thread is not noticed by the other at all. Note that while for a primitive data type like boolean, not reading the up to date value is the worst thing that can happen, for non-primitive data types, even worse problems, i.e. inconsistent results could occur.
Inserting a print statement has the side effect of synchronizing the threads, because the PrintStream perform an internal synchronization. Since there is no guaranty that System.out will contain such a synchronizing print stream implementation, this is an implementation specific side-effect.
If you change the declaration of stop to
static volatile boolean stop = false;
the threads will re-read the value from the shared heap in each iteration, reacting immediately on the change, at the cost of reduced overall performance.
Note that there are still no guarantees that this code works as you expect, as there is no guaranty about neither, that the thread priority has any effect nor that threads run in parallel at all. Thread scheduling is implementation and environment dependent behavior. E.g. you might find out that not the thread with the highest priority finishes its loop first, but just the thread that happened to be started first.

To clarify: the only purpose of thread/process "priority," in any language environment on any operating system, is to suggest to the OS "which of these two 'ought to be, I think, run first'," if both of them happen to be instantaneously "runnable" and a choice must be made to run only one of them.
(In my experience, the best example of this in-practice is the Unix/Linux nice command, which voluntarily reduces the execution-priority of a command by a noticeable amount.) CPU-intensive workloads which perform little I/O can actually benefit from being given a reduced priority.
As other answerers have already stressed, it is impossible to predict "what will actually happen," and priority can never be used to alter this premise. You must explicitly use appropriate synchronization-primitives to assure that your code executes properly in all situations.

Understanding Multi-Threading in Java

I am learning multithreading in Java. Problem statement is: Suppose there is a datastruture that can contains million of Integers, now I want to search for a key in this. I want to use 2 threads so that if any one of the thread founds the key, it should set a shared boolean variable as false, and both the thread should stop further processing.
Here is what I am trying:
public class Test implements Runnable{
private List<Integer> list;
private Boolean value;
private int key = 27;
public Test(List<Integer> list,boolean value) {
this.list=list;
this.value=value;
}
#Override
public void run() {
synchronized (value) {
if(value){
Thread.currentThread().interrupt();
}
for(int i=0;i<list.size();i++){
if(list.get(i)==key){
System.out.println("Found by: "+Thread.currentThread().getName());
value = true;
Thread.currentThread().interrupt();
}
System.out.println(Thread.currentThread().getName() +": "+ list.get(i));
}
}
}
}
And main class is:
public class MainClass {
public static void main(String[] args) {
List<Integer> list = new ArrayList<Integer>(101);
for(int i=0;i<=100;i++){
list.add(i);
}
Boolean value=false;
Thread t1 = new Thread(new Test(list.subList(0, 49),value));
t1.setName("Thread 1");
Thread t2 = new Thread(new Test(list.subList(50, 99),value));
t2.setName("Thread 2");
t1.start();
t2.start();
}
}
What I am expecting:
Both threads will run randomly and when any of thread encounters 27, both thread will be interrupted. So, thread 1 should not be able to process all the inputs, similarly thread 2.
But, what is happening:
Both threads are completing the loop and thread 2 is always starting after Thread 1 completes.
Please highlight the mistakes, I am still learning threading.
My next practice question will be: Access one by one any shared resource

You are wrapping your whole block of code under the synchronized block under the object value. What this means is that, once execution arrives at the synchronized block the first thread will hold the monitor to object value and any subsequent threads will block until the monitor is released.
Note how the whole block:
synchronized (value){
if(value){
Thread.currentThread().interrupt();
}
for(int i=0; i < list.size(); i++){
if(list.get(i) == key){
System.out.println("Found by: "+Thread.currentThread().getName());
value = true;
Thread.currentThread().interrupt();
}
System.out.println(Thread.currentThread().getName() +": "+ list.get(i));
}
}
is wrapped within a synchronized block meaning that only one thread can run that block at once, contrary to your objective.
In this context, I believe you are misunderstanding the principals behind synchronization and "sharing variables". To clarify:
static - is the variable modifier used to make a variable global across objects (i.e. class variable) such that each object shares the same static variable.
volatile - is the variable modifier used to make a variable thread-safe. Note that you can still access a variable without this modifier from different threads (this is however dangerous and can lead to race conditions). Threads have no effect on the scope of variables (unless you use a ThreadLocal).
I would just like to add that you can't put volatile everywhere and expect code to be thread-safe. I suggest you read Oracle's guide on synchronization for a more in-depth review of how to establish thread-safety.
In your case, I would remove the synchronization block and declare the shared boolean as a:
private static volatile Boolean value;
Additionally, the task you are trying to perform right now is something a Fork/Join pool is built for. I suggest reading this part of Oracle's java tutorials to see how a Fork/Join pool is used in a divide-and-conquer approach.

By wrapping the main logic of your thread in a synchronized block, execution of the code in that block becomes mutually exclusive. Thread 1 will enter the block, acquiring a lock on "value" and run the entire loop before returning the lock and allowing Thread 2 to run.
If you were to wrap only the checking and setting of the flag "value", then both threads should run the code concurrently.
EDIT: As other people have discussed making "value" a static volatile boolean within the Test class, and not using the synchronized block at all, would also work. This is because access to volatile variables occurs as if it were in a synchronized block.
Reference: https://docs.oracle.com/javase/tutorial/essential/concurrency/locksync.html

You should not obtain a lock on the found flag - that will just make sure only one thread can run. Instead make the flag static so it is shared and volatile so it cannot be cached.
Also, you should check the flag more often.
private List<Integer> list;
private int key = 27;
private static volatile boolean found;
public Test(List<Integer> list, boolean value) {
this.list = list;
this.found = value;
}
#Override
public void run() {
for (int i = 0; i < list.size(); i++) {
// Has the other thread found it?
if (found) {
Thread.currentThread().interrupt();
}
if (list.get(i) == key) {
System.out.println("Found by: " + Thread.currentThread().getName());
// I found it!
found = true;
Thread.currentThread().interrupt();
}
System.out.println(Thread.currentThread().getName() + ": " + list.get(i));
}
}
BTW: Both of your threads start at 0 and walk up the array - I presume you do this in this code as a demonstration and you either have them work from opposite ends or they walk at random.

Make boolean value static so both threads can access and edit the same variable. You then don't need to pass it in. Then as soon as one thread changes it to true, the second thread will also stop since it is using the same value.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.