Java parallel volatile i++

Java parallel volatile i++ - java

I have a global variable
volatile i = 0;
and two threads. Each does the following:
i++;
System.out.print(i);
I receive the following combinations. 12, 21 and 22.
I understand why I don't get 11 (volatile disallows the caching of i) and I also understand 12 and 22.
What I don't understand is how it is possible to get 21?
The only possible way how you can get this combination is that the thread that prints later had to be the first to increment i from 0 to 1 and then cached i==1. Then the other thread incremented i from 1 to 2 and then printed 2. Then the first thread prints the cached i==1. But I thought that volatile disallow caching.
Edit: After running the code 10,000 times I got 11 once. Adding volatile to i does not change the possible combinations at all.
markspace is right: volatile forbids caching i but i++ is not atomic. This means that i still gets sort of "cached" in a register during the incrementation.
r1 = i
//if i changes here r1 does not change
r1 = r1 + 1
i = r1
This is the reason why 11 is still possible. 21 is caused because PrintStreams are not synchronized (see Karol Dowbecki's answer)

Your code does not guarantee which thread will call System.out first.
The increments and reads for i happened in order due to volatile keyword but prints didn't.

Unfortunately ++ is not an atomic operation. Despite volatile not allowing caching, it is permitted for the JVM to read, increment, and then write as separate operations. Thus, the concept you are trying to implement just isn't workable. You need to use synchronized for its mutex, or use something like AtomicInteger which does provide an atomic increment operation.

The only possible way...is that the thread that prints later had to be the first to increment i from 0 to 1 and then cached i==1...
You are forgetting about what System.out.print(i); does: That statement calls the System.out object's print(...) method with whatever value was stored in i at the moment when the call was started.
So here's one scenario that could happen:
Thread A
increments i (i now equals 1)
Starts to call `print(1)` //Notice! that's the digit 1, not the letter i.
gets bogged down somewhere deep in the guts...
Thread B
increments i (i=2)
Calls `print(2)`
Gets lucky, and the call runs to completion.
Thread A
finishes its `print(1)` call.
Neither thread is caching the i variable. But, the System.out.print(...) function doesn't know anything about your volatile int i. It only knows about the value (1 or 2) that was passed to it.

Related

Threads run in serial not parallel

I am trying to learn concurrency in Java, but whatever I do, 2 threads run in serial, not parallel, so I am not able to replicate common concurrency issues explained in tutorials (like thread interference and memory consistency errors). Sample code:
public class Synchronization {
static int v;
public static void main(String[] args) {
Runnable r0 = () -> {
for (int i = 0; i < 10; i++) {
Synchronization.v++;
System.out.println(v);
}
};
Runnable r1 = () -> {
for (int i = 0; i < 10; i++) {
Synchronization.v--;
System.out.println(v);
}
};
Thread t0 = new Thread(r0);
Thread t1 = new Thread(r1);
t0.start();
t1.start();
}
}
This always give me a result starting from 1 and ending with 0 (whatever the loop length is). For example, the code above gives me every time:
1
2
3
4
5
6
7
8
9
10
9
8
7
6
5
4
3
2
1
0
Sometimes, the second thread starts first and the results are the same but negative, so it is still running in serial.
Tried in both Intellij and Eclipse with identical results. CPU has 2 cores if it matters.
UPDATE: it finally became reproducible with huge loops (starting from 1_000_000), though still not every time and just with small amount of final discrepancy. Also seems like making operations in loops "heavier", like printing thread name makes it more reproducible as well. Manually adding sleep to thread also works, but it makes experiment less cleaner, so to say. The reason doesn't seems to be that first loop finishes before the second starts, because I see both loops printing to console while continuing operating and still giving me 0 at the end. The reasons seems more like a thread race for same variable. I will dig deeper into that, thanks.

Seems like first started thread just never give a chance to second in Thread Race to take a variable/second one just never have a time to even start (couldn't say for sure), so the second almost* always will be waiting until first loop will be finished.
Some heavy operation will mix the result:
TimeUnit.MILLISECONDS.sleep(100);
*it is not always true, but you are was lucky in your tests

Starting a thread is heavyweight operation, meaning that it will take some time to perform. Due that fact, by the time you start second thread, first is finished.
The reasoning why sometimes it is in "revert order" is due how thread scheduler works. By the specs there are not guarantees about thread execution order - having that in mind, we know that it is possible for second thread to run first (and finish)
Increase iteration count to something meaningful like 10000 and see what will happen then.

This is called lucky timing as per Brian Goetz (Author of Java Concurrency In Practice). Since there is no synchronization to the static variable v it is clear that this class is not thread-safe.

Simple Multi Thread code to be clarified

I've got this little problem on thread.
int x = 0;
add() {
x=x+1;
}
If we run this in multiple threads, say 4 threads, is the final value of x=4 at every time or it could be 1,2,3 or 4.
Thanks
PS
lets say the atomic operations for the adding is like this,
LOAD A x
ADD A 1
LOAD x A
Then the final result will be 4. Am I right or what have I get wrong ?

This is a classical example of data race.
Now, let's take a closer look at what add() does:
add()
{
x = x + 1;
}
This translates to:
Give me the most recent value of X and store it in my private workspace
Add 1 to that value that is stored in my private workspace
Copy what I have in my workspace to the memory that I copied from (that is globally accessible).
Now, before we go further in explaining this, you have something called context switch, which is the process by which your operating system divides your processor's time among different threads and processes. This process usually gives your threads a finite amount of processor time (on windows it is about 40 milliseconds) and then interrupts that work, copies everything the processor have in its registers (and by thus preserve it's state) and switch to the next task. This is called Round-robin task scheduling.
You have no control over when you're processing is going to be interrupted and transferred to another thread.
Now imagine you have two threads doing the same:
1. Give me the most recent value of X and store it in my private workspace
2. Add 1 to that value that is stored in my private workspace
3. Copy what I have in my workspace to the memory that I copied from (that is globally accessible).
and X is equal to 1 before any of them runs.
The first thread might execute the first instruction and store in it's private workspace the value of X that was most recent at the time it was working on - 1. Then a context-switch occurs and the operating system interrupts your threads and gives control to the next task in queue, that happens to be the second thread. The second thread also reads the value of X which is equal to 1.
Thread number two manages to run to completion - it adds 1 to the value it "downloaded" and "uploads" the calculated value.
The operating system forces a context switch again.
Now the first thread continues execution at the point where it was interrupted. It will still think that the most recent value is 1, it will increment that value by one and save the result of it's computation to that memory area. And this is how data races occur. You expect the final result to be 3 but it is 2.
There are many ways to avoid this problem such as locks/mutexes, compare and swap or atomic operations.

Your code is broken at two levels:
No happens-before relationship imposed between actions of threads;
Atomicity of get-and-increment not enforced.
To solve 1. you can add the volatile modifier. This will still leave the operation non-atomic. To ensure atomicity, you would use (preferably) an AtomicInteger or synchronized (involves locking, not preferred).
As it stands, the result can be any number from 0 to 4 if read from a thread that was not involved in incrementing.

Multi-thread applications are concurrent (this is the whole point).
t1: LOAD A1 x
t2: LOAD A2 x
t3: LOAD A3 x
t4: LOAD A4 x
t1: ADD A1 1
t2: ADD A2 1
t3: ADD A3 1
t4: ADD A4 1
t1: STORE x A1
t2: STORE x A2
t3: STORE x A3
t4: STORE x A4
A1, A2, A3, A4 are local registers.
The result is 1, but it could be 2, 3 or 4. If you have another thread it could see the old value due to visability issues and see 0

Using AtomicInteger as a static shared counter

In an effort to learn about synchronization via Java, I'm just messing around with some simple things like creating a counter shared between threads.
The problem I've run into is that I can't figure out how to print the counter sequentially 100% of the time.
int counterValue = this.counter.incrementAndGet();
System.out.println(this.threadName + ": " + counterValue);
The above increments the AtomicInteger counter, gets the new value, and prints it to the console identified by the thread name that is responsible for that update. The problem occurs when it appears that the incrementAndGet() method is causing the JVM to context switch to another thread for its updates before printing the current thread's updated value. This means that the value gets incremented but not printed until the thread returns to the executing state. This is obvious when looking at this example output:
Thread 3: 4034
Thread 3: 4035
Thread 3: 4036
Thread 1: 3944
Thread 1: 4037
Thread 1: 4039
Thread 1: 4040
Thread 2: 3863
Thread 1: 4041
Thread 1: 4043
You can see that when execution returns to Thread 1, it prints its value and continues updating. The same is evident with Thread 2.
I have a feeling that I'm missing something very obvious.

The problem occurs when it appears that the incrementAndGet() method is causing the JVM to context switch to another thread for its updates before printing the current thread's updated value
This is a race condition that often can happen in these situations. Although the AtomicInteger counters are being incremented properly, there is nothing to stop Thread 2 from being swapped out after the increment happens and before the println is called.
int counterValue = this.counter.incrementAndGet();
// there is nothing stopping a context switch here
System.out.println(this.threadName + ": " + counterValue);
If you want to print the "counter sequentially 100% of the time" you are going to have to synchronize on a lock around both the increment and the println call. Of course if you do that then the AtomicInteger is wasted.
synchronized (counter) {
System.out.println(this.threadName + ": " + counter.incrementAndGet());
}
If you edit your question to explain why you need the output to be sequential maybe there is a better solution that doesn't have this race condition.

You need to synchronize the whole construction for that:
synchronized(this) {
int counterValue = this.counter.incrementAndGet();
System.out.println(this.threadName + ": " + counterValue);
}
In this case, though, you don't have to use AtomicInteger. Plain int would work (counter++).

To print sequentially, the incAndGet and the println must both be in a critical region, a piece of code only one thread may enter, the others being blocked. Realisable with a binary semaphore, for instance, like java synchronized.
You could turn things on the head, and have one thread incrementing a counter and printing it. Other threads may in a "critical region" take only one after another the counter. That would be more efficient, as critical regions should remain small and preferable do no I/O.

Starvation in non-blocking approaches

I've been reading about non-blocking approaches for some time. Here is a piece of code for so called lock-free counter.
public class CasCounter {
private SimulatedCAS value;
public int getValue() {
return value.get();
}
public int increment() {
int v;
do {
v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));
return v + 1;
}
}
I was just wondering about this loop:
do {
v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));
People say:
So it tries again, and again, until all other threads trying to change the value have done so. This is lock free as no lock is used, but not blocking free as it may have to try again (which is rare) more than once (very rare).
My question is:
How can they be so sure about that? As for me I can't see any reason why this loop can't be infinite, unless JVM has some special mechanisms to solve this.

The loop can be infinite (since it can generate starvation for your thread), but the likelihood for that happening is very small. In order for you to get starvation you need some other thread succeeding in changing the value that you want to update between your read and your store and for that to happen repeatedly.
It would be possible to write code to trigger starvation but for real programs it would be unlikely to happen.
The compare and swap is usually used when you don't think you will have write conflicts very often. Say there is a 50% chance of "miss" when you update, then there is a 25% chance that you will miss in two loops and less than 0.1% chance that no update would succeed in 10 loops. For real world examples, a 50% miss rate is very high (basically not doing anything than updating), and as the miss rate is reduces, to say 1% then the risk of not succeeding in two tries is only 0.01% and in 3 tries 0.0001%.
The usage is similar to the following problem
Set a variable a to 0 and have two threads updating it with a = a+1 a million times each concurrently. At the end a could have any answer between 1000000 (every other update was lost due to overwrite) and 2000000 (no update was overwritten).
The closer to 2000000 you get the more likely the CAS usage is to work since that mean that quite often the CAS would see the expected value and be able to set with the new value.

Edit: I think I have a satisfactory answer now. The bit that confused me was the 'v != compareAndSwap'. In the actual code, CAS returns true if the value is equal to the compared expression. Thus, even if the first thread is interrupted between get and CAS, the second thread will succeed the swap and exit the method, so the first thread will be able to do the CAS.
Of course, it is possible that if two threads call this method an infinite number of times, one of them will not get the chance to run the CAS at all, especially if it has a lower priority, but this is one of the risks of unfair locking (the probability is very low however). As I've said, a queue mechanism would be able to solve this problem.
Sorry for the initial wrong assumptions.

atomic question from core java

This is a section from Core Java 8th edition Page 757
CAUTION:
public void flipDone() {
done = !done;
}
// not atomic
I don't understand why it's not atomic. Can any one tell me why? thanks

The flipDone method is executed by the computer in three distinct steps:
Read the value of memory location labeled done into ALU
Flip the value (i.e true -> false or false -> true)
Write the new value into the memory location done
In Java, a piece of code can potentially be invoked in multiple threads. These threads should be thought of as executing the code concurrently.
Say, memory location labeled done contains the value false initially. Consider two threads calling flipDone, resulting in the following sequence of steps:
Thread 1 Thread 2
-----------------------------------------------------------------------
Read value of done, false
Invert that value to true
Write the new value, true, to done
Read value of done, now true
Invert that value to false
Write the new value, false, to done
The flipDone method was called twice. done went from false to true and then back again to false - as one would expect. But since the threads execute concurrently, this is not the only ordering of steps. Consider this ordering instead:
Thread 1 Thread 2
-----------------------------------------------------------------------
Read value of done, false
Invert that value to true Read value of done, false
Write the new value, true, to done Invert that value to true
Write the new value, true, to done
While the first thread is inverting the value it read, the second thread, concurrently, is reading the value. Similarly, while the first thread is writing the value to memory, the second thread is inverting the value it read. When Thread 2 finishes, the value of done will be true. Here, although flipDone was called twice, done was flipped only once! One of the updates seem to have been lost. This is the problem that the book is trying to warn you about.

There are three steps here:
Read the value of the boolean done
Invert the value read in step 1
Assign that value to the boolean done
There's nothing to stop another thread from pre-empting in the middle of all this.

Because two threads may be calling flipDone() method at the same time so the state of the done variable is indeterminate.

When you execute
done = !done;
What actually happens is:
1. Get the value of done
2. Apply the not operator
3. Assign it to done
If two threads execute the first step together, they are going to have the same value of done, so instead of changing it two times, they will change it only once.
For instance if done was initially true, after changing it two times you'd expect it's still true, but if the two threads execute step 1 together, it will be false.

It does not execute as a single indivisible operation but instead is a sequence of three discrete operations - Fetch the current value of done, Negate the value, Write the new value back to done. It is a read-modify-write operation in which the resulting state is derived from the previous state.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.