Using AtomicInteger as a static shared counter

Using AtomicInteger as a static shared counter - java

In an effort to learn about synchronization via Java, I'm just messing around with some simple things like creating a counter shared between threads.
The problem I've run into is that I can't figure out how to print the counter sequentially 100% of the time.
int counterValue = this.counter.incrementAndGet();
System.out.println(this.threadName + ": " + counterValue);
The above increments the AtomicInteger counter, gets the new value, and prints it to the console identified by the thread name that is responsible for that update. The problem occurs when it appears that the incrementAndGet() method is causing the JVM to context switch to another thread for its updates before printing the current thread's updated value. This means that the value gets incremented but not printed until the thread returns to the executing state. This is obvious when looking at this example output:
Thread 3: 4034
Thread 3: 4035
Thread 3: 4036
Thread 1: 3944
Thread 1: 4037
Thread 1: 4039
Thread 1: 4040
Thread 2: 3863
Thread 1: 4041
Thread 1: 4043
You can see that when execution returns to Thread 1, it prints its value and continues updating. The same is evident with Thread 2.
I have a feeling that I'm missing something very obvious.

The problem occurs when it appears that the incrementAndGet() method is causing the JVM to context switch to another thread for its updates before printing the current thread's updated value
This is a race condition that often can happen in these situations. Although the AtomicInteger counters are being incremented properly, there is nothing to stop Thread 2 from being swapped out after the increment happens and before the println is called.
int counterValue = this.counter.incrementAndGet();
// there is nothing stopping a context switch here
System.out.println(this.threadName + ": " + counterValue);
If you want to print the "counter sequentially 100% of the time" you are going to have to synchronize on a lock around both the increment and the println call. Of course if you do that then the AtomicInteger is wasted.
synchronized (counter) {
System.out.println(this.threadName + ": " + counter.incrementAndGet());
}
If you edit your question to explain why you need the output to be sequential maybe there is a better solution that doesn't have this race condition.

You need to synchronize the whole construction for that:
synchronized(this) {
int counterValue = this.counter.incrementAndGet();
System.out.println(this.threadName + ": " + counterValue);
}
In this case, though, you don't have to use AtomicInteger. Plain int would work (counter++).

To print sequentially, the incAndGet and the println must both be in a critical region, a piece of code only one thread may enter, the others being blocked. Realisable with a binary semaphore, for instance, like java synchronized.
You could turn things on the head, and have one thread incrementing a counter and printing it. Other threads may in a "critical region" take only one after another the counter. That would be more efficient, as critical regions should remain small and preferable do no I/O.

Related

why the starvation is caused by notifyAll()?

Here is my code,
class Shared {
private static int index = 0;
public synchronized void printThread() {
try {
while(true) {
Thread.sleep(1000);
System.out.println(Thread.currentThread().getName() + ": " + index++);
notifyAll();
// notify();
wait();
}
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
class Example13 implements Runnable {
private Shared shared = new Shared();
#Override
public void run() {
shared.printThread();
}
}
public class tetest {
public static void main(String[] args) {
Example13 r = new Example13();
Thread t1 = new Thread(r, "Thread 1");
Thread t2 = new Thread(r, "Thread 2");
Thread t3 = new Thread(r, "Thread 3");
Thread t4 = new Thread(r, "Thread 4");
Thread t5 = new Thread(r, "Thread 5");
t1.start();
t2.start();
t3.start();
t4.start();
t5.start();
}
}
and the result is here
Thread 1: 0
Thread 5: 1
Thread 4: 2
Thread 3: 3
Thread 2: 4
Thread 3: 5
Thread 2: 6
Thread 3: 7
Thread 2: 8
Thread 3: 9
the question is, why only two of the threads are working? I'm so confused, I thought notify() is randomly wake up one of the waiting threads, but it's not.
is this starvation? then, why the starvation is caused? I tried notify() and notifyAll(), but got same results for both.
Can any one help my toasted brain?

This isn't 'starvation'. Your 5 threads are all doing nothing. They all want to 'wake up' - notify() will wake up an arbitrary one. It is neither unreliably random: The JMM does not ascribe an order to this, so one of them will wake up, you can't rely on it being random (do not use this to generate random numbers), nor can you rely on specific ordering behaviour.
It's not starvation (it's not: Oh no! Thread 2 and 3 are doing all the work and 4 and 5 are just hanging out doing nothing! That's bad - the system could be more efficient!) - because it doesn't matter which thread 'does the work'. A CPU core is a CPU core, it cares not which thread it ends up running.
Starvation is a different principle. Imagine instead of Thread.sleep (which means the threads aren't waiting for anything specific, other than some time to elapse), instead the threads want to print the result of some expensiv-ish math operation. If you just let 2 threads each say 'Hello!', then the impl of System.out says that it would be acceptable for the JVM to produce:
HelHellloo!!
So to prevent this, you use locks to create a 'talking stick' of sorts: A thread only gets to print if it has the talking stick. Each of 5 threads will all perform, in a loop, the following operation:
Do an expensive math operation.
Acquire the talking stick.
Print the result of the operation.
Release the talking stick.
Loop back to the top.
Now imagine that, despite the fact that the math operation is quite expensive, for whatever reason you have an excruciatingly slow terminal, and the 'print the result of the operation' job takes a really long time to finish.
Now you can run into starvation. Imagine this scenario:
Threads 1-5 all do their expensive math simultaneously.
Arbitrarily, thread 4 ends up nabbing the talking stick.
The other 4 threads soon too want the talking stick but they have to wait; t4 has it. They do nothing now. Twiddling their thumbs (they could be calculating, but they are not!)
after the excruciatingly long time, 4 is done and releases the stick. 1, 2, 3, and 5 dogpile on like its the All-Blacks and 2 so happens to win the scrum and crawls out of the pile with the stick. 1, 3, and 5 gnash their teeth and go back yet again to waiting for the stick, still not doing any work. Whilst 2 is busy spending the really long time printing results, 4 goes back to the top of the loop and calculates another result. It ends up doing this faster than 2 manages to print, so 4 ends up wanting the talking stick again before 2 is done.
2 is finally done and 1, 3, 4, and 5 all scrum into a pile again. 4 happens to get the stick - java makes absolutely no guarantees about fairness, any one of them can get it, there is also no guarantee of randomness or lack thereof. A JVM is not broken if 4 is destined to win this fight.
Repeat ad nauseam. 2 and 4 keep trading the stick back and forth. 1, 3, and 5 never get to talk.
The above is, as per the JMM, valid - a JVM is not broken if it behaves like this (it would be a tad weird). Any bugs filed about this behaviour would be denied: The lock isn't so-called "fair". Java has fair locks if you want them - in the java.util.concurrent package. Fair locks incur some slight extra bookkeeping cost, the assumption made by the synchronized and wait/notify system is that you don't want to pay this extra cost.
A better solution to the above scenario is possibly to make a 6th thread that JUST prints, with 5 threads JUST filling a buffer, at least then the 'print' part is left to a single thread and that might be faster. But mostly, the bottleneck in this getup is simply that printing - the code has zero benefit from being multicore (just having ONE thread do ONE math calculation, print it, do another, and so forth would be better. Or possibly 2 threads: Whilst printing, the other thread calculates a number, but there's no point in having more than one; even a single thread can calculate faster than results can be printed). Thus in some ways this is just what the situation honestly requires: This hypothetical scenario still prints as fast as it can. IF you need the printing to be 'fair' (and who says that? It's not intrinsic to the problem description that fairness is a requirement. Maybe all the various calculations are equally useful so it doesn't matter that one thread gets to print more than others; let's say its bitcoin miners, generating a random number and checking if that results in a hash with the requisite 7 zeroes at the end or whatever bitcoin is up to now - who cares that one thread gets more time than another? A 'fair' system is no more likely to successfully mine a block).
Thus, 'fairness' is something you need to explicitly determine you actually need. If you do, AND starvation is an issue, use a fair lock. new ReentrantLock(true) is all you need (that boolean parameter is the fair parameter - true means you want fairness).

Java parallel volatile i++

I have a global variable
volatile i = 0;
and two threads. Each does the following:
i++;
System.out.print(i);
I receive the following combinations. 12, 21 and 22.
I understand why I don't get 11 (volatile disallows the caching of i) and I also understand 12 and 22.
What I don't understand is how it is possible to get 21?
The only possible way how you can get this combination is that the thread that prints later had to be the first to increment i from 0 to 1 and then cached i==1. Then the other thread incremented i from 1 to 2 and then printed 2. Then the first thread prints the cached i==1. But I thought that volatile disallow caching.
Edit: After running the code 10,000 times I got 11 once. Adding volatile to i does not change the possible combinations at all.
markspace is right: volatile forbids caching i but i++ is not atomic. This means that i still gets sort of "cached" in a register during the incrementation.
r1 = i
//if i changes here r1 does not change
r1 = r1 + 1
i = r1
This is the reason why 11 is still possible. 21 is caused because PrintStreams are not synchronized (see Karol Dowbecki's answer)

Your code does not guarantee which thread will call System.out first.
The increments and reads for i happened in order due to volatile keyword but prints didn't.

Unfortunately ++ is not an atomic operation. Despite volatile not allowing caching, it is permitted for the JVM to read, increment, and then write as separate operations. Thus, the concept you are trying to implement just isn't workable. You need to use synchronized for its mutex, or use something like AtomicInteger which does provide an atomic increment operation.

The only possible way...is that the thread that prints later had to be the first to increment i from 0 to 1 and then cached i==1...
You are forgetting about what System.out.print(i); does: That statement calls the System.out object's print(...) method with whatever value was stored in i at the moment when the call was started.
So here's one scenario that could happen:
Thread A
increments i (i now equals 1)
Starts to call `print(1)` //Notice! that's the digit 1, not the letter i.
gets bogged down somewhere deep in the guts...
Thread B
increments i (i=2)
Calls `print(2)`
Gets lucky, and the call runs to completion.
Thread A
finishes its `print(1)` call.
Neither thread is caching the i variable. But, the System.out.print(...) function doesn't know anything about your volatile int i. It only knows about the value (1 or 2) that was passed to it.

Threads run in serial not parallel

I am trying to learn concurrency in Java, but whatever I do, 2 threads run in serial, not parallel, so I am not able to replicate common concurrency issues explained in tutorials (like thread interference and memory consistency errors). Sample code:
public class Synchronization {
static int v;
public static void main(String[] args) {
Runnable r0 = () -> {
for (int i = 0; i < 10; i++) {
Synchronization.v++;
System.out.println(v);
}
};
Runnable r1 = () -> {
for (int i = 0; i < 10; i++) {
Synchronization.v--;
System.out.println(v);
}
};
Thread t0 = new Thread(r0);
Thread t1 = new Thread(r1);
t0.start();
t1.start();
}
}
This always give me a result starting from 1 and ending with 0 (whatever the loop length is). For example, the code above gives me every time:
1
2
3
4
5
6
7
8
9
10
9
8
7
6
5
4
3
2
1
0
Sometimes, the second thread starts first and the results are the same but negative, so it is still running in serial.
Tried in both Intellij and Eclipse with identical results. CPU has 2 cores if it matters.
UPDATE: it finally became reproducible with huge loops (starting from 1_000_000), though still not every time and just with small amount of final discrepancy. Also seems like making operations in loops "heavier", like printing thread name makes it more reproducible as well. Manually adding sleep to thread also works, but it makes experiment less cleaner, so to say. The reason doesn't seems to be that first loop finishes before the second starts, because I see both loops printing to console while continuing operating and still giving me 0 at the end. The reasons seems more like a thread race for same variable. I will dig deeper into that, thanks.

Seems like first started thread just never give a chance to second in Thread Race to take a variable/second one just never have a time to even start (couldn't say for sure), so the second almost* always will be waiting until first loop will be finished.
Some heavy operation will mix the result:
TimeUnit.MILLISECONDS.sleep(100);
*it is not always true, but you are was lucky in your tests

Starting a thread is heavyweight operation, meaning that it will take some time to perform. Due that fact, by the time you start second thread, first is finished.
The reasoning why sometimes it is in "revert order" is due how thread scheduler works. By the specs there are not guarantees about thread execution order - having that in mind, we know that it is possible for second thread to run first (and finish)
Increase iteration count to something meaningful like 10000 and see what will happen then.

This is called lucky timing as per Brian Goetz (Author of Java Concurrency In Practice). Since there is no synchronization to the static variable v it is clear that this class is not thread-safe.

Thread concurrency issue even within one single command?

I am slightly surprised by what I get if I compile and run the following (horrible non-synchronized) Java SE program.
public class ThreadRace {
// this is the main class.
public static void main(String[] args) {
TestRunnable tr=new TestRunnable(); // tr is a Runnable.
Thread one=new Thread(tr,"thread_one");
Thread two=new Thread(tr,"thread_two");
one.start();
two.start(); // starting two threads both with associated object tr.
}
}
class TestRunnable implements Runnable {
int counter=0; // Both threads can see this counter.
public void run() {
for(int x=0;x<1000;x++) {
counter++;
}
// We can't get here until we've added one to counter 1000 times.
// Can we??
System.out.println("This is thread "+
Thread.currentThread().getName()+" and the counter is "+counter);
}
}
If I run "java ThreadRace" at the command line, then here is my interpretation
of what happens. Two new threads are created and started. The threads have
the same Runnable object instance tr, and so they see the same tr.counter .
Both new threads add one to this counter 1000 times, and then print the value
of the counter.
If I run this lots and lots of times, then usually I get output of the form
This is thread thread_one and the counter is 1000
This is thread thread_two and the counter is 2000
and occasionally I get output of the form
This is thread thread_one and the counter is 1204
This is thread thread_two and the counter is 2000
Note that what happened in this latter case was that thread_one finished
adding one to the counter 1000 times, but thread_two had started adding
one already, before thread_one printed out the value of the counter.
In particular, this output is still comprehensible to me.
However, very occasionally I get something like
This is thread thread_one and the counter is 1723
This is thread thread_two and the counter is 1723
As far as I can see, this "cannot happen". The only way the System.out.println() line
can be reached in either thread, is if the thread has finished counting to 1000.
So I am not bothered if one of the threads reports the counter as being some
random number between 1000 and 2000, but I cannot see how both threads can
get as far as their System.out.println() line (implying both for loops have finished,
surely?) and counter not be 2000 by the time the second statement is printed.
Is what is happening that both threads somehow attempt to do counter++ at exactly
the same time, and one overwrites the other? That is, a thread can even be
interrupted even in the middle of executing a single statement?

The "++" operator is not atomic -- it doesn't happen in one uninterruptible cycle. Think of it like this:
1. Fetch the old value
2. Add one to it
3. Store the new value back
So imagine that you get this sequence:
Thread A: Step 1
Thread B: Step 1
Thread A: Step 2
Thread B: Step 2
Thread A: Step 3
Thread B: Step 3
Both threads think they've incremented the variable, but its value has only increased by one! The second "store back" operation effectively cancels out the result of the first.
Now, truth is, when you add in multiple levels of cache, far weirder things can actually happen; but this is an easy explanation to understand. You can fix these kinds of issues by synchronizing access to the variable: either the whole run() method, or the inside of the loop using a synchronized block. As Jon suggests, you could also use some of the fancier tools in java.util.concurrent.atomic.

It absolutely can happen. counter++ isn't an atomic operation. Consider it as:
int tmp = counter;
tmp++;
counter = tmp;
Now imagine two threads executing that code at the same about time:
Both read the counter
Both increment their local copy (0 to 1)
Both write 1 into counter
This sort of thing is precisely why java.util.concurrent.atomic exists. Change your code to:
class TestRunnable implements Runnable {
private final AtomicInteger counter = new AtomicInteger();
public void run() {
for(int x=0;x<1000;x++) {
counter.incrementAndGet();
}
System.out.println("This is thread "+
Thread.currentThread().getName()+" and the counter is " + counter.get());
}
}
That code is safe.

Starvation in non-blocking approaches

I've been reading about non-blocking approaches for some time. Here is a piece of code for so called lock-free counter.
public class CasCounter {
private SimulatedCAS value;
public int getValue() {
return value.get();
}
public int increment() {
int v;
do {
v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));
return v + 1;
}
}
I was just wondering about this loop:
do {
v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));
People say:
So it tries again, and again, until all other threads trying to change the value have done so. This is lock free as no lock is used, but not blocking free as it may have to try again (which is rare) more than once (very rare).
My question is:
How can they be so sure about that? As for me I can't see any reason why this loop can't be infinite, unless JVM has some special mechanisms to solve this.

The loop can be infinite (since it can generate starvation for your thread), but the likelihood for that happening is very small. In order for you to get starvation you need some other thread succeeding in changing the value that you want to update between your read and your store and for that to happen repeatedly.
It would be possible to write code to trigger starvation but for real programs it would be unlikely to happen.
The compare and swap is usually used when you don't think you will have write conflicts very often. Say there is a 50% chance of "miss" when you update, then there is a 25% chance that you will miss in two loops and less than 0.1% chance that no update would succeed in 10 loops. For real world examples, a 50% miss rate is very high (basically not doing anything than updating), and as the miss rate is reduces, to say 1% then the risk of not succeeding in two tries is only 0.01% and in 3 tries 0.0001%.
The usage is similar to the following problem
Set a variable a to 0 and have two threads updating it with a = a+1 a million times each concurrently. At the end a could have any answer between 1000000 (every other update was lost due to overwrite) and 2000000 (no update was overwritten).
The closer to 2000000 you get the more likely the CAS usage is to work since that mean that quite often the CAS would see the expected value and be able to set with the new value.

Edit: I think I have a satisfactory answer now. The bit that confused me was the 'v != compareAndSwap'. In the actual code, CAS returns true if the value is equal to the compared expression. Thus, even if the first thread is interrupted between get and CAS, the second thread will succeed the swap and exit the method, so the first thread will be able to do the CAS.
Of course, it is possible that if two threads call this method an infinite number of times, one of them will not get the chance to run the CAS at all, especially if it has a lower priority, but this is one of the risks of unfair locking (the probability is very low however). As I've said, a queue mechanism would be able to solve this problem.
Sorry for the initial wrong assumptions.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.