Volatile vs static variable in a multithreaded environment - java

I'm currently learning multithreading and I've found something interesting that I can't explain. To the best of my knowledge if two Threads are accessing a static variable they can make their own copies into their cache. An update made by Thread1 to the static variable in its local cache wont reflect in the static variable for Thread2 cache.
For this reason my isFound static variable in Cracker.java should be static and volatile, but it doesnt matter, because all Threads immediately stop when this exit condition is set to true. Can someone explain this to me?
HashDecryptor.java
public class HashDecryptor {
private List<Thread> threads = new ArrayList<>();
// some other fields
public HashDecryptor() {
createThreads();
}
private void createThreads() {
long max = (long) (Math.pow(26, numberOfChars));
int n = numberOfThreads;
for (int i = 0; i < n; ++i) {
if (i == 0) {
threads.add(new Thread(new Cracker(hashToDecrypt, (max * i / n), (max * (i + 1) / n))));
} else {
threads.add(new Thread(new Cracker(hashToDecrypt, (max * i / n) + 1, (max * (i + 1) / n))));
}
}
}
public void startDecryting() {
for (Thread t : threads) {
t.start();
}
}
}
Cracker.java
public class Cracker implements Runnable {
// Some other fields
private static boolean isFound;
public Cracker(String hashToDecrypt, long start, long end) {
this.hashToDecrypt = hashToDecrypt;
this.start = start;
this.end = end;
}
#Override
public void run() {
decrypt();
}
public void decrypt() {
LocalTime startTime = LocalTime.now();
long counter = start;
while (!isFound && counter <= end) {
if (match(counter)) {
isFound = true;
printData(generatePassword(counter), startTime);
}
counter++;
}
}
}

Static variables :Are used in the context of Object where update made by one object would reflect in all the other objects of the same class but not in the context of Thread where update of one thread to the static variable will reflect the changes immediately to all the threads (in their local cache).
If two Threads(suppose t1 and t2) are accessing the same object and updating a variable which is declared as static then it means t1 and t2 can make their own local copy of the same object(including static variables) in their respective cache, so update made by t1 to the static variable in its local cache wont reflect in the static variable for t2 cache .
Volatile variable: If two Threads(suppose t1 and t2) are accessing the same object and updating a variable which is declared as volatile then it means t1 and t2 can make their own local cache of the Object except the variable which is declared as a volatile . So the volatile variable will have only one main copy which will be updated by different threads and update made by one thread to the volatile variable will immediately reflect to the other Thread.

For this reason my isFound static variable in Cracker.java should be static and volatile, but it doesn't matter, because all Threads immediately stop when this exit condition is set to true. Can someone explain this to me?
There are a number of ways that you can get incidental synchronization that might account for this. First of all, your application may be contending for CPU resources with other applications running on the hardware and the application may get swapped out. Maybe you have more threads than you have CPUs. Both of these may cause flushing of dirty memory to core memory when the threads get swapped out.
Another likely scenario is that your threads are crossing other memory barriers such as calling other synchronized methods or accessing other volatile fields. For example, I wonder about this statement because some of the input/output streams have synchronized classes.
printData(generatePassword(counter), startTime);
You might try to remove the printing of the data to see if your application behavior changes.
I tell you it works fine, and I did verify it with sysouts. That's the strange thing about this, and that's why I asked this question :)
Perfect example. System.out is a PrintStream which is a synchronized class so calling println() there will cause your thread to cross both a read and write memory barrier that will update your static field. It's important to note that any memory barrier affects all of the cached memory. Crossing any read memory barriers forces all cached memory to be updated from central memory. Crossing any write memory barriers forces all local dirty memory to be written to central.
The problem is when you remove the System.out methods or when you application stops calling the synchronized class and then that static variable is not properly updated. So you can't rely on it but it does happen.

Related

Why does wait(100) cause synchronized method to fail in multi threaded?

I am referencing from Baeldung.com. Unfortunately, the article does not explain why this is not a thread safe code. Article
My goal is to understand how to create a thread safe method with the synchronized keyword.
My actual result is: The count value is 1.
package NotSoThreadSafe;
public class CounterNotSoThreadSafe {
private int count = 0;
public int getCount() { return count; }
// synchronized specifies that the method can only be accessed by 1 thread at a time.
public synchronized void increment() throws InterruptedException { int temp = count; wait(100); count = temp + 1; }
}
My expected result is: The count value should be 10 because of:
I created 10 threads in a pool.
I executed Counter.increment() 10 times.
I make sure I only test after the CountDownLatch reached 0.
Therefore, it should be 10. However, if you release the lock of synchronized using Object.wait(100), the method become not thread safe.
package NotSoThreadSafe;
import org.junit.jupiter.api.Test;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import static org.junit.jupiter.api.Assertions.assertEquals;
class CounterNotSoThreadSafeTest {
#Test
void incrementConcurrency() throws InterruptedException {
int numberOfThreads = 10;
ExecutorService service = Executors.newFixedThreadPool(numberOfThreads);
CountDownLatch latch = new CountDownLatch(numberOfThreads);
CounterNotSoThreadSafe counter = new CounterNotSoThreadSafe();
for (int i = 0; i < numberOfThreads; i++) {
service.execute(() -> {
try { counter.increment(); } catch (InterruptedException e) { e.printStackTrace(); }
latch.countDown();
});
}
latch.await();
assertEquals(numberOfThreads, counter.getCount());
}
}
This code has both of the classical concurrency problems: a race condition (a semantic problem) and a data race (a memory model related problem).
Object.wait() releases the object's monitor and another thread can enter into the synchronized block/method while the current one is waiting. Obviously, author's intention was to make the method atomic, but Object.wait() breaks the atomicity. As result, if we call .increment() from, let's say, 10 threads simultaneously and each thread calls the method 100_000 times, we get count < 10 * 100_000 almost always, and this isn't what we'd like to. This is a race condition, a logical/semantic problem. We can rephrase the code... Since we release the monitor (this equals to the exit from the synchronized block), the code works as follows (like two separated synchronized parts):
public void increment() {
int temp = incrementPart1();
incrementPart2(temp);
}
private synchronized int incrementPart1() {
int temp = count;
return temp;
}
private synchronized void incrementPart2(int temp) {
count = temp + 1;
}
and, therefore, our increment increments the counter not atomically. Now, let's assume that 1st thread calls incrementPart1, then 2nd one calls incrementPart1, then 2nd one calls incrementPart2, and finally 1st one calls incrementPart2. We did 2 calls of the increment(), but the result is 1, not 2.
Another problem is a data race. There is the Java Memory Model (JMM) described in the Java Language Specification (JLS). JMM introduces a Happens-before (HB) order between actions like volatile memory write/read, Object monitor's operations etc. https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5 HB gives us guaranties that a value written by one thread will be visible by another one. Rules how to get these guaranties are also known as Safe Publication rules. The most common/useful ones are:
Publish the value/reference via a volatile field (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5), or as the consequence of this rule, via the AtomicX classes
Publish the value/reference through a properly locked field (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5)
Use the static initializer to do the initializing stores
(http://docs.oracle.com/javase/specs/jls/se11/html/jls-12.html#jls-12.4)
Initialize the value/reference into a final field, which leads to the freeze action (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.5).
So, to have the counter correctly (as JMM has defined) visible, we must make it volatile
private volatile int count = 0;
or do the read over the same object monitor's synchronization
public synchronized int getCount() { return count; }
I'd say that in practice, on Intel processors, you read the correct value without any of these additional efforts, with just simple plain read, because of TSO (Total Store Ordering) implemented. But on a more relaxed architecture, like ARM, you get the problem. Follow JMM formally to be sure your code is really thread-safe and doesn't contain any data races.
Why int temp = count; wait(100); count = temp + 1; is not thread-safe? One possible flow:
First thread reads count (0), save it in temp for later, and waits, allowing second thread to run (lock released);
second thread reads count (also 0), saved in temp, and waits, eventually allowing first thread to continue;
first thread increments value from temp and saves in count (1);
but second thread still holds the old value of count (0) in temp - eventually it will run and store temp+1 (1) into count, not incrementing its new value.
very simplified, just considering 2 threads
In short: wait() releases the lock allowing other (synchronized) method to run.

When does a thread-locally cached variable makes consistent with "main memory" after being updated?

I am totally puzzled with the two samples.
public class VTest {
private static /*volatile*/ boolean leap = true;
public static void main(String[] args) throws InterruptedException {
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
while (leap) {
}
}
});
t2.start();
Thread.sleep(3000);
leap = false;
}
}
In this case, t2 is not able to stop, as leap was stored locally so that t2 can't access the leap updated in main thread.
public class VTest2 {
private static int m = 0;
public static void main(String[] args) throws InterruptedException {
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
for (int i = 0; i < 10000; ++i) ++m;
}
});
t2.start();
for (int i = 0; i < 10000; ++i) ++m;
Thread.sleep(3000);
System.out.println(m);
}
}
But, in this case, the m is always be 20000, why isn't 10000?
Any answer will be appreciated.
It's not really a matter of "when". Because of the way that m is declared, the two threads have no reason to believe that it needs to consider the value in main memory.
Consider that ++m is not an atomic operation, but is rather:
A read
An increment
A write
Because the thread doesn't know it needs to read from or flush to main memory, there is no guarantee as to how it is executed:
Perhaps it reads from main memory each time, and flushes to main memory each time
Perhaps it reads from main memory just once, and doesn't flush to main memory when it writes
Perhaps it reads from/writes to main memory on some iterations of the loop
(...many other ways)
So, essentially, the answer is that there is no guarantee that the value is read from or written to main memory, ever.
If you declare m as volatile, that gives you some guarantees: that m is definitely read from main memory, and definitely flushed to main memory. However, because ++m isn't atomic, there is no guarantee that you get 20000 at the end (it's possible it could be 2, at worst), because the work of the two threads can intersperse (e.g. both threads read the same value of m, increment it, and both write back the same value m+1).
To do this correctly, you need to ensure that:
++m is executed atomically
The value is guaranteed to be visible.
The easiest way of doing this would be to use an AtomicInteger instead; however, you could mutually synchronize the increments:
synchronized (VTest2.class) {
++m;
}
You then also need to synchronize the final read, in order to ensure you are definitely seeing the last value written by t2:
synchronized (VTest2.class) {
System.out.println(m);
}
In this case, t2 is not able to stop, as leap was stored locally so that t2 can't access the leap updated in main thread.
That's not really the case: the leap variable was not stored "locally" by the thread. It's still a shared static variable. However, because it is not marked as volatile, and there is no synchronization happening whatsoever, the JVM (the JIT in particular) is free to do optimization to avoid loading it. I believe in this case it is removing the check on the variable.
Note: The second code incrementing m is not thread-safe: try increasing the loop to millions to test that, it will almost never match the expected sum.

Are java getters thread-safe?

Is is okay to synchronize all methods which mutate the state of an object, but not synchronize anything which is atomic? In this case, just returning a field?
Consider:
public class A
{
private int a = 5;
private static final Object lock = new Object();
public void incrementA()
{
synchronized(lock)
{
a += 1;
}
}
public int getA()
{
return a;
}
}
I've heard people argue that it's possible for getA() and incrementA() to be called at roughly the same time and have getA() return to wrong thing. However it seems like, in the case that they're called at the same time, even if the getter is synchronized you can get the wrong thing. In fact the "right thing" doesn't even seem defined if these are called concurrently. The big thing for me is that the state remains consistent.
I've also heard talk about JIT optimizations. Given an instance of the above class and the following code(the code would be depending on a to be set in another thread):
while(myA.getA() < 10)
{
//incrementA is not called here
}
it is apparently a legal JIT optimization to change this to:
int temp = myA.getA();
while(temp < 10)
{
//incrementA is not called here
}
which can obviously result in an infinite loop.
Why is this a legal optimization? Would this be illegal if a was volatile?
Update
I did a little bit of testing into this.
public class Test
{
private int a = 5;
private static final Object lock = new Object();
public void incrementA()
{
synchronized(lock)
{
a += 1;
}
}
public int getA()
{
return a;
}
public static void main(String[] args)
{
final Test myA = new Test();
Thread t = new Thread(new Runnable(){
public void run() {
while(true)
{
try {
Thread.sleep(100);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
myA.incrementA();
}
}});
t.start();
while(myA.getA() < 15)
{
System.out.println(myA.getA());
}
}
}
Using several different sleep times, this worked even when a is not volatile. This of course isn't conclusive, it still may be legal. Does anyone have some examples that could trigger such JIT behaviour?
Is is okay to synchronize all methods which mutate the state of an object, but not synchronize anything which is atomic? In this case, just returning a field?
Depends on the particulars. It is important to realize that synchronization does two important things. It is not just about atomicity but it is also required because of memory synchronization. If one thread updates the a field, then other threads may not see the update because of memory caching on the local processor. Making the int a field be volatile solves this problem. Making both the get and the set method be synchronized will as well but it is more expensive.
If you want to be able to change and read a from multiple threads, the best mechanism is to use an AtomicInteger.
private AtomicInteger a = new AtomicInteger(5);
public void setA(int a) {
// no need to synchronize because of the magic of the `AtomicInteger` code
this.a.set(a);
}
public int getA() {
// AtomicInteger also takes care of the memory synchronization
return a.get();
}
I've heard people argue that it's possible for getA() and setA() to be called at roughly the same time and have getA() return to wrong thing.
This is true but you can get the wrong value if getA() is called after setA() as well. A bad cache value can stick forever.
which can obviously result in an infinite loop. Why is this a legal optimization?
It is a legal optimization because threads running with their own memory cache asynchronously is one of the important reasons why you see performance improvements with them. If all memory accesses where synchronized with main memory then the per-CPU memory caches would not be used and threaded programs would run a lot slower.
Would this be illegal if a was volatile?
It is not legal if there is some way for a to be altered – by another thread possibly. If a was final then the JIT could make that optimization. If a was volatile or the get method marked as synchronized then it would certainly not be a legal optimization.
It's not thread safe because that getter does not ensure that a thread will see the latest value, as the value may be stale. Having the getter be synchronized ensures that any thread calling the getter will see the latest value instead of a possible stale one.
You basically have two options:
1) Make your int volatile
2) Use an atomic type like AtomicInt
using a normal int without synchronization is not thread safe at all.
Your best solution is to use an AtomicInteger, they were basically designed for exactly this use case.
If this is more of a theoretical "could this be done question", I think something like the following would be safe (but still not perform as well as an AtomicInteger):
public class A
{
private volatile int a = 5;
private static final Object lock = new Object();
public void incrementA()
{
synchronized(lock)
{
final int tmp = a + 1;
a = tmp;
}
}
public int getA()
{
return a;
}
}
The short answer is your example will be thread-safe, if
the variable is declared as volatile, or
the getter is declared as synchronized.
The reason that your example class A is not thread-safe is that one can create a program using it that doesn't have a "well-formed execution" (see JLS 17.4.7).
For instance, consider
// in thread #1
int a1 = A.getA();
Thread.sleep(...);
int a2 = A.getA();
if (a1 == a2) {
System.out.println("no increment");
// in thread #2
A.incrementA();
in the scenario that the increment happens during the sleep.
For this execution to be well-formed, there must be a "happens before" (HB) chain between the assignment to a in incrementA called by thread #2, and the subsequent read of a in getA called by thread #1.
If the two threads synchronize using the same lock object, then there is a HB between one thread releasing the lock and a second thread acquiring the lock. So we get this:
thread #2 acquires lock --HB-->
thread #2 reads a --HB-->
thread #2 writes a --HB-->
thread #2 releases lock --HB-->
thread #1 acquires lock --HB-->
thread #1 reads a
If two threads share a a volatile variable, there is a HB between any write and any subsequent read (without an intervening write). So we typically get this:
thread #2 acquires lock --HB-->
thread #2 reads a --HB-->
thread #2 writes a --HB-->
thread #1 reads a
Note that incrementA needs to be synchronized to avoid race conditions with other threads calling incrementA.
If neither of the above is true, we get this:
thread #2 acquires lock --HB-->
thread #2 reads a --HB-->
thread #2 writes a // No HB!!
thread #1 reads a
Since there is no HB between the write by thread #2 and the subsequent read by thread #1, the JLS does not guarantee that the latter will see the value written by the former.
Note that this is a simplified version of the rules. For the complete version, you need to read all of JLS Chapter 17.

Regarding volatile variable usage

I am new to the volatile variable but I was going through article which states 2) Volatile variable can be used as an alternative way of achieving synchronization in Java in some cases, like Visibility. with volatile variable its guaranteed that all reader thread will see updated value of volatile variable once write operation completed, without volatile keyword different reader thread may see different values.
I request you guys could you please show this with me a small java program , so technically also it is clear to me.
what I come from my understanding is...
Volatile means each Thread Access the variable will have its own private copy which is same as original one.But if the Thread is going to change that private copy,then original one will not get reflected.
public class Test1 {
volatile int i=0,j=0;
public void add1()
{
i++;
j++;
}
public void printing(){
System.out.println("i=="+i+ "j=="+j);
}
public static void main(String[] args) {
Test1 t1=new Test1();
Test1 t2=new Test1();
t1.add1();//for t1 ,i=1,j=1
t2.printing();//for t2 value of i and j is still,i=0,j=0
t1.printing();//prints the value of i and j for t1,i.e i=1,j=1
t2.add1();////for t2 value of i and j is changed to i=1;j=1
t2.printing();//prints the value of i and j for t2i=1;j=1
}
}
I request you guys could you please show a small program of volatile functionality, so technically also it is clear to me
Volatile variable as you have read guarantees visibility but doesn't guarantee atomicity - another important aspect of thread safety. I will try to explain by an example
public class Counter {
private volatile int counter;
public int increment() {
System.out.println("Counter:"+counter); // reading always gives the correct value
return counter++; // atomicity isn't guaranteed, this will eventually lead to skew/error in the expected value of counter.
}
public int decrement() {
System.out.println("Counter:"+counter);
return counter++;
}
}
In the example, you can see that the read operation will always give the correct value of counter at an instant of time, however atomic operations (like evaluate a condition and do something and read and write on the basis of read value) thread safety is not guaranteed.
You can refer this answer for additional details.
Volatile means each Thread Access the variable will have its own
private copy which is same as original one.But if the Thread is going
to change that private copy,then original one will not get reflected.
I am not sure I understand you correctly, but volatile fields imply they are read and written from the main memory accessible to all threads - there are no thread specific copies (caching) of the variable.
From JLS,
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable

Java - using AtomicInteger vs Static int

While using multiple threads I have learnt to use Static variables whenever I want to use a counter that will be accessed by multiple threads.
Example:
static int count=0; Then later in the program I use it as count++;.
Today I came across something called AtomicInteger and I also learned that it is Thread safe and could use one of its methods called getAndInrement() to achieve the same effect.
Could anyone help me to understand about using static atomicInteger versus static int count?
- AtomicInteger is used to perform the atomic operation over an integer, its an alternative when you don't want to use synchronized keyword.
- Using a volatile on a Non-Atomic field will give inconsistent result.
int volatile count;
public void inc(){
count++
}
- static will make a variable shared by all the instances of that class, But still it will produce an inconsistent result in multi-threading environment.
So try these when you are in multithreading environment:
1. Its always better to follow the Brian's Rule:
When ever we write a variable which is next to be read by another
thread, or when we are reading a variable which is written just by
another thread, it needs to be synchronized. The shared fields must be
made private, making the read and write methods/atomic statements
synchronized.
2. Second option is using the Atomic Classes, like AtomicInteger, AtomicLong, AtomicReference, etc.
I agree with #Kumar's answer.
Volatile is not sufficient - it has some implications for the memory order, but does not ensure atomicity of ++.
The really difficult thing about multi-threaded programming is that problems may not show up in any reasonable amount of testing. I wrote a program to demonstrate the issue, but it has threads that do nothing but increment counters. Even so, the counts are within about 1% of the right answer. In a real program, in which the threads have other work to do, there may be a very low probability of two threads doing the ++ close enough to simultaneously to show the problem. Multi-thread correctness cannot be tested in, it has to be designed in.
This program does the same counting task using a simple static int, a volatile int, and an AtomicInteger. Only the AtomicInteger consistently gets the right answer. A typical output on a multiprocessor with 4 dual-threaded cores is:
count: 1981788 volatileCount: 1982139 atomicCount: 2000000 Expected count: 2000000
Here's the source code:
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
public class Test {
private static int COUNTS_PER_THREAD = 1000000;
private static int THREADS = 2;
private static int count = 0;
private static volatile int volatileCount = 0;
private static AtomicInteger atomicCount = new AtomicInteger();
public static void main(String[] args) throws InterruptedException {
List<Thread> threads = new ArrayList<Thread>(THREADS);
for (int i = 0; i < THREADS; i++) {
threads.add(new Thread(new Counter()));
}
for (Thread t : threads) {
t.start();
}
for (Thread t : threads) {
t.join();
}
System.out.println("count: " + count + " volatileCount: " + volatileCount + " atomicCount: "
+ atomicCount + " Expected count: "
+ (THREADS * COUNTS_PER_THREAD));
}
private static class Counter implements Runnable {
#Override
public void run() {
for (int i = 0; i < COUNTS_PER_THREAD; i++) {
count++;
volatileCount++;
atomicCount.incrementAndGet();
}
}
}
}
"static" make the var to be class level. That means, if you define "static int count" in a class, no matter how many instances you created of the class, all instances use same "count". While AtomicInteger is a normal class, it just add synchronization protection.
With AtomicInteger the incrementAndGet() guaranteed to be atomic.
If you use count++ to get the previous value it is not guaranteed to be atomic.
Something the I missed from your question - and was stated by other answer - static has nothing to do with threading.
static int counter would give you inconsistent result in multithreaded environment unless you make the counter volatile or make the increment block synchronized.
In case of automic it gives lock-free thread-safe programming on single variables.
More detail in automic's and link
I think there is no gurantee to see on count++ the newest value. count++ must read the value of count. Another Thread can have written a new value to count but stored it's value on the Thread local cache, i. e. does not flush to main memory. Also your Thread, that reads count, has no gurantee to read from the main memory, i. e. refresh from main memory. synchronize gurantees that.
AtomicInteger is to make the get and increment as an atomic process. It can be thought as a Sequencer in Database. It provides utility methods to increment, decrement delta int values.
static int can cause issue if you are getting counter and then processing and then updating it. AtomicInteger does it easily but you can't use it if you have to update the counter based on processing results.

Categories

Resources