Does jvm guarantie update of processor cache after switching threads?

Does jvm guarantie update of processor cache after switching threads? - java

I was asked by interviewer is there any danger not to use volatile if we know for sure that threads will never interfere.
eg. we have:
int i = 10;
// Thread 1
i++;
// await some time and switch to Thread 2
getI();
I don't use any synchronization.
Do we have any danger to receive outdated value of i by second thread?

Without volatile or synchronized or read/write barrier, there is no guarantee that another thread will ever see a change you made, no matter how long your wait. In particular boolean fields can be inlined into the code and no read actually performed. In theory, int values could be inlined if the JVM detects the field is not changed by the thread (I don't believe it actually does though)
we know for sure that threads will never interfere.
This is not something you can know, unless the reading thread is not running when you perform the update. When thread starts it will see any changes which occurred before it started.

You might receive outdated values, yes.
The reason in short is:
Every thread in Java has its own little 'cache'. For performance reasons, threads keep a copy of the 'master' data in their own memory. So you basically have a main memory and a threadlocal one. The volatile keyword forces the thread to access the main memory and not its local one.
Refer to this for more information: http://www.ibm.com/developerworks/library/j-5things15/

If an interviewer asked me that question, I would answer in terms of the Java Language Specification. There is no "cache" or "main memory" in the JLS. The JLS talks about fields (a.k.a., instance variables and class variables) and it makes very specific guarantees about when an update to a field that happens in thread A will become visible in thread B. Implementation details like "cache" and "memory barriers" and can vary from one platform to another, but a program that is correct with respect to the JLS should (in theory) be correct on any Java platform.

Related

Is visibility guaranteed for sequential thread runs for non volatile var?

I have read below references:
http://tutorials.jenkov.com/java-concurrency/volatile.html
https://www.geeksforgeeks.org/volatile-keyword-in-java/
https://docs.oracle.com/javase/tutorial/essential/concurrency/atomic.html
But, I am still not clear on what is the expected behavior in below case:
I have a thread pool in which thread is reused.
I have a non-volatile var which is accessed by different threads form that thread pool.
Threads are run in sequential.(Yes, they can be run in one thread. But, just for this case. So, don't ask why not use one thread)
And the part I am not clear is that. Do I still need volatile to make sure the change made in previous thread is visible to the next thread execution.
Like does java flash thread local cache to memory after each execution?
And does the thread reload the local cache before each execution?
It keep complain there is code section that is not quoted. So, I have to do this. Please help fix.

Java Memory Model design can answer your question:
In the Java Memory Model a volatile field has a store barrier inserted after a write to it and a load barrier inserted before a read of it. Qualified final fields of a class have a store barrier inserted after their initialisation to ensure these fields are visible once the constructor completes when a reference to the object is available.
see https://dzone.com/articles/memory-barriersfences
In other words: when a thread tries to read from volatile var JMM forces all CPUs owning this memory area to write back to memory from the local CPU's cache. Then CPU loads its value to the local cache if necessary.
And the part I am not clear is that. Do I still need volatile to make sure the change made in previous thread is visible to the next thread execution.
Yes and no. Volatile keyword is just for making a value of variable visible to other threads. But if you need only one thread read-write at the moment = you need synchronization. If you need to provide just visibility = volatile keyword is enough.
If you do not use the volatile keyword and share the value through the threads it might be not up to date or even be corrupted. For LONG type Java do 2 writes by 32bits and they are not atomic. Let's say another thread can read the value between these two writes happened.

It doesn't matter whether the Threads are running sequentially for parallely, if you don't use volatile keyword there is no visibility guarantee. That means there is no guarantee that the other thread will see the latest copy as the thread may read the values from register.

Does Volatile variable makes sense here(multi-core processor)?

I declared a instance variable as voltile. Say two threads are created by two processors under multi core where thread updates the variable. To ensure
instantaneous visibilty, I believe declaring variable as volatile is right choice here so that update done by thread happens in main memory and is visible to another thread .
Right?
Intention here to understand the concept in terms of multicore processor.

I am assuming you are considering using volatile vs. not using any special provisions for concurrency (such as synchronized or AtomicReference).
It is irrelevant whether you are running single-code or multicore: sharing data between threads is never safe without volatile. There are many more things the runtime is allowed to do without it; basically it can pretend the accessing thread is the only thread running on the JVM. The thread can read the value once and store it on the call stack forever; a loop reading the value, but never writing it, may be transformed such that the value is read only once at the outset and never reconsidered, and so on.
So the message is simple: use volatile—but that's not necessarily all you need to take care of in concurrent code.

It doesn't matter if it's done by different processors or not. When you don't have mult-processors, you can still run into concurrency problems because context switches may happen any time.
If a field is not volatile, it may still be in one thread's cache while its context is switched out and the other thread's context switches in. In that case, the thread that just took over the (single) processor will not see that the field has changed.
Since these things can happen even with one processor, they are bound to happen with more than one processor, so indeed, you need to protect your shared data.
Whether volatile is the right choice or not depends on what type it is and what kind of change you are trying to protect from. But again, that has nothing to do with the number of processors.
If the field is a reference type, then volatile only ensures the vilibility of new assignments to the field. It doesn't protect against changes in the object it points to - for that you need to synchronize.

Example of a memory consistency error when using volatile keyword?

From docs:
Using volatile variables reduces the risk of memory consistency errors
But this means that sometimes volatile variables don't work correct?
Strange how it can be used - for my opinion it is very bad code that sometimes work sometimes not. I tried to Google but didn't find example memory consistency error with volatile. Could you please propose one?

The issue is not so much that volatile works unreliably. It always works the way it is supposed to work. The problem is that the way it is supposed to work is sometimes not adequate for concurrency control. If you use volatile in the wrong situation, you can still get memory consistency errors.
A volatile variable will always have any writes propagated to all threads. However, suppose you need to increment the variable among various threads. Doing this(*):
volatile int mCounter;
// later, in some code that might be executed simultaneously on multiple threads:
mCounter++;
There is a chance that counter increments will be missed. This is because the value of mCounter needs to be first read by each thread before a new value can be written. In between those two steps, another thread may have changed the value of mCounter. In situations like this, you would need to rely on synchronized blocks rather than volatile to ensure data integrity.
For more info on volatile vs. synchronized, I recommend the article Managing volatility by Brian Goetz
(*) I realize that the above would be better implemented with AtomicInteger; it's a contrived example to illustrate a point.

Volatile does the following:
- It prevents the caching the values in the Thread.
- It makes sure that the threads having the copies of the values of the fields of the object reconcile with the main copy present in the memory.
- Making sure the data is written directly to the memory and read from memory itself.
## But the condition where volatile fails:
- Making a Non-Atomic statement Volatile.
Eg:
int count = 0;
count++; // Increment operator is Not Atomic in java
## Better option:
1. Its always better to follow the Brian's Rule:
When ever we write a variable which is next to be read by another thread, or when we are reading a variable which is written just by another thread, it needs to be synchronized.
The shared fields must be made private, making the read and write methods/atomic statements synchronized.
2. Second option is using the Atomic Classes, like AtomicInteger, AtomicLong, AtomicReference, etc.
## See this link, i have asked a similar question like yours:
Why Volatile is behaving weirdly

Do 2 threads interacting slow each other down?

This is a very conceptual question.
Let's say I have 2 separate threads. Thread A continually gets the time and stores it as a variable, and thread B continually gets the time from thread B's variable and does something with it.
The moment thread B accesses the variable in thread A, does thread A stop running until the operation is finished?
To expand, what if you had 3 threads, Thread A to get the current time and set it as a variable in Thread B, And then thread C to read the variable.
If Thread A is in the middle of assigning the variable the moment thread C comes to read it, does thread stop running until A is finished?
Thanks for the great answers, but now I have 1 more question. If they would interfere, what is the preferred solution to make the multiple threads not compete when communicating. (Conceptually) what would you do to make it so these threads could share the value of the variable while remaining as fast as possible individually?

Memory and thrads are two entirely different things. Even a variable that is part of a class which represents a thread is just memory like any other that any thread can access.
However, what complicates this and does in fact cause a slowdown is processor caches: in order for two threads running on different CPUs to access the same piece of memory and "see" each other's changes, the CPU caches have to be synchronized, which can completely negate the (massive) speed advantages of these caches if it happens a lot.
Note that because of caches, thread B may actually see an outdated value of the variable for an arbitrarily long time unless both access the variable within a synchronized block, or the variable is declared with the keyword volatile.

No normally not unless you declare the getter and the setter synchronized. If i.e. the setter is being called from one thread and another thread wants to access the getter it has to wait for the other thread to end his task first.
I hope I understood the question right.

It won't make thread B slow thread A down.
But:
1. Thread A may slow thread B down. Depending on the system architecture (cores and caches), every write by thread A will purge the cache line from thread B's CPU. B's next read will be more expensive.
2. If you use a lock to protect the data structure, both threads have to obtain it. Then, obviously, thread A will slow down.
3. If you don't use a lock, and your data type isn't atomic, you may read corrupt data. For example, if the time changes from 0x0000ffff to 0x00010000, you may read 0x0001ffff. Normally, an integer, which is aligned on 4 bytes, is atomic, so if your time is time_t you're probably OK. But the details are platform dependent.

It depends on the OS and the number of CPU, but basically yes, when one thread is working, the other waits. It doesn't matter much if thread B uses a variable used by thread A, as they share the same memory.

Yes they slow down.
And that's because your concurrency checks you coded in the application: since you have a lock, semaphore, monitor, or whatever, to regulate the access to the stored variable, each thread will possibly wait to gain exclusive access to read/write the variable.
and even if you don't have concurrency checks i am pretty sure that memory doesn't allow simultaneous reads or read/write: since this situation might happen, your threads will be forced to slow down a little bit nevertheless

Are static variables shared between threads?

My teacher in an upper level Java class on threading said something that I wasn't sure of.
He stated that the following code would not necessarily update the ready variable. According to him, the two threads don't necessarily share the static variable, specifically in the case when each thread (main thread versus ReaderThread) is running on its own processor and therefore doesn't share the same registers/cache/etc and one CPU won't update the other.
Essentially, he said it is possible that ready is updated in the main thread, but NOT in the ReaderThread, so that ReaderThread will loop infinitely.
He also claimed it was possible for the program to print 0 or 42. I understand how 42 could be printed, but not 0. He mentioned this would be the case when the number variable is set to the default value.
I thought perhaps it is not guaranteed that the static variable is updated between the threads, but this strikes me as very odd for Java. Does making ready volatile correct this problem?
He showed this code:
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready) Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}

There isn't anything special about static variables when it comes to visibility. If they are accessible any thread can get at them, so you're more likely to see concurrency problems because they're more exposed.
There is a visibility issue imposed by the JVM's memory model. Here's an article talking about the memory model and how writes become visible to threads. You can't count on changes one thread makes becoming visible to other threads in a timely manner (actually the JVM has no obligation to make those changes visible to you at all, in any time frame), unless you establish a happens-before relationship.
Here's a quote from that link (supplied in the comment by Jed Wesley-Smith):
Chapter 17 of the Java Language Specification defines the happens-before relation on memory operations such as reads and writes of shared variables. The results of a write by one thread are guaranteed to be visible to a read by another thread only if the write operation happens-before the read operation. The synchronized and volatile constructs, as well as the Thread.start() and Thread.join() methods, can form happens-before relationships. In particular:
Each action in a thread happens-before every action in that thread that comes later in the program's order.
An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor. And because the happens-before relation is transitive, all actions of a thread prior to unlocking happen-before all actions subsequent to any thread locking that monitor.
A write to a volatile field happens-before every subsequent read of that same field. Writes and reads of volatile fields have similar memory consistency effects as entering and exiting monitors, but do not entail mutual exclusion locking.
A call to start on a thread happens-before any action in the started thread.
All actions in a thread happen-before any other thread successfully returns from a join on that thread.

He was talking about visibility and not to be taken too literally.
Static variables are indeed shared between threads, but the changes made in one thread may not be visible to another thread immediately, making it seem like there are two copies of the variable.
This article presents a view that is consistent with how he presented the info:
http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html
First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way:
Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one).
You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system.
Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in which they get seen.
...
But again, this is simply a mental model to think about threading and volatile, not literally how the JVM works.

Basically it's true, but actually the problem is more complex. Visibility of shared data can be affected not only by CPU caches, but also by out-of-order execution of instructions.
Therefore Java defines a Memory Model, that states under which circumstances threads can see consistent state of the shared data.
In your particular case, adding volatile guarantees visibility.

They are "shared" of course in the sense that they both refer to the same variable, but they don't necessarily see each other's updates. This is true for any variable, not just static.
And in theory, writes made by another thread can appear to be in a different order, unless the variables are declared volatile or the writes are explicitly synchronized.

Within a single classloader, static fields are always shared. To explicitly scope data to threads, you'd want to use a facility like ThreadLocal.

When you initialize static primitive type variable java default assigns a value for static variables
public static int i ;
when you define the variable like this the default value of i = 0;
thats why there is a possibility to get you 0.
then the main thread updates the value of boolean ready to true. since ready is a static variable , main thread and the other thread reference to the same memory address so the ready variable change. so the secondary thread get out from while loop and print value.
when printing the value initialized value of number is 0. if the thread process has passed while loop before main thread update number variable. then there is a possibility to print 0

#dontocsata
you can go back to your teacher and school him a little :)
few notes from the real world and regardless what you see or be told.
Please NOTE, the words below are regarding this particular case in the exact order shown.
The following 2 variable will reside on the same cache line under virtually any know architecture.
private static boolean ready;
private static int number;
Thread.exit (main thread) is guaranteed to exit and exit is guaranteed to cause a memory fence, due to the thread group thread removal (and many other issues). (it's a synchronized call, and I see no single way to be implemented w/o the sync part since the ThreadGroup must terminate as well if no daemon threads are left, etc).
The started thread ReaderThread is going to keep the process alive since it is not a daemon one!
Thus ready and number will be flushed together (or the number before if a context switch occurs) and there is no real reason for reordering in this case at least I can't even think of one.
You will need something truly weird to see anything but 42. Again I do presume both static variables will be in the same cache line. I just can't imagine a cache line 4 bytes long OR a JVM that will not assign them in a continuous area (cache line).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.