I am new to Java, I am currently learning about volatile. Say I have the following code:
public class Test
{
private static boolean b = false;
public static void main(String[] args) throws Exception
{
new Thread(new Runnable()
{
public void run()
{
while(true)
{
b = true;
}
}
}).start();
// Give time for thread to start
Thread.sleep(2000);
System.out.println(b);
}
}
Output:
true
This code has two threads (the main thread and another thread). Why is the other thread able to modify the value of b, shouldn't b be volatile in order for this to happen?
The volatile keyword guarantees that changes are visible amongst multiple threads, but you're interpreting that to mean that opposite is also true; that the absence of the volatile keyword guarantees isolation between threads, and there's no such guarantee.
Also, while your code example is multi-threaded, it isn't necessarily concurrent. It could be that the values were cached per-thread, but there was enough time for the JVM to propagate the change before you printed the result.
You are right that with volatile, you can ensure/guarantee that your 2 threads will see the appropriate value from main memory at all times, and never a thread-specific cached version of it.
Without volatile, you lose that guarantee. And each thread is working with its own cached version of the value.
However, there is nothing preventing the 2 threads from resynchronizing their memory if and when they feel like it, and eventually viewing the same value (maybe). It's just that you can't guarantee that it will happen, and you most certainly cannot guarantee when it will happen. But it can happen at some indeterminate point in time.
The point is that your code may work sometimes, and sometimes not. But even if every time you run it on your personal computer, is seems like it's reading the variable properly, it's very likely that this same code will break on a different machine. So you are taking big risks.
Related
Volatile is supposed to make the Threads read the values from RAM disabling thread cache, and without volatile caching will be enabled making a thread unaware of the variable change made by another thread but this does not work for the below code.
Why does this happen and code works the same with and without volatile keyword there?
public class Racing{
private boolean won = false; //without volatile keyword
public void race() throws InterruptedException{
Thread one = new Thread(()->{
System.out.println("Player-1 is racing...");
while(!won){
won=true;
}
System.out.println("Player-1 has won...");
});
Thread two=new Thread(()->{
System.out.println("Player-2 is racing...");
while(!won){
System.out.println("Player-2 Still Racing...");
}
});
one.start();
//Thread.sleep(2000);
two.start();
}
public static void main(String k[]) {
Racing racing=new Racing();
try{
racing.race();
}
catch(InterruptedException ie){}
}
Why does this behave the same with and without volatile ?
Volatile is supposed to make the threads read the values from RAM
disabling thread cache
No, this is not accurate. It depends on the architecture where the code is running. The Java language standard itself does not state anything about how the volatile should or not be implemented.
From Myths Programmers Believe about CPU Caches can read:
As a computer engineer who has spent half a decade working with caches
at Intel and Sun, I’ve learnt a thing or two about cache-coherency.
(...)
For another, if volatile variables were truly written/read from main-memory > every single time, they would be horrendously slow – main-memory references are > 200x slower than L1 cache references. In reality, volatile-reads (in Java) can > often be just as cheap as a L1 cache reference, putting to rest the notion that volatile forces reads/writes all the way to main memory. If you’ve been avoiding the use of volatiles because of performance concerns, you might have been a victim of the above misconceptions.
Unfortunately, there still are several articles online propagating this inaccuracy (i.e., that volatile forces variables to be read from main memory).
Accordingly to the language standard (§17.4):
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable
So informally, all threads will have a view of the most updated value of that variable. There is nothing about how the hardware should enforce such constrain.
Why does this happen and code works same with and without volatile
Well (in your case) without the volatile is undefined behavior, meaning you might or not see the most updated value of the flag won, consequently, theoretically the race condition is still there. However, because you have added the following statement
System.out.println("Player-2 Still Racing...");
in:
Thread two = new Thread(()->{
System.out.println("Player-2 is racing...");
while(!won){
System.out.println("Player-2 Still Racing...");
}
});
two things will happen, you will avoid the Spin on field problem, and second if one looks at the System.out.println code:
public void println(String x) {
synchronized (this) {
print(x);
newLine();
}
}
one can see that there is a synchronized being called, which will increase the likelihood that the threads will be reading the most updated value of the field flag (before the called to the println method). However, even that might change based on the JVM implementation.
Without volatile, there is no guarantee that another thread will see updates written to a variable. That does not mean that another thread will not see those updates if the value is not volatile. Other threads may eventually see the modified value.
In your example, you are using System.out.printlns, which contain memory barriers. That means once the println works, all variables updated before that point are visible to all the threads. The program might work differently if you do not print anything.
I've encountered this code in a book. It states NoVisibility could loop forever because the value of ready might never become
visible to the reader thread.
I'm confused by this statement. In order for the loop to run forever, ready must always be false, which is the default value. This means it must fail at executing ready = true; because the reader thread will always read the ready variable from memory. the assignment happens in CPU and it must have some problem in flushing the data back to Main Memory. I think I need some explanation on a situation how it can fail, or I may have missed some other part.
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready)
Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}
Your understanding is flawed. You are assuming that Java will behave intuitively here. In fact, it may not. And, indeed, the Java Language specification allows non-intuitive behavior if you don't follow the rules.
To be more specific, in your example it is NOT GUARANTEED that the second thread will see the results of the first thread's assignment to ready1. This is due to such things as:
The compiler caching the value of ready in a register in the first or second thread.
The compiler not including instructions to force the write to be flushed from one core's memory cache to main memory, or similar.
If you want a guarantee that the second thread will see the result of the write then either reads and writes of ready by the two threads must be (properly) synchronized, or the ready variable must be declared to be volatile.
So ...
This means it must fail at executing ready = true; because the reader thread will always read the ready variable from memory.
is incorrect. The "because" is not guaranteed by the Java language specification in this example.
Yes. It is nonintuitive. Relying on your intuition based on your understanding of single-threaded programs is not reliable. If you want to want to understand what is and is not guaranteed, please study the specification of the "Java Memory Model" in Section 17.4 of the JLS.
In short, the book is correct.
1 - It might see the results immediately, or after a short or long delay. Or it might never see them. And the behavior is liable to vary from one system to the next, and with versions of the Java platform. So your program that (by luck) works all of the time on one system may not always work on another system.
The value of ready may be updated but the other thread may never know about it. There you need volatile variables! A thread assumes that the variable is only used by this and only thread. So, it reads its value from the stack that it created.
private static volatile boolean ready;
What volatile does is that it says to your program to ready from the memory, not from the stack.
Actually what jvm does is it translates:
while(flag){...}
To:
if(flag){
while(true){
}
The stack is created when the thread is created. It collectes the values of the variables in order to use them later.
This is what I have understand, correct me if I am wrong!
I have a class with a getter getInt() and a setter setInt() on a certain field, say field
Integer Int;
of an object of a class, say SomeClass.
The setInt() here is synchronized-- getInt() isn't.
I am updating the value of Int from within multiple threads.
Each thread is getting the value Int, and setting it appropriately.
The threads aren't sharing any other resources in any way.
The code executed in each thread is as follows.
public void update(SomeClass c) {
while (<condition-1>) // the conditions here and the calculation of
// k below dont have anything to do
// with the members of c
if (<condition-2>) {
// calculate k here
synchronized (c) {
c.setInt(c.getInt()+k);
// System.out.println("in "+this.toString());
}
}
}
The run() method is just invoking the above method on the members updated from within the constructor by the params passed to it:
public void run() { update(c); }
When I run this on large sequences, the threads aren't interleaving much-- i see one thread executing for long without any other thread running in between.
There must be a better way of doing this.
I can't change the internals of SomeClass, or of the class invoking the threads.
How can this be done better?
TIA.
//=====================================
EDIT:
I'm not after manipulating the execution sequence of the threads. They all have the same priority. It`s just that what i see in the outcome is suggesting that the threads aren't sharing the execution time evenly-- one of them, once takes over, executing on. However, I can't see why this code should be doing this.
It`s just that what i see in the outcome is suggesting that the threads aren't sharing the execution time evenly
Well, this is exactly what you don't want if you are after efficiency. Yanking a thread from being executed and scheduling another thread is generally very costly. Therefore it's actually advantageous to do one of them, once takes over, executing on. Of course, when this is overdone you could see higher throughput but longer response time. In theory. In practice, JVMs thread scheduling is well tuned for almost all purposes, and you don't want to try changing it in almost all situations. As a rule of thumb, if you are interested in response times in millisecond order, you probably want to stay away messing with it.
tl;dr: It's not being inefficient, you probably want to leave it as it is.
EDIT:
Having said that, using an AtomicInteger may help in performance, and is in my opinion less error prone than using a lock (synchronized keyword). You need to be hitting that variable really very hard in order to get a measurable benefit though.
The JDK provides a nice solution for multi threaded int access, AtomicInteger:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicInteger.html
As Enno Shioji has pointed out, letting one thread proceed might be the most efficient way to execute your code in some scenarios.
It depends on how much cost the thread synchronization imposes in relation to the other work of your code (which we don’t know). If you have a loop like:
while (<condition-1>)
if (<condition-2>) {
// calculate k here
synchronized (c) {
c.setInt(c.getInt()+k);
}
}
and the test for condition-1 and condition-2 and the calculation of k is rather cheap compared to the synchronization cost, the Hotspot optimizer might decide to reduce the overhead by transforming the code to something like this:
synchronized (c) {
while (<condition-1>)
if (<condition-2>) {
// calculate k here
c.setInt(c.getInt()+k);
}
}
(or a rather more complicated structure by performing loop unrolling and span the synchronized block over multiple iterations). The bottom line is that the optimized code might block other threads longer but let the one owning the lock finish faster resulting in an overall faster execution.
This does not mean that a single-threaded execution was the fastest way to handle your problem. It also doesn’t mean that using an AtomicInteger here would be the best option to solve the problem. It would create a higher CPU load and possibly a small acceleration but it doesn’t solve your real mistake:
It is completely unnecessary to update c within the loop at a high frequency. After all, your threads do not depend on seeing updates to c timely. It even looks like they are not using it at all. So the correct fix would be to move the update out of the loop:
int kTotal=0;
while (<condition-1>)
if (<condition-2>) {
// calculate k here
kTotal += k;
}
synchronized (c) {
c.setInt(c.getInt()+kTotal);
}
Now, all threads can run in parallel (assuming the code you haven’t posted here doesn’t contain inter-thread dependencies) and the synchronization cost is reduced to a minimum. You could still change it to an AtomicInteger as well but that’s not that important anymore.
Answering to this
i see one thread executing for long without any other thread running in between.
There must be a better way of doing this.
You can not control how threads will be executed. JVM does this for you, and does not like you to interfere in its work.
Still you can look at yield as your option, but that also does not ensure same thread will not be picked again.
The java.lang.Thread.yield() method causes the currently executing thread object to temporarily pause and allow other threads to execute.
I've found it better to use wait() and notify() than yield. Check out this example (seen from a book)-
class Q {
int n;
boolean valueSet = false;
synchronized int get() {
if(!valueSet)
wait(); //handle InterruptedException
//
valueSet = false;
notify();//if thread waiting in put, now notified
}
synchronized void put(int n) {
if(valueSet)
wait(); //handle InterruptedException
//
valueSet = true;
//if thread in get waiting then that is resumed now
notify();
}
}
or you could try using sleep() and join the threads in the end in main() but that isn't a foolproof way
You are having public void update(SomeClass c) method in your code and this method is an instance method in which you are passing the object as parameter.
synchronized(c) in your code is doing nothing. Let me show you with some example,
So if you will make different objects of this class and then try to make them different threads like,
class A extends Thread{
public void update(SomeClass c){}
public void run(){
update(c)
}
public static void main(String args[]){
A t1 = new A();
A t2 = new A();
t1.start();
t2.start();
}
}
Then both of these t1 & t2 will have their own copies of update method and the reference variable c which you are making synchronized will also be different for both the threads. t1 calls its own update() method and t2 calls its own update() method. So synchronization won't work.
Synchronization will work when you have something common for both the threads.
Something like,
class A extends Thread{
static SomeClass c;
public void update(){
synchronized(c){
}
}
public void run(){
update(c)
}
public static void main(String args[]){
A t1 = new A();
A t2 = new A();
t1.start();
t2.start();
}
}
This way the actual concept of synchronization will be applied.
I met the following Java class on the internet:
public class Lock1 implements Runnable {
int b=100;
public synchronized void m1() throws Exception {
b=1000;
Thread.sleep(50);
System.out.println("b="+b);
}
public synchronized void m2() throws Exception {
Thread.sleep(30);
//System.out.println("m2");
b=2000;
}
public void run() {
try {m1();}
catch(Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws Exception {
Lock1 tt=new Lock1();
Thread t = new Thread(tt);
t.start();
tt.m2();
System.out.println(tt.b);
}
}
Tried running this a lot of times, the result is almost always:
1000
b=1000
In my original guess, I thought the first line should be "2000", since tt.m2() is just a method invocation(not a thread), the main method should continue with its execution and get the resulting "b" as the one has been assigned value 2000 in method m2.
The second try that I did is to uncomment out the
System.out.println("m2")
in m2 method.Suprisingly, the result will be nearly always:
m2
2000
b=1000
Why adding a statement in the m2 method, will cause the output value of tt.b to be changed?
Sorry I am quite confused here about the difference between threads and method invocation, hope experts can help out!
Synchronization in the Java sense combines several things. In this case these points are interesting:
mutual exclusion
memory barriers for readers
memory barriers for writers
After entering a synchronized block (or method) you have got two guarantees: You have the lock (mutual exclusion) and that the JVM and the compiler will discard any cache for the for the synchronization object. This means an access to this.b will fetch the actual value for 'b' from the RAM and not from any cache but only once. Then it will work with the cached copy again.
Leaving a synchronized block in turn guarantees that the CPU flushes all dirty (i.e. written) caches to the memory.
The point in your stuff is: System.out.println(tt.b); is in no way synchronized which means the access to it has not crossed a defined memory barrier. So although the other thread has written a new value for b and flushed it to the RAM the main thread has no idea, that it should read b from RAM and not from its own cache.
The solution is:
synchronized(tt){
System.out.println(tt.b);
}
This meets the golden rule, that if something is synchronized then every access to it should be synchronized and not only half of the accesses.
And regarding your added System.out: There are three things:
First: It is slow (compared to some memory fiddling). This means that in the meantime the CPU or the JVM might decide for themselves, that a new look to tt might be appropriate
Second: It is big (compared to some memory fiddling). This means that the touched code alone might evict tt from the caches.
Third: It is synchronized internally. This means that you crossed some memory barriers (which might have nothing to do with your tt - who knows). But these might also have some effect.
This is the lead rule of multithreading debugging: Adding System.out in order to catch errors will, according to Murphy, actually hide the problem.
I guess this is JVM implementation specific.
Basically, each thread has its's own copy (view) of the object variables and the way they are synced back and forth is not determined.
The most likely cause is that System.out.println is slow. The cause of the "unexpected" results is because of a race condition between the delay (Thread.sleep) and the overhead of opening the output stream (System.out.println).
So I've been reading on concurrency and have some questions on the way (guide I followed - though I'm not sure if its the best source):
Processes vs. Threads: Is the difference basically that a process is the program as a whole while a thread can be a (small) part of a program?
I am not exactly sure why there is a interrupted() method and a InterruptedException. Why should the interrupted() method even be used? It just seems to me that Java just adds an extra layer of indirection.
For synchronization (and specifically about the one in that link), how does adding the synchronize keyword even fix the problem? I mean, if Thread A gives back its incremented c and Thread B gives back the decremented c and store it to some other variable, I am not exactly sure how the problem is solved. I mean this may be answering my own question, but is it supposed to be assumed that after one of the threads return an answer, terminate? And if that is the case, why would adding synchronize make a difference?
I read (from some random PDF) that if you have two Threads start() subsequently, you cannot guarantee that the first thread will occur before the second thread. How would you guarantee it, though?
In synchronization statements, I am not completely sure whats the point of adding synchronized within the method. What is wrong with leaving it out? Is it because one expects both to mutate separately, but to be obtained together? Why not just have the two non-synchronized?
Is volatile just a keyword for variables and is synonymous with synchronized?
In the deadlock problem, how does synchronize even help the situation? What makes this situation different from starting two threads that change a variable?
Moreover, where is the "wait"/lock for the other person to bowBack? I would have thought that bow() was blocked, not bowBack().
I'll stop here because I think if I went any further without these questions answered, I will not be able to understand the later lessons.
Answers:
Yes, a process is an operating system process that has an address space, a thread is a unit of execution, and there can be multiple units of execution in a process.
The interrupt() method and InterruptedException are generally used to wake up threads that are waiting to either have them do something or terminate.
Synchronizing is a form of mutual exclusion or locking, something very standard and required in computer programming. Google these terms and read up on that and you will have your answer.
True, this cannot be guaranteed, you would have to have some mechanism, involving synchronization that the threads used to make sure they ran in the desired order. This would be specific to the code in the threads.
See answer to #3
Volatile is a way to make sure that a particular variable can be properly shared between different threads. It is necessary on multi-processor machines (which almost everyone has these days) to make sure the value of the variable is consistent between the processors. It is effectively a way to synchronize a single value.
Read about deadlocking in more general terms to understand this. Once you first understand mutual exclusion and locking you will be able to understand how deadlocks can happen.
I have not read the materials that you read, so I don't understand this one. Sorry.
I find that the examples used to explain synchronization and volatility are contrived and difficult to understand the purpose of. Here are my preferred examples:
Synchronized:
private Value value;
public void setValue(Value v) {
value = v;
}
public void doSomething() {
if(value != null) {
doFirstThing();
int val = value.getInt(); // Will throw NullPointerException if another
// thread calls setValue(null);
doSecondThing(val);
}
}
The above code is perfectly correct if run in a single-threaded environment. However with even 2 threads there is the possibility that value will be changed in between the check and when it is used. This is because the method doSomething() is not atomic.
To address this, use synchronization:
private Value value;
private Object lock = new Object();
public void setValue(Value v) {
synchronized(lock) {
value = v;
}
}
public void doSomething() {
synchronized(lock) { // Prevents setValue being called by another thread.
if(value != null) {
doFirstThing();
int val = value.getInt(); // Cannot throw NullPointerException.
doSecondThing(val);
}
}
}
Volatile:
private boolean running = true;
// Called by Thread 1.
public void run() {
while(running) {
doSomething();
}
}
// Called by Thread 2.
public void stop() {
running = false;
}
To explain this requires knowledge of the Java Memory Model. It is worth reading about in depth, but the short version for this example is that Threads have their own copies of variables which are only sync'd to main memory on a synchronized block and when a volatile variable is reached. The Java compiler (specifically the JIT) is allowed to optimise the code into this:
public void run() {
while(true) { // Will never end
doSomething();
}
}
To prevent this optimisation you can set a variable to be volatile, which forces the thread to access main memory every time it reads the variable. Note that this is unnecessary if you are using synchronized statements as both keywords cause a sync to main memory.
I haven't addressed your questions directly as Francis did so. I hope these examples can give you an idea of the concepts in a better way than the examples you saw in the Oracle tutorial.