I understand the code section below is problematic because the new value of isAlive set in the kill method might not be visible in the thread.
public class MyClass extends Thread {
private boolean isAlive;
public void run() {
while(isAlive) {
....
}
}
public void kill() {
isAlive = false;
}
}
The typical fix is to declare the isAlive variable as volatile.
My question here is that is there any other ways to achieve this without using volatile? Does Java provide other mechanisms to achieve this?
EDIT: Synchronize the method is also not an option.
There is no good reason to go for a different option than volatile. Volatile is needed to provide the appropriate happens-before edge between writing and reading; otherwise you have a data-race on your hands and as a consequence the write to the flag might never be seen. E.g. the compiler could hoist the read of the variable out of a loop.
There are cheaper alternative that provide more relaxed ordering guarantees compared to the sequential consistency that volatile provides. E.g. acquire/release or opaque (check out the Atomic classes and the VarHandle). But this should only be used in very rare situations where the ordering constraints reduce performance due to limited compiler optimizations and fences on a hardware level.
Long story short: make the variable volatile because it a simple and very fast solution.
There are three options:
Make the shared variable volatile. (This is the simplest way.)
Use synchronized, either in the form of synchronized methods or synchronized blocks. Note that you need to do both reads and writes for the shared variables while holding the (same) mutex.
Use one of the classes in java.util.concurrent that has a "synchronizing effect"1. Or more precisely, one that you can use to get a happens before relationship between the update and subsequent read of the isAlive variable. This will be documented in the respective classes javadocs.
If you don't use one of those options, it is not guaranteed2 that the thread that calls run() will see isAlive variable change from true to false.
If you want to understand the deep technical reasons why this is so, read Chapter 17.4 of the Java Language Specification where it specifies the Java Memory Model. (It will explain what happens before means in this context.)
1 - One of the Lock classes would be an obvious choice.
2 - That is to say ... your code may not work 100% reliably on all platforms. This is the kind of problem where "try it and see" or even extensive testing cannot show conclusively that your code is correct.
The wait/notify mechanism is embedded deep in the heart of the java language, the superclass of classes, has five methods that are the core of the wait/notify mechanism, notify(), notifyAll(), wait(), wait(long), and wait(long, int), all classes in java inherit from Object, in addition, none of these methods can be overridden in a subclass as they are all declared final
here is an example that may help you to understand the concept
public class NotifyAndWait {
public List list;
public NofityAndWait() { list = Collections.synchronizedList(new LinkedList ());
public String removeItem() throws InterruptedException {
synchronized(list) {
while(list.isEmpty())
list.wait();
}
String item = list.remove(0);
return item;
}
public void addItem(String item) {
synchronized(list) {
list.add(item);
//after adding, nofity and waiting all thread that the list has changed
list.notifyAll();
}
}
public static void main(String..args) throws Exception {
final NotifyAndWait obj = new NotifyAndWait();
Runnable runA = new Runnable() {
public void run() {
try {
String item = enf.removeItem();
catch(Exception e) {} };
Runnable runB = new Runnable() {
public void run() { obj.addItem("Hello"); }
};
Thread t1 = new Thread(runA, "T1");
t1.start();
Thread.sleep(500);
Thread t2 = new Thread(runB, "T2");
t2.start();
Thread.sleep(1000);
}
}
As far as I know, polling a boolean in a while loop as a "kill" control is a perfectly reasonable thing to do. Brian Goetz, in "Java Concurrency in Action" has a code example that is very similar to yours, on page 137 (section 7.1 Task Cancellation).
He states that making the boolean volatile gives the pattern greater reliability. He has a good description of the mechanism on page 38.
When a field is declared volatile, the compile and runtime are put on
notice that this variable is shared and that operations on it should
not be reordered with other memory operations. Volatile variables are
not cached in registers or in caches where they are hidden from other
processors, so a read of a volatile variable always returns the most
recent write by any thread.
I use volatile booleans and loose coupling as my main method of communication that must occur across threads. AFAIK, volatile has a smaller cost than synchronized and is the most recommended pattern for this situation.
Related
I often use the following pattern to create a cancellable thread:
public class CounterLoop implements Runnable {
private volatile AtomicBoolean cancelPending = new AtomicBoolean(false);
#Override
public void run() {
while (!cancelPending.get()) {
//count
}
}
public void cancel() {
cancelPending.set(true);
}
}
But I'm not sure that cancelPending MUST be a AtomicBoolean. Can we just use a normal boolean in this case?
Using both volatile and AtomicBoolean is unnecessary. If you declare the cancelPending variable as final as follows:
private final AtomicBoolean cancelPending = new AtomicBoolean(false);
the JLS semantics for final fields mean that synchronization (or volatile) will not be needed. All threads will see the correct value for the cancelPending reference. JLS 17.5 states:
"An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields."
... but there are no such guarantees for normal fields; i.e. not final and not volatile.
You could also just declare cancelPending as a volatile boolean ... since you don't appear to be using the test-and-set capability of AtomicBoolean.
However, if you used a non-volatile boolean you would need to use synchronized to ensure that all threads see an up-to-date copy of the cancelPending flag.
You can use a volatile boolean instead with no issues.
Note that this only applies in cases much like this where the boolean is only being changed to a specific value (true in this case). If the boolean might be changed to either true or false at any time then you may need an AtomicBoolean to detect and act on race conditions.
However - the pattern you describe has an innate smell. By looping on a boolean (volatile or not) you are likely to find yourself trying to insert some sort of sleep mechanism or having to interrupt your thread.
A much cleaner route is to split up the process into finer steps. I recently posted an answer here covering the options of pausing threads that may be of interest.
No, you can not. Because if you will change the boolean value from another thread without proper synchronization then this change can be invisible to another threads. You can use valotile boolean in your case to make any modification visible to all threads.
Yes you can. You can either use a non volatile AtomicBoolean (relying on its built in thread safety), or use any other volatile variable.
According to the Java Memory Model (JMM), both options result in a properly synchronized program, where the read and write of the cancelPending variable can't produce a data race.
Using a volatile boolean variable in this context is safe, though some may consider it bad practice. Consult this thread to see why.
Your solution of using an Atomic* variable seems the best option, even though the synchronization may introduce unnecessary overhead in comparison to a volatile variable.
You can also use a critical section
Object lock = new Object();
#Override
public void run() {
synchronized (lock) {
if (cancelPending) {
return;
}
}
}
or a synchronized method.
synchronized public boolean shouldStop() {
return shouldStop;
}
synchronized public void setStop(boolean stop) {
shouldStop = stop;
}
For example I have a class with 2 counters (in multi-threaded environment):
public class MyClass {
private int counter1;
private int counter2;
public synchronized void increment1() {
counter1++;
}
public synchronized void increment2() {
counter2++;
}
}
Theres 2 increment operations not related with each other. But I use same object for lock (this).
It is true that if clients simultaneously calls increment1() and increment2() methods, then increment2 invocation will be blocked until increment1() releases the this monitor?
If it's true, does it mean that I need to provide different monitor locks for each operation (for performance reasons)?
It is true that if clients simultaneously calls increment1() and increment2() methods, then increment2 invocation will be blocked until increment1() releases the this monitor?
If they're called on the same instance, then yes.
If it's true, does it mean that I need to provide different monitor locks for each operation (for performance reasons)?
Only you can know that. We don't know your performance requirements. Is this actually a problem in your real code? Are your real operations long-lasting? Do they occur very frequently? Have you performed any diagnostics to estimate the impact of this? Have you profiled your application to find out how much time is being spent waiting for the monitor at all, let alone when it's unnecessary?
I would actually suggest not synchronizing on this for entirely different reasons. It's already hard enough to reason about threading when you do control everything - but when you don't know everything which can acquire a monitor, you're on a hiding to nothing. When you synchronize on this, it means that any other code which has a reference to your object can also synchronize on the same monitor. For example, a client could use:
synchronized (myClass) {
// Do something entirely different
}
This can lead to deadlocks, performance issues, all kinds of things.
If you use a private final field in your class instead, with an object created just to be a monitor, then you know that the only code acquiring that monitor will be your code.
1) yes it's true that increment1() blocks increment2() and vice versa because they both are implicitly synchronizing on this
2) if you need a better performance consider the lock-free java.util.concurrent.atomic.AtomicInteger class
private AtomicInteger counter1 = new AtomicInteger();
private AtomicInteger counter2 = new AtomicInteger();
public void increment1() {
counter1.getAndIncrement();
}
public void increment2() {
counter2.getAndIncrement();
}
If you synchonize on the method, as what you did here, you lock the whole object, so two thread accessing a different variable from this same object would block each other anyway.
If you want to syncrhonize only a counter at a time so two thread won't block each other while accessing different variables, you have to add the two counters here in two synchronized block, and use different variables as the "lock" of the two blocks.
You are right it will be a performance bottleneck if you use same Object. You can use different lock for individual counter or use java.util.concurrent.atomic.AtomicInteger for concurrent counter.
Like:
public class Counter {
private AtomicInteger count = new AtomicInteger(0);
public void incrementCount() {
count.incrementAndGet();
}
public int getCount() {
return count.get();
}
}
Yes the given code is identical to the following:
public void increment1() {
synchronized(this) {
counter1++;
}
}
public oid increment2() {
synchronized(this) {
counter2++;
}
}
which means that only one method can be executed at the same time. You should either provide different locks (and locking on this is a bad idea to begin with), or some other solution. The second one is the one you actually want here: AtomicInteger
Yes if multiple threads try to call methods on your object they will wait trying to get the lock (although the order of who gets the lock isn't guaranteed.) As with everything there is no reason to optimise until you know this is the bottle neck in you code.
If you need the performance benefits that can be had from being able to call both operations in parallel, then yes, you do not to provide different monitor objects for the different operations.
However, there is something to be said for premature optimization and that you should make sure that you need it before making your program more complex to accommodate it.
Should we declare the private fields as volatile if the instanced are used in multiple threads?
In Effective Java, there is an example where the code doesn't work without volatile:
import java.util.concurrent.TimeUnit;
// Broken! - How long would you expect this program to run?
public class StopThread {
private static boolean stopRequested; // works, if volatile is here
public static void main(String[] args) throws InterruptedException {
Thread backgroundThread = new Thread(new Runnable() {
public void run() {
int i = 0;
while (!stopRequested)
i++;
}
});
backgroundThread.start();
TimeUnit.SECONDS.sleep(1);
stopRequested = true;
}
}
The explanations says that
while(!stopRequested)
i++;
is optimized to something like this:
if(!stopRequested)
while(true)
i++;
so further modifications of stopRequested aren't seen by the background thread, so it loops forever. (BTW, that code terminates without volatile on JRE7.)
Now consider this class:
public class Bean {
private boolean field = true;
public boolean getField() {
return field;
}
public void setField(boolean value) {
field = value;
}
}
and a thread as follows:
public class Worker implements Runnable {
private Bean b;
public Worker(Bean b) {
this.b = b;
}
#Override
public void run() {
while(b.getField()) {
System.err.println("Waiting...");
try { Thread.sleep(1000); }
catch(InterruptedException ie) { return; }
}
}
}
The above code works as expected without using volatiles:
public class VolatileTest {
public static void main(String [] args) throws Exception {
Bean b = new Bean();
Thread t = new Thread(new Worker(b));
t.start();
Thread.sleep(3000);
b.setField(false); // stops the child thread
System.err.println("Waiting the child thread to quit");
t.join();
// if the code gets, here the child thread is stopped
// and it really gets, with JRE7, 6 with -server, -client
}
}
I think because of the public setter, the compiler/JVM should never optimize the code which calls getField(), but this article says that there is some "Volatile Bean" pattern (Pattern #4), which should be applied to create mutable thread-safe classes. Update: maybe that article applies for IBM JVM only?
The question is: which part of JLS explicitly or implicitly says that private primitive fields with public getters/setters must be declared as volatile (or they don't have to)?
Sorry for a long question, I tried to explain the problem in details. Let me know if something is not clear. Thanks.
The question is: which part of JLS explicitly or implicitly says that private primitive fields with public getters/setters must be declared as volatile (or they don't have to)?
The JLS memory model doesn't care about getters/setters. They're no-ops from the memory model perspective - you could as well be accessing public fields. Wrapping the boolean behind a method call doesn't affect its memory visibility. Your latter example works purely by luck.
Should we declare the private fields as volatile if the instanced are used in multiple threads?
If a class (bean) is to be used in multithreaded environment, you must somehow take that into account. Making private fields volatile is one approach: it ensures that each thread is guaranteed to see the latest value of that field, not anything cached / optimized away stale values. But it doesn't solve the problem of atomicity.
The article you linked to applies to any JVM that adheres to the JVM specification (which the JLS leans on). You will get various results depending on the JVM vendor, version, flags, computer and OS, the number of times you run the program (HotSpot optimizations often kick in after the 10000th run) etc, so you really must understand the spec and carefully adhere to the rules in order to create reliable programs. Experimenting in this case is a poor way to find out how things work because the JVM can behave in any way it wants as long at it falls within the spec, and most JVMs do contain loads of all kind of dynamic optimizations.
Before I answer your question I want to address
BTW, that code terminates without volatile on JRE7
This can change if you were to deploy the same application with different runtime arguments. Hoisting isn't necessarily a default implementation for JVMs so it can work in one and not in another.
To answer your question there is nothing preventing the Java compiler from executing your latter example like so
#Override
public void run() {
if(b.getField()){
while(true) {
System.err.println("Waiting...");
try { Thread.sleep(1000); }
catch(InterruptedException ie) { return; }
}
}
}
It is still sequentially consistent and thus maintains Java's guarantees - you can read specifically 17.4.3:
Among all the inter-thread actions performed by each thread t, the
program order of t is a total order that reflects the order in which
these actions would be performed according to the intra-thread
semantics of t.
A set of actions is sequentially consistent if all actions occur in a
total order (the execution order) that is consistent with program
order, and furthermore, each read r of a variable v sees the value
written by the write w to v such that:
In other words - So long as a thread will see the read and write of a field in the same order regardless of the compiler/memory re ordering it is considered sequentially consistent.
No, that code is just as incorrect. Nothing in the JLS says a field must be declared as volatile. However, if you want your code to work correctly in a multi-threaded environment, then you have to obey the visibility rules. volatile and synchronized are two of the major facilities for correctly making data visible across threads.
As for your example, the difficulty of writing multi-threaded code is that many forms of incorrect code work fine in testing. Just because a multi-threaded test "succeeds" in testing does not mean it is correct code.
For the specific JLS reference, see the Happens Before section (and the rest of the page).
Note, as a general rule of thumb, if you think you have come up with a clever new way to get around "standard" thread-safe idioms, you are most likely wrong.
So I've been reading on concurrency and have some questions on the way (guide I followed - though I'm not sure if its the best source):
Processes vs. Threads: Is the difference basically that a process is the program as a whole while a thread can be a (small) part of a program?
I am not exactly sure why there is a interrupted() method and a InterruptedException. Why should the interrupted() method even be used? It just seems to me that Java just adds an extra layer of indirection.
For synchronization (and specifically about the one in that link), how does adding the synchronize keyword even fix the problem? I mean, if Thread A gives back its incremented c and Thread B gives back the decremented c and store it to some other variable, I am not exactly sure how the problem is solved. I mean this may be answering my own question, but is it supposed to be assumed that after one of the threads return an answer, terminate? And if that is the case, why would adding synchronize make a difference?
I read (from some random PDF) that if you have two Threads start() subsequently, you cannot guarantee that the first thread will occur before the second thread. How would you guarantee it, though?
In synchronization statements, I am not completely sure whats the point of adding synchronized within the method. What is wrong with leaving it out? Is it because one expects both to mutate separately, but to be obtained together? Why not just have the two non-synchronized?
Is volatile just a keyword for variables and is synonymous with synchronized?
In the deadlock problem, how does synchronize even help the situation? What makes this situation different from starting two threads that change a variable?
Moreover, where is the "wait"/lock for the other person to bowBack? I would have thought that bow() was blocked, not bowBack().
I'll stop here because I think if I went any further without these questions answered, I will not be able to understand the later lessons.
Answers:
Yes, a process is an operating system process that has an address space, a thread is a unit of execution, and there can be multiple units of execution in a process.
The interrupt() method and InterruptedException are generally used to wake up threads that are waiting to either have them do something or terminate.
Synchronizing is a form of mutual exclusion or locking, something very standard and required in computer programming. Google these terms and read up on that and you will have your answer.
True, this cannot be guaranteed, you would have to have some mechanism, involving synchronization that the threads used to make sure they ran in the desired order. This would be specific to the code in the threads.
See answer to #3
Volatile is a way to make sure that a particular variable can be properly shared between different threads. It is necessary on multi-processor machines (which almost everyone has these days) to make sure the value of the variable is consistent between the processors. It is effectively a way to synchronize a single value.
Read about deadlocking in more general terms to understand this. Once you first understand mutual exclusion and locking you will be able to understand how deadlocks can happen.
I have not read the materials that you read, so I don't understand this one. Sorry.
I find that the examples used to explain synchronization and volatility are contrived and difficult to understand the purpose of. Here are my preferred examples:
Synchronized:
private Value value;
public void setValue(Value v) {
value = v;
}
public void doSomething() {
if(value != null) {
doFirstThing();
int val = value.getInt(); // Will throw NullPointerException if another
// thread calls setValue(null);
doSecondThing(val);
}
}
The above code is perfectly correct if run in a single-threaded environment. However with even 2 threads there is the possibility that value will be changed in between the check and when it is used. This is because the method doSomething() is not atomic.
To address this, use synchronization:
private Value value;
private Object lock = new Object();
public void setValue(Value v) {
synchronized(lock) {
value = v;
}
}
public void doSomething() {
synchronized(lock) { // Prevents setValue being called by another thread.
if(value != null) {
doFirstThing();
int val = value.getInt(); // Cannot throw NullPointerException.
doSecondThing(val);
}
}
}
Volatile:
private boolean running = true;
// Called by Thread 1.
public void run() {
while(running) {
doSomething();
}
}
// Called by Thread 2.
public void stop() {
running = false;
}
To explain this requires knowledge of the Java Memory Model. It is worth reading about in depth, but the short version for this example is that Threads have their own copies of variables which are only sync'd to main memory on a synchronized block and when a volatile variable is reached. The Java compiler (specifically the JIT) is allowed to optimise the code into this:
public void run() {
while(true) { // Will never end
doSomething();
}
}
To prevent this optimisation you can set a variable to be volatile, which forces the thread to access main memory every time it reads the variable. Note that this is unnecessary if you are using synchronized statements as both keywords cause a sync to main memory.
I haven't addressed your questions directly as Francis did so. I hope these examples can give you an idea of the concepts in a better way than the examples you saw in the Oracle tutorial.
I am refering to page 261 - 262 of Joshua Bloch Effective Java
// Properly synchronized cooperative thread termination
public class StopThread {
private static boolean stopRequested;
private static synchronized void requestStop() {
stopRequested = true;
}
private static synchronized boolean stopRequested() {
return stopRequested;
}
public static void main(String[] args) throws InterruptedException {
Thread backgroundThread = new Thread(new Runnable() {
public void run() {
int i = 0;
while (!stopRequested())
i++;
}
});
backgroundThread.start();
TimeUnit.SECONDS.sleep(1);
requestStop();
}
}
Note that both the write method
(requestStop) and the read method
(stop- Requested) are synchronized. It
is not sufficient to synchronize only
the write method! In fact,
synchronization has no effect unless
both read and write operations are
synchronized.
Joshua's example is synchronized on this. However My doubt is that, must synchronized be acted on the same object? Say, if I change the code to
private static void requestStop() {
synchronized(other_static_final_object_monitor) {
stopRequested = true;
}
}
private static synchronized boolean stopRequested() {
return stopRequested;
}
will this still able to avoid liveness failure?
That's is, we know grabbing monitor for a same object during read/write can avoid liveness failure (According to Joshua Bloch's example). But how about grabbing monitor for different object during read/write?
I don't believe it's guaranteed, although I wouldn't be surprised if it actually was okay in all existing implementations. The Java Language Specification, section 17.4.4 states this:
An unlock action on monitor m synchronizes-with all subsequent lock actions on m (where subsequent is defined according to the synchronization order).
I believe that all the safety of reading/writing shared variables within locks stems from that bullet point in the spec - and that only specifies anything about a lock and an unlock action on a single monitor.
EDIT: Even if this did work for a single variable, you wouldn't want to use it for multiple variables. If you update multiple variables while holding a monitor and only read from them when holding a monitor, you can ensure that you always read a consistent set of data: nothing's going to write to variable Y before you've read that but after you've read variable X. If you use different monitors for reading and writing, that consistency goes away: the values could be changed at any time while you're reading them.
Possibly, but there are no guarantees and it could be highly platform dependant. In your case there is no real test for liveliness so if the value is a few milli-seconds late your application will appear to work correctly anyway. The application will stop eventually without any synchronized and you may not see then difference.
The problem with memory consistency errors is I have seen examples where something can be updated correctly in a test 1 billion times and then fail when there is a different program running on the system. This is why guaranteed behaviour is more interesting.
According to the The Java Language Specification,
"We say that a read r of a variable v is allowed to observe a write w to v if, in the happens-before partial order of the execution trace:
r is not ordered before w (i.e., it is not the case that hb(r, w), and
there is no intervening write w' to v (i.e., no write w' to v such that hb(w, w') and hb(w', r).
Informally, a read r is allowed to see the result of a write w if there is no happens-before ordering to prevent that read."
This means that unless there is some explicit synchronization action that causes multiple threads to interleave their actions in some predictable way (i.e. there's a good happens-before relationship defined on their actions), then a thread is allowed to see pretty much any value of a variable at any point where it was written to.
If you synchronize on multiple different objects, there is no happens-before relationship connecting the reader and the writer. This means that the reading thread can keep seeing whatever value it wants for the stopRequested variable, which could either be the first value forever, or the new value as soon as its updated, or something delightfully in-between the two.
Theoretically it's wrong. Per lang spec v3, the background thread may not see the update.
Practically it'll work. VM just can't be that smart to optimize to such a degree. (In older version of Java, which has threading spec worded differently, it is possible that your suggestion is correct even in theory.)
In any case, don't do it.
If you use a different monitor, there is no synchronization. No other code is requesting the monitor of this or other_static_final_object_monitor.
Using a static object to synchronize is only useful, if you want to synchronize across classes and within methods.
Also, NEVER use a String as a lock/monitor. Always use something like this:
static final Object LOCK = new Object();