java: `volatile` private fields with getters and setters

java: `volatile` private fields with getters and setters - java

Should we declare the private fields as volatile if the instanced are used in multiple threads?
In Effective Java, there is an example where the code doesn't work without volatile:
import java.util.concurrent.TimeUnit;
// Broken! - How long would you expect this program to run?
public class StopThread {
private static boolean stopRequested; // works, if volatile is here
public static void main(String[] args) throws InterruptedException {
Thread backgroundThread = new Thread(new Runnable() {
public void run() {
int i = 0;
while (!stopRequested)
i++;
}
});
backgroundThread.start();
TimeUnit.SECONDS.sleep(1);
stopRequested = true;
}
}
The explanations says that
while(!stopRequested)
i++;
is optimized to something like this:
if(!stopRequested)
while(true)
i++;
so further modifications of stopRequested aren't seen by the background thread, so it loops forever. (BTW, that code terminates without volatile on JRE7.)
Now consider this class:
public class Bean {
private boolean field = true;
public boolean getField() {
return field;
}
public void setField(boolean value) {
field = value;
}
}
and a thread as follows:
public class Worker implements Runnable {
private Bean b;
public Worker(Bean b) {
this.b = b;
}
#Override
public void run() {
while(b.getField()) {
System.err.println("Waiting...");
try { Thread.sleep(1000); }
catch(InterruptedException ie) { return; }
}
}
}
The above code works as expected without using volatiles:
public class VolatileTest {
public static void main(String [] args) throws Exception {
Bean b = new Bean();
Thread t = new Thread(new Worker(b));
t.start();
Thread.sleep(3000);
b.setField(false); // stops the child thread
System.err.println("Waiting the child thread to quit");
t.join();
// if the code gets, here the child thread is stopped
// and it really gets, with JRE7, 6 with -server, -client
}
}
I think because of the public setter, the compiler/JVM should never optimize the code which calls getField(), but this article says that there is some "Volatile Bean" pattern (Pattern #4), which should be applied to create mutable thread-safe classes. Update: maybe that article applies for IBM JVM only?
The question is: which part of JLS explicitly or implicitly says that private primitive fields with public getters/setters must be declared as volatile (or they don't have to)?
Sorry for a long question, I tried to explain the problem in details. Let me know if something is not clear. Thanks.

The question is: which part of JLS explicitly or implicitly says that private primitive fields with public getters/setters must be declared as volatile (or they don't have to)?
The JLS memory model doesn't care about getters/setters. They're no-ops from the memory model perspective - you could as well be accessing public fields. Wrapping the boolean behind a method call doesn't affect its memory visibility. Your latter example works purely by luck.
Should we declare the private fields as volatile if the instanced are used in multiple threads?
If a class (bean) is to be used in multithreaded environment, you must somehow take that into account. Making private fields volatile is one approach: it ensures that each thread is guaranteed to see the latest value of that field, not anything cached / optimized away stale values. But it doesn't solve the problem of atomicity.
The article you linked to applies to any JVM that adheres to the JVM specification (which the JLS leans on). You will get various results depending on the JVM vendor, version, flags, computer and OS, the number of times you run the program (HotSpot optimizations often kick in after the 10000th run) etc, so you really must understand the spec and carefully adhere to the rules in order to create reliable programs. Experimenting in this case is a poor way to find out how things work because the JVM can behave in any way it wants as long at it falls within the spec, and most JVMs do contain loads of all kind of dynamic optimizations.

Before I answer your question I want to address
BTW, that code terminates without volatile on JRE7
This can change if you were to deploy the same application with different runtime arguments. Hoisting isn't necessarily a default implementation for JVMs so it can work in one and not in another.
To answer your question there is nothing preventing the Java compiler from executing your latter example like so
#Override
public void run() {
if(b.getField()){
while(true) {
System.err.println("Waiting...");
try { Thread.sleep(1000); }
catch(InterruptedException ie) { return; }
}
}
}
It is still sequentially consistent and thus maintains Java's guarantees - you can read specifically 17.4.3:
Among all the inter-thread actions performed by each thread t, the
program order of t is a total order that reflects the order in which
these actions would be performed according to the intra-thread
semantics of t.
A set of actions is sequentially consistent if all actions occur in a
total order (the execution order) that is consistent with program
order, and furthermore, each read r of a variable v sees the value
written by the write w to v such that:
In other words - So long as a thread will see the read and write of a field in the same order regardless of the compiler/memory re ordering it is considered sequentially consistent.

No, that code is just as incorrect. Nothing in the JLS says a field must be declared as volatile. However, if you want your code to work correctly in a multi-threaded environment, then you have to obey the visibility rules. volatile and synchronized are two of the major facilities for correctly making data visible across threads.
As for your example, the difficulty of writing multi-threaded code is that many forms of incorrect code work fine in testing. Just because a multi-threaded test "succeeds" in testing does not mean it is correct code.
For the specific JLS reference, see the Happens Before section (and the rest of the page).
Note, as a general rule of thumb, if you think you have come up with a clever new way to get around "standard" thread-safe idioms, you are most likely wrong.

Related

Using a boolean to coordinate two threads in Java

I understand the code section below is problematic because the new value of isAlive set in the kill method might not be visible in the thread.
public class MyClass extends Thread {
private boolean isAlive;
public void run() {
while(isAlive) {
....
}
}
public void kill() {
isAlive = false;
}
}
The typical fix is to declare the isAlive variable as volatile.
My question here is that is there any other ways to achieve this without using volatile? Does Java provide other mechanisms to achieve this?
EDIT: Synchronize the method is also not an option.

There is no good reason to go for a different option than volatile. Volatile is needed to provide the appropriate happens-before edge between writing and reading; otherwise you have a data-race on your hands and as a consequence the write to the flag might never be seen. E.g. the compiler could hoist the read of the variable out of a loop.
There are cheaper alternative that provide more relaxed ordering guarantees compared to the sequential consistency that volatile provides. E.g. acquire/release or opaque (check out the Atomic classes and the VarHandle). But this should only be used in very rare situations where the ordering constraints reduce performance due to limited compiler optimizations and fences on a hardware level.
Long story short: make the variable volatile because it a simple and very fast solution.

There are three options:
Make the shared variable volatile. (This is the simplest way.)
Use synchronized, either in the form of synchronized methods or synchronized blocks. Note that you need to do both reads and writes for the shared variables while holding the (same) mutex.
Use one of the classes in java.util.concurrent that has a "synchronizing effect"1. Or more precisely, one that you can use to get a happens before relationship between the update and subsequent read of the isAlive variable. This will be documented in the respective classes javadocs.
If you don't use one of those options, it is not guaranteed2 that the thread that calls run() will see isAlive variable change from true to false.
If you want to understand the deep technical reasons why this is so, read Chapter 17.4 of the Java Language Specification where it specifies the Java Memory Model. (It will explain what happens before means in this context.)
1 - One of the Lock classes would be an obvious choice.
2 - That is to say ... your code may not work 100% reliably on all platforms. This is the kind of problem where "try it and see" or even extensive testing cannot show conclusively that your code is correct.

The wait/notify mechanism is embedded deep in the heart of the java language, the superclass of classes, has five methods that are the core of the wait/notify mechanism, notify(), notifyAll(), wait(), wait(long), and wait(long, int), all classes in java inherit from Object, in addition, none of these methods can be overridden in a subclass as they are all declared final
here is an example that may help you to understand the concept
public class NotifyAndWait {
public List list;
public NofityAndWait() { list = Collections.synchronizedList(new LinkedList ());
public String removeItem() throws InterruptedException {
synchronized(list) {
while(list.isEmpty())
list.wait();
}
String item = list.remove(0);
return item;
}
public void addItem(String item) {
synchronized(list) {
list.add(item);
//after adding, nofity and waiting all thread that the list has changed
list.notifyAll();
}
}
public static void main(String..args) throws Exception {
final NotifyAndWait obj = new NotifyAndWait();
Runnable runA = new Runnable() {
public void run() {
try {
String item = enf.removeItem();
catch(Exception e) {} };
Runnable runB = new Runnable() {
public void run() { obj.addItem("Hello"); }
};
Thread t1 = new Thread(runA, "T1");
t1.start();
Thread.sleep(500);
Thread t2 = new Thread(runB, "T2");
t2.start();
Thread.sleep(1000);
}
}

As far as I know, polling a boolean in a while loop as a "kill" control is a perfectly reasonable thing to do. Brian Goetz, in "Java Concurrency in Action" has a code example that is very similar to yours, on page 137 (section 7.1 Task Cancellation).
He states that making the boolean volatile gives the pattern greater reliability. He has a good description of the mechanism on page 38.
When a field is declared volatile, the compile and runtime are put on
notice that this variable is shared and that operations on it should
not be reordered with other memory operations. Volatile variables are
not cached in registers or in caches where they are hidden from other
processors, so a read of a volatile variable always returns the most
recent write by any thread.
I use volatile booleans and loose coupling as my main method of communication that must occur across threads. AFAIK, volatile has a smaller cost than synchronized and is the most recommended pattern for this situation.

Java multithread visibility?

i have read in a book saying that because of compiler optimization, code execution might be reordered to cause the ReaderThread be in infinite loop. How is that possible?
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready)
Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}

How is that possible?
Code reordering is possible (in general) because the Java Language Specification (JLS) says it is possible. However, reordering is (probably) not going to be the problem here. Rather, an infinite loop is likely to be due to hardware memory cache behavior.
In this case, there is nothing in the JLS that requires the writes to the variables made by the main method to be visible to the child thread. The technical explanation is that there is no happens-before chain linking the writes to the (subsequent) reads. Without the crucial happens-before chain, visibility is not guaranteed.
Note that, whether there is actually an infinite loop here will depend on all sorts of factors. The point is that it is a possibility given the way the example code is written.

it is caused by CPU cache maybe. The ReaderThread may see outdated value of the "ready" variable, so it may not break out the while loop. The outdated "ready" value is called a stale data.

Are the effects of not using volatile keyword platform specific?

Are the effects of not using volatile keyword platform specific?
On Ubuntu 13.04 x64 with openJDK 1.7 using or not using volatile keyword has no effect; in the sense that the program executes as expected when not using volatile.
I want to know what is the exact reason of this and why it doesn't fail every time like in Windows with Oracle's JVM.
I know what volatile guarantees and when it should be used. This is not the question.
example:
public class VolatileTest {
private static boolean test;
public static void main(String... args) {
Thread a = new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
test = true;
}
});
Thread b = new Thread(new Runnable() {
#Override
public void run() {
while(!test) {
try {
Thread.sleep(500);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println(test);
}
}
});
a.start();
b.start();
}}

Volatile is not platform-specific in its effects. It complies to the specification set out in the Java Memory Model. Its implementation in the JVM is obviously different for each platform, as for how it enforces memory barriers.
Can you provide sample code? Your test is probably misleading.

Are the effects of not using volatile keyword platform specific?
The Java Memory Model describes how a program must behave when it is properly sychronized, but does not say much about how a program could behave when it is not (apart from the causality requirement for example). So in theory, it is platform specific.
In practice, it is platform and JVM specific. Typically, x86 architectures have fairly a strong memory model and removing the volatile keyword will often not break your program (i.e. the code still behaves as if the variable was still volatile).
Some (generally older) processors are even sequentially consistent, which means that your program will behave as if everything were synchronized.
On the other hand, on processors with weaker memory models such as ARM, it is easier to observe the effects of a broken multi-threaded program.
Similarly, some JVM are more aggressive in their optimisations and will reorder instructions, hoist variables etc. differently. With a given JVM, the parameters you use could have an effect too. For example on Hotspot, if you disable JIT compilation, you will probably "fix" some thread safety issues.
See also this post on memory models.

A field may be declared volatile, in which case the Java Memory Model ensures
that all threads see a consistent value for the variable. -- JLS 8.3.1.4
Volatile guarantees consistent view of the variable. Native implementations of volatile prohibit the CPU/Core from keeping the variable in it's registers for the computations performed by the thread, so that all the threads running on other CPUs/Core can have consistent view of that variable.
You can not conclude that it's not working by running few test cases. These kind of issues may get caught in 1 out of thousands of test.

Declaring a field volatile ensures all threads see the current value of the field. Not declaring a field volatile does not ensure other threads won't see a modification on the field made by a thread.
Modifying a little your example:
public class VolatileTest {
private static boolean test = false;
public static void main(String... args) {
Thread a = new Thread(new Runnable() {
#Override
public void run() {
boolean localTest = test;
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("changing local test");
test = !localTest;
}
});
Thread b = new Thread(new Runnable() {
#Override
public void run() {
boolean localTest = test;
while(!localTest) {
localTest = test;
}
}
});
a.start();
b.start();
}}
This one will read the field test a lot of times, then the JVM will cache test value for thread b. Making thread b sleep will make test field not be read so much, then the JVM won't try to "optimize" (caching the value of test on thread b) - but if test was declared volatile, the JVM would have no means to cache test's value.
The test was run on Ubuntu Lucid 64 bits, with Oracle's Java 1.7.0.

The example code you have provided does not exercise the loop enough to get JIT optimised. It will run in the interpreter, and as such your code will (most likely) not be subject to any compiler optimisation tricks that will cause it to fail. The CPU itself is also unlikely to pull tricks advanced enough to thwart this code, especially considering the relatively huge timescales involved here with those sleeps.
However, this is an implementation detail of your particular JVM. Your program is racy even if the bug will never manifest on your particular combination of hardware, JVM and operating system.

For performance optimisation, a compiler & JVM implementation may move code order and also use registers or caches visible to just one thread. Obviously, this behaviour varies for each implementation.
That means that an updated value for a variable may be visible to one one thread; while all other threads read a stale varlue
If you want to safely accessing a variable from multiple threads, guaranteeing consistent reads of the most up-to-date value, you must use volatile. This behaviour provided by volatile is not platform-specific.
It is possible that you could fluke it under certain scenarios/implementations and see consistent up-to-date values without volatile, but that's pot luck & no way to program.

Am I properly implementing Java's synchronized construct?

I'm kind of confused about how to implement synchronized blocks in Java.
Here is an example situation:
public class SlotWheel extends Thread implements ActionListener
{
private int currentTick; // This instance variable is modified in two places
private synchronized void resetCurrentTicks()
{
currentTick = 0;
}
private synchronized void incrementCurrentTicks()
{
++currentTick;
}
public void actionPerformed(ActionEvent e)
{
resetCurrentTicks();
}
}
While the program is running, it's possible that a user clicks a button which invokes actionPerformed which then calls resetCurrentTicks. At the same time, the running thread is calling incrementCurrentTicks on each loop iteration.
Because I'm still new to Java and programming, I'm not sure if my implementation is protecting currentTick from becoming corrupted.
I have this feeling that my implementation would only work if incrementCurrentTicks were to be called in actionPerformed and in the running thread, but because I'm manipulating currentTick from different methods, my implementation is wrong.

Looks ok.
See http://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html
It is not possible for two invocations of synchronized methods on the same object to interleave
Of course you should consider whether it is the GUI thread trying to mess with the ticks or not. In your simple case it's probably ok, but in a more complex case you might want to push the "work" out of the GUI thread.

Your instincts are correct. It is difficult to synchronize access to class properties consistently across multiple methods. Rather than attempting to do so, I would recommend you take a look at java.util.concurrent.atomic.AtomicInteger. It provides you with safe, concurrent access to the underlying property without writing and testing alot of boilerplate code.
Incorporating it into your code, you would end up with something like this:
public class SlotWheel extends Thread implements ActionListener {
private AtomicInteger currentTick = new AtomicInteger();
private void resetCurrentTicks() {
currentTick.set(0);
}
private void incrementCurrentTicks() {
currentTick.incrementAndGet();
}
public void actionPerformed(ActionEvent e)
{
resetCurrentTicks();
}
}

First off, Java guarantees that "scalar" values -- integers, chars, floats, etc -- are atomic in themselves in that you cannot simultaneously modify such a value and get a mixture of the two sources. You're guaranteed to get one value or the other of two "simultaneous" modifications. You can, however, get an inconsistent result from, eg, x++, since two threads may attempt to simultaneously increment x and possibly only one increment might occur. (OTOH, two threads simultaneously performing x = 7; will obviously not interfere with each other -- simultaneous access does not cause an explosion or anything.)
Next, understand that the synchronized keyword is used in two slightly different contexts -- as a method modifier and as a block modifier. There is some modest difference between the two.
When used as a block modifier you say synchronized(object_ref) {some block}. In this case the synchronized statement gets a lock on the object identified by object_ref and all other syncronized statements that might simultaneously attempt to execute referencing the same object will be held off while the current statement finishes its block.
When you use it as a method modifier, the function is the same except that, for a non-static method, the "this" object is the one that is locked, and the entire method is is "protected" by the lock.
(For a static method, on the other hand, the Class object is locked -- a slightly special case, equivalent to synchronized(ClassName.class){some block} as a synchronized block.)
It's important to understand that for two synchronized blocks or methods to be prevented from simultaneously executing they must be referencing, as their synchronizing object, the same object, not simply one of the same class.

You are correct, in that it is not safe. However, you can simply synchronize on any Object in scope, and remove "synchronized" from the method definition
public class MyThread {
private Object lock = new Object();
private int counter;
protected void threadMetod() {
synchronized (lock) {
counter++;
}
}
public void otherReset() {
synchronized (lock) {
counter = 0;
}
}
}

Questions on Concurrency from Java Guide

So I've been reading on concurrency and have some questions on the way (guide I followed - though I'm not sure if its the best source):
Processes vs. Threads: Is the difference basically that a process is the program as a whole while a thread can be a (small) part of a program?
I am not exactly sure why there is a interrupted() method and a InterruptedException. Why should the interrupted() method even be used? It just seems to me that Java just adds an extra layer of indirection.
For synchronization (and specifically about the one in that link), how does adding the synchronize keyword even fix the problem? I mean, if Thread A gives back its incremented c and Thread B gives back the decremented c and store it to some other variable, I am not exactly sure how the problem is solved. I mean this may be answering my own question, but is it supposed to be assumed that after one of the threads return an answer, terminate? And if that is the case, why would adding synchronize make a difference?
I read (from some random PDF) that if you have two Threads start() subsequently, you cannot guarantee that the first thread will occur before the second thread. How would you guarantee it, though?
In synchronization statements, I am not completely sure whats the point of adding synchronized within the method. What is wrong with leaving it out? Is it because one expects both to mutate separately, but to be obtained together? Why not just have the two non-synchronized?
Is volatile just a keyword for variables and is synonymous with synchronized?
In the deadlock problem, how does synchronize even help the situation? What makes this situation different from starting two threads that change a variable?
Moreover, where is the "wait"/lock for the other person to bowBack? I would have thought that bow() was blocked, not bowBack().
I'll stop here because I think if I went any further without these questions answered, I will not be able to understand the later lessons.

Answers:
Yes, a process is an operating system process that has an address space, a thread is a unit of execution, and there can be multiple units of execution in a process.
The interrupt() method and InterruptedException are generally used to wake up threads that are waiting to either have them do something or terminate.
Synchronizing is a form of mutual exclusion or locking, something very standard and required in computer programming. Google these terms and read up on that and you will have your answer.
True, this cannot be guaranteed, you would have to have some mechanism, involving synchronization that the threads used to make sure they ran in the desired order. This would be specific to the code in the threads.
See answer to #3
Volatile is a way to make sure that a particular variable can be properly shared between different threads. It is necessary on multi-processor machines (which almost everyone has these days) to make sure the value of the variable is consistent between the processors. It is effectively a way to synchronize a single value.
Read about deadlocking in more general terms to understand this. Once you first understand mutual exclusion and locking you will be able to understand how deadlocks can happen.
I have not read the materials that you read, so I don't understand this one. Sorry.

I find that the examples used to explain synchronization and volatility are contrived and difficult to understand the purpose of. Here are my preferred examples:
Synchronized:
private Value value;
public void setValue(Value v) {
value = v;
}
public void doSomething() {
if(value != null) {
doFirstThing();
int val = value.getInt(); // Will throw NullPointerException if another
// thread calls setValue(null);
doSecondThing(val);
}
}
The above code is perfectly correct if run in a single-threaded environment. However with even 2 threads there is the possibility that value will be changed in between the check and when it is used. This is because the method doSomething() is not atomic.
To address this, use synchronization:
private Value value;
private Object lock = new Object();
public void setValue(Value v) {
synchronized(lock) {
value = v;
}
}
public void doSomething() {
synchronized(lock) { // Prevents setValue being called by another thread.
if(value != null) {
doFirstThing();
int val = value.getInt(); // Cannot throw NullPointerException.
doSecondThing(val);
}
}
}
Volatile:
private boolean running = true;
// Called by Thread 1.
public void run() {
while(running) {
doSomething();
}
}
// Called by Thread 2.
public void stop() {
running = false;
}
To explain this requires knowledge of the Java Memory Model. It is worth reading about in depth, but the short version for this example is that Threads have their own copies of variables which are only sync'd to main memory on a synchronized block and when a volatile variable is reached. The Java compiler (specifically the JIT) is allowed to optimise the code into this:
public void run() {
while(true) { // Will never end
doSomething();
}
}
To prevent this optimisation you can set a variable to be volatile, which forces the thread to access main memory every time it reads the variable. Note that this is unnecessary if you are using synchronized statements as both keywords cause a sync to main memory.
I haven't addressed your questions directly as Francis did so. I hope these examples can give you an idea of the concepts in a better way than the examples you saw in the Oracle tutorial.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.