Those who have developed professional, multi-threaded, Java Spring applications can probably testify the use of the volatile keyword is almost non-existent (and other threading controls for that matter), despite the potential disastrous consequences of missing it when needed.
Let me provide an example of very common code
#Service
public class FeatureFlagHolder {
private boolean featureFlagActivated = false;
public void activateFeatureFlag() {
featureFlagActivated = true;
}
// similar code to de-activate
public boolean isFeatureFlagActivated() {
return featureFlagActivated;
}
}
Suppose the threads changing and reading the state of featureFlagActivated are different. The thread reading the boolean could, AFAIK, according to the JVM cache its value and never refresh it. In practice, I've never seen that happen. Actually, I've never even seen the boolean not being updated immediately on a read.
Why is that?
At the most basic level it has to be said that a lack of volatile doesn't guarantee that it will fail. It just means that the JVM is allowed to do optimizations that could lead to failure. But whether those optimizations happen and whether they then lead to failure is influenced by many different factors. Therefore it's often very hard to actually detect these problems, until they become catastrophic.
For a starter, I'd like to summarize conditions that happen to frequently coincide when it does go wrong.
the non-volatile variable is usually read in a tight loop
the non-volatile variable is changed rarely, but when it changes it's "important" in some sense.
the amount of code executed inside that loop is small (roughly small enough to be fully inlined by an aggressive compiler)
the tight loop over-running has a very visible effect (for example it leads to an exception and not just silently doing unnecessary work).
Note that not all of those are necessary, but they tend to be true when I actually observe the issue.
My personal interpretation (plus some reading on the topic) lead me to these rules of thumb:
if reading the wrong value won't be noticed, then you simply won't notice if the volatile is missing. If the only bad thing that happens is that you run through a loop a couple of times unnecessarily, then chances are you will never realize that it happens.
when the reads of the volatile variable happen with enough "distance" between them (where distance is measured by other read access to other parts of memory) then it can often behave as if it was volatile, simply because it drops out of the cache
any kind of synchronization on anything inside the loops tends to have the effect of invalidating some caches at least and thus causes the variable to act as if it was volatile.
These three alone make it rather hard to actually spot the problem except in very extreme cases (i.e. when executing once too many causes a big crash in your system).
In your specific example, I assume that the feature flag is not something that will be toggled multiple times per second. It's more likely that it's set once per process and then stays untouched.
For example, if you have multiple incoming requests in the same second and halfway through the second you toggle the feature flag it can happen that some of the requests that happen after the toggling will still use the old value, due to having it cached from earlier.
Will you notice? Unlikely. It'll be extremely hard to distinguish "this request came in just before the change" from "this request came in just after the change and wrongly used the old value". If 6 out of 10 requests use the old value instead of the correct 5 out of 10, there's a good chance no one will ever notice.
Related
Is there ever any harm in making a variable "volatile" in Java if it doesn't actually need to be marked volatile? ... or is it just "unnecessary" as I have often read.
As someone dabbling in multi-threading, but not a master computer scientist, I'm currently going with "if in doubt, make it volatile."
The obvious impact is some small performance impact because the compiler is forbidden from using certain optimizations. However the worse impact is the false sense of security. Just because a variable is volatile does not mean everything done with it is now threadsafe UNLESS all operations upon it are atomic (otherwise there could be a disconnect between the observation and the mutation of that variable).
Proper synchronization blocks are still needed. Your approach is inherently flawed. Sorry but it’s not that simple to get thread safety.
The third problem is it renders the true purpose of your code more obscure. If all variables are marked volatile then how is the reader to know which ones truly rely on that property and which ones don’t? Such obscurity creates a hidden cost in code maintenance burden, also known as “technical debt.”
It can have performance implications, but that's it.
There may be another danger: that you lull yourself into thinking that since everything is volatile, your code is thread-safe. That's not the case, and there's no substitute for actually understanding the threading implications of your code. If you do that, you won't need to mark things volatile "just in case."
But to your immediate question: no, you won't ever take functionally correct code and break it by making a variable violate.
If multiple threads try to update the same member variable, it is called a race condition. But I was more interested in knowing how the JVM handles it internally if we don't handle it in our code by making it synchronised or something else? Will it hang my program? How will the JVM react to it? I thought the JVM would temporarily create a sync block for this situation, but I'm not sure what exactly would be happening.
If any of you have some insight, it would be good to know.
The precise term is a data race, which is a specialization of the general concept of a race condition. The term data race is an official, precisely specified concept, which means that it arises from a formal analysis of the code.
The only way to get the real picture is to go and study the Memory Model chapter of the Java Language Specification, but this is a simplified view: whenever you have a data race, there is almost no guarantee as to the outcome and a reading thread may see any value which has ever been written to the variable. Therein also lies the only guarantee: the thread will not observe an "out-of-thin-air" value, such which was never written. Well, unless you're dealing with longs or doubles, then you may see torn writes.
Maybe I'm missing something but what is there to handle? There is still a thread that will get there first. Depending on which thread that is, that thread will just update/read some variable and proceed to the next instruction. It can't magically construct a sync block, it doesn't really know what you want to do. So in other words what happens will depend on the outcome of the 'race'.
Note I'm not heavily into the lower level stuff so perhaps I don't fully understand the depth of your question.
Java provides synchronized and volatile to deal with these situations. Using them properly can be frustratingly difficult, but keep in mind that Java is only exposing the complexity of modern CPU and memory architectures. The alternatives would be to always err on the side of caution, effectively synchronizing everything which would kill performance; or ignore the problem and offer no thread safety whatsoever. And fortunately, Java provides excellent high-level constructs in the java.util.concurrent package, so you can often avoid dealing with the low-level stuff.
In short, the JVM assumes that code is free of data races when translating it into machine code. That is, if code is not correctly synchronized, the Java Language Specification provides only limited guarantees about the behavior of that code.
Most modern hardware likewise assumes that code is free of data races when executing it. That is, if code is not correctly synchronized, the hardware makes only limited guarantees about the result of its execution.
In particular, the Java Language Specification guarantees the following only in the absence of a data race:
visibility: reading a field yields the value last assigned to it (it is unclear which write was last, and writes of long or double variables need not be atomic)
ordering: if a write is visible, so are any writes preceding it. For instance, if one thread executes:
x = new FancyObject();
another thread can read x only after the constructor of FancyObject has executed completely.
In the presence of a data race, these guarantees are null and void. It is possible for a reading thread to never see a write. It is also possible to see the write of x, without seeing the effect of the constructor that logically preceded the write of x. It is very unlikely that the program is correct if such basic assumptions can not be made.
A data race will however not compromise the integrity of the Java Virtual Machine. In particular, the JVM will not crash or halt, and still guarantee memory safety (i.e. prevent memory corruption) and certain semantics of final fields.
The JVM will handle the situation just fine (ie it will not hang or complain), but you may not get a result that you like!
When multiple threads are involved, java becomes fiendishly complicated and even code that looks obviously correct can turn out to be horribly broken. As an example:
public class IntCounter {
private int i;
public IntCounter(int i){
this.i = i;
}
public void incrementInt(){
i++;
}
public int getInt(){
return i;
}
}
is flawed in many ways.
First, let's say that i is currently 0 and thread A and thread B both call incrementInt() at about the same time. There is a danger that they will both see that i is 0, then both increment it 1 and then save the result. So at the end of the two calls, i is only 1, not 2!
That's the race condition problem with the code, but there are other problems concerning memory visibility. When thread A changes a shared variable, there is no guarantee (without synchronization) that thread B will ever see the changes!
So thread A could increment i 100 times, and an hour later, thread B, calling getInt(), might see i as 0, or 100 or anywhere in between!
The only sane thing to do if you are delving into java concurrency is to read Java Concurrency in Practice by Brian Goetz et al. (OK there's probably other good ways to learn about it, but this is a great book co written by Joshua Bloch, Doug Lea and others)
Suppose our code has 2 threads (A and B) have a reference to the same instance of this class somewhere:
public class MyValueHolder {
private int value = 1;
// ... getter and setter
}
When Thread A does myValueHolder.setValue(7), there is no guarantee that Thread B will ever read that value: myValueHolder.getValue() could - in theory - keep returning 1 forever.
In practice however, the hardware will clear the second level cache sooner or later, so Thread B will read 7 sooner or later (usually sooner).
Is there any way to make the JVM emulate that worst case scenario for which it keeps returning 1 forever for Thread B? That would be very useful to test our multi-threaded code with our existing tests under those circumstances.
jcstress maintainer here. There are multiple ways to answer that question.
The easiest solution would be wrapping the getter in the loop, and let JIT hoist it. This is allowed for non-volatile field reads, and simulates the visibility failure with compiler optimization.
More sophisticated trick involves getting the debug build of OpenJDK, and using -XX:+StressLCM -XX:+StressGCM, effectively doing the instruction scheduling fuzzing. Chances are the load in question will float somewhere you can detect with the regular tests your product has.
I am not sure if there is practical hardware holding the written value long enough opaque to cache coherency, but it is somewhat easy to build the testcase with jcstress. You have to keep in mind that the optimization in (1) can also happen, so we need to employ a trick to prevent that. I think something like this should work.
It would be great to have a Java compiler that would intentionally perform as many weird (but allowed) transfirmations as possible to be able to break thread unsafe code more easily, like Csmith for C. Unfortunately, such a compiler does not exist (as far as I know).
In the meantime, you can try the jcstress library* and exercise your code on several architectures, if possible with weaker memory models (i.e. not x86) to try and break your code:
The Java Concurrency Stress tests (jcstress) is an experimental harness and a suite of tests aid research in the correctness of concurrency support in the JVM, class libraries, and hardware.
But in the end, unfortunately, the only way to prove that a piece of code is 100% correct is code inspection (and I don't know of a static code analysis tool able to detect all race conditions).
*I have not used it and I am unclear which of jcstress and the java-concurrency-torture library is more up to date (I would suspect jcstress).
Not on a real machine, sadly testing multi-threaded code will remain difficult.
As you say, the hardware will clear the second level cache and the JVM has no control over that. The JSL only specifies what must happen and this is a case where B might never see the updated value of value.
The only way to force this to happen on a real machine is to alter the code in such a way to void your testing strategy i.e. you end up testing different code.
However, you might be able to run this on a simulator that simulates hardware that doesn't clear the second level cache. Sounds like a lot of effort though!
I think you are refering to the principle called "false sharing" where different CPUs must synchronize their caches or else face the possibility that data such as you describe could become mismatched. There is a very good article on false sharing on Intel's website. Intel describes some useful tools in their article for diagnosing this problem. This is a relevant quote:
The primary means of avoiding false sharing is through code
inspection. Instances where threads access global or dynamically
allocated shared data structures are potential sources of false
sharing. Note that false sharing can be obscured by the fact that
threads may be accessing completely different global variables that
happen to be relatively close together in memory. Thread-local storage
or local variables can be ruled out as sources of false sharing.
Although methods described in the article are not what you have asked for (forcing worst-case behavior from the JVM), as already stated this isn't really possible. The methods described in this article are the best way I know to try to diagnose and avoid false sharing.
There are other resources addressing this problem around the web. For example, this article has a suggestion for a way to avoid false sharing in Java. I have not tried this method, so I cannot vouch for it, but I think the author's idea is sound. You might consider trying out his suggestion.
I have previously suggested a worst case behaving JVM for test purposes on the memory model list but the idea didn't seem popular.
So how to gain "worst case JVM behaviour" , with existing technology i.e how can I test the scenario in the question and get it to fail EVERY time. You could try to find the setup with the weakest memory model possible but that's unlikely to be perfect.
What I have often considered is using a distributed JVM something similar to how I believe Terracotta works under the cover so your application now runs on multiple JVM's (either remote or local) (threads in the same application run in different instances). In this setup inter JVM thread communication takes place at memory barriers e.g. the synchronized keywords you are missing in bugged code for instance (it conforms to the Java Memory Model) and the application is configured i.e. you say this class thread runs here . No code change required to your tests just configuration, any well ordered java application should run out of the box, however this setup would be very intolerant of a badly ordered application (normally a problem ... now an asset i.e. the Memory model exhibits very weak but legal behavior). In the example above loading the code onto a cluster, if two threads run on different nodes setValue has no effect visible to the other thread unless the code was changed and synchronized, volatile etc etc were used, then the code works as intended.
Now your test for the example above (configured correctly) would fail every time without correct "happens before ordering" which is potentially very useful for tests. The flaw in the plan for complete coverage you would need a potentially a node per application thread (can be same machine or multiple in a cluster) or multiple test runs. If you have 1000's of threads then that could be prohibitive though hopefully they would be pooled and scaled down for E2E test scenarios or run it in a cloud. If nothing else this kind of setup might be useful in demonstrating the issue.
inter thread communication across JVMs
The example you have given is described as Incorrectly Synchronized in http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4. I think this is always incorrect and will lead to bugs sooner or later. Most of the times later :-).
To find such incorrectly synchronized code blocks, I use the following algorithm:
Record the threads for all field modifications using instrumentation. If a field is modified by more than one thread without synchronization, I have found a data race.
I implemented this algorithm inside http://vmlens.com, which is a tool to find data races inside java programs.
Here's a simple way: just comment out the code for setValue. You can uncomment it after testing. Since in many cases like this a mechanism is needed to fake failures, it would be a good idea to build a general mechanism for all such cases.
all:
Here is the famous article:
The "Double-Checked Locking is Broken" Declaration
It declares that pattern doesn't work in Java. It further says, close to the end, that new JVM can make the pattern work by using volatile.
However, in another article: Memory Barriers and JVM Concurrency
It says keyword "synchronized" generates memory barrier full fences. So who is right? Does the pattern work in Java on earth?
There are essentially 3 ways to fix double-checked locking:
ensure that the variable is declared volatile (works from Java 5 onwards);
just don't bother with it in the first place: just use synchronization and don't try to mess around with fancy bug-prone-- and probably pointless-- means of "avoiding" it;
let the classloader do the synchronization for you.
I've posted example code here.
BUT: Double-checked locking is really an outdated paradigm, if indeed it was ever useful in Java. As I see things, it was essentially carried over into Java by C programmers who didn't fully appreciate that the JVM effectively has a more efficient (and correct!) way of dealing with the issue built into the classloader and that optimisations to synchronization are generally best made at the JVM level.
I've seen a lot of people clutter their code with this "pattern". I don't think I've ever seen any actual data showing that it has any benefit.
Plus: if you do have a large application that is hitting synchronization issues, then one of the whole raisons d'être of Java is that it has rich concurrency libraries. Look at how you can re-work your application to use them... if profiling data proves it to be necessary.
It depends on what version of java you are using.
This has been fixed in java 5 and forward.
Check http://en.wikipedia.org/wiki/Double-checked_locking#Usage_in_Java
They're both right, and DCL works fine in Java from 5 on.
If you are expecting your program to produce the exact same output every time given the exact same input, and you are using DCL, you may want to seriously rethink what you are doing. An awful lot can depend on who gets to the lock first--you're rolling a lot of dice. Not good for an accounting app.
If your program involves balls bouncing off walls and each other, DCL may make a lot of sense. It does work. Synchronizing has to be a bit slower than non-synchronizing even without contention, so why do it if a simple if can prevent it? And if 100 threads pile up on a synch statement when the needed object already exists, that has to be a lot slower.
The keyword "synchronized" that generates memory barrier full fences does not mean DCL could work properly. Let's take the following code as example:
public static Runnable getInstance()
{
if (null == instance) //1
{
synchronized (Runnable.class)
{
if (null == instance)
{
instance = new Runnable(); //2
}
}
}
return instance;
}
We know that JVM will follow many steps when construct an object. We focus 2 important steps here:
First, JVM malloc the memory for this object. The value of member-variables in this object has defaut value for now. Second, JVM calls method and assigns the user-specified value to the member variables.
That means thread A may get a partitially-constructed instance in code 1 (in the middle of the code 1 and code 2) . Although "synchronized" generates memory barrier full fences, there is no happen-before guarantee in code 1 and code 2. Memory barrier fences take effect during synchronized code block. Code 1 is outside the synchronized code block.
I'm new to the Java threading and have only recently started to read up on the Memory Model. From my understanding the Java Memory Model as it stands allows the compiler to make optimizations.
This can complicate multi-threaded code and synchronization but my question is for something much simpler. Take this example since the two statements are not dependent on each other, is it possible that the compiler could change the ordering of the code in the try statement and therefore break the check?
boolean success = false;
try{
MyClass.someFunction();
success = true;
}
catch(Exception e){
}
if(success){
System.out.println("Sucess!");
}
else{
System.out.println("Fail!");
}
No. The Java compiler is required to do what you'd expect -- print Success! if Myclass.someFunction() returns normally and Fail! if it raises an exception. This comes from the semantics of Java in general, and is independent of any compiler optimizations your compiler might make (except insofar as it might constrain what optimizations are legal).
The reason is that the Java language specification says that within a single thread, the program must behave exactly as though statements were executed in order, top to bottom. The compiler is free to rewrite your program in all kinds of unintuitive ways in order to generate bytecode, but it must preserve the illusion that within a single thread, it's running exactly the statements you typed in your source code, in the correct order.
The Java language specification could extend that to multi-threaded contexts as well, and say that every thread must always see a state of the world consistent with all of your threads executing exactly the source code you typed in. However, (a) that would make it very difficult to write a correct compiler optimization, and many otherwise useful optimizations would be illegal; and (b) it wouldn't help programmers very much since it wouldn't eliminate the need for proper synchronization anyway; it'd mostly just turn broken programs into less-obviously broken programs.
Instead, the Java memory model defines precise rules for when memory modifications in one thread are visible to other threads, and lets the compiler do what it wants otherwise. This is a good compromise, since it gives programmers a set of rules they can use to make sure their multithreaded programs are correct while still giving compiler writers leeway to implement good optimizations.
But the important point for your question is: the compiler can do whatever it wants behind the scenes, but it's not allowed to change the meaning of your program. In a single-threaded context, the meaning of your program is well defined and would not permit the compiler to do the bad things you were thinking it could do.
As Java cannot understand what MyClass.someFunction() does, it cannot safely reorder this statement. In fact, most data dependency checkers are completely incapable of moving outside of a function boundary due to side-effects.
Threads are a special case, and not unique to Java - data ends up in registers, and won't be refetched from memory unless needed. Java's solution is the transient keyword (similar to volatile in other languages).