Can volatile be eliminated by Java compiler optimizations - java

Can the optimizations performed by the Java compiler (version 5 or later) remove the "volatile" declaration of a variable?
More precisely, can a volatile variable be turned into a non-volatile variable in any of the following cases:
if there is no multithreading, i.e. if an application never uses more than one thread?
if a volatile variable is written by one thread but never accessed by any other thread?
if a volatile variable is read by several threads but never modified (read only, no writes)?

The volatile keyword requires that certain guarantees are satisfied when reading and writing the variable. It doesn't really make sense to talk about "removing the declaration"—no, that can't happen, because the only way that makes sense would be if the compiler ignored the declaration in your source code.
But if you are running the code in a context (e.g., single-threaded) where you couldn't tell that the runtime is actively working to meet those requirements, it is permissible for the runtime to skip that extra work.
Of your examples, the only case that might be determined at compile-time is a variable that is only written, and never read. In that case, the compiler could skip writes (if a variable is written, and no one is around to read it, does it make a sound?), but the the Java Memory Model still makes some guarantees about happens-before relationships around writing a volatile variable, and those still have to be upheld, so it wouldn't make sense to optimize that away at compile-time.

Related

Unnecessarily using volatile keyword -- is that dangerous?

Is there ever any harm in making a variable "volatile" in Java if it doesn't actually need to be marked volatile? ... or is it just "unnecessary" as I have often read.
As someone dabbling in multi-threading, but not a master computer scientist, I'm currently going with "if in doubt, make it volatile."
The obvious impact is some small performance impact because the compiler is forbidden from using certain optimizations. However the worse impact is the false sense of security. Just because a variable is volatile does not mean everything done with it is now threadsafe UNLESS all operations upon it are atomic (otherwise there could be a disconnect between the observation and the mutation of that variable).
Proper synchronization blocks are still needed. Your approach is inherently flawed. Sorry but it’s not that simple to get thread safety.
The third problem is it renders the true purpose of your code more obscure. If all variables are marked volatile then how is the reader to know which ones truly rely on that property and which ones don’t? Such obscurity creates a hidden cost in code maintenance burden, also known as “technical debt.”
It can have performance implications, but that's it.
There may be another danger: that you lull yourself into thinking that since everything is volatile, your code is thread-safe. That's not the case, and there's no substitute for actually understanding the threading implications of your code. If you do that, you won't need to mark things volatile "just in case."
But to your immediate question: no, you won't ever take functionally correct code and break it by making a variable violate.

Does JVM guarantee to cache not volatile variable?

Does JVM guarantee to cache not volatile variable ?
Can a programer depend upon on JVM to always cache non-volatile variables locally for each thread.
Or JVM may or may not do this, thus a programer should not depend upon JVM for this.
Thanks for the answers in advance.
No. The JVM doesn't guarantee "caching" of non-volatile fields. What implementations of JVM guarantee is how volatile fields should behave. Caching of fields is non-standard (unspecified) and can vary from JVM to JVM implementation. So, you shouldn't really rely on it (even if find out, by some way that some data is being cached by a thread)
The java language spec is pretty clear about volatile:
The Java programming language provides a second mechanism, volatile fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4).
That's it. You got a special keyword defining this special semantic. So, when you think the other way round: without that special keyword, you can't rely on any special semantics. Then you get what the Java Memory Model has to offer; but nothing more.
And to be fully correct - there is of course Unsafe, allowing you to tamper with memory in unsafe ways with very special semantics.
The recommended pattern if you need a snapshot of a field is to copy it to a local variable. This is commonly used when writing code that makes heavy use of atomics and read-modify-conditional-write loops.

Java equivalent to Thread.MemoryBarrier

In Java, how can I explicitly trigger a full memory fence/barrier, equal to the invocation of
System.Threading.Thread.MemoryBarrier();
in C#?
I know that since Java 5 reads and writes to volatile variables have been causing a full memory fence, but maybe there is an (efficient) way without volatile.
Compared to MemoryBarrier(), Java's happens-before is a much sharper tool, leaving more leeway for aggressive optimization while maintaining thread safety.
A sharper tool, as you would expect, also requires more care to use properly, and that is how the semantics of volatile variable access could be described. You must write to a volatile variable on the write site and read from the same volatile on each reading site. By implication you can have any number of independent, localized "memory barriers", one per a volatile variable, and each guards only the state reachable from that variable.
The full idiom is usually referred to as "safe publication" (although this is a more general term) and implies populating an immutable object graph which will be shared between threads, then writing a reference to it to a volatile variable.
Java 8, via JEP 108 added another possibility. Access to three fences have been to the Java API, fullFence, loadFence and storeFence.
There are no direct equivalent. Use volatile field or more high level things.

Purpose/advantages of volatile

What exactly is a situation where you would make use of the volatile keyword? And more importantly: How does the program benefit from doing so?
From what I've read and know already: volatile should be used for variables that are accessed by different threads, because they are slightly faster to read than non-volatile ones. If so, shouldn't there be a keyword to enforce the opposite?
Or are they actually synchronized between all threads? How are normal variables not?
I have a lot of multithreading code and I want to optimize it a bit. Of course I don't hope for huge performance enhancement (I don't have any problems with it atm anyway), but I'm always trying to make my code better. And I'm slightly confused with this keyword.
When a multithreaded program is running, and there is some shared variable which isn't declared as volatile, what these threads do is create a local copy of the variable, and work on the local copy instead. So the changes on the variable aren't reflected. This local copy is created because cached memory access is much faster compared to accessing variables from main memory.
When you declare a variable as volatile, it tells the program NOT to create any local copy of the variable and use the variable directly from the main memory.
By declaring a variable as volatile, we are telling the system that its value can change unexpectedly from anywhere, so always use the value which is kept in the main memory and always make changes to the value of the variable in the main memory and not create any local copies of the variable.
Note that volatile is not a substitute for synchronization, and when a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Volatile make accessing the variables slower by having every thread actually access the value each time from memory thus getting the newest value.
This is useful when accessing the variable from different threads.
Use a profiler to tune code and read Tips optimizing Java code
The volatile keyword means that the compiler will force a new read of the variable every time it is referenced. This is useful when that variable is something other than standard memory. Take for instance an embedded system where you're reading a hardware register or interface which appears as a memory location to the processor. External system changes which change the value of that register will not be read correctly if the processor is using a cached value that was read earlier. Using volatile forces a new read and keeps everything synchronized.
Heres a good stack overflow explanation
and Heres a good wiki article
In computer programming, particularly in the C, C++, C#, and Java programming languages, a variable or object declared with the volatile keyword usually has special properties related to optimization and/or threading. Generally speaking, the volatile keyword is intended to prevent the compiler from applying certain optimizations which it might have otherwise applied because ordinarily it is assumed variables cannot change value "on their own."
**^wiki
In short it guarantees that a given thread access the same copy of some data. Any changes in one thread would immediately be noticeable within another thread
volatile concerns memory visibility. The value of the volatile variable becomes visible to all readers after a write operation completes on it. Kind of like turning off caching.
Here is a good stack overflow response: Do you ever use the volatile keyword in Java?
Concerning specific questions, no they are not synchronized. You still need to use locking to accomplish that. Normal variables are neither synchronized or volatile.
To optimize threaded code its probably worth reading up on granularity, optimistic and pessimistic locking.

Java Memory Model and boolean for success

I'm new to the Java threading and have only recently started to read up on the Memory Model. From my understanding the Java Memory Model as it stands allows the compiler to make optimizations.
This can complicate multi-threaded code and synchronization but my question is for something much simpler. Take this example since the two statements are not dependent on each other, is it possible that the compiler could change the ordering of the code in the try statement and therefore break the check?
boolean success = false;
try{
MyClass.someFunction();
success = true;
}
catch(Exception e){
}
if(success){
System.out.println("Sucess!");
}
else{
System.out.println("Fail!");
}
No. The Java compiler is required to do what you'd expect -- print Success! if Myclass.someFunction() returns normally and Fail! if it raises an exception. This comes from the semantics of Java in general, and is independent of any compiler optimizations your compiler might make (except insofar as it might constrain what optimizations are legal).
The reason is that the Java language specification says that within a single thread, the program must behave exactly as though statements were executed in order, top to bottom. The compiler is free to rewrite your program in all kinds of unintuitive ways in order to generate bytecode, but it must preserve the illusion that within a single thread, it's running exactly the statements you typed in your source code, in the correct order.
The Java language specification could extend that to multi-threaded contexts as well, and say that every thread must always see a state of the world consistent with all of your threads executing exactly the source code you typed in. However, (a) that would make it very difficult to write a correct compiler optimization, and many otherwise useful optimizations would be illegal; and (b) it wouldn't help programmers very much since it wouldn't eliminate the need for proper synchronization anyway; it'd mostly just turn broken programs into less-obviously broken programs.
Instead, the Java memory model defines precise rules for when memory modifications in one thread are visible to other threads, and lets the compiler do what it wants otherwise. This is a good compromise, since it gives programmers a set of rules they can use to make sure their multithreaded programs are correct while still giving compiler writers leeway to implement good optimizations.
But the important point for your question is: the compiler can do whatever it wants behind the scenes, but it's not allowed to change the meaning of your program. In a single-threaded context, the meaning of your program is well defined and would not permit the compiler to do the bad things you were thinking it could do.
As Java cannot understand what MyClass.someFunction() does, it cannot safely reorder this statement. In fact, most data dependency checkers are completely incapable of moving outside of a function boundary due to side-effects.
Threads are a special case, and not unique to Java - data ends up in registers, and won't be refetched from memory unless needed. Java's solution is the transient keyword (similar to volatile in other languages).

Categories

Resources