In Java, how can I explicitly trigger a full memory fence/barrier, equal to the invocation of
System.Threading.Thread.MemoryBarrier();
in C#?
I know that since Java 5 reads and writes to volatile variables have been causing a full memory fence, but maybe there is an (efficient) way without volatile.
Compared to MemoryBarrier(), Java's happens-before is a much sharper tool, leaving more leeway for aggressive optimization while maintaining thread safety.
A sharper tool, as you would expect, also requires more care to use properly, and that is how the semantics of volatile variable access could be described. You must write to a volatile variable on the write site and read from the same volatile on each reading site. By implication you can have any number of independent, localized "memory barriers", one per a volatile variable, and each guards only the state reachable from that variable.
The full idiom is usually referred to as "safe publication" (although this is a more general term) and implies populating an immutable object graph which will be shared between threads, then writing a reference to it to a volatile variable.
Java 8, via JEP 108 added another possibility. Access to three fences have been to the Java API, fullFence, loadFence and storeFence.
There are no direct equivalent. Use volatile field or more high level things.
Related
Can the optimizations performed by the Java compiler (version 5 or later) remove the "volatile" declaration of a variable?
More precisely, can a volatile variable be turned into a non-volatile variable in any of the following cases:
if there is no multithreading, i.e. if an application never uses more than one thread?
if a volatile variable is written by one thread but never accessed by any other thread?
if a volatile variable is read by several threads but never modified (read only, no writes)?
The volatile keyword requires that certain guarantees are satisfied when reading and writing the variable. It doesn't really make sense to talk about "removing the declaration"—no, that can't happen, because the only way that makes sense would be if the compiler ignored the declaration in your source code.
But if you are running the code in a context (e.g., single-threaded) where you couldn't tell that the runtime is actively working to meet those requirements, it is permissible for the runtime to skip that extra work.
Of your examples, the only case that might be determined at compile-time is a variable that is only written, and never read. In that case, the compiler could skip writes (if a variable is written, and no one is around to read it, does it make a sound?), but the the Java Memory Model still makes some guarantees about happens-before relationships around writing a volatile variable, and those still have to be upheld, so it wouldn't make sense to optimize that away at compile-time.
Does JVM guarantee to cache not volatile variable ?
Can a programer depend upon on JVM to always cache non-volatile variables locally for each thread.
Or JVM may or may not do this, thus a programer should not depend upon JVM for this.
Thanks for the answers in advance.
No. The JVM doesn't guarantee "caching" of non-volatile fields. What implementations of JVM guarantee is how volatile fields should behave. Caching of fields is non-standard (unspecified) and can vary from JVM to JVM implementation. So, you shouldn't really rely on it (even if find out, by some way that some data is being cached by a thread)
The java language spec is pretty clear about volatile:
The Java programming language provides a second mechanism, volatile fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4).
That's it. You got a special keyword defining this special semantic. So, when you think the other way round: without that special keyword, you can't rely on any special semantics. Then you get what the Java Memory Model has to offer; but nothing more.
And to be fully correct - there is of course Unsafe, allowing you to tamper with memory in unsafe ways with very special semantics.
The recommended pattern if you need a snapshot of a field is to copy it to a local variable. This is commonly used when writing code that makes heavy use of atomics and read-modify-conditional-write loops.
What exactly is a situation where you would make use of the volatile keyword? And more importantly: How does the program benefit from doing so?
From what I've read and know already: volatile should be used for variables that are accessed by different threads, because they are slightly faster to read than non-volatile ones. If so, shouldn't there be a keyword to enforce the opposite?
Or are they actually synchronized between all threads? How are normal variables not?
I have a lot of multithreading code and I want to optimize it a bit. Of course I don't hope for huge performance enhancement (I don't have any problems with it atm anyway), but I'm always trying to make my code better. And I'm slightly confused with this keyword.
When a multithreaded program is running, and there is some shared variable which isn't declared as volatile, what these threads do is create a local copy of the variable, and work on the local copy instead. So the changes on the variable aren't reflected. This local copy is created because cached memory access is much faster compared to accessing variables from main memory.
When you declare a variable as volatile, it tells the program NOT to create any local copy of the variable and use the variable directly from the main memory.
By declaring a variable as volatile, we are telling the system that its value can change unexpectedly from anywhere, so always use the value which is kept in the main memory and always make changes to the value of the variable in the main memory and not create any local copies of the variable.
Note that volatile is not a substitute for synchronization, and when a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
Volatile make accessing the variables slower by having every thread actually access the value each time from memory thus getting the newest value.
This is useful when accessing the variable from different threads.
Use a profiler to tune code and read Tips optimizing Java code
The volatile keyword means that the compiler will force a new read of the variable every time it is referenced. This is useful when that variable is something other than standard memory. Take for instance an embedded system where you're reading a hardware register or interface which appears as a memory location to the processor. External system changes which change the value of that register will not be read correctly if the processor is using a cached value that was read earlier. Using volatile forces a new read and keeps everything synchronized.
Heres a good stack overflow explanation
and Heres a good wiki article
In computer programming, particularly in the C, C++, C#, and Java programming languages, a variable or object declared with the volatile keyword usually has special properties related to optimization and/or threading. Generally speaking, the volatile keyword is intended to prevent the compiler from applying certain optimizations which it might have otherwise applied because ordinarily it is assumed variables cannot change value "on their own."
**^wiki
In short it guarantees that a given thread access the same copy of some data. Any changes in one thread would immediately be noticeable within another thread
volatile concerns memory visibility. The value of the volatile variable becomes visible to all readers after a write operation completes on it. Kind of like turning off caching.
Here is a good stack overflow response: Do you ever use the volatile keyword in Java?
Concerning specific questions, no they are not synchronized. You still need to use locking to accomplish that. Normal variables are neither synchronized or volatile.
To optimize threaded code its probably worth reading up on granularity, optimistic and pessimistic locking.
When objects are locked in languages like C++ and Java where actually on a low level scale) is this performed? I don't think it's anything to do with the CPU/cache or RAM. My best guestimate is that this occurs somewhere in the OS? Would it be within the same part of the OS which performs context switching?
I am referring to locking objects, synchronizing on method signatures (Java) etc.
It could be that the answer depends on which particular locking mechanism?
Locking involves a synchronisation primitive, typically a mutex. While naively speaking a mutex is just a boolean flag that says "locked" or "unlocked", the devil is in the detail: The mutex value has to be read, compared and set atomically, so that multiple threads trying for the same mutex don't corrupt its state.
But apart from that, instructions have to be ordered properly so that the effects of a read and write of the mutex variable are visible to the program in the correct order and that no thread inadvertently enters the critical section when it shouldn't because it failed to see the lock update in time.
There are two aspects to memory access ordering: One is done by the compiler, which may choose to reorder statements if that's deemed more efficient. This is relatively trivial to prevent, since the compiler knows when it must be careful. The far more difficult phenomenon is that the CPU itself, internally, may choose to reorder instructions, and it must be prevented from doing so when a mutex variable is being accessed for the purpose of locking. This requires hardware support (e.g. a "lock bit" which causes a pipeline flush and a bus lock).
Finally, if you have multiple physical CPUs, each CPU will have its own cache, and it becomes important that state updates are propagated to all CPU caches before any executing instructions make further progress. This again requires dedicated hardware support.
As you can see, synchronisation is a (potentially) expensive business that really gets in the way of concurrent processing. That, however, is simply the price you pay for having one single block of memory on which multiple independent context perform work.
There is no concept of object locking in C++. You will typically implement your own on top of OS-specific functions or use synchronization primitives provided by libraries (e.g. boost::scoped_lock). If you have access to C++11, you can use the locks provided by the threading library which has a similar interface to boost, take a look.
In Java the same is done for you by the JVM.
The java.lang.Object has a monitor built into it. That's what is used to lock for the synchronized keyword. JDK 6 added a concurrency packages that give you more fine-grained choices.
This has a nice explanation:
http://www.artima.com/insidejvm/ed2/threadsynch.html
I haven't written C++ in a long time, so I can't speak to how to do it in that language. It wasn't supported by the language when I last wrote it. I believe it was all 3rd party libraries or custom code.
It does depend on the particular locking mechanism, typically a semaphore, but you cannot be sure, since it is implementation dependent.
All architectures I know of use an atomic Compare And Swap to implement their synchronization primitives. See, for example, AbstractQueuedSynchronizer, which was used in some JDK versions to implement Semiphore and ReentrantLock.
I know that writing to a volatile variable flushes it from the memory of all the cpus, however I want to know if reads to a volatile variable are as fast as normal reads?
Can volatile variables ever be placed in the cpu cache or is it always fetched from the main memory?
You should really check out this article: http://brooker.co.za/blog/2012/09/10/volatile.html. The blog article argues volatile reads can be a lot slower (also for x86) than non-volatile reads on x86.
Test 1 is a parallel read and write to a non-volatile variable. There
is no visibility mechanism and the results of the reads are
potentially stale.
Test 2 is a parallel read and write to a volatile variable. This does not address the OP's question specifically. However worth noting that a contended volatile can be very slow.
Test 3 is a read to a volatile in a tight loop. Demonstrated is that the semantics of what it means to be volatile indicate that the value can change with each loop iteration. Thus the JVM can not optimize the read and hoist it out of the loop. In Test 1, it is likely the value was read and stored once, thus there is no actual "read" occurring.
Credit to Marc Booker for running these tests.
The answer is somewhat architecture dependent. On an x86, there is no additional overhead associated with volatile reads specifically, though there are implications for other optimizations.
JMM cookbook from Doug Lea, see architecture table near the bottom.
To clarify: There is not any additional overhead associated with the read itself. Memory barriers are used to ensure proper ordering. JSR-133 classifies four barriers "LoadLoad, LoadStore, StoreLoad, and StoreStore". Depending on the architecture, some of these barriers correspond to a "no-op", meaning no action is taken, others require a fence. There is no implicit cost associated with the Load itself, though one may be incurred if a fence is in place. In the case of the x86, only a StoreLoad barrier results in a fence.
As pointed out in a blog post, the fact that the variable is volatile means there are assumptions about the nature of the variable that can no longer be made and some compiler optimizations would not be applied to a volatile.
Volatile is not something that should be used glibly, but it should also not be feared. There are plenty of cases where a volatile will suffice in place of more heavy handed locking.
It is architecture dependent. What volatile does is tell the compiler not to optimise that variable away. It forces most operations to treat the variable's state as an unknown. Because it is volatile, it could be changed by another thread or some other hardware operation. So, reads will need to re-read the variable and operations will be of the read-modify-write kind.
This kind of variable is used for device drivers and also for synchronisation with in-memory mutexes/semaphores.
Volatile reads cannot be as quick, especially on multi-core CPUs (but also only single-core).
The executing core has to fetch from the actual memory address to make sure it gets the current value - the variable indeed cannot be cached.
As opposed to one other answer here, volatile variables are not used just for device drivers! They are sometimes essential for writing high performance multi-threaded code!
volatile implies that the compiler cannot optimize the variable by placing its value in a CPU register. It must be accessed from main memory. It may, however, be placed in a CPU cache. The cache will guaranty consistency between any other CPUs/cores in the system. If the memory is mapped to IO, then things are a little more complicated. If it was designed as such, the hardware will prevent that address space from being cached and all accesses to that memory will go to the hardware. If there isn't such a design, the hardware designers may require extra CPU instructions to insure that the read/write goes through the caches, etc.
Typically, the 'volatile' keyword is only used for device drivers in operating systems.