Volatile eliminates visibility and ordering issues. While atomic toolkit provides atomicity of operations.
Volatile uses happens-before relation and Atomic uses compare and swap.
Why there is a need to introduce new layer of abstraction like atomic toolkit, instead of enhancing volatile keyword itself? Is there any specific cases which may be solved by atomic toolkit?
Actually if you'll take closer look into Atomic* implemetations then you'll see that all of them holds volatile field with value.
IMHO atomics is already an extension of volatile mechanism which provides convenient way to do atomic CAS operations.
Also there is a benefits of hiding CAS implementation. For example hotspot jvm heavily uses intrinsics to achieve to squeeze performance.
Changing an existing language construct like volatile would most likely mean that you break existing applications. So this is typically not an option, in particular not for the Java language. Creating a new library does not effect existing applications and is thus completely backwards compatible.
In addition to that, the classes in the atomicpackage offer advanced operations like compareAndSet that you cannot just add to the volatile keyword.
Related
Does JVM guarantee to cache not volatile variable ?
Can a programer depend upon on JVM to always cache non-volatile variables locally for each thread.
Or JVM may or may not do this, thus a programer should not depend upon JVM for this.
Thanks for the answers in advance.
No. The JVM doesn't guarantee "caching" of non-volatile fields. What implementations of JVM guarantee is how volatile fields should behave. Caching of fields is non-standard (unspecified) and can vary from JVM to JVM implementation. So, you shouldn't really rely on it (even if find out, by some way that some data is being cached by a thread)
The java language spec is pretty clear about volatile:
The Java programming language provides a second mechanism, volatile fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (ยง17.4).
That's it. You got a special keyword defining this special semantic. So, when you think the other way round: without that special keyword, you can't rely on any special semantics. Then you get what the Java Memory Model has to offer; but nothing more.
And to be fully correct - there is of course Unsafe, allowing you to tamper with memory in unsafe ways with very special semantics.
The recommended pattern if you need a snapshot of a field is to copy it to a local variable. This is commonly used when writing code that makes heavy use of atomics and read-modify-conditional-write loops.
So I am pretty new to Haskell and would like to know, if synchronisation is used to prevent corruption when multithreading Java, how is this done in Haskell? I've only found useless or overly complicated responses on google.
Your question is a bit ambiguous since one may use multithreading for either concurrency or parallelism, which are distinct problems with distinct solutions.
In both cases, you'll need to make sure your programs are compiled with SMP support and ran using multiple RTS threads: see the GHC manual's section about concurrency.
Concurrency
As others have pointed out, synchronization will be a non problem in the vast majority of your code, since you'll mostly be dealing with pure functions. This is true in any language if you keep mutable state and libraries that rely on it under armed guard religiously avoid mutable state unless it is properly wrapped behind a pure API. Concurrency is an area where Haskell shines because its semantics require purity. Types are used to describe impure operations instead, making it dead easy to spot code where some sort of synchronization might be needed.
Typically, your application's state will be backed by a transactional database which will handle synchronization and persistence for you. You will not need any additional synchronization at all if your concurrent application does not have additional state.
In other cases, haskell has a handy Software Transactional Memory implementation. It allows you to write and compose code written in an imperative-looking style, without explicit locking, while having atomicity and guarantees against deadlocks. It is the foolproof(tm) way to write concurrent code.
Lastly, there are some low-level primitives available in base: plain old mutable references with IORef, semaphores, and MVars which can be used as if they were variables protected by a mutex.
There also are channels in base, but beware: they are unbounded !
Parallelism
This is also an area where Haskell shines because of its non-strict semantics. Non-strictness allows you to write code that expresses your logic in a straightforward manner while not getting committed to a specific evaluation order.
As a consequence, you can describe a parallel evaluation strategy separately from the business logic. Writing parallel code is then just a matter of placing the right annotation in the right spot.
Here is an example that was/is used in production at Bdellium:
map outputParticipant parts `using` parListChunk 10 rdeepseq
^^^^^ business logic ^^^^^^ ^^^^ eval. strategy ^^^^
The code can be understood as follows: Parallel workers will fully evaluate the results of mapping the outputParticipant function to individual items in the parts list, distributing the work in chunks of 10 elements.
This answer will pertain to functional languages in general - no synchronisation are needed. As functions in functional programming have no side effects: functions accept a value and return a value, there's no mutable state. Such functions are inherently thread safe.
In Java, how can I explicitly trigger a full memory fence/barrier, equal to the invocation of
System.Threading.Thread.MemoryBarrier();
in C#?
I know that since Java 5 reads and writes to volatile variables have been causing a full memory fence, but maybe there is an (efficient) way without volatile.
Compared to MemoryBarrier(), Java's happens-before is a much sharper tool, leaving more leeway for aggressive optimization while maintaining thread safety.
A sharper tool, as you would expect, also requires more care to use properly, and that is how the semantics of volatile variable access could be described. You must write to a volatile variable on the write site and read from the same volatile on each reading site. By implication you can have any number of independent, localized "memory barriers", one per a volatile variable, and each guards only the state reachable from that variable.
The full idiom is usually referred to as "safe publication" (although this is a more general term) and implies populating an immutable object graph which will be shared between threads, then writing a reference to it to a volatile variable.
Java 8, via JEP 108 added another possibility. Access to three fences have been to the Java API, fullFence, loadFence and storeFence.
There are no direct equivalent. Use volatile field or more high level things.
I've come across these two in some multi-threaded code and was wondering if there is if/any difference between the two.
I mean does the use of an AtomicBoolean, rather than a SynchronizedBoolean, make a
significant difference in performance?
And does it affect the correctness of the computation?
AtomicBoolean is a part of the standard java concurrent package. SynchronizedBoolean is part of a set of utilities created by Doug Lea (author of much of the java concurrent packages). Performance-wise, you should expect AtomicBoolean to perform better -- it uses a volatile boolean whereas SynchronizedBoolean uses a ReadWriteLock.
However in practice for most applications you won't notice much difference.
The real difference (and what should guide your choice) is in the semantics the two classes offer. AtomicBoolean provides just simple set/get/compareAndSet operations. The SynchronizedBoolean offers atomic boolean operations and exposes its internal lock to allow you to execute Runnables within the context of its value.
Doug Lea has offered this source free to the community. I have found an extension to SynchronizedBoolean, WaitableBoolean particularly useful since it allows you to execute a Runnable within the lock whenever a particular state change occurs.
When objects are locked in languages like C++ and Java where actually on a low level scale) is this performed? I don't think it's anything to do with the CPU/cache or RAM. My best guestimate is that this occurs somewhere in the OS? Would it be within the same part of the OS which performs context switching?
I am referring to locking objects, synchronizing on method signatures (Java) etc.
It could be that the answer depends on which particular locking mechanism?
Locking involves a synchronisation primitive, typically a mutex. While naively speaking a mutex is just a boolean flag that says "locked" or "unlocked", the devil is in the detail: The mutex value has to be read, compared and set atomically, so that multiple threads trying for the same mutex don't corrupt its state.
But apart from that, instructions have to be ordered properly so that the effects of a read and write of the mutex variable are visible to the program in the correct order and that no thread inadvertently enters the critical section when it shouldn't because it failed to see the lock update in time.
There are two aspects to memory access ordering: One is done by the compiler, which may choose to reorder statements if that's deemed more efficient. This is relatively trivial to prevent, since the compiler knows when it must be careful. The far more difficult phenomenon is that the CPU itself, internally, may choose to reorder instructions, and it must be prevented from doing so when a mutex variable is being accessed for the purpose of locking. This requires hardware support (e.g. a "lock bit" which causes a pipeline flush and a bus lock).
Finally, if you have multiple physical CPUs, each CPU will have its own cache, and it becomes important that state updates are propagated to all CPU caches before any executing instructions make further progress. This again requires dedicated hardware support.
As you can see, synchronisation is a (potentially) expensive business that really gets in the way of concurrent processing. That, however, is simply the price you pay for having one single block of memory on which multiple independent context perform work.
There is no concept of object locking in C++. You will typically implement your own on top of OS-specific functions or use synchronization primitives provided by libraries (e.g. boost::scoped_lock). If you have access to C++11, you can use the locks provided by the threading library which has a similar interface to boost, take a look.
In Java the same is done for you by the JVM.
The java.lang.Object has a monitor built into it. That's what is used to lock for the synchronized keyword. JDK 6 added a concurrency packages that give you more fine-grained choices.
This has a nice explanation:
http://www.artima.com/insidejvm/ed2/threadsynch.html
I haven't written C++ in a long time, so I can't speak to how to do it in that language. It wasn't supported by the language when I last wrote it. I believe it was all 3rd party libraries or custom code.
It does depend on the particular locking mechanism, typically a semaphore, but you cannot be sure, since it is implementation dependent.
All architectures I know of use an atomic Compare And Swap to implement their synchronization primitives. See, for example, AbstractQueuedSynchronizer, which was used in some JDK versions to implement Semiphore and ReentrantLock.