Mutex and Semaphore in Java

Mutex and Semaphore in Java - java

Are Mutexes and Semaphores much useful classes in Java keeping in mind that Synchronize utility has also been provided ? Or they have just been provided for sake of completeness as compared with C++? I do not find much coverage on Mutexes and Semaphores.

In multithreaded applications where data is either shared or tasks synchronized, the lowest level object in accomplishing this is the simple lock or mutex. On top of that you build more complicated objects or algorithms, like Semaphore.
Java has over the years provided the objects and algorithms for many of the most commons multithreaded design patterns. Take a look at the java.util.concurrent package for a complete list.
Synchronize, IMHO, I think replaces most use cases of the humble mutex. And it cuts down on lines of code in process. But there are still use cases for Semaphores. As you can see in the java.util.conccurrent package the Semaphore class was added in Java 1.5. So I think, and I think that the writers of Java also think, that there is still a need for the Semaphore object. But please take a look at the various classes provided in the java.util.concurrent package. Just reading through it provides insight into common multithreaded patterns that you may not have known beforehand. Using a more succinct concurrent Class to solve a multithreaded problem, versus implementing more complicated code using a lower level constructs (like a simple mutex or synchronized block) may lead to less lines of code and more elegant solution.

Related

How does StampedLock queue lock requests?

I am investigating locking a cache based on Java8's StampedLock (javadoc here) but I can't find a convincing implementation on the net to follow, despite reading articles like StampedLock Idioms.
I don't feel very positive about Java's multi-threading and concurrency offerings after being shocked that ReentrantReadWriteLock doesn't allow upgrading of a read lock to a write lock, followed by the difficulty homing in on a reputable alternative solution.
My issue is that there's no definitive statement to allay my fears that StampedLock will block write requests indefinitely while there are read requests queued.
Looking at the documentation, there are 2 comments which raise my suspicions.
From the Javadoc:
The scheduling policy of StampedLock does not consistently prefer
readers over writers or vice versa. All "try" methods are best-effort
and do not necessarily conform to any scheduling or fairness policy.
From the source code:
* These rules apply to threads actually queued. All tryLock forms
* opportunistically try to acquire locks regardless of preference
* rules, and so may "barge" their way in. Randomized spinning is
* used in the acquire methods to reduce (increasingly expensive)
* context switching while also ....
So it hints at a queue for read and write locks but I'd need to read and digest the whole 1500 lines of source code to nail it.
I assume it must be there because I found a good benchmarking article which shows that StampedLock is the way to go for many reads / few writes. However I'm still concerned because of the lack of coverage online.
Fundamentally I guess I expected an implementation where I could plug'n'play following the javadoc, but in the end I'm left rooting around the net wondering why there isn't an example anywhere of a looped StampedLock#tryOptimisticRead() - even the code from the benchmark article doesn't do that.
Is Java concurrency this difficult or have I missed something obvious?

"Is Java concurrency this difficult or have I missed something obvious?"
It is a matter of opinion1 whether Java concurrency is more difficult than (say) C++ or Python concurrency.
But yes, concurrency is difficult2 in any language that allows different threads to directly update shared data structures. Languages that (only) support CSP-like concurrency are easier to understand and reason about.
Reference:
https://en.wikipedia.org/wiki/Communicating_sequential_processes
To your point about fairness, it is true that most forms of locking in Java do not guarantee fairness. And indeed many things to do with thread scheduling are (deliberately) loosely specified. But it is not difficult to write code that avoids these problems ... once you understand the libraries and how to use them.
To your specific issue about StampedLock behavior.
My issue is that there's no definitive statement to allay my fears that StampedLock will block write requests indefinitely while there are read requests queued.
There is no such statement3 because such behavior is possible. It follows from a careful reading of the StampedLock API documentation. For example, it says:
"The scheduling policy of StampedLock does not consistently prefer readers over writers or vice versa."
In short, there is nothing there that guarantees that an untimed writeLock will eventually acquire the lock.
If you need to categorically avoid the scenario of readers causing writer starvation, then don't use writeLock and readLock. You could use tryOptimisticRead instead of readLock. Or you could design and implement a different synchronization mechanism.
Finally, you seems to be implying that StampedLock should provide a way to directly deal with your scenario and / or that the document should specifically explain to non-expert users how to deal with it. I draw people's attention to this:
"StampedLocks are designed for use as internal utilities in the development of thread-safe components.".
The fact that you are having difficulty finding pertinent examples in not the fault of the javadocs. If anything, it supports the inference that this API is for experts ...
1 - My opinion is that Java's concurrency support is at least easier to reason about than most other languages of its ilk. The Java Memory Model (Chapter 17.4 of the JLS) is well specified, and "Java Concurrency In Practice" by Goetz et al does a good job of explaining the ins and outs of concurrent programming.
2 - .... for most programmers.
3 - If this is not definitive enough for you, write yourself an example where there is a (simulated) indefinitely large sequence of read requests and multiple reader threads. Run it and see if the writer threads get stalled until the read requests are all drained.

Are java Lock implementations using synchronized code? [duplicate]

Is there a difference between 'ReentrantLock' and 'synchronized' on how it's implemented on CPU level?
Or do they use the same 'CAS' approach?

If we are talking about ReentrantLock vs synchronized (also known as "intrinsic lock") then it's a good idea to look at Lock documentation:
All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock:
A successful lock operation acts like a successful monitorEnter
action
A successful unlock operation acts like a successful monitorExit
action
So in general consider that synchronized is just an easy-to-use and concise approach of locking. You can achieve exactly the same synchronization effects by writing code with ReentrantLock with a bit more code (but it offers more options and flexibility).
Some time ago ReentrantLock was way faster under certain conditions (high contention for example), but now Java uses different optimizations techniques (like lock coarsening and adaptive locking) to make performance differences in many typical scenarios barely visible to the programmer.
There was also done a great job to optimize intrinsic lock in low-contention cases (e.g. biased locking). Authors of Java platform do like synchronized keyword and intrinsic-locking approach, they want programmers do not fear to use this handy tool (and prevent possible bugs). That's why synchronized optimizations and "synchronization is slow" myth busting was such a big deal for Sun and Oracle.
"CPU-part" of the question:
synchronized uses a locking mechanism that is built into the JVM and MONITORENTER / MONITOREXIT bytecode instructions. So the underlying implementation is JVM-specific (that is why it is called intrinsic lock) and AFAIK usually (subject to change) uses a pretty conservative strategy: once lock is "inflated" after threads collision on lock acquiring, synchronized begin to use OS-based locking ("fat locking") instead of fast CAS ("thin locking") and do not "like" to use CAS again soon (even if contention is gone).
ReentrantLock implementation is based on AbstractQueuedSynchronizer and coded in pure Java (uses CAS instructions and thread descheduling which was introduced it Java 5), so it is more stable across platforms, offers more flexibility and tries to use fast CAS appoach for acquiring a lock every time (and OS-level locking if failed).
So, the main difference between these locks implementations in terms of performance is a lock acquiring strategy (which may not exist in specific JVM implementation or situation).
And there is no general answer which locking is better + it is a subject to change during the time and platforms. You should look at the specific problem and its nature to pick the most suitable solution (as usually in Java)
PS: you're pretty curious and I highly recommend you to look at HotSpot sources to go deeper (and to find out exact implementations for specific platform version). It may really help. Starting point is somewhere here: http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/runtime/synchronizer.cpp

The ReentrantLock class, which implements Lock, has the same concurrency and memory semantics as synchronized, but also adds features like lock polling, timed lock waits, and interruptible lock waits. Additionally, it offers far better performance under heavy contention.
Source
Above answer is extract from Brian Goetz's article. You should read entire article. It helped me to understand differences in both.

Java uses synchronisation... What does Haskell use?

So I am pretty new to Haskell and would like to know, if synchronisation is used to prevent corruption when multithreading Java, how is this done in Haskell? I've only found useless or overly complicated responses on google.

Your question is a bit ambiguous since one may use multithreading for either concurrency or parallelism, which are distinct problems with distinct solutions.
In both cases, you'll need to make sure your programs are compiled with SMP support and ran using multiple RTS threads: see the GHC manual's section about concurrency.
Concurrency
As others have pointed out, synchronization will be a non problem in the vast majority of your code, since you'll mostly be dealing with pure functions. This is true in any language if you keep mutable state and libraries that rely on it under armed guard religiously avoid mutable state unless it is properly wrapped behind a pure API. Concurrency is an area where Haskell shines because its semantics require purity. Types are used to describe impure operations instead, making it dead easy to spot code where some sort of synchronization might be needed.
Typically, your application's state will be backed by a transactional database which will handle synchronization and persistence for you. You will not need any additional synchronization at all if your concurrent application does not have additional state.
In other cases, haskell has a handy Software Transactional Memory implementation. It allows you to write and compose code written in an imperative-looking style, without explicit locking, while having atomicity and guarantees against deadlocks. It is the foolproof(tm) way to write concurrent code.
Lastly, there are some low-level primitives available in base: plain old mutable references with IORef, semaphores, and MVars which can be used as if they were variables protected by a mutex.
There also are channels in base, but beware: they are unbounded !
Parallelism
This is also an area where Haskell shines because of its non-strict semantics. Non-strictness allows you to write code that expresses your logic in a straightforward manner while not getting committed to a specific evaluation order.
As a consequence, you can describe a parallel evaluation strategy separately from the business logic. Writing parallel code is then just a matter of placing the right annotation in the right spot.
Here is an example that was/is used in production at Bdellium:
map outputParticipant parts `using` parListChunk 10 rdeepseq
^^^^^ business logic ^^^^^^ ^^^^ eval. strategy ^^^^
The code can be understood as follows: Parallel workers will fully evaluate the results of mapping the outputParticipant function to individual items in the parts list, distributing the work in chunks of 10 elements.

This answer will pertain to functional languages in general - no synchronisation are needed. As functions in functional programming have no side effects: functions accept a value and return a value, there's no mutable state. Such functions are inherently thread safe.

Atomic Boolean vs SynchronizedBoolean in Java

I've come across these two in some multi-threaded code and was wondering if there is if/any difference between the two.
I mean does the use of an AtomicBoolean, rather than a SynchronizedBoolean, make a
significant difference in performance?
And does it affect the correctness of the computation?

AtomicBoolean is a part of the standard java concurrent package. SynchronizedBoolean is part of a set of utilities created by Doug Lea (author of much of the java concurrent packages). Performance-wise, you should expect AtomicBoolean to perform better -- it uses a volatile boolean whereas SynchronizedBoolean uses a ReadWriteLock.
However in practice for most applications you won't notice much difference.
The real difference (and what should guide your choice) is in the semantics the two classes offer. AtomicBoolean provides just simple set/get/compareAndSet operations. The SynchronizedBoolean offers atomic boolean operations and exposes its internal lock to allow you to execute Runnables within the context of its value.
Doug Lea has offered this source free to the community. I have found an extension to SynchronizedBoolean, WaitableBoolean particularly useful since it allows you to execute a Runnable within the lock whenever a particular state change occurs.

Should I always make my java-code thread-safe, or for performance-reasons do it only when needed?

If I create classes, that are used at the moment only in a single thread, should I make them thread-safe, even if I don't need that at the moment? It could be happen, that I later use this class in multiple threads, and at that time I could get race conditions and may have a hard time to find them if I didn't made the class thread-safe in the first place. Or should I make the class not thread-safe, for better performance? But premature optimization is evil.
Differently asked: Should I make my classes thread-safe if needed (if used in multiple threads, otherwise not) or should I optimize this issue then needed (if I see that the synchronization eats up an important part of processing time)?
If I choose one of the both ways, are there methods to reduce the disadvantages? Or exists a third possibility, that I should use?
EDIT: I give the reason this question came up to my mind. At our company we have written a very simple user-management that writes the data into property-files. I used it in a web-app and after some work on it I got strange errors, that the user-management forgot about properties of users(including name and password) and roles. That was very annoying but not consistently reproducible, so I think it was race condition. Since I synchronized all methods reading and writing from/on disk, the problem disappeared. So I thought, that I probably could have been avoided all the hassle, if we had written the class with synchronization in the first place?
EDIT 2: As I look over the tips of Pragmatic Programmer, I saw tip #41: Always Design for Concurrency. This doesn't say that all code should be thread-safe, but it says the design should have the concurrency in mind.

I used to try to make everything thread-safe - then I realised that the very meaning of "thread-safe" depends on the usage. You often just can't predict that usage, and the caller will have to take action anyway to use it in a thread-safe way.
These days I write almost everything assuming single threading, and put threading knowledge in the select few places where it matters.
Having said that, I do also (where appropriate) create immutable types, which are naturally amenable to multi-threading - as well as being easier to reason about in general.

Start from the data. Decide which data is explicitly shared and protect it. If at all possible, encapsulate the locking with the data. Use pre-existing thread-safe concurrent collections.
Whenever possible, use immutable objects. Make attributes final, set their values in the constructors. If you need to "change" the data consider returning a new instance. Immutable objects don't need locking.
For objects that are not shared or thread-confined, do not spend time making them thread-safe.
Document the expectations in the code. The JCIP annotations are the best pre-defined choice available.

Follow the prinicple of "as simple as possible, but no simpler." Absent a requirement, you should not make them thread-safe. Doing so would be speculative, and likely unnecessary. Thread-safe programming adds much more complexity to your classes, and will likely make them less performant due to synchronization tasks.
Unless explicitly stated that an object is thread-safe, the expectation is that it is not.

I personally would only design classes that are "thread-safe" when needed - on the principle of optimise only when needed. Sun seem to have gone the same way with the example of single threaded collections classes.
However there are some good principles that will help you either way if you decide to change:
Most important: THINK BEFORE YOU SYNCHRONIZE. I had a colleague once who used to synchronize stuff "just in case - after all synchronized must be better, right?" This is WRONG, and was a cause of multiple deadlock bugs.
If your Objects can be immutable, make them immutable. This will not only help with threading, will help them be safely used in sets, as keys for Maps etc
Keep your Objects as simple as possible. Each one should ideally only do one job. If you ever find you might want to synchronise access to half the members, then you possibly should split the Object in two.
Learn java.util.concurrent and use it whenever possible. Their code will be better, faster and safer than yours (or mine) in 99% of cases.
Read Concurrent Programming in Java, it's great!

Just as a side remark: Synchronization != Thread-safety. Even so you might not concurrently modify data, but you might read it concurrently. So keep the Java Memory Model in mind where synchronization means making data reliable available in all threads, not only protecting the concurrent modification of it.
And yes, in my opinion thread-safety has to built in right from the beginning and it depends on the application logic if you need handling of concurrency. Never assume anything and even if your test seems to be fine, race conditions are sleeping dogs.

I found the JCIP annotations very useful to declare which classes are thread-safe. My team annotates our classes as #ThreadSafe, #NotThreadSafe or #Immutable. This is much clearer than having to read Javadoc, and FindBugs helps us find violations of the #Immutable and #GuardedBy contracts too.

You should absolutely know which segments of your code will be multi-threaded and which won't.
Without being able to concentrate the area of multithreadedness into a small, controllable section, you will not succeed. The parts of your app that are multi-threaded need to be gone over carefully, fully analyzed, understood and adapted for a multi-threaded environment.
The rest does not and therefore making it thread-safe would be a waste.
For instance, with the swing GUI, Sun just decided that none of it would be multi-threaded.
Oh, and if someone uses your classes--it's up to them to ensure that if it's in a threaded section then make it threadsafe.
Sun initially came out with threadsafe collections (only). the problem is, threadsafe cannot be made un-threadsafe (for performance purposes). So now they came out with un-threadsafe versions with wrappers to make them threadsafe. For most cases, the wrappers are unnecessary--assume that unless you are creating the threads yourself, that your class does not have to be threadsafe--but DOCUMENT it in the javadocs.

If I create classes, that are used at the moment only in a single thread, should I make them thread-safe
It is not necessary for a class used by a thread to by itself thread-safe for the program as a whole to be thread-safe. You can safely share objects of non "thread safe" classes between threads if they are protected by appropriate synchronization. So, there is no need to make a class itself thread-safe until that becomes apparent.
However, multi-threading is fundamental (architectural) choice in a program. It is not really something to add as an after thought. So you should know right from the start which classes need to be thread safe.

Here's my personal approach:
Make objects and data structure immutable wherever you can. That is good practice in general, and is automatically thread safe. Problem solved.
If you have to make an object mutable then normally don't bother trying to make it thread safe. The reasoning for this is simple: when you have mutable state then locking / control cannot be safely handled by a single class. Even if you synchronize all the methods, this doesn't guarantee thread safety. And if you add synchronisation to an object that only ever gets used in a single-threaded context, then you've just added unnecessary overhead. So you might as well leave it up to the caller / user to implement whatever locking system is necessary.
If you provide a higher level public API then implement whatever locking is required to make your API thread safe. For higher level functionality the overhead of thread safety is pretty trivial, and your users will definitely thank you. An API with complicated concurrency semantics that the users need to work around is not a good API!
This approach has served me well over time: you may need to make the occasional exception but on average it's a very good place to start!

If you want to follow what Sun did in the Java API, you can take a look at the collection classes. Many common collection classes are not thread-safe, but have thread-safe counterparts. According to Jon Skeet (see comments), many of the Java classes were originally thread-safe, but they were not benefiting developers, so some classes now have two versions - one being thread-safe and the other not thread-safe.
My advice is to not make the code thread-safe until you have to, as there is some overhead involved with thread-safety. I guess this falls into the same category as optimization - don't do it before you have to.

Design separately the classes to use from multiple threads and document other ones to be used from only single thread.
Single threaded ones are much easier to work with.
Separating the multithreaded logic helps to make the synchronization correct.

"Always" is a very dangerous word in software development... choices like this are "always" situational.

To avoid race conditions, lock on only one object - read descriptions of race conditions tediously and you will discover that cross-locks ( race condition is a misnomer - race comes to halt there ) are always a consequence of two + threads trying to lock on two + objects.
Make all methods synchronized and do testing - for any real world app that actually has to deal with the issues sync is a small cost. What they don't tell you is that the whole thing does lockout on 16 bit pointer tables ... at that point you are uh,...
Just keep your burger flippin resume' current.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.