I am learning concurrency in Java, and came to know about semaphores, which can be used for synchronisation without busy waiting.
Now, I'm wondering if Java synchronised method/statements and locks (e.g. re-entrant lock) are busy waiting mechanisms?
If not, how do other thread(s) get notified, do they implement semaphores under the hood?
synchronised method()
synchronised(object){}
reeantrant.lock()
how do other thread(s)get notified,
Virtually all practical JVMs let the operating system do the real work. workstations, servers, and mobile devices all run preemptive, multi-tasking operating systems these days, and it's the operating system that provides the primitive functions upon which threads, mutexes, semaphores, etc. are all built.
The most important thing that the OS can do that programs can't do for themselves is called a context switch. That's where the OS suspends one thread, saves its state in a special "context record," and then restores the context of some other thread, allowing the second thread to start using the same CPU that the first thread previously was using.
Whenever a thread waits to lock a mutex, waits for I/O, for some_object.notify(), or for pretty much anything else, the OS "switches out" its context, places it in a queue of things that are awaiting that event, and we say that the thread is blocked on the event. When the event happens (e.g., when somebody releases a mutex) the OS moves the thread's context into a queue of threads that are RUNNABLE, and eventually, when a CPU becomes available, it "switches in" the context, and the thread gets to run again.
do they implement semaphores under the hood?
Semaphore is a very old idea: It originally was meant to be a low-level primitive operation--one that could be implemented in an arcane way, without any hardware support--upon which other things like mutual exclusion, queues, barriers, etc. could be built.
These days, mutexes are the lowest level--implemented using special hardware instructions--and a semaphore is a higher level object that is built on top of a mutex. One of the main reasons we still have Semaphore is just that so much of the existing literature talks about semaphores, and so much existing code still uses semaphores.
The way Lock interface with class Reentrant(true) lock works is that it uses BlockingQueue to store Threads that want to acquire the lock. In that way thread that 'came first, go out first'-FIFO. All clear about that.
But where do 'unfair locks' go, or ReentrantLock(false). What is their internal implementation? How does OS decide which thread now to pick? And most importantly are now these threads also stored in a queue or where? (they must be somewhere)
The class ReentrantLock does not use a BlockingQueue. It uses a non-public subclass of AbstractQueuedSynchronizer behind the scenes.
The AbstractQueuedSynchronizer class, as its documentation states, maintains “a first-in-first-out (FIFO) wait queue”. This data structure is the same for fair and unfair locks. The unfairness doesn’t imply that the lock would change the order of enqueued waiting threads, as there would be no advantage in doing that.
The key difference is that an unfair lock allows a lock attempt to succeed immediately when the lock just has been released, even when there are other threads waiting for the lock for a longer time. In that scenario, the queue is not even involved for the overtaking thread. This is more efficient than adding the current thread to the queue and putting it into the wait state while removing the longest waiting thread from the queue and changing its state to “runnable”.
When the lock is not available by the time, a thread tries to acquire it, the thread will be added to the queue and at this point, there is no difference between fair and unfair locks for it (except that other threads may overtake it without getting enqueued). Since the order has not been specified for an unfair lock, it could use a LIFO data structure behind the scenes, but it’s obviously simpler to have just one implementation code for both.
For synchronized, on the other hand, which does not support fair acquisition, there are some JVM implementations using a LIFO structure. This may change from one version to another (or even with the same, as a side effect of some JVM options or environmental aspects).
Another interesting point in this regard, is that the parameterless tryLock() of the ReentrantLock implementation will be unfair, even when the lock is otherwise in fair mode. This demonstrates that being unfair is not a property of the waiting queue here, but the treatment of the arriving thread that makes a new lock attempt.
Even when this lock has been set to use a fair ordering policy, a call to tryLock() will immediately acquire the lock if it is available, whether or not other threads are currently waiting for the lock. This "barging" behavior can be useful in certain circumstances, even though it breaks fairness.
ReentrantLock API doc says:
The constructor for this class accepts an optional fairness parameter. When set true, under contention, locks favor granting access to the longest-waiting thread.
Note however, that fairness of locks does not guarantee fairness of thread scheduling. Thus, one of many threads using a fair lock may obtain it multiple times in succession while other active threads are not progressing and not currently holding the lock.
I am not able to understand points 2:
If one thread obtain lock multiple times in succession, then as per point 1, other threads will wait for longer and that does mean they will get the lock next time. Then how this does not affect (fairness of) thread scheduling? Thus, I feel fair lock is nothing but longest waiting time first thread scheduling.
I think they're just trying to separate the fairness logic side from the scheduling logic. Threads may be concurrent, but that doesn't mean they try to access Locks simultaneously. Thread priority requests are only 'hints' to the OS, and are never guaranteed the way they may be expected.
So, just because you have threads A and B, which may request a lock, which may even have identical behavior, one thread may execute, acquire the lock, release, re-acquire, before the other locks even request it:
A: Request Lock -> Release Lock -> Request Lock Again (Succeeds)
B: Request Lock (Denied)...
----------------------- Time --------------------------------->
Thread scheduling logic is decoupled from the Lock logic.
There are other scheduling issues too, the burden of which often falls on the software designer, see Starvation and Livelock
For example, I have a method called increase.
public synchronized void increase() {
count++;
}
Two threads (T_1 and T_2) both execute this method. We know count++ is a compound operation that consists of read, modify and write. If T_1 gets the lock firstly, execute read, can T_1 be interrupted at this point (Although T_2 can't do anything besides waiting for the lock to be released)?
From Concurrency in Go by Katherine Cox-Buday, it says:
When something is considered atomic, or to have the property of atomicity, this means that within the context that it is operating, it is indivisible, or uninterruptible.
I think it means that atomicity is truly uninterruptible.
But from one answer to java - What does "atomic" mean in programming? - Stack Overflow, it says:
"Atomic operation" means an operation that appears to be instantaneous from the perspective of all other threads. You don't need to worry about a partly complete operation when the guarantee applies.
I think it means that atomicity is virtually uninterruptible (it can be interrupted, but from the perspective of all other threads, it seems it can't be interrupted.).
So, which one is right?
A it was pointed out there is now way for your program to understand the difference. But threads scheduler on OS level is free to suspend a thread and resume another thread.
For example if inside synchronized method you are doing some blocking I/O (like reading user input or reading from socket) scheduler can detect it and pause a thread which holds lock but is blocked on I/O and try to resume any other thread (maybe even the one which is also waiting for the same lock).
java.util.concurrent API provides a class called as Lock, which would basically serialize the control in order to access the critical resource. It gives method such as park() and unpark().
We can do similar things if we can use synchronized keyword and using wait() and notify() notifyAll() methods.
I am wondering which one of these is better in practice and why?
If you're simply locking an object, I'd prefer to use synchronized
Example:
Lock.acquire();
doSomethingNifty(); // Throws a NPE!
Lock.release(); // Oh noes, we never release the lock!
You have to explicitly do try{} finally{} everywhere.
Whereas with synchronized, it's super clear and impossible to get wrong:
synchronized(myObject) {
doSomethingNifty();
}
That said, Locks may be more useful for more complicated things where you can't acquire and release in such a clean manner. I would honestly prefer to avoid using bare Locks in the first place, and just go with a more sophisticated concurrency control such as a CyclicBarrier or a LinkedBlockingQueue, if they meet your needs.
I've never had a reason to use wait() or notify() but there may be some good ones.
I am wondering which one of these is better in practice and why?
I've found that Lock and Condition (and other new concurrent classes) are just more tools for the toolbox. I could do most everything I needed with my old claw hammer (the synchronized keyword), but it was awkward to use in some situations. Several of those awkward situations became much simpler once I added more tools to my toolbox: a rubber mallet, a ball-peen hammer, a prybar, and some nail punches. However, my old claw hammer still sees its share of use.
I don't think one is really "better" than the other, but rather each is a better fit for different problems. In a nutshell, the simple model and scope-oriented nature of synchronized helps protect me from bugs in my code, but those same advantages are sometimes hindrances in more complex scenarios. Its these more complex scenarios that the concurrent package was created to help address. But using this higher level constructs requires more explicit and careful management in the code.
===
I think the JavaDoc does a good job of describing the distinction between Lock and synchronized (the emphasis is mine):
Lock implementations provide more extensive locking operations than can be obtained using synchronized methods and statements. They allow more flexible structuring, may have quite different properties, and may support multiple associated Condition objects.
...
The use of synchronized methods or statements provides access to the implicit monitor lock associated with every object, but forces all lock acquisition and release to occur in a block-structured way: when multiple locks are acquired they must be released in the opposite order, and all locks must be released in the same lexical scope in which they were acquired.
While the scoping mechanism for synchronized methods and statements makes it much easier to program with monitor locks, and helps avoid many common programming errors involving locks, there are occasions where you need to work with locks in a more flexible way. For example, **some algorithms* for traversing concurrently accessed data structures require the use of "hand-over-hand" or "chain locking": you acquire the lock of node A, then node B, then release A and acquire C, then release B and acquire D and so on. Implementations of the Lock interface enable the use of such techniques by allowing a lock to be acquired and released in different scopes, and allowing multiple locks to be acquired and released in any order.
With this increased flexibility comes additional responsibility. The absence of block-structured locking removes the automatic release of locks that occurs with synchronized methods and statements. In most cases, the following idiom should be used:
...
When locking and unlocking occur in different scopes, care must be taken to ensure that all code that is executed while the lock is held is protected by try-finally or try-catch to ensure that the lock is released when necessary.
Lock implementations provide additional functionality over the use of synchronized methods and statements by providing a non-blocking attempt to acquire a lock (tryLock()), an attempt to acquire the lock that can be interrupted (lockInterruptibly(), and an attempt to acquire the lock that can timeout (tryLock(long, TimeUnit)).
...
You can achieve everything the utilities in java.util.concurrent do with the low-level primitives like synchronized, volatile, or wait / notify
However, concurrency is tricky, and most people get at least some parts of it wrong, making their code either incorrect or inefficient (or both).
The concurrent API provides a higher-level approach, which is easier (and as such safer) to use. In a nutshell, you should not need to use synchronized, volatile, wait, notify directly anymore.
The Lock class itself is on the lower-level side of this toolbox, you may not even need to use that directly either (you can use Queues and Semaphore and stuff, etc, most of the time).
There are 4 main factors into why you would want to use synchronized or java.util.concurrent.Lock.
Note: Synchronized locking is what I mean when I say intrinsic locking.
When Java 5 came out with
ReentrantLocks, they proved to have
quite a noticeble throughput
difference then intrinsic locking.
If youre looking for faster locking
mechanism and are running 1.5
consider j.u.c.ReentrantLock. Java
6's intrinsic locking is now
comparable.
j.u.c.Lock has different mechanisms
for locking. Lock interruptable -
attempt to lock until the locking
thread is interrupted; timed lock -
attempt to lock for a certain amount
of time and give up if you do not
succeed; tryLock - attempt to lock,
if some other thread is holding the
lock give up. This all is included
aside from the simple lock.
Intrinsic locking only offers simple
locking
Style. If both 1 and 2 do not fall
into categories of what you are
concerned with most people,
including myself, would find the
intrinsic locking semenatics easier
to read and less verbose then
j.u.c.Lock locking.
Multiple Conditions. An object you
lock on can only be notified and
waited for a single case. Lock's
newCondition method allows for a
single Lock to have mutliple reasons
to await or signal. I have yet to
actually need this functionality in
practice, but is a nice feature for
those who need it.
I would like to add some more things on top of Bert F answer.
Locks support various methods for finer grained lock control, which are more expressive than implicit monitors (synchronized locks)
A Lock provides exclusive access to a shared resource: only one thread at a time can acquire the lock and all access to the shared resource requires that the lock be acquired first. However, some locks may allow concurrent access to a shared resource, such as the read lock of a ReadWriteLock.
Advantages of Lock over Synchronization from documentation page
The use of synchronized methods or statements provides access to the implicit monitor lock associated with every object, but forces all lock acquisition and release to occur in a block-structured way
Lock implementations provide additional functionality over the use of synchronized methods and statements by providing a non-blocking attempt to acquire a lock (tryLock()), an attempt to acquire the lock that can be interrupted (lockInterruptibly(), and an attempt to acquire the lock that can timeout (tryLock(long, TimeUnit)).
A Lock class can also provide behavior and semantics that is quite different from that of the implicit monitor lock, such as guaranteed ordering, non-reentrant usage, or deadlock detection
ReentrantLock: In simple terms as per my understanding, ReentrantLock allows an object to re-enter from one critical section to other critical section . Since you already have lock to enter one critical section, you can other critical section on same object by using current lock.
ReentrantLock key features as per this article
Ability to lock interruptibly.
Ability to timeout while waiting for lock.
Power to create fair lock.
API to get list of waiting thread for lock.
Flexibility to try for lock without blocking.
You can use ReentrantReadWriteLock.ReadLock, ReentrantReadWriteLock.WriteLock to further acquire control on granular locking on read and write operations.
Apart from these three ReentrantLocks, java 8 provides one more Lock
StampedLock:
Java 8 ships with a new kind of lock called StampedLock which also support read and write locks just like in the example above. In contrast to ReadWriteLock the locking methods of a StampedLock return a stamp represented by a long value.
You can use these stamps to either release a lock or to check if the lock is still valid. Additionally stamped locks support another lock mode called optimistic locking.
Have a look at this article on usage of different type of ReentrantLock and StampedLock locks.
The main difference is fairness, in other words are requests handled FIFO or can there be barging? Method level synchronization ensures fair or FIFO allocation of the lock. Using
synchronized(foo) {
}
or
lock.acquire(); .....lock.release();
does not assure fairness.
If you have lots of contention for the lock you can easily encounter barging where newer requests get the lock and older requests get stuck. I've seen cases where 200 threads arrive in short order for a lock and the 2nd one to arrive got processed last. This is ok for some applications but for others it's deadly.
See Brian Goetz's "Java Concurrency In Practice" book, section 13.3 for a full discussion of this topic.
Major difference between lock and synchronized:
with locks, you can release and acquire the locks in any order.
with synchronized, you can release the locks only in the order it was acquired.
Brian Goetz's "Java Concurrency In Practice" book, section 13.3:
"...Like the default ReentrantLock, intrinsic locking offers no deterministic fairness guarantees, but the
statistical fairness guarantees of most locking implementations are good enough for almost all situations..."
Lock makes programmers' life easier. Here are a few situations that can be achieved easily with lock.
Lock in one method, and release the lock in another method.
If You have two threads working on two different pieces of code, however, in the first thread has a pre-requisite on a certain piece of code in the second thread (while some other threads also working on the same piece of code in the second thread simultaneously). A shared lock can solve this problem quite easily.
Implementing monitors. For example, a simple queue where the put and get methods are executed from many other threads. However, you do not want multiple put (or get) methods running simultaneously, neither the put and get method running simultaneously. A private lock makes your life a lot easier to achieve this.
While, the lock, and conditions build on the synchronized mechanism. Therefore, can certainly be able to achieve the same functionality that you can achieve using the lock. However, solving complex scenarios with synchronized may make your life difficult and can deviate you from solving the actual problem.
Lock and synchronize block both serves the same purpose but it depends on the usage. Consider the below part
void randomFunction(){
.
.
.
synchronize(this){
//do some functionality
}
.
.
.
synchronize(this)
{
// do some functionality
}
} // end of randomFunction
In the above case , if a thread enters the synchronize block, the other block is also locked. If there are multiple such synchronize block on the same object, all the blocks are locked. In such situations , java.util.concurrent.Lock can be used to prevent unwanted locking of blocks