What is the latency of a BlockingQueue's take() method?

What is the latency of a BlockingQueue's take() method? - java

I'd like to understand how take() works and if it's a suitable method to consume "fastly" elements that are pushed on a queue.
Note that, for the sake of understanding how it works, I'm not considering here the observer pattern: I know that I could use that pattern to "react quicly" to events but that's not what my question is about.
For example if I have a BlockingQueue (mostly empty) and a thread "stuck" waiting for an element to be pushed on that queue so that it can be consumed, what would be a good way to minimize the time spent (reduce the latency) between the moment an element is pushed on the queue and the moment it is consumed?
For example what's the difference between a thread doing this:
while( true ) {
elem = queue.peek();
if ( elem == null ) {
Thread.sleep( 25 ); // prevents busy-looping
} else {
... // do something here
}
}
and another one doing this:
while ( true ) {
elem = queue.take();
... // do something with elem here
}
(I take it that to simplify things we can ignore discussing about exceptions here!?)
What goes on under the hood when you call take() and the queue is empty? The JVM somehow has to "sleep" the thread under the hood because it can't be busy-looping constantly checking if there's something on the queue? Is take() using some CAS operation under the hood? And if so what determines how often take() does call that CAS operation?
What when something suddenly makes it to the queue? How's that thread blocked on take() somehow "notified" that it should act promptly?
Lastly, is it "common" to have one thread "stuck" on take() on a BlockingQueue for the lifetime of the application?
It's all one big question related to how the blocking take() works and I take it that answering my various questions (at least the one that makes sense) would help me understand all this better.

Internally, take waits on the notEmpty condition, which is signaled in the insert method; in other words, the waiting thread goes to sleep, and wakes up on insert. This should be fast.
Some blocking queues, e.g. ArrayBlockingQueue and SynchronousQueue, have a constructor that accepts the queue's fairness property; passing in true should prevent threads from getting stuck on take, otherwise this is a possibility. (This parameter specifies whether the underlying ReentrantLock is fair.)

Well, here's the implementation of LinkedBlockingQueue<E>.take() :
public E take() throws InterruptedException {
E x;
int c = -1;
final AtomicInteger count = this.count;
final ReentrantLock takeLock = this.takeLock;
takeLock.lockInterruptibly();
try {
while (count.get() == 0) {
notEmpty.await();
}
x = dequeue();
c = count.getAndDecrement();
if (c > 1)
notEmpty.signal();
} finally {
takeLock.unlock();
}
if (c == capacity)
signalNotFull();
return x;
}
When the queue is empty, notEmpty.await() is called, which :
Causes the current thread to wait until it is signalled or
interrupted.
The lock associated with this Condition is atomically released and the
current thread becomes disabled for thread scheduling purposes and
lies dormant until one of four things happens:
Some other thread invokes the signal method for this Condition and the
current thread happens to be chosen as the thread to be awakened; or
Some other thread invokes the signalAll method for this Condition; or
Some other thread interrupts the current thread, and interruption of
thread suspension is supported; or
A "spurious wakeup" occurs.
When another threads puts something in the queue, it calls signal, which awakes one of the threads waiting to consume items from this queue. This should work faster then your peek/sleep loop.

You can assume that take() will be notified that it can wake as soon your OS can pass such a signal between threads. Note: your OS will be involved worst case. Typically this is 1 - 10 micro-seconds, and in rare case 100 or even 1000 micro-seconds in very rare cases. Note: Thread.sleep will wait for a minimum of 1000 microseconds and 25 milli-seconds is 25,000 micro-seconds so I would hope the difference is obvious to you.
The only really way of avoiding rare but long context switches is to busy wait on a affinity lock CPU. (This allocates a CPU to your thread) If your application is that latency sensible, simpler solution is to not pass the work between threads at all. ;)

Since two threads are involved, the peek/sleep with a hypothetical micro/nano-sleep implementation would not differ too much from take() since they both involve passing information from one thread to the next via main memory (using volatile write/read and a healthy amount of CAS), unless the JVMs find other ways to do inter-thread synchronization. You can try to implement a benchmark using two BlockingQueues and two threads who each act as producer for one queue and consumer for the other and move a token back and forth, taking it from one queue and offering to the next. Then you could see how fast they can produce/consume and compare that to peek/sleep. I guess performance depends a lot on the amount work spent on each token (in this case zero, so we measure pure overhead) and the distance of CPU to memory. In my experience, single CPUs come out way ahead of multi-socket machines.

The difference is that the first thread sleeps for up to 25ms too long, whereas the second thread doesn't waste any time at all.

Related

How efficient are BlockingQueues / what's their effect on CPU time?

I am making an online game in Java and I ran into one particular issue where I was trying to find the most efficient way to send clients spawn entity NPC packets. I of course understand how to send them but I wanted to do it off of the main game loop since it requires looping through a map of NPC's (I also made sure its thread safe). To do this I thought a BlockingQueue was my best option so I created a new thread set it to daemon then passed in a runnable object. Then whenever I needed to send one of these packets I would use the insertElement() method to add to the queue. Here is how it looks.
public class NpcAsyncRunnable implements Runnable {
private final BlockingQueue<NpcObject> blockingQueue;
public NpcAsyncRunnable() {
blockingQueue = new LinkedBlockingQueue<>();
}
#Override
public void run() {
while(true) {
try {
final NpcObject obj = blockingQueue.take();
//Run my algorithm here
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public void insertElement(final NpcObject obj) {
blockingQueue.add(obj);
}
}
Now my question is how efficient is this? I am running the thread the whole time in an infinite loop because I always want it to be checking for another inserted element. However, my concern is if I have too many async threads listening would it start to clog up the CPU? I ask this because I know a CPU core can only run 1 thread of execution at a time but with hyperthreading (AMD has the same thing but its called something different) it can jump between executing multiple threads when one needs to search for something in memory. But does this infinite loop without making it sleep mean it will always be checking if the queue has a new entry? My worry is I will make a CPU core waste all its resources infinitely looping over this one thread waiting for another insertion.
Does the CPU instead auto assign small breaks to allow other threads to execute or do I need to include sleep statements so that this thread is not using way more resources than is required? How much CPU time will this use just idling?

...does this infinite loop without making it sleep mean...?
blockingQueue.take() does sleep until there's something in the queue to be taken. The Javadoc for the take method says, "Retrieves and removes the head of this queue, waiting if necessary until an element becomes available."
"Waiting" means it sleeps. Any time you are forced to write catch (InterruptedException...), it's because you called something that sleeps.
how does it know when something is added if its sleeping? It has to be running in order to check if something has been added to the queue right?
No. It doesn't need to run. It doesn't need to "check." A BlockingQueue effectively* uses object.wait() to make a thread "sleep," and it uses object.notify() to wake it up again. When one thread in a Java program calls o.wait() for any Object o, the wait() call will not return** until some other thread calls o.notify() for the same Object o.
wait() and notify() are thin wrappers for operating system-specific calls that do approximately the same thing. All the magic happens in the OS. In a nutshell;
The OS suspends the thread that calls o.wait(), and it adds the thread's saved execution context to a queue associated with the object o.
When some other thread calls o.notify(), the OS takes the saved execution context at the head of the queue (if there is one***), and moves it to the "ready-to-run" queue.
Some time later, the OS scheduler will find the saved thread context at the head of the "ready-to-run" queue, and it will restore the context on one of the system's CPUs.
At that point, the o.wait() call will return, and the thread that waited can then proceed to deal with whatever it was waiting for (e.g., an NpcAsyncRunnable object in your case.)
* I don't know whether any particular class that implements BlockingQueue actually uses object.wait() and object.notify(), but even if they don't use those methods, then they almost certainly use the same operating system calls that underlie wait() and notify().
** Almost true, but there's something called "spurious wakeup." Correctly using o.wait() and o.notify() is tricky. I strongly recommend that you work through the tutorial if you want to try it yourself.
*** o.notify() does absolutely nothing at all if no other thread is already waiting at the moment when it is called. Beginners who don't understand this often ask, "Why did wait() never return?" It didn't return because the thread that wait()ed was too late. Again, I urge you to work through the tutorial if you want to learn how to avoid that particular bug.

ConcurrentLinkedQueue and poll()

I'm going to use ConcurrentLinkedQueue and Java as an example to a more general question. Let me first explain the question with regards to ConcurrentLinkedQueue. Consider:
ConcurrentLinkedQueue<Integer> queue = new ConcurrentLinkedQueue<>();
while (true) {
Integer item = queue.poll();
if (item != null) {
// do some stuff
}
}
ConcurrentLinkedQueue::poll does not block. So if I was to run this code (and only this code) in its own thread. It would constantly do a redundant operation. Compare this to using something like LinkedBlockingQueue::take that blocks until something is available. How much of a difference does it make?
I realize that the question is very vague and specific to the language and data-structure's implementation. But the questions generalizes to something like this:
How resource consuming is it to run a forever loop that does some small, repetitive operation (like queue.poll())?
Because the operation is small but repetitive, each iteration finishes faster but the loop also runs at a higher frequency, which makes me think that it's worse.

As you have an infinite number of operations to execute, scheduled by the while (true) loop, you would bring one CPU core to constantly 100% utilisation with the polling implementation. Which is not a good idea.
On the other hand take blocks the thread until an item is available. The thread can be put in the background, while other threads can be executed on the CPU. A blocked thread does not consume resources. It is wake up, when an item is available and only then scheduled to be executed on the CPU.
The context switch of the scheduling of a background thread might give you a slightly slower reaction time (this depends on the operation system implementation of thread scheduling and handling interrupts), but over all it should be way better than putting constantly 100% utilisation on the CPU.

Access to shared resource, lock unlock or wait notify

Scenario:
Multi-threads reading from different sources.
One single access point to a shared queue (See a class RiderSynchronized trying to write)
Every line a Reader reads, It tries to insert into a shared queue through method RiderSynchronized provides.
When shared queue is full, I have to run a batch on a prepared statement to insert into Oracle. Meanwhile, all access to shared queue it must be denied.
Code:
public class RiderSynchronized {
private ArrayDeque<JSONRecord> queue = new ArrayDeque<>();
private OracleDAO oracleDao;
private long capacity;
public RiderSynchronized(OracleDAO oracleDao, long capacity) {
this.oracleDao = oracleDao;
this.capacity = capacity;
}
public synchronized boolean addRecord(JSONRecord record) {
boolean success = false;
try {
while (queue.size() >= capacity) {
wait();
}
queue.add(record);
if (queue.size() < capacity) {
success = true;
notify(); //notify single Thread
} else {
JSONRecord currentRecord = null;
while ((currentRecord = queue.poll()) != null) {
oracleDao.insertRowParsedIntoBatch(currentRecord);
}
oracleDao.runBatch();
success = true;
notifyAll(); //it could be all Reading Threads are waiting. Notify all
}
} catch (Exception e) {
success = false;
}
return success;
}
}
I have to admit I'm a little worried about a thing.
1) Reader threads can just use addRecord indistinctly? Are They going to wait for themselves? Or Do I have to implement some other method where to check before to run addRecord Method?
2) When queue.size < capacity, I decide to notify just to one thread, because IMHO, at this point, no threads should be in status waiting. Am I wrong? Should I notify All?
2b) Exact question for the "else" statement. Is it a good practice to notifyAll? At this point, it could be all threds are waiting?
3) Finally. I'm a little concerned to re-write everything using Lock e Condition Classes. Is it a better decision? Or Is it ok how I'm running this scenario?

1) Reader threads can just use addRecord indistinctly? Are They going
to wait for themselves? Or Do I have to implement some other method
where to check before to run addRecord Method?
The problem with your current code is that if for some reason notifyAll is not called by the only thread that theoretically should be able to go into the else block then your threads will wait forever.
The potential risks in your code are:
oracleDao.insertRowParsedIntoBatch(currentRecord)
oracleDao.runBatch()
With your current code if one of those methods throw an exception notifyAll will never be called so your threads will wait forever, you should at least consider calling notifyAll in a finally block to make sure that it will be called whether happens.
2) When queue.size < capacity, I decide to notify just to one thread,
because IMHO, at this point, no threads should be in status waiting.
Am I wrong? Should I notify All?
Your threads could only wait in case queue.size() >= capacity so for me notify is not even needed as this condition (queue.size() < capacity) is not expected by any thread.
2b) Exact question for the "else" statement. Is it a good practice to
notifyAll? At this point, it could be all threds are waiting?
Item 69 from Effective Java:
A related issue is whether you should use notify or notifyAll to wake
waiting threads. (Recall that notify wakes a single waiting thread,
assuming such a thread exists, and notifyAll wakes all waiting
threads.) It is often said that you should always use notifyAll. This
is reasonable, conservative advice. It will always yield correct
results because it guarantees that you’ll wake the threads that need
to be awakened. You may wake some other threads, too, but this won’t
affect the correctness of your program. These threads will check the
condition for which they’re waiting and, finding it false, will
continue waiting. As an optimization, you may choose to invoke notify
instead of notifyAll if all threads that could be in the wait-set are
waiting for the same condition and only one thread at a time can
benefit from the condition becoming true. Even if these conditions
appear true, there may be cause to use notifyAll in place of notify.
Just as placing the wait invocation in a loop protects against
accidental or malicious notifications on a publicly accessible object,
using notifyAll in place of notify protects against accidental or
malicious waits by an unrelated thread. Such waits could otherwise
“swallow” a critical notification, leaving its intended recipient
waiting indefinitely.
3) Finally. I'm a little concerned to re-write everything using Lock e
Condition Classes. Is it a better decision? Or Is it ok how I'm
running this scenario?
Lock and Condition are interesting if you need features that are not available with intrinsic locks like for example tryLock() or the ability to awake only threads waiting for a given condition. In your case it doesn't seem to be necessary so you can keep it like it is.

I'll try to answer you question one by one.
1) If I understand you correctly answer is yes thread will wait and you don't need to do anything else.
2) You don't need to notify anyone in case of queue.size < capacity there are no waiting thread at this point.
3) Yes it is ok to notify all. If more threads than capacity is waiting rest of them come to wait state fast.
4) It is opinion based question. In your scenario you wont get any benefit from rewriting.

why does my notify method not work properly?

I am improving my concurrent program by pausing threads that doing the same thing to wait for one of them finished. However, it cannot not wake up threads properly. Here is the code.
//to store graphs, if a thread finds the graph it is going to compute is in the entry, it waits, otherwise it compute then notify all other threads waiting on it.
Map<Graph, Object> entry = new ConcurrentHashMap<Graph, Object>();
public Result recursiveMethod(Graph g) {
if (entry.get(g) != null) {//if the graph is in the entry, waits
synchronized(entry.get(g)) {
entry.get(g).wait();
}
//wakes up, and directly return the result
return result;
}
synchronized(entry) {
if (entry.get(g) == null)//if the graph is not in the entry, continue to compute
entry.put(g,new Object());
}
//compute the graph recursively calls this method itself...
calculate here...
//wake up threads waiting on it, and remove the graph from entry
synchronized(entry.get(g)){
entry.get(g).notifyAll();
}
entry.remove(g);
return result;
}
This method is called by many many threads. Before a thread starts calculation, it looks up the entry to see if there is another thread calculating an identical graph. If so, it waits.
If not, it continues to calculate. After it figures out the result, it notifies all the threads that is waiting on it.
I use a map to pair up graph and an object. The object is the lock.
Please notice that the this map can recognize two identical graphs, that is, the following code returns true.
Graph g = new Graph();
entry.put(g, new Object());
Graph copy = new Graph(g);
entry.get(g) == entry.get(copy) //this is true
Thus, the entry.get(g) should be ok to be the lock/monitor.
However, most of the threads have not been awaken, only 3-4 threads has.
When the number of threads that is waiting equals to the number of threads that my computer can create, which means all the threads are waiting, this program would never terminate.
Why exactly does not the entry.get(g).notifyAll() work?

Due to the fact that you have un-synchronized gaps between times that you check the map and times that you operate on the map, you have many holes in your logic where threads can proceed incorrectly. you either need to synchronize outside of your map checks or use some of the special atomic methods for ConcurrentMaps.
when writing concurrent code, i like to pretend there is a malicious gnome running around in the background changing things wherever possible (e.g. outside of synchronized blocks). here's the first example to get you started:
if (entry.get(g) != null) {//if the graph is in the entry, waits
synchronized(entry.get(g)) {
you call entry.get() twice outside of a synchronization block. therefore, the value you get could be different between those 2 calls (the evil gnome changes the map as often as possible). in fact, it could be null when you try to synchronize on it, which will throw an exception.
additionally, wait() calls should always be made in a loop while waiting for a loop condition to change (due to the possibility of spurious wakeups, or, also in your case, multiple wakeups). lastly, you should change the loop condition before you notify. #AdrianShum gave a pretty good overview of how to use wait/notify correctly. your while loop should not be around everything, but within the synchronized block, around the wait call alone. this is not to deal with InterruptedException (a separate issue), but to deal with spurious wakeups and notifyAll calls. when you call notifyAll all waiting threads wake up, but only one can proceed, so the rest need to go back to waiting (hence the while loop).
in short, writing concurrent code is hard, and what you are attempting to implement is not simple. i would recommend reading a good book first (like Josh Bloch's "Java Concurrency In Practice") before attempting to finish this code.

In fact #jtahlborn has already raised the key of problem. I am trying to supplement by telling what's the most obvious issue here.
Try to get yourself understand the basics of Condition and why it can solve those race condition as normal signalling (in windows for example)
Your logic is something like this now (assume obj is referring to same object):
Thread 1:
if (!hasResult) {
synchronized(obj) {
obj.wait();
}
}
Thread 2:
hasResult = false;
// do your work
synchronized(obj) {
obj.notify();
}
hasResult= true;
You gotta know thread 1 and thread 2 is run in parallel, therefore you may have something like
Thread 1 Thread 2
hasResult = false
if (!hasResult)
do your work
synchronized(obj)
obj.notify()
end synchronized(obj)
synchronized(obj)
obj.wait()
end synchronized(obj)
Thread 1 is going to wait forever.
What you should do is
Thread 1:
synchronized(obj) {
while (hasResult) {
obj.wait();
}
}
Thread 2:
hasResult = false;
synchronized(obj) {
// do your work
obj.notify();
hasResult=true;
}
That's one of the biggest hole that #jtahlborn is talking I believe (and there are other). Note that setting condition and checking condition are all protected in synchronized block. That's the main basic idea of how Condition variable is solving the race condition illustrated before. Get yourself understand the idea first, and redesign your piece of code with something more reasonable.

Condition vs wait notify mechanism

What is the advantage of using Condition interface/implementations over the conventional wait notify mechanism? Here I quote the comments written by Doug Lea:
Condition factors out the Object monitor methods (wait, notify and notifyAll) into distinct objects to give the effect of having multiple wait-sets per object, by combining them with the use of arbitrary Lock implementations. Where a Lock replaces the use of synchronized methods and statements, a Condition replaces the use of the Object monitor methods.
I see this is a more Object Oriented way of implementing wait/notify mechanism. But is there a sound advantage over the former?

The biggest problem is that wait/notify is error prone for new developers. The main problem is not knowing how to handle them correctly can result is obscure bug.
if you call notify() before wait() it is lost.
it can be sometimes unclear if notify() and wait() are called on the same object.
There is nothing in wait/notify which requires a state change, yet this is required in most cases.
wait() can return spuriously
Condition wraps up this functionality into a dedicated component, however it behaves much the same.
There is a question regarding wait/nofity posted minutes before this one and many, many more Search [java]+wait+notify

When you use Condition: await()/signal() you can distinguish which object or group of objects/threads get a specific signal. Here is a short example where some threads, the producers, will get the isEmpty signal while the consumers will get the isFull signal:
private volatile boolean usedData = true;//mutex for data
private final Lock lock = new ReentrantLock();
private final Condition isEmpty = lock.newCondition();
private final Condition isFull = lock.newCondition();
public void setData(int data) throws InterruptedException {
lock.lock();
try {
while(!usedData) {//wait for data to be used
isEmpty.await();
}
this.data = data;
isFull.signal();//broadcast that the data is now full.
usedData = false;//tell others I created new data.
}finally {
lock.unlock();//interrupt or not, release lock
}
}
public void getData() throws InterruptedException{
lock.lock();
try {
while(usedData) {//usedData is lingo for empty
isFull.await();
}
isEmpty.signal();//tell the producers to produce some more.
usedData = true;//tell others I have used the data.
}finally {//interrupted or not, always release lock
lock.unlock();
}
}

There are many advantages like mentioned above about Condition Interface some important are as follows:
Condition interface comes with Two extra methods that are:
1)boolean awaitUntil(Date deadline)throws InterruptedException :
Causes the current thread to wait until it is signalled or interrupted, or the specified deadline elapses.
2)awaitUninterruptibly() :
Causes the current thread to wait until it is signalled.
If the current thread's interrupted status is set when it enters this method, or it is interrupted while waiting, it will continue to wait until signalled. When it finally returns from this method its interrupted status will still be set.
The above two methods are not present in default monitor that is in object class,in some situations we want to set the deadline for thread to wait then we are able to do that by Condition interface.
In some situations we don't want thread to be interrupted and want current thread to wait until it is signaled then we can go for awaitUninterruptibly method present in Condition Interface.
For more information Condition Interface Java Documentation:
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Condition.html#awaitUntil%28java.util.Date%29

To specifically address why having multiple waitsets is an advantage:
With wait/notify if there are different things that threads are waiting for (the common example is a fixed size blocking queue, with some threads putting things in the queue and blocking when the queue is full, and other threads taking from the queue and blocking when the queue is empty) then if you use notify, causing the scheduler to pick one thread from the wait set to notify, you can have corner cases where the chosen thread isn't interested in being notified for a particular situation. For instance the queue will notify for adding something to the queue, but if the chosen thread is a producer and the queue is full then it can't act on that notification, which you would rather have gone to a consumer. With intrinsic locking you have to use notifyAll in order to make sure that notifications don't get lost.
But notifyAll incurs churn with every call, where every thread wakes up and contends for the lock, but only one can make progress. The other threads all bump around contending for the lock until, one at a time, they can acquire the lock and most likely go back to waiting. It generates a lot of contention for not much benefit, it would be preferable to be able to use notify and know only one thread is notified, where the notification is relevant to that thread.
This is where having separate Conditions to wait on is a big improvement. The queue can invoke signal on a condition and know it will wake up only one thread, where that thread is specifically waiting for the condition.
The API doc for Condition has a code example that shows using multiple conditions for a bounded buffer, it says:
We would like to keep waiting put threads and take threads in separate wait-sets so that we can use the optimization of only notifying a single thread at a time when items or spaces become available in the buffer.

In addition to other well accepted answers - since Condition is associated with Lock object you can have arbitrary sets of Lock objects (reawrite, read, write) in your class and have specific condition associated with that. Then you can use those set of condition to synchronize different parts of your class according to your implementation semantics. This gives more flexibility and explicit behavior then wait-notify imo

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.