BlockingQueue - blocked drainTo() methods

BlockingQueue - blocked drainTo() methods - java

BlockingQueue has the method called drainTo() but it is not blocked. I need a queue that I want to block but also able to retrieve queued objects in a single method.
Object first = blockingQueue.take();
if ( blockingQueue.size() > 0 )
blockingQueue.drainTo( list );
I guess the above code will work but I'm looking for an elegant solution.

Are you referring to the comment in the JavaDoc:
Further, the behavior of this operation is undefined if the specified collection
is modified while the operation is in progress.
I believe that this refers to the collection list in your example:
blockingQueue.drainTo(list);
meaning that you cannot modify list at the same time you are draining from blockingQueue into list. However, the blocking queue internally synchronizes so that when drainTo is called, puts and (see note below) gets will block. If it did not do this, then it would not be truly Thread-safe. You can look at the source code and verify that drainTo is Thread-safe regarding the blocking queue itself.
Alternately, do you mean that when you call drainTo that you want it to block until at least one object has been added to the queue? In that case, you have little choice other than:
list.add(blockingQueue.take());
blockingQueue.drainTo(list);
to block until one or more items have been added, and then drain the entire queue into the collection list.
Note: As of Java 7, a separate lock is used for gets and puts. Put operations are now permitted during a drainTo (and a number of other take operations).

If you happen to use Google Guava, there's a nifty Queues.drain() method.
Drains the queue as BlockingQueue.drainTo(Collection, int), but if the
requested numElements elements are not available, it will wait for
them up to the specified timeout.

I found this pattern useful.
List<byte[]> blobs = new ArrayList<byte[]>();
if (queue.drainTo(blobs, batch) == 0) {
blobs.add(queue.take());
}

With the API available, I don't think you are going to get much more elegant. Other than you can remove the size test.
If you are wanting to atomically retrieve a contiguous sequence of elements even if another removal operation coincides, I don't believe even drainTo guarantees that.

Source code:
596: public int drainTo(Collection<? super E> c) {
//arg. check
603: lock.lock();
604: try {
608: for (n = 0 ; n != count ; n++) {
609: c.add(items[n]);
613: }
614: if (n > 0) {
618: notFull.signalAll();
619: }
620: return n;
621: } finally {
622: lock.unlock();
623: }
624: }
ArrayBlockingQueue is eager to return 0. BTW, it could do it before taking the lock.

Related

Is BlockingQueue completely thread safe in Java

I know that the documentation says that the object is thread safe but does that mean that all access to it from all methods are thread safe? So if I call put() on it from many threads at once and take() on it at the same instance, will nothing bad happen?
I ask because this answer is making me second guess:
https://stackoverflow.com/a/22006181/4164238

The quick answer is yes, they are thread safe. But lets not leave it there ...
Firstly a little house keeping, BlockingQueue is an interface, and any implementation that is not thread safe will be breaking the documented contract. The link that you included was referring to LinkedBlockingQueue, which has some cleverness to it.
The link that you included makes an interesting observation, yes there are two locks within LinkedBlockingQueue. However it fails to understand that the edge case that a 'simple' implementation would have fallen foul of was in-fact being handled, which is why the take and put methods are more complicated than one would at first expect.
LinkedBlockingQueue is optimized to avoid using the same lock on both reading and writing, this reduces contention however for correct behavior it relies on the queue not being empty. When the queue has elements within it, then the push and the pop points are not at the same region of memory and contention can be avoided. However when the queue is empty then the contention cannot be avoided, and so extra code is required to handle this common 'edge' case. This is a common trade off between code complexity and performance/scalability.
The question then follows, how does LinkedBlockingQueue know when the queue is empty/not empty and thus handle the threading then? The answer is that it uses an AtomicInteger and a Condition as two extra concurrent data structures. The AtomicInteger is used to check whether the length of the queue is zero and the Condition is used to wait for a signal to notify a waiting thread when the queue is probably in the desired state. This extra coordination does have an overhead, however in measurements it has been shown that when ramping up the number of concurrent threads that the overheads of this technique are lower than the contention that is introduced by using a single lock.
Below I have copied the code from LinkedBlockingQueue and added comments explaining how they work. At a high level, take() first locks out all other calls to take() and then signals put() as necessary. put() works in a similar way, first it blocks out all other calls to put() and then signals take() if necessary.
From the put() method:
// putLock coordinates the calls to put() only; further coordination
// between put() and take() follows below
putLock.lockInterruptibly();
try {
// block while the queue is full; count is shared between put() and take()
// and is safely visible between cores but prone to change between calls
// a while loop is used because state can change between signals, which is
// why signals get rechecked and resent.. read on to see more of that
while (count.get() == capacity) {
notFull.await();
}
// we know that the queue is not full so add
enqueue(e);
c = count.getAndIncrement();
// if the queue is not full, send a signal to wake up
// any thread that is possibly waiting for the queue to be a little
// emptier -- note that this is logically part of 'take()' but it
// has to be here because take() blocks itself
if (c + 1 < capacity)
notFull.signal();
} finally {
putLock.unlock();
}
if (c == 0)
signalNotEmpty();
From take()
takeLock.lockInterruptibly();
try {
// wait for the queue to stop being empty
while (count.get() == 0) {
notEmpty.await();
}
// remove element
x = dequeue();
// decrement shared count
c = count.getAndDecrement();
// send signal that the queue is not empty
// note that this is logically part of put(), but
// for thread coordination reasons is here
if (c > 1)
notEmpty.signal();
} finally {
takeLock.unlock();
}
if (c == capacity)
signalNotFull();

Yes, all implementations of BlockingQueue are thread safe for put and take and all actions.
The link just goes halfway...and is not covering the full details. It is thread safe.

That answer is a little strange - for a start, BlockingQueue is an interface so it doesn't have any locks. Implementations such as ArrayBlockingQueue use the same lock for add() and take() so would be fine. Generally, if any implementation is not thread safe then it is a buggy implementation.

I think #Chris K has missed some points. "When the queue has elements within it, then the push and the pop points are not at the same region of memory and contention can be avoided. ", notice that when the queue has one element, head.next and tail points to the same node and put() and take() can both get locks and execute.
I think empty and full condition can be solved by synchronized put() and take(). However when it comes to one element, the lb queue has a null dummy head node, which may has something to do with the thread safety.

I tried this implementation on Leetcode
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingDeque;
class FooBar {
private final BlockingQueue<Object> line = new LinkedBlockingDeque<>(1);
private static final Object PRESENT = new Object();
private int n;
public FooBar(int n) {
this.n = n;
}
public void foo(Runnable printFoo) throws InterruptedException {
for (int i = 0; i < n; i++) {
line.put(PRESENT);
// printFoo.run() outputs "foo". Do not change or remove this line.
printFoo.run();
}
}
public void bar(Runnable printBar) throws InterruptedException {
for (int i = 0; i < n; i++) {
line.take();
// printBar.run() outputs "bar". Do not change or remove this line.
printBar.run();
}
}
}
With n = 3, mosttimes I get a correct response of foobarfoobarfoorbar but sometimes I get barbarfoofoofoobar which is quite surprising.
I resolved to use using ReentrantLock and Condition, #chris-k can you shed more light

How to read the last X entries of a vector while being thread safe?

I have a singleton logger that contains a vector. Objects from outside can append information to this vector by calling singletonLogger.append(String data) and read the whole vector by calling singletonLogger.getLogEntries() which returns a string.
It would be nice to overload the getLogEntries-method with an int-parameter, e.g. getLogEntries(int x), to be able to get only the last x entries instead of the whole log.
Without regarding mutliple threads, this would be easy, something like:
String getLogEntries(int x) {
int size = vector.size();
for(int i = size; i > (size - x); i--) {
// StringBuilder.append(vector.elementAt....
}
}
But of course, this is not really safe when taking multiple threads into account. Imagine the vector gets cleared by another method shortly after its size was determined by the method above, the loop will crash.
On the other hand, I do not want to mark the whole method as synchronized, because the loop processing could last 5 - 10 seconds. This would block all the code that is trying to call the logger's methods, right?
Is there another way to reliably get the last x elements of a vector?
Thanks

Edit
Vector has a sublist method that should work and be synchronized but that doesn't solve someone clearing the Vector in another thread. You could use ReadWriteLock and get a readLock() when reading from the end of the Vector using sublist() and a writeLock() (which guarantees exclusive access) when clear() needs to be called. If your background thread is writing the log entries to disk or something, it should count the number of line written, and then get a writeLock() and remove those from the front of the list instead of calling clear(). That would limit the time under the lock to be more efficient.
You might also consider maintaining your own internal queue so you can control the synchronization specifically. This may make it easier to clear the earlier entries from the queue. Then again you may need a ReadWriteLock for that as well.

Did you consider copying the relevant elements to a new Vector in a synchronized block and then handling them outside one?

How can I atomically "enqueue if free space OR dequeue then enqueue" for a Java queue / list?

I've got a requirement for a list in Java with a fixed capacity but which always allows threads to add items to the start. If it's full it should remove an item from the end to make space. No other process will remove items, but other processes will wish to iterate over the items.
Is there something in the JDK which would allow me to do this atomically?
My current plan is just to use some existing threadsafe Collection (e.g. LinkedBlockingQueue) and further synchronise on it when I check capacity / add / remove. Would that work as well?
Thanks.

Your idea would work but would involve taking out multiple locks (see example below). Given you need to synchronize multiple operations when adding data you may as well wrap a LinkedList implementation of a Queue to avoid the overhead of additional locks.
// Create queue with fixed capacity.
Queue<Item> queue = new LinkedBlockingQueue<Item>(1000);
...
// Attempt to add item to queue, removing items if required.
synchronized(queue) { // First lock
while (!queue.offer(item)) { // Second lock
queue.take(); // Third lock
}
}

I'm working in an old version of Java (yes 1.3, I have no choice), so even if it's there in later Javas I can't use it. So I coded along these lines:
public class Fifo {
private LinkedList fifoContents = new LinkedList();
public synchronized void put(Object element) {
if ( fifoContents.size() > 100){
fifoContents.removeFirst();
logger.logWarning("*** Backlog, discarding messaage ");
}
fifoContents.add (element);
return;
}
public synchronized Object get() throws NoSuchElementException {
return fifoContents.removeFirst();
}
}

You may be able to get away with just testing/removing/inserting without additional locks:
class DroppingQueue<E>
extends ArrayBlockingQueue<E> {
public boolean add(E item) {
while (! offer(item)) {
take();
}
return true;
}
}
Although this method is not synchronized, add and offer still are, so the worst that can happen is that thread #1 will call offer, find the queue to be full, thread #2 will do the same, and both will remove items, temporarily reducing the number of items to less than the maximum, before both threads successfully add their items. This will probably not cause serious problems.

There's no such class in JDK.
If you are going to implement such collection, you might want to use array with floating head/tail pointers - since you have fixed size you don't need linked list at all.

Is this java code thread-safe?

I am planning to use this schema in my application, but I was not sure whether this is safe.
To give a little background, a bunch of servers will compute results of sub-tasks that belong to a single task and report them back to the central server. This piece of code is used to register the results, and also check whether all the subtasks for the task has completed and if so, report that fact only once.
The important point is that, all task must be reported once and only once as soon as it is completed (all subTaskResults are set).
Can anybody help? Thank you! (Also, if you have a better idea to solve this problem, please let me know!)
*Note that I simplified the code for brevity.
Solution I
class Task {
//Populate with bunch of (Long, new AtomicReference()) pairs
//Actual app uses read only HashMap
Map<Id, AtomicReference<SubTaskResult>> subtasks = populatedMap();
Semaphore permission = new Semaphore(1);
public Task set(id, subTaskResult){
//null check omitted
subtasks.get(id).set(result);
return check() ? this : null;
}
private boolean check(){
for(AtomicReference ref : subtasks){
if(ref.get()==null){
return false;
}
}//for
return permission.tryAquire();
}
}//class
Stephen C kindly suggested to use a counter. Actually, I have considered that once, but I reasoned that the JVM could reorder the operations and thus, a thread can observe a decremented counter (by another thread) before the result is set in AtomicReference (by that other thread).
*EDIT: I now see this is thread safe. I'll go with this solution. Thanks, Stephen!
Solution II
class Task {
//Populate with bunch of (Long, new AtomicReference()) pairs
//Actual app uses read only HashMap
Map<Id, AtomicReference<SubTaskResult>> subtasks = populatedMap();
AtomicInteger counter = new AtomicInteger(subtasks.size());
public Task set(id, subTaskResult){
//null check omitted
subtasks.get(id).set(result);
//In the actual app, if !compareAndSet(null, result) return null;
return check() ? this : null;
}
private boolean check(){
return counter.decrementAndGet() == 0;
}
}//class

I assume that your use-case is that there are multiple multiple threads calling set, but for any given value of id, the set method will be called once only. I'm also assuming that populateMap creates the entries for all used id values, and that subtasks and permission are really private.
If so, I think that the code is thread-safe.
Each thread should see the initialized state of the subtasks Map, complete with all keys and all AtomicReference references. This state never changes, so subtasks.get(id) will always give the right reference. The set(result) call operates on an AtomicReference, so the subsequent get() method calls in check() will give the most up-to-date values ... in all threads. Any potential races with multiple threads calling check seem to sort themselves out.
However, this is a rather complicated solution. A simpler solution would be to use an concurrent counter; e.g. replace the Semaphore with an AtomicInteger and use decrementAndGet instead of repeatedly scanning the subtasks map in check.
In response to this comment in the updated solution:
Actually, I have considered that once,
but I reasoned that the JVM could
reorder the operations and thus, a
thread can observe a decremented
counter (by another thread) before the
result is set in AtomicReference (by
that other thread).
The AtomicInteger and AtomicReference by definition are atomic. Any thread that tries to access one is guaranteed to see the "current" value at the time of the access.
In this particular case, each thread calls set on the relevant AtomicReference before it calls decrementAndGet on the AtomicInteger. This cannot be reordered. Actions performed by a thread are performed in order. And since these are atomic actions, the efects will be visible to other threads in order as well.
In other words, it should be thread-safe ... AFAIK.

The atomicity guaranteed (per class documentation) explicitly for AtomicReference.compareAndSet extends to set and get methods (per package documentation), so in that regard your code appears to be thread-safe.
I am not sure, however, why you have Semaphore.tryAquire as a side-effect there, but without complimentary code to release the semaphore, that part of your code looks wrong.

The second solution does provide a thread-safe latch, but it's vulnerable to calls to set() that provide an ID that's not in the map -- which would trigger a NullPointerException -- or more than one call to set() with the same ID. The latter would mistakenly decrement the counter too many times and falsely report completion when there are presumably other subtasks IDs for which no result has been submitted. My criticism isn't with regard to the thread safety, but rather to the invariant maintenance; the same flaw would be present even without the thread-related concern.
Another way to solve this problem is with AbstractQueuedSynchronizer, but it's somewhat gratuitous: you can implement a stripped-down counting semaphore, where each call set() would call releaseShared(), decrementing the counter via a spin on compareAndSetState(), and tryAcquireShared() would only succeed when the count is zero. That's more or less what you implemented above with the AtomicInteger, but you'd be reusing a facility that offers more capabilities you can use for other portions of your design.
To flesh out the AbstractQueuedSynchronizer-based solution requires adding one more operation to justify the complexity: being able to wait on the results from all the subtasks to come back, such that the entire task is complete. That's Task#awaitCompletion() and Task#awaitCompletion(long, TimeUnit) in the code below.
Again, it's possibly overkill, but I'll share it for the purpose of discussion.
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.AbstractQueuedSynchronizer;
final class Task
{
private static final class Sync extends AbstractQueuedSynchronizer
{
public Sync(int count)
{
setState(count);
}
#Override
protected int tryAcquireShared(int ignored)
{
return 0 == getState() ? 1 : -1;
}
#Override
protected boolean tryReleaseShared(int ignored)
{
int current;
do
{
current = getState();
if (0 == current)
return true;
}
while (!compareAndSetState(current, current - 1));
return 1 == current;
}
}
public Task(int count)
{
if (count < 0)
throw new IllegalArgumentException();
sync_ = new Sync(count);
}
public boolean set(int id, Object result)
{
// Ensure that "id" refers to an incomplete task. Doing so requires
// additional synchronization over the structure mapping subtask
// identifiers to results.
// Store result somehow.
return sync_.releaseShared(1);
}
public void awaitCompletion()
throws InterruptedException
{
sync_.acquireSharedInterruptibly(0);
}
public void awaitCompletion(long time, TimeUnit unit)
throws InterruptedException
{
sync_.tryAcquireSharedNanos(0, unit.toNanos(time));
}
private final Sync sync_;
}

I have a weird feeling reading your example program, but it depends on the larger structure of your program what to do about that. A set function that also checks for completion is almost a code smell. :-) Just a few ideas.
If you have synchronous communication with your servers you might use an ExecutorService with the same number of threads like the number of servers that do the communication. From this you get a bunch of Futures, and you can naturally proceed with your calculation - the get calls will block at the moment the result is needed but not yet there.
If you have asynchronous communication with the servers you might also use a CountDownLatch after submitting the task to the servers. The await call blocks the main thread until the completion of all subtasks, and other threads can receive the results and call countdown on each received result.
With all these methods you don't need special threadsafety measures other than that the concurrent storing of the results in your structure is threadsafe. And I bet there are even better patterns for this.

A bounded BlockingQueue that doesn't block

The title of this question makes me doubt if this exist, but still:
I'm interested in whether there is an implemented of Java's BlockingQueue, that is bounded by size, and never blocks, but rather throws an exception when trying to enqueue too many elements.
Edit - I'm passing the BlockingQueue to an Executor, which I suppose uses its add() method, not offer(). One can write a BlockingQueue that wraps another BlockingQueue and delegates calls to add() to offer().

Edit: Based on your new description I believe that you're asking the wrong question. If you're using a Executor you should probably define a custom RejectedExecutionHandler rather than modifying the queue. This only works if you're using a ThreadPoolExecutor, but if you're not it would probably be a better idea to modify the Executor rather than the queue.
It's my opinion that it's a mistake to override offer and make it behave like add. Interface methods constitute a contract. Client code that uses blocking queues depends on the methods actually doing what the documentation specifies. Breaking that rule opens up for a world of hurt. That, And it's inelegant.
The add() method on BlockingQueues does that, but they also have an offer() method which is generally a better choice. From the documentation for offer():
Inserts the specified element at the
tail of this queue if it is possible
to do so immediately without exceeding
the queue's capacity, returning true
upon success and false if this queue
is full. This method is generally
preferable to method add(E), which can
fail to insert an element only by
throwing an exception.
This works for all such queues regardless of the specific implementation (ArrayBlockingQueue, LinkedBlockingQueue etc.)
BlockingQueue<String> q = new LinkedBlockingQueue<String>(2);
System.out.println(q.offer("foo")); // true
System.out.println(q.offer("bar")); // true
System.out.println(q.offer("baz")); // false

One can write a BlockingQueue that
wraps another BlockingQueue and
delegates calls to add() to offer().
If that is supposed to be a question ... the answer is "Yes", but you can do it more neatly by creating a subclass that overrides the add(). The only catch (in both cases) is that your version of add cannot throw any checked exceptions that aren't in the method you are overriding, so your "would block" exception will need to be unchecked.

this is sad, you cannot block, there are so many use cases where you would want to block, the whole idea of providing your own bounded blocking queue to the executor has no meaning.
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
if (poolSize >= corePoolSize || !addIfUnderCorePoolSize(command)) {
if (runState == RUNNING && workQueue.***offer***(command)) {
if (runState != RUNNING || poolSize == 0)
ensureQueuedTaskHandled(command);
}
else if (!addIfUnderMaximumPoolSize(command))
reject(command); // is shutdown or saturated
}
}

A simple use case to get queries executed from source db in batch (executor), enrich in batch and put into another db (executor), you would want to execute queries only as fast as they are being put into another db. In which case, the dest executor should accept a blocking bounded executor to solve the problem than keep polling and checking how many were completed to execute more queries.
oops more, see my remainder comment:

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.