Is BlockingQueue completely thread safe in Java

Is BlockingQueue completely thread safe in Java - java

I know that the documentation says that the object is thread safe but does that mean that all access to it from all methods are thread safe? So if I call put() on it from many threads at once and take() on it at the same instance, will nothing bad happen?
I ask because this answer is making me second guess:
https://stackoverflow.com/a/22006181/4164238

The quick answer is yes, they are thread safe. But lets not leave it there ...
Firstly a little house keeping, BlockingQueue is an interface, and any implementation that is not thread safe will be breaking the documented contract. The link that you included was referring to LinkedBlockingQueue, which has some cleverness to it.
The link that you included makes an interesting observation, yes there are two locks within LinkedBlockingQueue. However it fails to understand that the edge case that a 'simple' implementation would have fallen foul of was in-fact being handled, which is why the take and put methods are more complicated than one would at first expect.
LinkedBlockingQueue is optimized to avoid using the same lock on both reading and writing, this reduces contention however for correct behavior it relies on the queue not being empty. When the queue has elements within it, then the push and the pop points are not at the same region of memory and contention can be avoided. However when the queue is empty then the contention cannot be avoided, and so extra code is required to handle this common 'edge' case. This is a common trade off between code complexity and performance/scalability.
The question then follows, how does LinkedBlockingQueue know when the queue is empty/not empty and thus handle the threading then? The answer is that it uses an AtomicInteger and a Condition as two extra concurrent data structures. The AtomicInteger is used to check whether the length of the queue is zero and the Condition is used to wait for a signal to notify a waiting thread when the queue is probably in the desired state. This extra coordination does have an overhead, however in measurements it has been shown that when ramping up the number of concurrent threads that the overheads of this technique are lower than the contention that is introduced by using a single lock.
Below I have copied the code from LinkedBlockingQueue and added comments explaining how they work. At a high level, take() first locks out all other calls to take() and then signals put() as necessary. put() works in a similar way, first it blocks out all other calls to put() and then signals take() if necessary.
From the put() method:
// putLock coordinates the calls to put() only; further coordination
// between put() and take() follows below
putLock.lockInterruptibly();
try {
// block while the queue is full; count is shared between put() and take()
// and is safely visible between cores but prone to change between calls
// a while loop is used because state can change between signals, which is
// why signals get rechecked and resent.. read on to see more of that
while (count.get() == capacity) {
notFull.await();
}
// we know that the queue is not full so add
enqueue(e);
c = count.getAndIncrement();
// if the queue is not full, send a signal to wake up
// any thread that is possibly waiting for the queue to be a little
// emptier -- note that this is logically part of 'take()' but it
// has to be here because take() blocks itself
if (c + 1 < capacity)
notFull.signal();
} finally {
putLock.unlock();
}
if (c == 0)
signalNotEmpty();
From take()
takeLock.lockInterruptibly();
try {
// wait for the queue to stop being empty
while (count.get() == 0) {
notEmpty.await();
}
// remove element
x = dequeue();
// decrement shared count
c = count.getAndDecrement();
// send signal that the queue is not empty
// note that this is logically part of put(), but
// for thread coordination reasons is here
if (c > 1)
notEmpty.signal();
} finally {
takeLock.unlock();
}
if (c == capacity)
signalNotFull();

Yes, all implementations of BlockingQueue are thread safe for put and take and all actions.
The link just goes halfway...and is not covering the full details. It is thread safe.

That answer is a little strange - for a start, BlockingQueue is an interface so it doesn't have any locks. Implementations such as ArrayBlockingQueue use the same lock for add() and take() so would be fine. Generally, if any implementation is not thread safe then it is a buggy implementation.

I think #Chris K has missed some points. "When the queue has elements within it, then the push and the pop points are not at the same region of memory and contention can be avoided. ", notice that when the queue has one element, head.next and tail points to the same node and put() and take() can both get locks and execute.
I think empty and full condition can be solved by synchronized put() and take(). However when it comes to one element, the lb queue has a null dummy head node, which may has something to do with the thread safety.

I tried this implementation on Leetcode
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingDeque;
class FooBar {
private final BlockingQueue<Object> line = new LinkedBlockingDeque<>(1);
private static final Object PRESENT = new Object();
private int n;
public FooBar(int n) {
this.n = n;
}
public void foo(Runnable printFoo) throws InterruptedException {
for (int i = 0; i < n; i++) {
line.put(PRESENT);
// printFoo.run() outputs "foo". Do not change or remove this line.
printFoo.run();
}
}
public void bar(Runnable printBar) throws InterruptedException {
for (int i = 0; i < n; i++) {
line.take();
// printBar.run() outputs "bar". Do not change or remove this line.
printBar.run();
}
}
}
With n = 3, mosttimes I get a correct response of foobarfoobarfoorbar but sometimes I get barbarfoofoofoobar which is quite surprising.
I resolved to use using ReentrantLock and Condition, #chris-k can you shed more light

Related

What would happen if putting and taking simultaneously when Java LinkedBlocking Queue only has one element?

LinkedBlocking Queue has two locks, one for putting, one for taking. When the size of the queue is 1, I think two threads can lock and manipulate the queue simultaneously, which will cause undefined behavior. Am I wrong?
// method put: // method take:
// put lock // take lock
putLocK.lockInterruptibly(); takeLock.lockInterruptibly();
... ...
while(count.get() == capacity){ while(count.get() == 0){
notFull.await(); notEmpty.await();
} }
enqueue(node); x = dequeue();
// method enqueue: // method dequeue:
last = last.next = node; Node<E> h = head;
... Node<E> first = h.next;
h.next = h;
head = first;
E x = first.item;
first.item = null;
return x;
Clearly put thread and take thread can lock when there's only one item in queue, therefore they will execute codes in method enqueue and dequeue respectively. I mean if take thread enters method dequeue, after all that pointer modification, doesn't collide with the codes in enqueue?
Links here says "However when the queue is empty then the contention cannot be avoided, and so extra code is required to handle this common 'edge' case"
Is BlockingQueue completely thread safe in Java

The javadoc for BlockingQueue (the superclass of LinkedBlockingQueue) states this:
BlockingQueue implementations are thread-safe. All queuing methods achieve their effects atomically using internal locks or other forms of concurrency control.
The word "atomically" means that if two operations (for example a put and a take) happen simultaneously, then the implementation will ensure that they behave according to the contract. The effect will be as if the put happens before get or vice-versa. That applies to edge-cases as well, such as your example of a queue with one element.
In fact, since put and get are blocking operations, the relative ordering of the two operations won't matter. With offer / poll or add / remove the order does matter, but you can't control it.
Please note that the above is based solely on what the javadoc says. Assuming that I have interpreted the javadoc correctly, then it applies to all1 BlockingQueue implementations, irrespective of whether they use one or two locks ... or none at all. If a BlockingQueue implementation doesn't behave as above, that is a bug!
1 - All implementations that implement the API correctly. That should cover all of the Java SE classes.

put implementation of a LinkedBlockingQueue
public void put(E e) throws InterruptedException {
// some lock and node code
// the part that matters here
try {
while (count.get() == capacity) {
notFull.await();
}
// put the item in the queue.
} finally {
// not important here
}
}
Basically, in put, the calling thread waits for the capacity to be less than the max continuing.
Even though the thread putting the value on the queue grabs a lock that is different from the take thread, it waits to add it to the queue until the queue is not full.
take has a similar implementation with regards to notEmpty instead of notFull.

After 2 days of search, I finally get it...
When the queue has only one item, according to LinkedBlocking Queue's design, there are actually two nodes: the dummy head and really item(meanwhile last points to it). It's true that put thread and take thread can both get their lock, but they modify different parts of the queue.
Put thread will call
last = last.next = node; // last points to the only item in queue
Take thread will call
Node<E> h = head;
Node<E> first = h.next; // first also points to the only item in queue
h.next = h;
head = first;
E x = first.item;
first.item = null;
return x;
The intersection of these two threads is what last points to in put thread and what first points to in take thread.
Notice that put thread only modifies last.item and take thread only modifies first.next. Although these two threads modifies the same object instance, they modifies the different member of it, and it won't bring about any thread conflicts.

Is this a proper customized synchronizer?

I had a strong need for a synchronizer similar to a CountDownLatch, but the starting number for the countdown is unknown. To add context, if I'm going through a buffered recordset (say from a text file or a query) and kicking off a runnable for each record, but I don't know how many records there will be... I need a synchronizer that signals when the iteration is complete and all runnables are complete.
This is the synchronizer I came up with... a BufferedLatch. A method is called in the iteration loop for each record incrementing the recordSetSize. At the end of each runnable kicked off for each record, the processedRecordSetSize is incremented. When the iteration through all records is complete (but runnables may still be in queue), the setDownloadComplete() method is called letting the BufferedLatch know the recordSetSize is now fixed. The await() method waits for the iterationComplete variable to be true (recordsetSize is now fixed) and recordsetSize == processedRecordSetSize;
Is this an optimal implementation of this synchronizer? Is there more concurrent opportunity that synchronization is holding back? Although testing seems to work fine, are there any gotcha's I'm overlooking?
import java.util.concurrent.atomic.AtomicInteger;
public final class BufferedLatch {
/** A customized synchronizer built for concurrent iteration processes where the number of objects to be iterated is unknown
* and a runnable will be kicked off for each object, and the await() method will wait for all runnables to be complete
*/
private final AtomicInteger recordsetSize = new AtomicInteger(0);
private final AtomicInteger processedRecordsetSize = new AtomicInteger(0);
private volatile boolean iterationComplete = false;
public int incrementRecordsetSize() throws Exception {
if (iterationComplete) {
throw new Exception("Cannot increase recordsize after download is flagged complete!");
}
else {
return recordsetSize.incrementAndGet();
}
}
public void incrementProcessedRecordSize() {
synchronized(this) {
processedRecordsetSize.incrementAndGet();
if (iterationComplete) {
if (processedRecordsetSize.get() == recordsetSize.get()) {
this.notifyAll();
}
}
}
}
public void setDownloadComplete() {
synchronized(this) {
iterationComplete = true;
}
}
public void await() throws InterruptedException {
while (! (iterationComplete && (processedRecordsetSize.get() == recordsetSize.get()))) {
synchronized(this) {
while (! (iterationComplete && (processedRecordsetSize.get() == recordsetSize.get()))) {
this.wait();
}
}
}
}
}
UPDATE-- NEW CODE
public final class BufferedLatch {
/** A customized synchronizer built for concurrent iteration processes where the number of objects to be iterated is unknown
* and a runnable will be kicked off for each object, and the await() method will wait for all runnables to be complete
*/
private int recordCount = 0;
private int processedRecordCount = 0;
private boolean iterationComplete = false;
public synchronized void incrementRecordCount() throws Exception {
if (iterationComplete) {
throw new Exception("Cannot increase recordCount after download is flagged complete!");
}
else {
recordCount++;
}
}
public synchronized void incrementProcessedRecordCount() {
processedRecordCount++;
if (iterationComplete && recordCount == processedRecordCount) {
this.notifyAll();
}
}
public synchronized void setIterationComplete() {
iterationComplete = true;
if (iterationComplete && recordCount == processedRecordCount) {
this.notifyAll();
}
}
public synchronized void await() throws InterruptedException {
while (! (iterationComplete && (recordCount == processedRecordCount))) {
this.wait();
}
}
}

Probably not. I think conceptually you're onto something here, as it looks like your application needs something more than just a CountDownLatch. However, the implementation seems to have several problems.
First, I note that it looks odd to mix atomics/volatiles AND ordinary object monitor locks (synchronized). While there may be proper uses that mix these different constructs, mixing in this case I believe will lead to errors.
Consider incrementRecordsetSize() which first checks iterationComplete and only if it's false does it increment recordsetSize. The iterationComplete variable is volatile so updates from other threads will be visible. However, the fact that no locking is done here allows TOCTOU race conditions (time-of-check vs time-of-use). The rule seems to be, recordsetSize must not be incremented if iterationComplete is true. Suppose thread T1 comes along and finds iterationComplete to be false, so it decides to increment recordsetSize. Before it does so, another thread T2 comes along and sets iterationComplete to be true. This would allow T1 to do the increment improperly. Worse, before it does so, suppose another thread T3 came along and called incrementProcessedRecordSize(). It would increment processedRecordsetSize and then find iterationComplete true. It further might find that processedRecordsetSize equals recordsetSize and then notify all waiters, who then proceed as if the processing is complete. But it's not, as T1 then proceeds to increment recordsetSize and presumably continues with its processing.
The problem here is that this object's state consists of the fusion of three independent pieces of state -- two int counters and a boolean -- and all three must be read and written atomically. If certain bits of logic attempt to take advantage of individual volatile or atomic properties, it introduces the possibility of race conditions such as the one I described.
I'd suggest rewriting this as a plain object with two plain ints and a boolean (not atomic, not volatile) and just lock around everything. This should certainly clear up the logic and make things easier to understand.
In incrementProcessedRecordSize I note that the condition essentially duplicates the condition in the await method. A simplifying convention is for all updates to notify and have the condition evaluated only by the waiters. This may result in some unnecessary wakeups. If this is a problem, you might consider minimizing the number of notifies, but you need to think about maintainability. If you're not careful, the wait/notify conditions will become spread across the code and will be very hard to reason about. Alternatively, you could refactor the condition into a method and call it from the different places that do waiting and notification.
It looks like await() does a complicated form of double-checked locking. Instead of testing a volatile boolean outside the lock, it tests several separate pieces of information both outside and inside the lock. This seems susceptible to TOCTOU problems (as above) but it might be safe if you can prove the state really latches, that is, that once it becomes true it never returns to false. I'd have to stare at the code for a long time before I'd be able to convince myself it's correct.
On the other hand, what does this buy you? It seems to optimize away just the taking of the lock. If you have a zillion threads that are going to come by after processing is complete, it might be worth it, but it doesn't seem like it. I'd just remove the outer while loop and check the variables within a synchronized block.
Finally, having an object that represents counters and a boolean may very well be sensible for what you're doing, but other things you've said (in the question and in comments) are that some threads are generating a workload (e.g. reading lines from a file) and other threads are retiring that workload. This implies that there is some other data structure like a queue that contains this workload, and you have a producer-consumer problem here. That other structure has to be thread-safe, of course, since multiple threads are interacting over it. But the counters and boolean in this structure need to be updated in lockstep with the updates to the workload structure, otherwise there could be race conditions between checking and updating these separate objects.
It seems to me you could replace the counters in this object with the queue and just put simple locks around everything. The producers would append to the queue until they're done, at which time they set iterationComplete to true which prevents more work from being added. The consumers pull from the queue until iterationComplete is true and the queue is empty, at which point they're done. If they find the queue empty but iterationComplete is false, they know to block while awaiting further work.
I'd say to stick with simple locking and avoid volatiles/atomics until you get the basics correct. If there are bottlenecks in that code, then apply optimizations selectively while preserving the same invariants.

implement-your-own blocking queue in java

I know this question has been asked and answered many times before, but I just couldn't figure out a trick on the examples found around internet, like this or that one.
Both of these solutions check for emptiness of the blocking queue's array/queue/linkedlist to notifyAll waiting threads in put() method and vice versa in get() methods. A comment in the second link emphasizes this situation and mentions that that's not necessary.
So the question is; It also seems a bit odd to me to check whether the queue is empty | full to notify all waiting threads. Any ideas?
Thanks in advance.

I know this is an old question by now, but after reading the question and answers I couldn't help my self, I hope you find this useful.
Regarding checking if the queue is actually full or empty before notifying other waiting threads, you're missing something which is both methods put (T t) and T get() are both synchronized methods, meaning that only one thread can enter one of these methods at a time, yet this will not prevent them from working together, so if a thread-a has entered put (T t) method another thread-b can still enter and start executing the instructions in T get() method before thread-a has exited put (T t), and so this double-checking design is will make the developer feel a little bit more safe because you can't know if future cpu context switching if will or when will happen.
A better and a more recommended approach is to use Reentrant Locks and Conditions:
//I've edited the source code from this link
Condition isFullCondition;
Condition isEmptyCondition;
Lock lock;
public BQueue() {
this(Integer.MAX_VALUE);
}
public BQueue(int limit) {
this.limit = limit;
lock = new ReentrantLock();
isFullCondition = lock.newCondition();
isEmptyCondition = lock.newCondition();
}
public void put (T t) {
lock.lock();
try {
while (isFull()) {
try {
isFullCondition.await();
} catch (InterruptedException ex) {}
}
q.add(t);
isEmptyCondition.signalAll();
} finally {
lock.unlock();
}
}
public T get() {
T t = null;
lock.lock();
try {
while (isEmpty()) {
try {
isEmptyCondition.await();
} catch (InterruptedException ex) {}
}
t = q.poll();
isFullCondition.signalAll();
} finally {
lock.unlock();
}
return t;
}
Using this approach there's no need for double checking, because the lock object is shared between the two methods, meaning only one thread a or b can enter any of these methods at a time unlike synchronized methods which creates different monitors, and only those threads waiting because the queue is full will be notified when there's more space, and the same goes for threads waiting because the queue is empty, this will lead to a better cpu utilization.
you can find more detailed example with source code here

I think logically there is no harm doing that extra check before notifyAll().
You can simply notifyAll() once you put/get something from the queue. Everything will still work, and your code is shorter. However, there is also no harm checking if anyone is potentially waiting (by checking if hitting the boundary of queue) before you invoke notifyAll(). This extra piece of logic saves unnecessary notifyAll() invocations.
It just depends on you want a shorter and cleaner code, or you want your code to run more efficiently. (Haven't looked into notifyAll() 's implementation. If it is a cheap operation if there is no-one waiting, the performance gain may not be obvious for that extra checking anyway)

The reason why the authors used notifyAll() is simple: they had no clue whether or not it was necessary, so they decided for the "safer" option.
In the above example it would be sufficient to just call notify() as for each single element added, only a single thread waiting can be served under all circumstances.
This becomes more obvious, if your queue as well has the option to add multiple elements in one step like addAll(Collection<T> list), as in this case more than one thread waiting on an empty list could be served, to be exact: as many threads as elements have been added.
The notifyAll() however causes an extra overhead in the special single-element case, as many threads are woken up unnecessarily and therefore have to be put to sleep again, blocking queue access in the meantime. So replacing notifyAll() with notify() would improve speed in this special case.
But then not using wait/notify and synchronized at all, but instead use the concurrent package would increase speed by a lot more than any smart wait/notify implementation could ever get to.

I would like to write a simple blocking queue implementation which will help the people to understand this easily. This is for someone who is novice to this.
class BlockingQueue {
private List queue = new LinkedList();
private int limit = 10;
public BlockingQueue(int limit){
this.limit = limit;
}
public synchronized void enqueue(Object ele) throws InterruptedException {
while(queue.size() == limit)
wait();
if(queue.size() == 0)
notifyAll();
// add
queue.add(ele);
}
public synchronized Object deque() throws InterruptedException {
while (queue.size() == 0)
wait();
if(queue.size() == limit)
notifyAll();
return queue.remove(0);
}
}

Implementing a blocking queue in JavaME: how to optimize it?

I'm trying to implement a simple blocking queue in Java ME. In JavaME API, the concurrency utilities of Java SE are not available, so I have to use wait-notify like in the old times.
This is my provisional implementation. I'm using notify instead of notifyAll because in my project there are multiple producers but only a single consumer. I used an object for wait-notify on purpose to improve readability, despite it wastes a reference:
import java.util.Vector;
public class BlockingQueue {
private Vector queue = new Vector();
private Object queueLock = new Object();
public void put(Object o){
synchronized(queueLock){
queue.addElement(o);
queueLock.notify();
}
}
public Object take(){
Object ret = null;
synchronized (queueLock) {
while (queue.isEmpty()){
try {
queueLock.wait();
} catch (InterruptedException e) {}
}
ret = queue.elementAt(0);
queue.removeElementAt(0);
}
return ret;
}
}
My main question is about the put method. Could I put the queue.addElement line out of the synchronized block? Will performance improve if so?
Also, the same applies to take: could I take the two operations on queue out of the synchronized block?
Any other possible optimization?
EDIT:
As #Raam correctly pointed out, the consumer thread can starve when being awakened in wait. So what are the alternatives to prevent this? (Note: In JavaME I don't have all these nice classes from Java SE. Think of it as the old Java v1.2)

The Vector class makes no guarantees to be thread safe, and you should synchronize access to it, like you have done. Unless you have evidence that your current solution has performance problems, I wouldn't worry about it.
On a side note, I see no harm in using notifyAll rather than notify to support multiple consumers.

synchronized is used to protect access to shared state and ensure atomicity.
Note that methods of Vector are already synchronized, therefore Vector protects it own shared state itself. So, your synchronization blocks are only needed to ensure atomicity of your operations.
You certainly cannot move operations on queue from the synchronized block in your take() method, because atomicity is crucial for correctness of that method. But, as far as I understand, you can move queue operation from the synchronized block in the put() method (I cannot imagine a situation when it can go wrong).
However, the reasoning above is purely theoretical, because in all cases you have double synchronization: your synchronize on queueLock and methods of Vector implicitly synchronize on queue. Therefore proposed optimization doesn't make sense, its correctness depends on presence of that double synchronization.
To avoid double synchronization you need to synchronize on queue as well:
synchronized (queue) { ... }
Another option would be to use non-synchronized collection (such as ArrayList) instead of Vector, but JavaME doesn't support it. In this case you won't be able to use proposed optimization as well because synchronized blocks also protect shared state of the non-synchronized collection.

Unless you have performance issues specifically due to garbage collection, I would rather use a linked list than a Vector to implement a queue (first in,first out).
I would also write code that would be reused when your project (or another) gets multiple consumers. Although in that case, you need to be aware that the Java language specifications do not impose a way to implement monitors. In practice, that means that you don't control which consumer thread gets notified (half of the existing Java Virtual Machines implement monitors using a FIFO model and the other half implement monitors using a LIFO model)
I also think that whoever is using the blocking class is also supposed to deal with the InterruptedException. After all, the client code would have to deal with a null Object return otherwise.
So, something like this:
/*package*/ class LinkedObject {
private Object iCurrentObject = null;
private LinkedObject iNextLinkedObject = null;
LinkedObject(Object aNewObject, LinkedObject aNextLinkedObject) {
iCurrentObject = aNewObject;
iNextLinkedObject = aNextLinkedObject;
}
Object getCurrentObject() {
return iCurrentObject;
}
LinkedObject getNextLinkedObject() {
return iNextLinkedObject;
}
}
public class BlockingQueue {
private LinkedObject iLinkedListContainer = null;
private Object iQueueLock = new Object();
private int iBlockedThreadCount = 0;
public void appendObject(Object aNewObject) {
synchronized(iQueueLock) {
iLinkedListContainer = new iLinkedListContainer(aNewObject, iLinkedListContainer);
if(iBlockedThreadCount > 0) {
iQueueLock.notify();//one at a time because we only appended one object
}
} //synchonized(iQueueLock)
}
public Object getFirstObject() throws InterruptedException {
Object result = null;
synchronized(iQueueLock) {
if(null == iLinkedListContainer) {
++iBlockedThreadCount;
try {
iQueueLock.wait();
--iBlockedThreadCount; // instead of having a "finally" statement
} catch (InterruptedException iex) {
--iBlockedThreadCount;
throw iex;
}
}
result = iLinkedListcontainer.getCurrentObject();
iLinkedListContainer = iLinkedListContainer.getNextLinkedObject();
if((iBlockedThreadCount > 0) && (null != iLinkedListContainer )) {
iQueueLock.notify();
}
}//synchronized(iQueueLock)
return result;
}
}
I think that if you try to put less code in the synchronized blocks, the class will not be correct anymore.

There seem to be some issues with this approach. You can have scenarios where the consumer can miss notifications and wait on the queue even when there are elements in the queue.
Consider the following sequence in chronological order
T1 - Consumer acquires the queueLock and then calls wait. Wait will release the lock and cause the thread to wait for a notification
T2 - One producer acquires the queueLock and adds an element to the queue and calls notify
T3 - The Consumer thread is notified and attempts to acquire queueLock BUT fails as another producer comes at the same time. (from the notify java doc - The awakened thread will compete in the usual manner with any other threads that might be actively competing to synchronize on this object; for example, the awakened thread enjoys no reliable privilege or disadvantage in being the next thread to lock this object.)
T4 - The second producer now adds another element and calls notify. This notify is lost as the consumer is waiting on queueLock.
So theoretically its possible for the consumer to starve (forever stuck trying to get the queueLock) also you can run into a memory issue with multiple producers adding elements to the queue which are not being read and removed from the queue.
Some changes that I would suggest is as follows -
Keep an upper bound to the number of items that can be added to the queue.
Ensure that the consumer always read all the elements. Here is a program which shows how the producer - consumer problem can be coded.

BlockingQueue - blocked drainTo() methods

BlockingQueue has the method called drainTo() but it is not blocked. I need a queue that I want to block but also able to retrieve queued objects in a single method.
Object first = blockingQueue.take();
if ( blockingQueue.size() > 0 )
blockingQueue.drainTo( list );
I guess the above code will work but I'm looking for an elegant solution.

Are you referring to the comment in the JavaDoc:
Further, the behavior of this operation is undefined if the specified collection
is modified while the operation is in progress.
I believe that this refers to the collection list in your example:
blockingQueue.drainTo(list);
meaning that you cannot modify list at the same time you are draining from blockingQueue into list. However, the blocking queue internally synchronizes so that when drainTo is called, puts and (see note below) gets will block. If it did not do this, then it would not be truly Thread-safe. You can look at the source code and verify that drainTo is Thread-safe regarding the blocking queue itself.
Alternately, do you mean that when you call drainTo that you want it to block until at least one object has been added to the queue? In that case, you have little choice other than:
list.add(blockingQueue.take());
blockingQueue.drainTo(list);
to block until one or more items have been added, and then drain the entire queue into the collection list.
Note: As of Java 7, a separate lock is used for gets and puts. Put operations are now permitted during a drainTo (and a number of other take operations).

If you happen to use Google Guava, there's a nifty Queues.drain() method.
Drains the queue as BlockingQueue.drainTo(Collection, int), but if the
requested numElements elements are not available, it will wait for
them up to the specified timeout.

I found this pattern useful.
List<byte[]> blobs = new ArrayList<byte[]>();
if (queue.drainTo(blobs, batch) == 0) {
blobs.add(queue.take());
}

With the API available, I don't think you are going to get much more elegant. Other than you can remove the size test.
If you are wanting to atomically retrieve a contiguous sequence of elements even if another removal operation coincides, I don't believe even drainTo guarantees that.

Source code:
596: public int drainTo(Collection<? super E> c) {
//arg. check
603: lock.lock();
604: try {
608: for (n = 0 ; n != count ; n++) {
609: c.add(items[n]);
613: }
614: if (n > 0) {
618: notFull.signalAll();
619: }
620: return n;
621: } finally {
622: lock.unlock();
623: }
624: }
ArrayBlockingQueue is eager to return 0. BTW, it could do it before taking the lock.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.