A bounded BlockingQueue that doesn't block - java

The title of this question makes me doubt if this exist, but still:
I'm interested in whether there is an implemented of Java's BlockingQueue, that is bounded by size, and never blocks, but rather throws an exception when trying to enqueue too many elements.
Edit - I'm passing the BlockingQueue to an Executor, which I suppose uses its add() method, not offer(). One can write a BlockingQueue that wraps another BlockingQueue and delegates calls to add() to offer().

Edit: Based on your new description I believe that you're asking the wrong question. If you're using a Executor you should probably define a custom RejectedExecutionHandler rather than modifying the queue. This only works if you're using a ThreadPoolExecutor, but if you're not it would probably be a better idea to modify the Executor rather than the queue.
It's my opinion that it's a mistake to override offer and make it behave like add. Interface methods constitute a contract. Client code that uses blocking queues depends on the methods actually doing what the documentation specifies. Breaking that rule opens up for a world of hurt. That, And it's inelegant.
The add() method on BlockingQueues does that, but they also have an offer() method which is generally a better choice. From the documentation for offer():
Inserts the specified element at the
tail of this queue if it is possible
to do so immediately without exceeding
the queue's capacity, returning true
upon success and false if this queue
is full. This method is generally
preferable to method add(E), which can
fail to insert an element only by
throwing an exception.
This works for all such queues regardless of the specific implementation (ArrayBlockingQueue, LinkedBlockingQueue etc.)
BlockingQueue<String> q = new LinkedBlockingQueue<String>(2);
System.out.println(q.offer("foo")); // true
System.out.println(q.offer("bar")); // true
System.out.println(q.offer("baz")); // false

One can write a BlockingQueue that
wraps another BlockingQueue and
delegates calls to add() to offer().
If that is supposed to be a question ... the answer is "Yes", but you can do it more neatly by creating a subclass that overrides the add(). The only catch (in both cases) is that your version of add cannot throw any checked exceptions that aren't in the method you are overriding, so your "would block" exception will need to be unchecked.

this is sad, you cannot block, there are so many use cases where you would want to block, the whole idea of providing your own bounded blocking queue to the executor has no meaning.
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
if (poolSize >= corePoolSize || !addIfUnderCorePoolSize(command)) {
if (runState == RUNNING && workQueue.***offer***(command)) {
if (runState != RUNNING || poolSize == 0)
ensureQueuedTaskHandled(command);
}
else if (!addIfUnderMaximumPoolSize(command))
reject(command); // is shutdown or saturated
}
}

A simple use case to get queries executed from source db in batch (executor), enrich in batch and put into another db (executor), you would want to execute queries only as fast as they are being put into another db. In which case, the dest executor should accept a blocking bounded executor to solve the problem than keep polling and checking how many were completed to execute more queries.
oops more, see my remainder comment:

Related

what thread safety issue does java.util.Stack or java.util.Queue has

If I ignore the size() inaccuracy, and assume I allocated large enough underlying Vector so that no reallocation happens, what thread safety issue does java.util.Stack or java.util.Queue has?
I cannot think of a valid/reasonable consistency argument to say they are thread unsafe.
Anybody has some insights?
"Thread safe" isn't an absolute attribute for a class -- what's safe or unsafe is your usage of the object. You can come up with unsafe ways to use a ConcurrentHashMap, and you can come up with thread-safe ways to use a plain HashMap.
When people say a class is thread-safe, they generally mean that each method is implemented in a way that's thread-safe on its own. In that sense, a Stack is thread-safe. But its interface doesn't allow for easy/safe handling of common use cases, so in that sense it's not very thread-safe.
For instance, if your code checks that the Stack is not empty, and if so, pop an element -- that's unsafe because it could be that it had one element (and thus was not empty), but someone else popped it before you got a chance to (in which case you're trying to pop an empty stack, and will get an exception).
To be more thread-safe, you really need a single method that handles that case for you. A BlockingQueue gives you that. For instance, take() will block until there's a value to pop, while poll() will instantly return back a value or null if there's no element to pop.
Stack, which extends Vector, has every method synchronized. This means that interactions with individual methods are thread-safe.
Queue is an interface. The safety of use across threads is up to the individual implementations. For example, an ArrayBlockingQueue is thread safe, but a LinkedList is not.
Look at this method from ArrayBlockingQueue (leave any existing synchronisation aside):
private void insert(E x) {
items[putIndex] = x;
// HERE
putIndex = inc(putIndex);
++count;
notEmpty.signal();
}
Let thread A progress until HERE, and let thread B take over and execute the method; then let A continue. It is easy to see that B's E x overwrites A's E x, with count being incremented by 2 and putIndex being advanced twice.
Similar HEREs can be found in other methods as well.
All data structures with memory for data and variables for bookkeeping are blatantly vulnerable to unsynced concurrent access.

Incremental Future of list extensions

I essentially have a Future<List<T>> that is fetched in batches from the server. For some clients I'd like to provide incremental results while it loads in addition to the whole collection when future is fulfilled.
Is there a common Future extension defined somewhere for this? What are typical patterns/combinators exist for such futures?
I assume that given IncrementalListFuture<T> I can easily define map operation. What else comes to your mind?
Is there a common Future extension defined somewhere for this?
I assume you are talking about incremental results from an ExecutorService. You should consider using an ExecutorCompletionService which allows you to be informed as soon as one of the Future objects is get-able.
To quote from the javadocs:
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
for (Callable<Result> s : solvers) {
ecs.submit(s);
}
int n = solvers.size();
for (int i = 0; i < n; ++i) {
// this waits for one of the futures to finish and provide a result
Future<Result> future = ecs.take();
Result result = future.get();
if (result != null) {
// do something with the result
}
}
Sorry. I initially misread the question and thought that you were asking about a List<Future<?>>. It may be that you could refactor your code to actually return a number of Futures so I'll leave this for posterity.
I would not pass back the list in this case in a Future. You aren't going to be able to get the return until the job finishes.
If possible, I would pass in some sort of BlockingQueue so both the caller and the thread can access it:
final BlockingQueue<T> queue = new LinkedBlockingQueue<T>();
// build out job with the queue
threadPool.submit(new SomeJob(queue));
threadPool.shutdown();
// now we can consume from the queue as it is built:
while (true) {
T result = queue.take();
// you could some constant result object to mean that the job finished
if (result == SOME_END_OBJECT) {
break;
}
// provide intermediate results
}
You could also have some sort of SomeJob.take() method which calls through to a BlockingQueue defined inside of your job class.
// the blocking queue in this case is hidden inside your job object
T result = someJob.take();
...
Here's what I would do:
In the thread that populates the List, make it thread-safe by wrapping the list using Collections.synchronizedList
Make the list publically available, but not modifiable by adding a public method to the thread which returns the list, but wrapped by Collections.unmodifiableList
Instead of giving clients a Future>, give them a handle to the thread, or some kind of wrapper of it, so that they can call the public method above.
Alternatively, as Gray has suggested, BlockingQueues are great for thread coordination like this. This may require more changes to your client code, however.
To answer my own question: there has been lots of development in this area recently. Among most used are: Play iteratees (http://www.playframework.org/documentation/2.0/Iteratees) and Rx for .NET (http://msdn.microsoft.com/en-us/data/gg577609.aspx)
Instead of Future they define something like:
interface Observable<T> {
Disposable subscribe(Observer<T> observer);
}
interface Observer<T> {
void onCompleted();
void onError(Exception error);
void onNext(T value);
}
and lots of combinators.
Alternatively to Observables you can take a look at twitter's approach.
They use Spool, which is an asynchronous version of the Stream.
Basically it is a simple trait similar to the List
trait Spool[+A] {
def head: A
/**
* The (deferred) tail of the spool. Invalid for empty spools.
*/
def tail: Future[Spool[A]]
}
that allows you to do functional stuff like map, filter and foreach on top of it.
Future is really designed to return a single (atomic) result, not for communicating intermediate results in this manner. What you will really want to do is to use multiple futures, one per batch.
We have a similar requirement where we have a bunch of things that we need to get from different remote servers, and each will come return at different times. We don't want to wait until the last one has returned, but rather process them in the order they return. For this we created the AsyncCompleter which takes an Iterable<Callable<T>> and returns an Iterable<T> that blocks on iteration, completely abstracting usage of the Future interface.
If you look at how that class is implemented, you'll see how to use a CompletionService to receive results from an Executor in the order in which they become available, if you need to build this for yourself.
edit: just saw that the second half of Gray's answer is similar, basically using an ExecutorCompletionService

Piping data between threads with Java

I am writing a multi-threaded application that mimics a movie theater. Each person involved is its own thread and concurrency must be done completely by semaphores. The only issue I am having is how to basically link threads so that they can communicate (via a pipe for instance).
For instance:
Customer[1] which is a thread, acquires a semaphore that lets it walk up to the Box Office. Now Customer[1] must tell the Box Office Agent that they want to see movie "X". Then BoxOfficeAgent[1] also a thread, must check to make sure the movie isn't full and either sell a ticket or tell Customer[1] to pick another movie.
How do I pass that data back and forth while still maintaining concurrency with the semaphores?
Also, the only class I can use from java.util.concurrent is the Semaphore class.
One easy way to pass data back and forth between threads is to use the implementations of the interface BlockingQueue<E>, located in the package java.util.concurrent.
This interfaces has methods to add elements to the collection with different behaviors:
add(E): adds if possible, otherwise throws exception
boolean offer(E): returns true if the element has been added, false otherwise
boolean offer(E, long, TimeUnit): tries to add the element, waiting the specified amount of time
put(E): blocks the calling thread until the element has been added
It also defines methods for element retrieval with similar behaviors:
take(): blocks until there's an element available
poll(long, TimeUnit): retrieves an element or returns null
The implementations I use most frequently are: ArrayBlockingQueue, LinkedBlockingQueue and SynchronousQueue.
The first one, ArrayBlockingQueue, has a fixed size, defined by a parameter passed to its constructor.
The second, LinkedBlockingQueue, has illimited size. It will always accept any elements, that is, offer will return true immediately, add will never throw an exception.
The third, and to me the most interesting one, SynchronousQueue, is exactly a pipe. You can think of it as a queue with size 0. It will never keep an element: this queue will only accept elements if there's some other thread trying to retrieve elements from it. Conversely, a retrieval operation will only return an element if there's another thread trying to push it.
To fulfill the homework requirement of synchronization done exclusively with semaphores, you could get inspired by the description I gave you about the SynchronousQueue, and write something quite similar:
class Pipe<E> {
private E e;
private final Semaphore read = new Semaphore(0);
private final Semaphore write = new Semaphore(1);
public final void put(final E e) {
write.acquire();
this.e = e;
read.release();
}
public final E take() {
read.acquire();
E e = this.e;
write.release();
return e;
}
}
Notice that this class presents similar behavior to what I described about the SynchronousQueue.
Once the methods put(E) gets called it acquires the write semaphore, which will be left empty, so that another call to the same method would block at its first line. This method then stores a reference to the object being passed, and releases the read semaphore. This release will make it possible for any thread calling the take() method to proceed.
The first step of the take() method is then, naturally, to acquire the read semaphore, in order to disallow any other thread to retrieve the element concurrently. After the element has been retrieved and kept in a local variable (exercise: what would happen if that line, E e = this.e, were removed?), the method releases the write semaphore, so that the method put(E) may be called again by any thread, and returns what has been saved in the local variable.
As an important remark, observe that the reference to the object being passed is kept in a private field, and the methods take() and put(E) are both final. This is of utmost importance, and often missed. If these methods were not final (or worse, the field not private), an inheriting class would be able to alter the behavior of take() and put(E) breaking the contract.
Finally, you could avoid the need to declare a local variable in the take() method by using try {} finally {} as follows:
class Pipe<E> {
// ...
public final E take() {
try {
read.acquire();
return e;
} finally {
write.release();
}
}
}
Here, the point of this example if just to show an use of try/finally that goes unnoticed among inexperienced developers. Obviously, in this case, there's no real gain.
Oh damn, I've mostly finished your homework for you. In retribution -- and for you to test your knowledge about Semaphores --, why don't you implement some of the other methods defined by the BlockingQueue contract? For example, you could implement an offer(E) method and a take(E, long, TimeUnit)!
Good luck.
Think it in terms of shared memory with read/write lock.
Create a buffer to put the message.
The access to the buffer should be controlled by using a lock/semaphore.
Use this buffer for inter thread communication purpose.
Regards
PKV

Java's BlockingQueue design question

the method java.util.concurrent.BlockingQueue.add(E e)'s JavaDoc reads:
boolean add(E e)
Inserts the specified element into
this queue if it is possible to do so
immediately without violating capacity
restrictions, returning true upon
success and throwing an
IllegalStateException if no space is
currently available. When using a
capacity-restricted queue, it is
generally preferable to use offer.
My question is: will it ever return false? if not, why does this method return a boolean?
It seems weird to me. What is the design decision behind this?
Thanks for your knowledge!
Manuel
It follows the contract of Collection.add(E e) (since BlockingQueue is a subtype of Collection):
If a collection refuses to add a
particular element for any reason
other than that it already contains
the element, it must throw an
exception (rather than returning
false). This preserves the invariant
that a collection always contains the
specified element after this call
returns.
The decision behind is: fail fast. The IllegalStateException will be thrown, if the queue has a limited capacity. IllegalStateException is a RuntimeException. So if the exception gets thrown, you probably have a fault in your application logic or your application logic is not defensive enough. Or to say it in other words: If you like to use a limited queue, your application should deal with it properly (use offer instead).
I'm guessing it has a boolean return type because it's a subinterface of Queue, which also has a boolean add(E obj) method (which in turn is derived from Collection). Certain Queue implementations reject attempts to add objects to the queue by returning false.
Thus, the answer to your question is that implementations of BlockingQueue will never return false.
The method returns a boolean because it overrides Collection#add(E e).

BlockingQueue - blocked drainTo() methods

BlockingQueue has the method called drainTo() but it is not blocked. I need a queue that I want to block but also able to retrieve queued objects in a single method.
Object first = blockingQueue.take();
if ( blockingQueue.size() > 0 )
blockingQueue.drainTo( list );
I guess the above code will work but I'm looking for an elegant solution.
Are you referring to the comment in the JavaDoc:
Further, the behavior of this operation is undefined if the specified collection
is modified while the operation is in progress.
I believe that this refers to the collection list in your example:
blockingQueue.drainTo(list);
meaning that you cannot modify list at the same time you are draining from blockingQueue into list. However, the blocking queue internally synchronizes so that when drainTo is called, puts and (see note below) gets will block. If it did not do this, then it would not be truly Thread-safe. You can look at the source code and verify that drainTo is Thread-safe regarding the blocking queue itself.
Alternately, do you mean that when you call drainTo that you want it to block until at least one object has been added to the queue? In that case, you have little choice other than:
list.add(blockingQueue.take());
blockingQueue.drainTo(list);
to block until one or more items have been added, and then drain the entire queue into the collection list.
Note: As of Java 7, a separate lock is used for gets and puts. Put operations are now permitted during a drainTo (and a number of other take operations).
If you happen to use Google Guava, there's a nifty Queues.drain() method.
Drains the queue as BlockingQueue.drainTo(Collection, int), but if the
requested numElements elements are not available, it will wait for
them up to the specified timeout.
I found this pattern useful.
List<byte[]> blobs = new ArrayList<byte[]>();
if (queue.drainTo(blobs, batch) == 0) {
blobs.add(queue.take());
}
With the API available, I don't think you are going to get much more elegant. Other than you can remove the size test.
If you are wanting to atomically retrieve a contiguous sequence of elements even if another removal operation coincides, I don't believe even drainTo guarantees that.
Source code:
596: public int drainTo(Collection<? super E> c) {
//arg. check
603: lock.lock();
604: try {
608: for (n = 0 ; n != count ; n++) {
609: c.add(items[n]);
613: }
614: if (n > 0) {
618: notFull.signalAll();
619: }
620: return n;
621: } finally {
622: lock.unlock();
623: }
624: }
ArrayBlockingQueue is eager to return 0. BTW, it could do it before taking the lock.

Categories

Resources