Block before draining ArrayBlockingQueue

Block before draining ArrayBlockingQueue - java

I find myself repeating this pattern and have often wondered if it is idiomatic in Java or there is a better way of achieving this behaviour.
Problem: Given a producer/consumer setup, the consumer wants to process batches of items, so it uses drainTo(), however drainTo() will poll for existing items and possibly fail to get any items, to avoid this I prefix the drain with a take() to ensure it blocks until at least one item is available.
One problem I get, with a particular dataset, is with many use cases that the batch size is often irregular alternating between (1, N, 1, N). In general is this a common way to solve this problem:
Example:
ArrayBlockingQueue<Foo> queue;
function void produce() {
while(true) {
queue.put(createFoo());
}
}
function void consumeBatchSpin() {
while(true) {
List<Foo> batch = Lists.newLinkedList();
queue.drainTo(batch);
doSomething(batch);
//the problem here is that if nothing is being produced, this loop will spin
}
}
function void consumeBatchTake() {
while(true) {
List<Foo> batch = Lists.newLinkedList();
batch.add(queue.take()); //force at least one item to be there
queue.drainTo(batch);
doSomething(batch);
}
}

Have you considered adding to a list and taking the whole list on get.
I have posted one here recently. It is undergoing code review here but my tests suggest it is robust.
Essentially, when you do a put you add your new element to the current list. When you do a get you get the whole list and atomically replace it with a new empty one.
No need to use drainTo and no spinning at all.

Related

Java - concurrent clear of the list

I am trying to find a good way to achieve the following API:
void add(Object o);
void processAndClear();
The class would store the objects and upon calling processAndClear would iterate through the currently stored ones, process them somehow, and then clear the store. This class should be thread safe.
the obvious approach is to use locking, but I wanted to be more "concurrent". This is the approach which I would use:
class Store{
private AtomicReference<CopyOnWriteArrayList<Object>> store = new AtomicReference<>(new CopyOnWriteArrayList <>());
void add(Object o){
store.get().add(o);
}
void processAndClear(){
CopyOnWriteArrayList<Object> objects = store.get();
store.compareAndSet(objects, new CopyOnWriteArrayList<>());
for (Object object : objects) {
//do sth
}
}
}
This would allow threads that try to add objects to proceed almost immediately without any locking/waiting for the xlearing to complete. Is this the more or less correct approach?

Your above code is not thread-safe. Imagine the following:
Thread A is put on hold at add() right after store.get()
Thread B is in processAndClear(), replaces the list, processes all elements of the old one, then returns.
Thread A resumes and adds a new item to the now obsolete list that will never be processed.
The probably easiest solution here would be to use a LinkedBlockingQueue, which would as well simplify the task a lot:
class Store{
final LinkedBlockingQueue<Object> queue = new LinkedBlockingQueue<>();
void add(final Object o){
queue.put(o); // blocks until there is free space in the optionally bounded queue
}
void processAndClear(){
Object element;
while ((element = queue.poll()) != null) { // does not block on empty list but returns null instead
doSomething(element);
}
}
}
Edit: How to do this with synchronized:
class Store{
final LinkedList<Object> queue = new LinkedList<>(); // has to be final for synchronized to work
void add(final Object o){
synchronized(queue) { // on the queue as this is the shared object in question
queue.add(o);
}
}
void processAndClear() {
final LinkedList<Object> elements = new LinkedList<>(); // temporary local list
synchronized(queue) { // here as well, as every access needs to be properly synchronized
elements.addAll(queue);
queue.clear();
}
for (Object e : elements) {
doSomething(e); // this is thread-safe as only this thread can access these now local elements
}
}
}
Why this is not a good idea
Although this is thread-safe, it is much slower if compared to the concurrent version. Assume that you have a system with 100 threads that frequently call add, while one thread calls processAndClear. Then the following performance bottle-necks will occur:
If one thread calls add the other 99 are put on hold in the meantime.
During the first part of processAndClear all 100 threads are put on hold.
If you assume that those 100 adding threads have nothing else to do, you can easily show, that the application runs at the same speed as a single-threaded application minus the cost for synchronization. That means: adding will effectively be slower with 100 threads than with 1. This is not the case if you use a concurrent list as in the first example.
There will however be a minor performance gain with the processing thread, as doSomething can be run on the old elements while new ones are added. But again the concurrent example could be faster, as you could have multiple threads do the processing simultaneously.
Effectively synchronized can be used as well, but you will automatically introduce performance bottle-necks, potentially causing the application to run slower as single-threaded, forcing you to do complicated performance tests. In addition extending the functionality always contains a risk of introducing threading issues, as locking needs to be done manually.A concurrent list in contrast solves all these problems without additional code and the code can easily changed or extended later on.

The class would store the objects and upon calling processAndClear would iterate through the currently stored ones, process them somehow, and then clear the store.
This seems like you should use a BlockingQueue for this task. Your add(...) method would add to the queue and your consumer would call take() which blocks waiting for the next item. The BlockingQueue (ArrayBlockingQueue is a typical implementation) takes care of all of the synchronization and signaling for you.
This means that you don't have to have a CopyOnWriteArrayList nor an AtomicReference. What you would lose is a collection and you can iterate through for other reasons than your post articulates currently.

Is there an improved alternative to Java CopyOnWriteArrayList implementation and how can I request a change to Java spec?

CopyOnWriteArrayList almost has the behavior I want, and if unnecessary copies were removed it would be exactly what I am looking for. In particular, it could act exactly like ArrayList for adds made to the end of the ArrayList - i.e., there is no reason to actually make a new copy every single time which is so wasteful. It could just virtually restrict the end of the ArrayList to capture the snapshot for the readers, and update the end after the new items are added.
This enhancement seems like it would be worth having since for many applications the most common type of addition would be to the end of the ArrayList - which is even a reason for choosing to use an ArrayList to begin with.
There also would be no extra overhead since it could only not copy when appending and although it would still have to check if a re-size is necessary ArrayList has to do this anyways.
Is there any alternative implementation or data structure that has this behavior without the unnecessary copies for additions at the end (i.e., thread-safe and optimized to allow frequent reads with writes only being additions at the end of the list)?
How can I submit a change request to request a change to the Java specification to eliminate copies for additions to the end of a CopyOnWriteArrayList (unless a re-size is necessary)?
I'd really liked to see this changed with the core Java libraries rather than maintaining and using my own custom code.

Sounds like you're looking for a BlockingDeque, and in particular an ArrayBlockingQueue.
You may also want a ConcurrentLinkedQueue, which uses a "wait-free" algorithm (aka non-blocking) and may therefore be faster in many circumstances. It's only a Queue (not a Dequeue) and thus you can only insert/remove at the head of the collection, but it sounds like that might be good for your use case. But in exchange for the wait-free algorithm, it has to use a linked list rather than an array internally, and that means more memory (including more garbage when you pop items) and worse memory locality. The wait-free algorithm also relies on a compare and set (CAS) loop, which means that while it's faster in the "normal" case, it can actually be slower under high contention, as each thread needs to try its CAS several times before it wins and is able to move forward.
My guess is that the reason that lists don't get as much love in java.util.concurrent is that a list is an inherently racy data structure in most use cases other iteration. For instance, something like if (!list.isEmpty()) { return list.get(0); } is racy unless it's surrounded by a synchronized block, in which case you don't need an inherently thread-safe structure. What you really need is a "list-type" interface that only allows operations at the ends -- and that's exactly what Queue and Deque are.

To answer your questions:
I'm not aware of an alternative implementation that is a fully functional list.
If your idea is truly viable, I can think of a number of ways to proceed:
You can submit "requests for enhancement" (RFE) through the Java Bugs Database. However, in this case I doubt that you will get a positive response. (Certainly, not a timely one!)
You could create an RFE issue on Guava or Apache Commons issues tracker. This might be more fruitful, though it depends on convincing them ...
You could submit a patch to the OpenJDK team with an implementation of your idea. I can't say what the result might be ...
You could submit a patch (as above) to Guava or Apache Commons via their respective issues trackers. This is the approach that is most likely to succeed, though it still depends on convincing "them" that it is technically sound, and "a good thing".
You could just put the code for your proposed alternative implementation on Github, and see what happens.
However, all of this presupposes that your idea is actually going to work. Based on the scant information you have provided, I'm doubtful. I suspect that there may be issues with incomplete encapsulation, concurrency and/or not implementing the List abstraction fully / correctly.
I suggest that you put your code on Github so that other people can take a good hard look at it.

there is no reason to actually make a new copy every single time which is so wasteful.
This is how it works. It works by replacing the previous array with new array in a compare and swap action. It is a key part of the thread safety design that you always have a new array even if all you do is replace an entry.
thread-safe and optimized to allow frequent reads with writes only being additions at the end of the list
This is heavily optimised for reads, any other solution will be faster for writes, but slower for reads and you have to decide which one you really want.
You can have a custom data structure which will be the best of both worlds, but it not longer a generic solution which is what CopyOnWriteArrayList and ArrayDeque provide.
How can I submit a change request to request a change to the Java specification to eliminate copies for additions to the end of a CopyOnWriteArrayList (unless a re-size is necessary)?
You can do this through the bugs database, but what you propose is a fundamental change in how the data structure works. I suggest proposing a new/different data structure which works the way you want. In the mean time I suggest implementing it yourself as a working example as you will get want you want faster.
I would start with an AtomicReferenceArray as this can be used to perform the low level actions you need. The only problem with it is it is not resizable so you would need to determine the maximum size you would every need.

CopyOnWriteArrayList has a performance drawback because it creates a copy of the underlying array of the list on write operations. The array copying is making the write operations slow. May be, CopyOnWriteArrayList is advantageous for a usage of a List with high read rate and low write rate.
Eventually I started coding my own implementation using the java.util.concurrent.locks,ReadWriteLock. I did my implementation simply by maintaining object level ReadWriteLock instance, and gaining the read lock in the read operations and gaining the write lock in the write operations. The code looks like this.
public class ConcurrentList< T > implements List< T >
{
private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
private final List< T > list;
public ConcurrentList( List<T> list )
{
this.list = list;
}
public boolean remove( Object o )
{
readWriteLock.writeLock().lock();
boolean ret;
try
{
ret = list.remove( o );
}
finally
{
readWriteLock.writeLock().unlock();
}
return ret;
}
public boolean add( T t )
{
readWriteLock.writeLock().lock();
boolean ret;
try
{
ret = list.add( t );
}
finally
{
readWriteLock.writeLock().unlock();
}
return ret;
}
public void clear()
{
readWriteLock.writeLock().lock();
try
{
list.clear();
}
finally
{
readWriteLock.writeLock().unlock();
}
}
public int size()
{
readWriteLock.readLock().lock();
try
{
return list.size();
}
finally
{
readWriteLock.readLock().unlock();
}
}
public boolean contains( Object o )
{
readWriteLock.readLock().lock();
try
{
return list.contains( o );
}
finally
{
readWriteLock.readLock().unlock();
}
}
public T get( int index )
{
readWriteLock.readLock().lock();
try
{
return list.get( index );
}
finally
{
readWriteLock.readLock().unlock();
}
}
//etc
}
The performance improvement observed was notable.
Total time taken for 5000 reads + 5000 write ( read write ratio is 1:1) by 10 threads were
ArrayList - 16450 ns( not thread safe)
ConcurrentList - 20999 ns
Vector -35696 ns
CopyOnWriteArrayList - 197032 ns
please follow this link for more info about the test case used for obtaining above results
However, in order to avoid ConcurrentModificationException when using the Iterator, I just created a copy of the current List and returned the iterator of that. This means this list does not return and Iterator which can modify the original List. Well, for me, this is o.k. for the moment.
public Iterator<T> iterator()
{
readWriteLock.readLock().lock();
try
{
return new ArrayList<T>( list ).iterator();
}
finally
{
readWriteLock.readLock().unlock();
}
}
After some googling I found out that CopyOnWriteArrayList has a similar implementaion, as it does not return an Iterator which can modify the original List. Javadoc says,
The returned iterator provides a snapshot of the state of the list when the iterator was constructed. No synchronization is needed while traversing the iterator. The iterator does NOT support the remove method.

Incremental Future of list extensions

I essentially have a Future<List<T>> that is fetched in batches from the server. For some clients I'd like to provide incremental results while it loads in addition to the whole collection when future is fulfilled.
Is there a common Future extension defined somewhere for this? What are typical patterns/combinators exist for such futures?
I assume that given IncrementalListFuture<T> I can easily define map operation. What else comes to your mind?

Is there a common Future extension defined somewhere for this?
I assume you are talking about incremental results from an ExecutorService. You should consider using an ExecutorCompletionService which allows you to be informed as soon as one of the Future objects is get-able.
To quote from the javadocs:
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
for (Callable<Result> s : solvers) {
ecs.submit(s);
}
int n = solvers.size();
for (int i = 0; i < n; ++i) {
// this waits for one of the futures to finish and provide a result
Future<Result> future = ecs.take();
Result result = future.get();
if (result != null) {
// do something with the result
}
}
Sorry. I initially misread the question and thought that you were asking about a List<Future<?>>. It may be that you could refactor your code to actually return a number of Futures so I'll leave this for posterity.
I would not pass back the list in this case in a Future. You aren't going to be able to get the return until the job finishes.
If possible, I would pass in some sort of BlockingQueue so both the caller and the thread can access it:
final BlockingQueue<T> queue = new LinkedBlockingQueue<T>();
// build out job with the queue
threadPool.submit(new SomeJob(queue));
threadPool.shutdown();
// now we can consume from the queue as it is built:
while (true) {
T result = queue.take();
// you could some constant result object to mean that the job finished
if (result == SOME_END_OBJECT) {
break;
}
// provide intermediate results
}
You could also have some sort of SomeJob.take() method which calls through to a BlockingQueue defined inside of your job class.
// the blocking queue in this case is hidden inside your job object
T result = someJob.take();
...

Here's what I would do:
In the thread that populates the List, make it thread-safe by wrapping the list using Collections.synchronizedList
Make the list publically available, but not modifiable by adding a public method to the thread which returns the list, but wrapped by Collections.unmodifiableList
Instead of giving clients a Future>, give them a handle to the thread, or some kind of wrapper of it, so that they can call the public method above.
Alternatively, as Gray has suggested, BlockingQueues are great for thread coordination like this. This may require more changes to your client code, however.

To answer my own question: there has been lots of development in this area recently. Among most used are: Play iteratees (http://www.playframework.org/documentation/2.0/Iteratees) and Rx for .NET (http://msdn.microsoft.com/en-us/data/gg577609.aspx)
Instead of Future they define something like:
interface Observable<T> {
Disposable subscribe(Observer<T> observer);
}
interface Observer<T> {
void onCompleted();
void onError(Exception error);
void onNext(T value);
}
and lots of combinators.

Alternatively to Observables you can take a look at twitter's approach.
They use Spool, which is an asynchronous version of the Stream.
Basically it is a simple trait similar to the List
trait Spool[+A] {
def head: A
/**
* The (deferred) tail of the spool. Invalid for empty spools.
*/
def tail: Future[Spool[A]]
}
that allows you to do functional stuff like map, filter and foreach on top of it.

Future is really designed to return a single (atomic) result, not for communicating intermediate results in this manner. What you will really want to do is to use multiple futures, one per batch.
We have a similar requirement where we have a bunch of things that we need to get from different remote servers, and each will come return at different times. We don't want to wait until the last one has returned, but rather process them in the order they return. For this we created the AsyncCompleter which takes an Iterable<Callable<T>> and returns an Iterable<T> that blocks on iteration, completely abstracting usage of the Future interface.
If you look at how that class is implemented, you'll see how to use a CompletionService to receive results from an Executor in the order in which they become available, if you need to build this for yourself.
edit: just saw that the second half of Gray's answer is similar, basically using an ExecutorCompletionService

How can I atomically "enqueue if free space OR dequeue then enqueue" for a Java queue / list?

I've got a requirement for a list in Java with a fixed capacity but which always allows threads to add items to the start. If it's full it should remove an item from the end to make space. No other process will remove items, but other processes will wish to iterate over the items.
Is there something in the JDK which would allow me to do this atomically?
My current plan is just to use some existing threadsafe Collection (e.g. LinkedBlockingQueue) and further synchronise on it when I check capacity / add / remove. Would that work as well?
Thanks.

Your idea would work but would involve taking out multiple locks (see example below). Given you need to synchronize multiple operations when adding data you may as well wrap a LinkedList implementation of a Queue to avoid the overhead of additional locks.
// Create queue with fixed capacity.
Queue<Item> queue = new LinkedBlockingQueue<Item>(1000);
...
// Attempt to add item to queue, removing items if required.
synchronized(queue) { // First lock
while (!queue.offer(item)) { // Second lock
queue.take(); // Third lock
}
}

I'm working in an old version of Java (yes 1.3, I have no choice), so even if it's there in later Javas I can't use it. So I coded along these lines:
public class Fifo {
private LinkedList fifoContents = new LinkedList();
public synchronized void put(Object element) {
if ( fifoContents.size() > 100){
fifoContents.removeFirst();
logger.logWarning("*** Backlog, discarding messaage ");
}
fifoContents.add (element);
return;
}
public synchronized Object get() throws NoSuchElementException {
return fifoContents.removeFirst();
}
}

You may be able to get away with just testing/removing/inserting without additional locks:
class DroppingQueue<E>
extends ArrayBlockingQueue<E> {
public boolean add(E item) {
while (! offer(item)) {
take();
}
return true;
}
}
Although this method is not synchronized, add and offer still are, so the worst that can happen is that thread #1 will call offer, find the queue to be full, thread #2 will do the same, and both will remove items, temporarily reducing the number of items to less than the maximum, before both threads successfully add their items. This will probably not cause serious problems.

There's no such class in JDK.
If you are going to implement such collection, you might want to use array with floating head/tail pointers - since you have fixed size you don't need linked list at all.

Is this java code thread-safe?

I am planning to use this schema in my application, but I was not sure whether this is safe.
To give a little background, a bunch of servers will compute results of sub-tasks that belong to a single task and report them back to the central server. This piece of code is used to register the results, and also check whether all the subtasks for the task has completed and if so, report that fact only once.
The important point is that, all task must be reported once and only once as soon as it is completed (all subTaskResults are set).
Can anybody help? Thank you! (Also, if you have a better idea to solve this problem, please let me know!)
*Note that I simplified the code for brevity.
Solution I
class Task {
//Populate with bunch of (Long, new AtomicReference()) pairs
//Actual app uses read only HashMap
Map<Id, AtomicReference<SubTaskResult>> subtasks = populatedMap();
Semaphore permission = new Semaphore(1);
public Task set(id, subTaskResult){
//null check omitted
subtasks.get(id).set(result);
return check() ? this : null;
}
private boolean check(){
for(AtomicReference ref : subtasks){
if(ref.get()==null){
return false;
}
}//for
return permission.tryAquire();
}
}//class
Stephen C kindly suggested to use a counter. Actually, I have considered that once, but I reasoned that the JVM could reorder the operations and thus, a thread can observe a decremented counter (by another thread) before the result is set in AtomicReference (by that other thread).
*EDIT: I now see this is thread safe. I'll go with this solution. Thanks, Stephen!
Solution II
class Task {
//Populate with bunch of (Long, new AtomicReference()) pairs
//Actual app uses read only HashMap
Map<Id, AtomicReference<SubTaskResult>> subtasks = populatedMap();
AtomicInteger counter = new AtomicInteger(subtasks.size());
public Task set(id, subTaskResult){
//null check omitted
subtasks.get(id).set(result);
//In the actual app, if !compareAndSet(null, result) return null;
return check() ? this : null;
}
private boolean check(){
return counter.decrementAndGet() == 0;
}
}//class

I assume that your use-case is that there are multiple multiple threads calling set, but for any given value of id, the set method will be called once only. I'm also assuming that populateMap creates the entries for all used id values, and that subtasks and permission are really private.
If so, I think that the code is thread-safe.
Each thread should see the initialized state of the subtasks Map, complete with all keys and all AtomicReference references. This state never changes, so subtasks.get(id) will always give the right reference. The set(result) call operates on an AtomicReference, so the subsequent get() method calls in check() will give the most up-to-date values ... in all threads. Any potential races with multiple threads calling check seem to sort themselves out.
However, this is a rather complicated solution. A simpler solution would be to use an concurrent counter; e.g. replace the Semaphore with an AtomicInteger and use decrementAndGet instead of repeatedly scanning the subtasks map in check.
In response to this comment in the updated solution:
Actually, I have considered that once,
but I reasoned that the JVM could
reorder the operations and thus, a
thread can observe a decremented
counter (by another thread) before the
result is set in AtomicReference (by
that other thread).
The AtomicInteger and AtomicReference by definition are atomic. Any thread that tries to access one is guaranteed to see the "current" value at the time of the access.
In this particular case, each thread calls set on the relevant AtomicReference before it calls decrementAndGet on the AtomicInteger. This cannot be reordered. Actions performed by a thread are performed in order. And since these are atomic actions, the efects will be visible to other threads in order as well.
In other words, it should be thread-safe ... AFAIK.

The atomicity guaranteed (per class documentation) explicitly for AtomicReference.compareAndSet extends to set and get methods (per package documentation), so in that regard your code appears to be thread-safe.
I am not sure, however, why you have Semaphore.tryAquire as a side-effect there, but without complimentary code to release the semaphore, that part of your code looks wrong.

The second solution does provide a thread-safe latch, but it's vulnerable to calls to set() that provide an ID that's not in the map -- which would trigger a NullPointerException -- or more than one call to set() with the same ID. The latter would mistakenly decrement the counter too many times and falsely report completion when there are presumably other subtasks IDs for which no result has been submitted. My criticism isn't with regard to the thread safety, but rather to the invariant maintenance; the same flaw would be present even without the thread-related concern.
Another way to solve this problem is with AbstractQueuedSynchronizer, but it's somewhat gratuitous: you can implement a stripped-down counting semaphore, where each call set() would call releaseShared(), decrementing the counter via a spin on compareAndSetState(), and tryAcquireShared() would only succeed when the count is zero. That's more or less what you implemented above with the AtomicInteger, but you'd be reusing a facility that offers more capabilities you can use for other portions of your design.
To flesh out the AbstractQueuedSynchronizer-based solution requires adding one more operation to justify the complexity: being able to wait on the results from all the subtasks to come back, such that the entire task is complete. That's Task#awaitCompletion() and Task#awaitCompletion(long, TimeUnit) in the code below.
Again, it's possibly overkill, but I'll share it for the purpose of discussion.
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.AbstractQueuedSynchronizer;
final class Task
{
private static final class Sync extends AbstractQueuedSynchronizer
{
public Sync(int count)
{
setState(count);
}
#Override
protected int tryAcquireShared(int ignored)
{
return 0 == getState() ? 1 : -1;
}
#Override
protected boolean tryReleaseShared(int ignored)
{
int current;
do
{
current = getState();
if (0 == current)
return true;
}
while (!compareAndSetState(current, current - 1));
return 1 == current;
}
}
public Task(int count)
{
if (count < 0)
throw new IllegalArgumentException();
sync_ = new Sync(count);
}
public boolean set(int id, Object result)
{
// Ensure that "id" refers to an incomplete task. Doing so requires
// additional synchronization over the structure mapping subtask
// identifiers to results.
// Store result somehow.
return sync_.releaseShared(1);
}
public void awaitCompletion()
throws InterruptedException
{
sync_.acquireSharedInterruptibly(0);
}
public void awaitCompletion(long time, TimeUnit unit)
throws InterruptedException
{
sync_.tryAcquireSharedNanos(0, unit.toNanos(time));
}
private final Sync sync_;
}

I have a weird feeling reading your example program, but it depends on the larger structure of your program what to do about that. A set function that also checks for completion is almost a code smell. :-) Just a few ideas.
If you have synchronous communication with your servers you might use an ExecutorService with the same number of threads like the number of servers that do the communication. From this you get a bunch of Futures, and you can naturally proceed with your calculation - the get calls will block at the moment the result is needed but not yet there.
If you have asynchronous communication with the servers you might also use a CountDownLatch after submitting the task to the servers. The await call blocks the main thread until the completion of all subtasks, and other threads can receive the results and call countdown on each received result.
With all these methods you don't need special threadsafety measures other than that the concurrent storing of the results in your structure is threadsafe. And I bet there are even better patterns for this.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.