Concurrent Set Queue - java

Maybe this is a silly question, but I cannot seem to find an obvious answer.
I need a concurrent FIFO queue that contains only unique values. Attempting to add a value that already exists in the queue simply ignores that value. Which, if not for the thread safety would be trivial. Is there a data structure in Java or maybe a code snipit on the interwebs that exhibits this behavior?

If you want better concurrency than full synchronization, there is one way I know of to do it, using a ConcurrentHashMap as the backing map. The following is a sketch only.
public final class ConcurrentHashSet<E> extends ForwardingSet<E>
implements Set<E>, Queue<E> {
private enum Dummy { VALUE }
private final ConcurrentMap<E, Dummy> map;
ConcurrentHashSet(ConcurrentMap<E, Dummy> map) {
super(map.keySet());
this.map = Preconditions.checkNotNull(map);
}
#Override public boolean add(E element) {
return map.put(element, Dummy.VALUE) == null;
}
#Override public boolean addAll(Collection<? extends E> newElements) {
// just the standard implementation
boolean modified = false;
for (E element : newElements) {
modified |= add(element);
}
return modified;
}
#Override public boolean offer(E element) {
return add(element);
}
#Override public E remove() {
E polled = poll();
if (polled == null) {
throw new NoSuchElementException();
}
return polled;
}
#Override public E poll() {
for (E element : this) {
// Not convinced that removing via iterator is viable (check this?)
if (map.remove(element) != null) {
return element;
}
}
return null;
}
#Override public E element() {
return iterator().next();
}
#Override public E peek() {
Iterator<E> iterator = iterator();
return iterator.hasNext() ? iterator.next() : null;
}
}
All is not sunshine with this approach. We have no decent way to select a head element other than using the backing map's entrySet().iterator().next(), the result being that the map gets more and more unbalanced as time goes on. This unbalancing is a problem both due to greater bucket collisions and greater segment contention.
Note: this code uses Guava in a few places.

There's not a built-in collection that does this. There are some concurrent Set implementations that could be used together with a concurrent Queue.
For example, an item is added to the queue only after it was successfully added to the set, and each item removed from the queue is removed from the set. In this case, the contents of the queue, logically, are really whatever is in the set, and the queue is just used to track the order and provide efficient take() and poll() operations found only on a BlockingQueue.

I would use a synchronized LinkedHashSet until there was enough justification to consider alternatives. The primary benefit that a more concurrent solution could offer is lock splitting.
The simplest concurrent approach would be a a ConcurrentHashMap (acting as a set) and a ConcurrentLinkedQueue. The ordering of operations would provide the desired constraint. An offer() would first perform a CHM#putIfAbsent() and if successful insert into the CLQ. A poll() would take from the CLQ and then remove it from the CHM. This means that we consider an entry in our queue if it is in the map and the CLQ provides the ordering. The performance could then be adjusted by increasing the map's concurrencyLevel. If you are tolerant to additional racy-ness, then a cheap CHM#get() could act as a reasonable precondition (but it can suffer by being a slightly stale view).

A java.util.concurrent.ConcurrentLinkedQueue gets you most of the way there.
Wrap the ConcurrentLinkedQueue with your own class that checks for the uniqueness of an add. Your code has to be thread safe.

What do you mean by a concurrent queue with Set semantics? If you mean a truly concurrent structure (as opposed to a thread-safe structure) then I would contend that you are asking for a pony.
What happens for instance if you call put(element) and detect that something is already there which immediately is removed? For instance, what does it mean in your case if offer(element) || queue.contains(element) returns false?
These kinds of things often need to thought about slightly differently in a concurrent world as often nothing is as it seems unless you stop the world (lock it down). Otherwise you are usually looking at something in the past. So, what are you actually trying to do?

Perhaps extend ArrayBlockingQueue. In order to get access to the (package-access) lock, I had to put my sub-class within the same package. Caveat: I haven't tested this.
package java.util.concurrent;
import java.util.Collection;
import java.util.concurrent.locks.ReentrantLock;
public class DeDupingBlockingQueue<E> extends ArrayBlockingQueue<E> {
public DeDupingBlockingQueue(int capacity) {
super(capacity);
}
public DeDupingBlockingQueue(int capacity, boolean fair) {
super(capacity, fair);
}
public DeDupingBlockingQueue(int capacity, boolean fair, Collection<? extends E> c) {
super(capacity, fair, c);
}
#Override
public boolean add(E e) {
final ReentrantLock lock = this.lock;
lock.lock();
try {
if (contains(e)) return false;
return super.add(e);
} finally {
lock.unlock();
}
}
#Override
public boolean offer(E e) {
final ReentrantLock lock = this.lock;
lock.lock();
try {
if (contains(e)) return true;
return super.offer(e);
} finally {
lock.unlock();
}
}
#Override
public void put(E e) throws InterruptedException {
final ReentrantLock lock = this.lock;
lock.lockInterruptibly(); //Should this be lock.lock() instead?
try {
if (contains(e)) return;
super.put(e); //if it blocks, it does so without holding the lock.
} finally {
lock.unlock();
}
}
#Override
public boolean offer(E e, long timeout, TimeUnit unit) throws InterruptedException {
final ReentrantLock lock = this.lock;
lock.lock();
try {
if (contains(e)) return true;
return super.offer(e, timeout, unit); //if it blocks, it does so without holding the lock.
} finally {
lock.unlock();
}
}
}

A simple answer for a queue of unique objects can be as follow:
import java.util.concurrent.ConcurrentLinkedQueue;
public class FinalQueue {
class Bin {
private int a;
private int b;
public Bin(int a, int b) {
this.a = a;
this.b = b;
}
#Override
public int hashCode() {
return a * b;
}
public String toString() {
return a + ":" + b;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Bin other = (Bin) obj;
if ((a != other.a) || (b != other.b))
return false;
return true;
}
}
private ConcurrentLinkedQueue<Bin> queue;
public FinalQueue() {
queue = new ConcurrentLinkedQueue<Bin>();
}
public synchronized void enqueue(Bin ipAddress) {
if (!queue.contains(ipAddress))
queue.add(ipAddress);
}
public Bin dequeue() {
return queue.poll();
}
public String toString() {
return "" + queue;
}
/**
* #param args
*/
public static void main(String[] args) {
FinalQueue queue = new FinalQueue();
Bin a = queue.new Bin(2,6);
queue.enqueue(a);
queue.enqueue(queue.new Bin(13, 3));
queue.enqueue(queue.new Bin(13, 3));
queue.enqueue(queue.new Bin(14, 3));
queue.enqueue(queue.new Bin(13, 9));
queue.enqueue(queue.new Bin(18, 3));
queue.enqueue(queue.new Bin(14, 7));
Bin x= queue.dequeue();
System.out.println(x.a);
System.out.println(queue.toString());
System.out.println("Dequeue..." + queue.dequeue());
System.out.println("Dequeue..." + queue.dequeue());
System.out.println(queue.toString());
}
}

Related

ArrayBlockingQueue in which queue head is removed if the queue is full while adding an element

I am trying to write a simple queue like ArrayBlockingQueue in which the head of the queue will be removed if the queue is full while adding an element. The class should just have the below public methods
To get the size of the Queue
To get an element from the head of the queue. If no element available block.
To add an element at the tail of the queue
Can someone review the below code and let me know if there is a better way of doing this?
public class CircularArrayNonBlockingQueue<E> {
private ArrayBlockingQueue<E> blockingQueue;
public CircularArrayNonBlockingQueue(int size) {
blockingQueue = new ArrayBlockingQueue<>(size);
}
public synchronized int size() {
return blockingQueue.size();
}
public synchronized void add(E element) {
if(blockingQueue.remainingCapacity() <= 0) {
blockingQueue.poll();
}
blockingQueue.add(element);
}
public synchronized E poll() {
return blockingQueue.poll();
}
}
EDIT
Based on the discussion in the comments I don't need to make all the methods synchronized. The updated code looks like below -
public class CircularNonBlockingQueue<E> {
private final ArrayBlockingQueue<E> blockingQueue;
public CircularNonBlockingQueue(int size) {
blockingQueue = new ArrayBlockingQueue<>(size);
}
public int size() {
return blockingQueue.size();
}
public synchronized void add(E element) {
if(blockingQueue.remainingCapacity() <= 0) {
blockingQueue.poll();
}
blockingQueue.add(element);
}
public E take() throws InterruptedException {
return blockingQueue.take();
}
}
Having a thread-safe backend collection does not necessarily make a correct program. When only your add method is synchronized, the take() method may run concurrently to it, so it is possible that after your if(blockingQueue.remainingCapacity() <= 0) test within add, a concurrently running take() removes an element, so the poll() within add may remove an element unnecessarily. There is a perceivable difference to the situation where add() would complete before the take(), as the consuming thread would receive a different item. It other words, the effect would be as if add would sometimes not remove the oldest item, but the second oldest one.
On the other hand, if you use synchronized for all of your methods consistently, there is no need to have a thread-safe backend collection:
import java.util.ArrayDeque;
public class CircularBlockingQueue<E> {
private final ArrayDeque<E> blockingQueue;
private final int maxSize;
public CircularBlockingQueue(int size) {
if(size<1) throw new IllegalArgumentException("size == "+size);
blockingQueue = new ArrayDeque<>(size);
maxSize = size;
}
public synchronized int size() {
return blockingQueue.size();
}
public synchronized void add(E element) {
if(blockingQueue.size() == maxSize) {
blockingQueue.poll();
}
blockingQueue.add(element);
notify();
}
public synchronized E take() throws InterruptedException {
while(blockingQueue.isEmpty()) wait();
return blockingQueue.remove();
}
}
However, if you can live with weaker guarantees regarding the oldest element, you can use a BlockingQueue and don’t need any synchronized:
public class CircularBlockingQueue<E> {
private final ArrayBlockingQueue<E> blockingQueue;
public CircularBlockingQueue(int size) {
blockingQueue = new ArrayBlockingQueue<>(size);
}
public int size() {
return blockingQueue.size();
}
public void add(E element) {
while(!blockingQueue.offer(element)) {
blockingQueue.poll();
}
}
public E take() throws InterruptedException {
return blockingQueue.take();
}
}
It must be noted that neither of these solutions provides “fairness”. So if the number of producer and consumer threads is large compared to the queue’s capacity, there is the risk that producers repeatedly remove items without reactivating threads blocked in take(). So you should always ensure to have a sufficiently large capacity.

Use value if non-null, otherwise wait and atomically get it, repeat in a loop

Suppose a field in an object that changes from null to non-null and back and forth, etc., depending on the operation of one thread.
A second thread shall lazily take some action whenever it happens to get hold of a non-null value. In particular the second thread shall wait until the value switches to non-null. If it falls out of the wait, I want to be sure that it has a non-null in its hand.
This does not seem like a queue situation, because the second thread will not take the element away, it just uses it if one happens to be available.
It also does not fit for semaphore use, because again it would not .acquire() a permit.
Rather it reminds of of compare-and-get with built in wait, but this does not seem to exist.
Is there a predefined device for this in java.util.concurrent that I miss to identify. How can this be done?
This is similar but does not have an accepted answer or one that would help here.
Here's an implementation relying on ReentrantLock to manage a volatile field. This borrows heavily from the double-checked locking idiom, but instead of creating the value itself, a read operation waits on a condition to signal that a value has been set.
The get() method is overloaded with a version that accepts a timeout. Both versions are interruptible.
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
public class BlockingRef<V> {
private final Lock lock = new ReentrantLock(true);
private final Condition signal = lock.newCondition();
private volatile V value;
public BlockingRef() {
this(null);
}
public BlockingRef(V initialValue) {
this.value = initialValue;
}
public final void set(V value) {
lock.lock();
try {
this.value = value;
signal.signalAll();
} finally {
lock.unlock();
}
}
public final V get() throws InterruptedException {
V result = value;
if (result == null) {
lock.lockInterruptibly();
try {
for (result = value; result == null; result = value)
signal.await();
} finally {
lock.unlock();
}
}
return result;
}
public final V get(long time, TimeUnit unit)
throws TimeoutException, InterruptedException
{
V result = value;
if (result == null) {
long start = System.nanoTime();
if (!lock.tryLock(time, unit)) throw new TimeoutException();
try {
time = unit.toNanos(time);
for (result = value; result == null; result = value) {
long wait = time - (System.nanoTime() - start);
if (wait <= 0) throw new TimeoutException();
signal.await(wait, TimeUnit.NANOSECONDS);
}
} finally {
lock.unlock();
}
}
return result;
}
#Override
public String toString() {
return String.valueOf(value);
}
}

Take a snapshot of a Set

I have Set with items, and want to send it for parallel processing.
However, I want to modify the original set afterwards and it'd cause some concurrency issues, so I think it'd be nice to take a snapshot or something of the Set and send THAt for the processing.
Will clone work good?
Or should I make a new Set of it myself?
Or is there some nice way I'm missing?
Edit: I'm now using this, it seems to work pretty nice:
public class BufferedHashSet<E> extends HashSet<E> {
private List<E> toAdd = new LinkedList<E>();
private List<Object> toRemove = new LinkedList<Object>();
#Override
public boolean add(E e)
{
synchronized (this) {
toAdd.add(e);
return true;
}
}
#Override
public boolean remove(Object e)
{
synchronized (this) {
toRemove.add(e);
return true;
}
}
public void flush()
{
synchronized (this) {
for (E e : toAdd) {
super.add(e);
}
for (Object e : toRemove) {
super.remove(e);
}
toAdd.clear();
toRemove.clear();
}
}
}
In my opinion the most elegant solution is to use Set.addAll() method.
Set set;
Set snapshot = new TreeSet<>(); //or any Set implementation you use
snapshot.addAll(set);

Asynchronous Iterator

I have the following code:
while(slowIterator.hasNext()) {
performLengthTask(slowIterator.next());
}
Because both iterator and task are slow it makes sense to put those into separate threads. Here is a quick and dirty attempt for an Iterator wrapper:
class AsyncIterator<T> implements Iterator<T> {
private final BlockingQueue<T> queue = new ArrayBlockingQueue<T>(100);
private AsyncIterator(final Iterator<T> delegate) {
new Thread() {
#Override
public void run() {
while(delegate.hasNext()) {
queue.put(delegate.next()); // try/catch removed for brevity
}
}
}.start();
}
#Override
public boolean hasNext() {
return true;
}
#Override
public T next() {
return queue.take(); // try/catch removed for brevity
}
// ... remove() throws UnsupportedOperationException
}
However this implementation lacks support for "hasNext()". It would be ok of course for the hasNext() method to block until it knows whether to return true or not. I could have a peek object in my AsyncIterator and I could change hasNext() to take an object from the queue and have next() return this peek. But this would cause hasNext() to block indefinitely if the delegate iterator's end has been reached.
Instead of utilizing the ArrayBlockingQueue I could of course do thread communication myself:
private static class AsyncIterator<T> implements Iterator<T> {
private final Queue<T> queue = new LinkedList<T>();
private boolean delegateDone = false;
private AsyncIterator(final Iterator<T> delegate) {
new Thread() {
#Override
public void run() {
while (delegate.hasNext()) {
final T next = delegate.next();
synchronized (AsyncIterator.this) {
queue.add(next);
AsyncIterator.this.notify();
}
}
synchronized (AsyncIterator.this) {
delegateDone = true;
AsyncIterator.this.notify();
}
}
}.start();
}
#Override
public boolean hasNext() {
synchronized (this) {
while (queue.size() == 0 && !delegateDone) {
try {
wait();
} catch (InterruptedException e) {
throw new Error(e);
}
}
}
return queue.size() > 0;
}
#Override
public T next() {
return queue.remove();
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
}
However all the extra synchronizations, waits and notifys don't really make the code any more readable and it is easy to hide a race condition somewhere.
Any better ideas?
Update
Yes I do know about common observer/observable patterns. However the usual implementations don't foresee an end to the flow of data and they are not iterators.
I specifically want an iterator here, because actually the above mentioned loop exists in an external library and it wants an Iterator.
This is a tricky one, but I think I got the right answer this time. (I deleted my first answer.)
The answer is to use a sentinel. I haven't tested this code, and I removed try/catches for clarity:
public class AsyncIterator<T> implements Iterator<T> {
private BlockingQueue<T> queue = new ArrayBlockingQueue<T>(100);
private T sentinel = (T) new Object();
private T next;
private AsyncIterator(final Iterator<T> delegate) {
new Thread() {
#Override
public void run() {
while (delegate.hasNext()) {
queue.put(delegate.next());
}
queue.put(sentinel);
}
}.start();
}
#Override
public boolean hasNext() {
if (next != null) {
return true;
}
next = queue.take(); // blocks if necessary
if (next == sentinel) {
return false;
}
return true;
}
#Override
public T next() {
T tmp = next;
next = null;
return tmp;
}
}
The insight here is that hasNext() needs to block until the next item is ready. It also needs some kind of quit condition, and it can't use an empty queue or a boolean flag for that because of threading issues. A sentinel solves the problem without any locking or synchronization.
Edit: cached "next" so hasNext() can be called more than once.
Or save yourself the headache and use RxJava:
import java.util.Iterator;
import rx.Observable;
import rx.Scheduler;
import rx.observables.BlockingObservable;
import rx.schedulers.Schedulers;
public class RxAsyncIteratorExample {
public static void main(String[] args) throws InterruptedException {
final Iterator<Integer> slowIterator = new SlowIntegerIterator(3, 7300);
// the scheduler you use here will depend on what behaviour you
// want but io is probably what you want
Iterator<Integer> async = asyncIterator(slowIterator, Schedulers.io());
while (async.hasNext()) {
performLengthTask(async.next());
}
}
public static <T> Iterator<T> asyncIterator(
final Iterator<T> slowIterator,
Scheduler scheduler) {
final Observable<T> tObservable = Observable.from(new Iterable<T>() {
#Override
public Iterator<T> iterator() {
return slowIterator;
}
}).subscribeOn(scheduler);
return BlockingObservable.from(tObservable).getIterator();
}
/**
* Uninteresting implementations...
*/
public static void performLengthTask(Integer integer)
throws InterruptedException {
log("Running task for " + integer);
Thread.sleep(10000l);
log("Finished task for " + integer);
}
private static class SlowIntegerIterator implements Iterator<Integer> {
private int count;
private final long delay;
public SlowIntegerIterator(int count, long delay) {
this.count = count;
this.delay = delay;
}
#Override
public boolean hasNext() {
return count > 0;
}
#Override
public Integer next() {
try {
log("Starting long production " + count);
Thread.sleep(delay);
log("Finished long production " + count);
}
catch (InterruptedException e) {
throw new IllegalStateException(e);
}
return count--;
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
}
private static final long startTime = System.currentTimeMillis();
private static void log(String s) {
double time = ((System.currentTimeMillis() - startTime) / 1000d);
System.out.println(time + ": " + s);
}
}
Gives me:
0.031: Starting long production 3
7.332: Finished long production 3
7.332: Starting long production 2
7.333: Running task for 3
14.633: Finished long production 2
14.633: Starting long production 1
17.333: Finished task for 3
17.333: Running task for 2
21.934: Finished long production 1
27.334: Finished task for 2
27.334: Running task for 1
37.335: Finished task for 1

how to use concurrentskiplistmap correctly?

trying to use a concurrent skip list map. i had problems with how to use a synchronized linked hash map correctly, so i decided to give concurrent skip list map a try.
i have the same sort of problem. the unit test below fails because when i get the entry set, it has null values when size() indicates that the map is not empty. naict, i have all access to the map synchronized.
i would think that one would not need to do this (synchronized), since this a concurrent map.
the server just puts the numbers 0,1,2,3, ... into the map, keeping it's size below a threshold. it tries to put one number in for each millisecond that has passed since the server was started.
any pointers will be appreciated.
thanks
import static org.junit.Assert.*;
import java.util.*;
import java.util.Map.Entry;
import java.util.concurrent.ConcurrentSkipListMap;
import org.junit.*;
class DummyServer implements Runnable {
DummyServer(int pieces) {
t0=System.currentTimeMillis();
this.pieces=pieces;
max=pieces;
lruMap=new ConcurrentSkipListMap<Long,Long>();
}
Set<Map.Entry<Long,Long>> entrySet() {
Set<Entry<Long,Long>> entries=null;
synchronized(lruMap) {
entries=Collections.unmodifiableSet(lruMap.entrySet());
}
return entries;
}
Set<Long> keySet() {
Set<Long> entries=null;
synchronized(lruMap) {
entries=Collections.unmodifiableSet(lruMap.keySet());
}
return entries;
}
#Override public void run() {
int n=0;
while(piece<stopAtPiece) {
long target=piece(System.currentTimeMillis()-t0);
long n0=piece;
for(;piece<target;piece++,n++)
put(piece);
if(n>max+max/10) {
Long[] keys=keySet().toArray(new Long[0]);
synchronized(lruMap) {
for(int i=0;n>max;i++,n--)
lruMap.remove(keys[i]);
}
}
try {
Thread.sleep(10);
} catch(InterruptedException e) {
e.printStackTrace();
break;
}
}
}
private void put(long piece) {
synchronized(lruMap) {
lruMap.put(piece,piece);
}
}
public long piece() {
return piece;
}
public Long get(long piece) {
synchronized(lruMap) {
return lruMap.get(piece);
}
}
public int size() {
synchronized(lruMap) {
return lruMap.size();
}
}
public long piece(long dt) {
return dt/period*pieces+dt%period*pieces/period;
}
private long piece;
int period=2000;
private volatile Map<Long,Long> lruMap;
public final long t0;
protected final int pieces;
public final int max;
public long stopAtPiece=Long.MAX_VALUE;
}
public class DummyServerTestCase {
void checkMap(Long n) {
if(server.size()>0) {
final Set<Map.Entry<Long,Long>> mapValues=server.entrySet();
#SuppressWarnings("unchecked") final Map.Entry<Long,Long>[] entries=new Map.Entry[mapValues.size()];
mapValues.toArray(entries);
try {
if(entries[0]==null)
System.out.println(server.piece());
assertNotNull(entries[0]);
} catch(Exception e) {
fail(e.toString());
}
}
}
#Test public void testRunForFirstIsNotZero() {
server.stopAtPiece=1*server.pieces;
Thread thread=new Thread(server);
thread.start();
while(thread.isAlive()) {
for(long i=0;i<server.piece();i++) {
server.get(i);
Thread.yield();
checkMap(server.piece());
Thread.yield();
}
}
}
DummyServer server=new DummyServer(1000);
}
The problem is that you are performing
final Map.Entry<Long,Long>[] entries=new Map.Entry[mapValues.size()]; // size>0
mapValues.toArray(entries); // size is 0.
Between creating the array and calling toArray you are clearing the map.
If you take a copy using the Iterator you will not get this race condition.
void checkMap(Long n) {
final Set<Map.Entry<Long, Long>> mapValues = server.entrySet();
Set<Map.Entry<Long, Long>> entries = new LinkedHashSet<>(mapValues);
for (Entry<Long, Long> entry : entries) {
assertNotNull(entry);
}
}
or
void checkMap(Long n) {
for (Entry<Long, Long> entry : server.entrySet())
assertNotNull(entry);
}
First you shouldn't ever have to synchronize a thread-safe collection implementation unless you have to do some compound operation. The ConcurrentMap offers good atomic compound functions for you so even then you shouldnt have to.
Second. You should never rely on the size method to be correct while doing concurrent operations. The javadoc notes:
Beware that, unlike in most collections, the size method is not a
constant-time operation. Because of the asynchronous nature of these
maps, determining the current number of elements requires a traversal
of the elements.
The size can be different from when you start the invocation to when you get a return.
In short your test isn't a valid concurrent test. Can you elaborate more on what you're trying to achieve?

Categories

Resources