Producer-consumer problem with a twist - java

The producer is finite, as should be the consumer.
The problem is when to stop, not how to run.
Communication can happen over any type of BlockingQueue.
Can't rely on poisoning the queue(PriorityBlockingQueue)
Can't rely on locking the queue(SynchronousQueue)
Can't rely on offer/poll exclusively(SynchronousQueue)
Probably even more exotic queues in existence.
Creates a queued seq on another (presumably lazy) seq s. The queued
seq will produce a concrete seq in the background, and can get up to
n items ahead of the consumer. n-or-q can be an integer n buffer
size, or an instance of java.util.concurrent BlockingQueue. Note
that reading from a seque can block if the reader gets ahead of the
producer.
http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/seque
My attempts so far + some tests: https://gist.github.com/934781
Solutions in Java or Clojure appreciated.

class Reader {
private final ExecutorService ex = Executors.newSingleThreadExecutor();
private final List<Object> completed = new ArrayList<Object>();
private final BlockingQueue<Object> doneQueue = new LinkedBlockingQueue<Object>();
private int pending = 0;
public synchronized Object take() {
removeDone();
queue();
Object rVal;
if(completed.isEmpty()) {
try {
rVal = doneQueue.take();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
pending--;
} else {
rVal = completed.remove(0);
}
queue();
return rVal;
}
private void removeDone() {
Object current = doneQueue.poll();
while(current != null) {
completed.add(current);
pending--;
current = doneQueue.poll();
}
}
private void queue() {
while(pending < 10) {
pending++;
ex.submit(new Runnable() {
#Override
public void run() {
doneQueue.add(compute());
}
private Object compute() {
//do actual computation here
return new Object();
}
});
}
}
}

Not exactly an answer I'm afraid, but a few remarks and more questions. My first answer would be: use clojure.core/seque. The producer needs to communicate end-of-seq somehow for the consumer to know when to stop, and I assume the number of produced elements is not known in advance. Why can't you use an EOS marker (if that's what you mean by queue poisoning)?
If I understand your alternative seque implementation correctly, it will break when elements are taken off the queue outside your function, since channel and q will be out of step in that case: channel will hold more #(.take q) elements than there are elements in q, causing it to block. There might be ways to ensure channel and q are always in step, but that would probably require implementing your own Queue class, and it adds so much complexity that I doubt it's worth it.
Also, your implementation doesn't distinguish between normal EOS and abnormal queue termination due to thread interruption - depending on what you're using it for you might want to know which is which. Personally I don't like using exceptions in this way — use exceptions for exceptional situations, not for normal flow control.

Related

Reading from blocking queue with multiple threads

I have a producer-consumer model using a blocking queue where 4 threads read files from a directory puts it to the blocking queue and 4 threads(consumer) reads from blocking queue.
My problem is every time only one consumer reads from the Blockingqueue and the other 3 consumer threads are not reading:
final BlockingQueue<byte[]> queue = new LinkedBlockingQueue<>(QUEUE_SIZE);
CompletableFuture<Void> completableFutureProducer = produceUrls(files, queue, checker);
//not providing code for produceData , it is working file with all 4 //threads writing to Blocking queue. Here is the consumer code.
private CompletableFuture<Validator> consumeData(
final Response checker,
final CompletableFuture<Void> urls
) {
return CompletableFuture.supplyAsync(checker, 4)
.whenComplete((result, err) -> {
if (err != null) {
LOG.error("consuming url worker failed!", err);
urls.cancel(true);
}
});
}
completableFutureProducer.join();
completableFutureConsumer.join();
This is my code. Can someone tell me what I am doing wrong? Or help with correct code.
Why is one consumer reading from the Blocking queue.
Adding code for Response class reading from Blocking queue :
#Slf4j
public final class Response implements Supplier<Check> {
private final BlockingQueue<byte[]> data;
private final AtomicBoolean producersComplete;
private final Calendar calendar = Calendar.getInstance();
public ResponseCode(
final BlockingQueue<byte[]> data
) {
this.data = data;
producersDone = new AtomicBoolean();
}
public void notifyProducersDone() {
producersComplete.set(true);
}
#Override
public Check get() {
try {
Check check = null;
try {
while (!data.isEmpty() || !producersDone.get()) {
final byte[] item = data.poll(1, TimeUnit.SECONDS);
if (item != null) {
LOG.info("{}",new String(item));
// I see only one thread printing result here .
validator = validateData(item);
}
}
} catch (InterruptedException | IOException e) {
Thread.currentThread().interrupt();
throw new WriteException("Exception occurred while data validation", e);
}
return check;
} finally {
LOG.info("Done reading data from BlockingQueue");
}
}
}
It's hard to diagnose from this alone, but it's probably not correct to check for data.isEmpty() because the queue may happen to be temporarily empty (but later get items). So your threads might exit as soon as they encounter a temporarily empty queue.
Instead, you can exit if producers were done AND you got an empty result from the poll. That way the threads only exit when there are truly no more items to process.
It's a bit odd though that you are returning the result of the last item (alone). Are you sure this is what you want?
EDIT: I've done something very similar recently. Here is a class that reads from a file, transforms the lines in a multi-threaded way, then writes to a different file (the order of lines are preserved).
It also uses a BlockingQueue. It's very similar to your code, but it doesn't check for quue.isEmpty() for the aforementioned reason. It works fine for me.
4+4 threads is not that many, so you better do not use asynchronous tools like CompletableFuture. Simple multithreaded program would be simpler and work faster.
Having
BlockingQueue<byte[]> data;
don't use data.poll();
use data.take();
When you have lets say 1 item in the queue, and 4 consumers, one of them will poll the item rendering queue to be empty. Then 3 of the rest of the consumers checks if queue.isEmpty(), and since it is - quits the loop.

Java wait/notify implementation without synchronized

I have a scenario with dozens of producer and one single consumer. Timing is critical: for performance reason I want to avoid any locking of producers and I want the consumer to wait as little as possible when no messages are ready.
I've started using a ConcurrentLinkedQueue, but I don't like to call sleep on the consumer when queue.poll() == null because I could waste precious milliseconds, and I don't want to use yield because I end up wasting cpu.
So I came to implement a sort of ConcurrentBlockingQueue so that the consumer can run something like:
T item = queue.poll();
if(item == null) {
wait();
item = queue.poll();
}
return item;
And producer something like:
queue.offer(item);
notify();
Unfortunately wait/notify only works on synchronized block, which in turn would drastically reduce producer performance. Is there any other implementation of wait/notify mechanism that does not require synchronization?
I am aware of the risks related to not having wait and notify synchronized, and I managed to resolve them by having an external thread running the following:
while(true) {
notify();
sleep(100);
}
I've started using a ConcurrentLinkedQueue, but I don't like to call sleep on the consumer when queue.poll() == null
You should check the BlockingQueue interface, which has a take method that blocks until an item becomes available.
It has several implementations as detailed in the javadoc, but ConcurrentLinkedQueue is not one of them:
All Known Implementing Classes:
ArrayBlockingQueue, DelayQueue, LinkedBlockingDeque, LinkedBlockingQueue, LinkedTransferQueue, PriorityBlockingQueue, SynchronousQueue
I came out with the following implementation:
private final ConcurrentLinkedQueue<T> queue = new ConcurrentLinkedQueue<>();
private final Semaphore semaphore = new Semaphore(0);
private int size;
public void offer(T item) {
size += 1;
queue.offer(item);
semaphore.release();
}
public T poll(long timeout, TimeUnit unit) {
semaphore.drainPermits();
T item = queue.poll();
if (item == null) {
try {
semaphore.tryAcquire(timeout, unit);
} catch (InterruptedException ex) {
}
item = queue.poll();
}
if (item == null) {
size = 0;
} else {
size = Math.max(0, size - 1);
}
return item;
}
/** An inaccurate representation O(1)-access of queue size. */
public int size() {
return size;
}
With the following properties:
producers never go to SLEEP state (which I think can go with BlockingQueue implementations that use Lock in offer(), or with synchronized blocks using wait/notify)
consumer only goes to SLEEP state when queue is empty but it is soon woken up whenever a producer offer an item (no fixed-time sleep, no yield)
consumer can be sometime woken up even with empty queue, but it's ok here to waste some cpu cycle
Is there any equivalent implementation in jdk that I'm not aware of? Open for criticism.

Iterate through threads run via ThreadPoolTaskExecutor

I have a ThreadPoolTaskExecutor and when I create a Process which implements Runnable I run it via: executor.execute(process).
Now, before calling execute I want to check one field from Process object and compare it with ALL other currently running processes, executed by my ThreadPoolTaskExecutor. How I can do that, not generating a concurrent problem?
Code:
public class MyApp {
ThreadPoolTaskExecutor executor;
//...
public void runProcesses {
Process firstone = new Process(1);
Process nextOne = new Process(1);
// iterate through all processes started via executor and currently running,
// verify if there is any process.getX() == 1, if not run it
executor.execute(firstone );
//wait till firstone will end becouse have the same value of X
executor.execute(nextOne); // this cant be perform until the first one will end
}
}
public class Process {
private int x;
//...
public Process (int x){
this.x = x;
}
public int getX(){
return this.x;
}
}
I was thinking about createing simple Set of process started and add new one to it. But I have problem how to determine is it still running and remove it from set when it is done. So now I'm thinking about iterating through running threads, but completly dunno how.
I think that your initial idea is pretty good and can be made to work with not too much code.
It will require some tinkering in order to decouple "is a Runnable for this value already running" from "execute this Runnable", but here's a rough illustration that doesn't take care about that:
Implement equals() and hashCode() in Process, so that instances can safely be used in unordered sets and maps.
Create a ConcurrentMap<Process, Boolean>
You won't be using Collections.newSetFromMap(new ConcurrentHashMap<Process, Boolean>) because you'd want to use the map's putIfAbsent() method.
Try to add in it using putIfAbsent() each Process that you will be submitting and bail if the returned value is not null.
A non-null return value means that there's already an equivalent Process in the map (and therefore being processed).
The trivial and not very clean solution will be to inject a reference to the map in each Process instance and have putIfAbsent(this, true) as the first thing you do in your run() method.
Remove from it each Process that has finished processing.
The trivial and not very clean solution will be inject a reference to the map in each Process instance and have remove(this) as the last thing you do in your run() method.
Other solutions can have Process implement Callable and return its unique value as a result, so that it can be removed from the map, or use CompletableFuture and its thenAccept() callback.
Here's a sample that illustrates the trivial and not very clean solution described above (code too long to paste directly here).
Though #Dimitar provided very good solution for solving this problem I want to make an addition with another approach.
Having your requirements, it seems like you need to keep all submitted Processes, slicing them by x into separate queues and executing processes in queues one by one.
API of ThreadPoolExecutor empowers to enhance behaviour of Executor and I came to the following implementation of ThreadPoolExecutor:
ThreadPoolExecutor executor = new ThreadPoolExecutor(2, 2,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<>()) {
private final ConcurrentMap<Integer, Queue<Runnable>> processes = new ConcurrentHashMap<>();
#Override
public void execute(Runnable command) {
if (command instanceof Process) {
int id = ((Process) command).getX();
Queue<Runnable> t = new ArrayDeque<>();
Queue<Runnable> queue = this.processes.putIfAbsent(id, t);
if (queue == null) {
queue = t;
}
synchronized (queue) {
queue.add(command);
if (!processes.containsKey(id)) {
processes.put(id, queue);
}
if (queue.size() == 1) {
super.execute(queue.peek()); // removal of current process would be done in #afterExecute
}
}
} else {
super.execute(command);
}
}
#Override
protected void afterExecute(Runnable r, Throwable t) {
super.afterExecute(r, t);
if (r instanceof Process) {
int id = ((Process) r).getX();
Queue<Runnable> queue = this.processes.get(id);
synchronized (queue) {
queue.poll(); // remove completed prev process
Runnable nextProcess = queue.peek(); // retrieve next process
if (nextProcess != null) {
super.execute(nextProcess);
} else {
this.processes.remove(id);
}
}
}
}
}

AtomicReference to a mutable object and visibility

Say I have an AtomicReferenceto a list of objects:
AtomicReference<List<?>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>());
Thread A adds elements to this list: batch.get().add(o);
Later, thread B takes the list and, for example, stores it in a DB: insertBatch(batch.get());
Do I have to do additional synchronization when writing (Thread A) and reading (Thread B) to ensure thread B sees the list the way A left it, or is this taken care of by the AtomicReference?
In other words: if I have an AtomicReference to a mutable object, and one thread changes that object, do other threads see this change immediately?
Edit:
Maybe some example code is in order:
public void process(Reader in) throws IOException {
List<Future<AtomicReference<List<Object>>>> tasks = new ArrayList<Future<AtomicReference<List<Object>>>>();
ExecutorService exec = Executors.newFixedThreadPool(4);
for (int i = 0; i < 4; ++i) {
tasks.add(exec.submit(new Callable<AtomicReference<List<Object>>>() {
#Override public AtomicReference<List<Object>> call() throws IOException {
final AtomicReference<List<Object>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>(batchSize));
Processor.this.parser.parse(in, new Parser.Handler() {
#Override public void onNewObject(Object event) {
batch.get().add(event);
if (batch.get().size() >= batchSize) {
dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));
}
}
});
return batch;
}
}));
}
List<Object> remainingBatches = new ArrayList<Object>();
for (Future<AtomicReference<List<Object>>> task : tasks) {
try {
AtomicReference<List<Object>> remainingBatch = task.get();
remainingBatches.addAll(remainingBatch.get());
} catch (ExecutionException e) {
Throwable cause = e.getCause();
if (cause instanceof IOException) {
throw (IOException)cause;
}
throw (RuntimeException)cause;
}
}
// these haven't been flushed yet by the worker threads
if (!remainingBatches.isEmpty()) {
dao.insertBatch(remainingBatches);
}
}
What happens here is that I create four worker threads to parse some text (this is the Reader in parameter to the process() method). Each worker saves the lines it has parsed in a batch, and flushes the batch when it is full (dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));).
Since the number of lines in the text isn't a multiple of the batch size, the last objects end up in a batch that isn't flushed, since it's not full. These remaining batches are therefore inserted by the main thread.
I use AtomicReference.getAndSet() to replace the full batch with an empty one. It this program correct with regards to threading?
Um... it doesn't really work like this. AtomicReference guarantees that the reference itself is visible across threads i.e. if you assign it a different reference than the original one the update will be visible. It makes no guarantees about the actual contents of the object that reference is pointing to.
Therefore, read/write operations on the list contents require separate synchronization.
Edit: So, judging from your updated code and the comment you posted, setting the local reference to volatile is sufficient to ensure visibility.
I think that, forgetting all the code here, you exact question is this:
Do I have to do additional synchronization when writing (Thread A) and
reading (Thread B) to ensure thread B sees the list the way A left it,
or is this taken care of by the AtomicReference?
So, the exact response to that is: YES, atomic take care of visibility. And it is not my opinion but the JDK documentation one:
The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in The Java Language Specification, Third Edition (17.4 Memory Model).
I hope this helps.
Adding to Tudor's answer: You will have to make the ArrayList itself threadsafe or - depending on your requirements - even larger code blocks.
If you can get away with a threadsafe ArrayList you can "decorate" it like this:
batch = java.util.Collections.synchronizedList(new ArrayList<Object>());
But keep in mind: Even "simple" constructs like this are not threadsafe with this:
Object o = batch.get(batch.size()-1);
The AtomicReference will only help you with the reference to the list, it will not do anything to the list itself. More particularly, in your scenario, you will almost certainly run into problems when the system is under load where the consumer has taken the list while the producer is adding an item to it.
This sound to me like you should be using a BlockingQueue. You can then Limit the memory footprint if you producer is faster than your consumer and let the queue handle all contention.
Something like:
ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (50);
// ... Producer
queue.put(o);
// ... Consumer
List<Object> queueContents = new ArrayList<Object> ();
// Grab everything waiting in the queue in one chunk. Should never be more than 50 items.
queue.drainTo(queueContents);
Added
Thanks to #Tudor for pointing out the architecture you are using. ... I have to admit it is rather strange. You don't really need AtomicReference at all as far as I can see. Each thread owns its own ArrayList until it is passed on to dao at which point it is replaced so there is no contention at all anywhere.
I am a little concerned about you creating four parser on a single Reader. I hope you have some way of ensuring each parser does not affect the others.
I personally would use some form of producer-consumer pattern as I have described in the code above. Something like this perhaps.
static final int PROCESSES = 4;
static final int batchSize = 10;
public void process(Reader in) throws IOException, InterruptedException {
final List<Future<Void>> tasks = new ArrayList<Future<Void>>();
ExecutorService exec = Executors.newFixedThreadPool(PROCESSES);
// Queue of objects.
final ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (batchSize * 2);
// The final object to post.
final Object FINISHED = new Object();
// Start the producers.
for (int i = 0; i < PROCESSES; i++) {
tasks.add(exec.submit(new Callable<Void>() {
#Override
public Void call() throws IOException {
Processor.this.parser.parse(in, new Parser.Handler() {
#Override
public void onNewObject(Object event) {
queue.add(event);
}
});
// Post a finished down the queue.
queue.add(FINISHED);
return null;
}
}));
}
// Start the consumer.
tasks.add(exec.submit(new Callable<Void>() {
#Override
public Void call() throws IOException {
List<Object> batch = new ArrayList<Object>(batchSize);
int finishedCount = 0;
// Until all threads finished.
while ( finishedCount < PROCESSES ) {
Object o = queue.take();
if ( o != FINISHED ) {
// Batch them up.
batch.add(o);
if ( batch.size() >= batchSize ) {
dao.insertBatch(batch);
// If insertBatch takes a copy we could merely clear it.
batch = new ArrayList<Object>(batchSize);
}
} else {
// Count the finishes.
finishedCount += 1;
}
}
// Finished! Post any incopmplete batch.
if ( batch.size() > 0 ) {
dao.insertBatch(batch);
}
return null;
}
}));
// Wait for everything to finish.
exec.shutdown();
// Wait until all is done.
boolean finished = false;
do {
try {
// Wait up to 1 second for termination.
finished = exec.awaitTermination(1, TimeUnit.SECONDS);
} catch (InterruptedException ex) {
}
} while (!finished);
}

How to correctly use synchronized?

This piece of code:
synchronized (mList) {
if (mList.size() != 0) {
int s = mList.size() - 1;
for (int i = s; i > 0; i -= OFFSET) {
mList.get(i).doDraw(canv);
}
getHead().drawHead(canv);
}
}
Randomly throws AIOOBEs. From what I've read, the synchronized should prevent that, so what am I doing wrong?
Edits:
AIOOBE = Array Index Out Of Bounds Exception
The code's incomplete, cut down to what is needed. But to make you happy, OFFSET is 4, and just imagine that there is a for-loop adding a bit of data at the beginning. And a second thread reading and / or modifying the list.
Edit 2:
I've noticed it happens when the list is being drawn and the current game ends. The draw-thread hasn't drawn all elements when the list is emptied. Is there a way of telling the game to wait with emtying the list untill it's empty?
Edit 3:
I've just noticed that I'm not sure if this is a multi-threading problem. Seems I only have 2 threads, one for calculating and drawing and one for user input.. Gonna have to look into this a bit more than I thought.
What you're doing looks right... but that's all:
It doesn't matter on what object you synchronize, it needn't be the list itself.
What does matter is if all threads always synchronize on the same object, when accessing a shared resource.
Any access to SWING (or another graphic library) must happen in the AWT-Thread.
To your edit:
I've noticed it happens when the list is being drawn and the current game ends. The draw-thread hasn't drawn all elements when the list is emptied. Is there a way of telling the game to wait with emtying the list untill it's empty?
I think you mean "...wait with emptying the list until the drawing has completed." Just synchronize the code doing it on the same lock (i.e., the list itself in your case).
Again: Any access to a shared resource must be protected somehow. It seems like you're using synchronized just here and not where you're emptying the list.
The safe solution is to only allow one thread to create objects, add and remove them from a List after the game has started.
I had problems myself with random AIOOBEs erros and no synchornize could solve it properly plus it was slowing down the response of the user.
My solution, which is now stable and fast (never had an AIOOBEs since) is to make UI thread inform the game thread to create or manipulate an object by setting a flag and coordinates of the touch into the persistent variables.
Since the game thread loops about 60 times per second this proved to be sufficent to pick up the message from the UI thread and do something.
This is a very simple solution and it works great!
My suggestion is to use a BlockingQueue and I think you are looking for this solution also. How you can do it? It is already shown with an example in the javadoc :)
class Producer implements Runnable {
private final BlockingQueue queue;
Producer(BlockingQueue q) { queue = q; }
public void run() {
try {
while (true) { queue.put(produce()); }
} catch (InterruptedException ex) { ... handle ...}
}
Object produce() { ... }
}
class Consumer implements Runnable {
private final BlockingQueue queue;
Consumer(BlockingQueue q) { queue = q; }
public void run() {
try {
while (true) { consume(queue.take()); }
} catch (InterruptedException ex) { ... handle ...}
}
void consume(Object x) { ... }
}
class Setup {
void main() {
BlockingQueue q = new SomeQueueImplementation();
Producer p = new Producer(q);
Consumer c1 = new Consumer(q);
Consumer c2 = new Consumer(q);
new Thread(p).start();
new Thread(c1).start();
new Thread(c2).start();
}
}
The beneficial things for you are, you need not to worry about synchronizing your mList. BlockingQueue offers 10 special method. You can check it in the doc. Few from javadoc:
BlockingQueue methods come in four forms, with different ways of handling operations that cannot be satisfied immediately, but may be satisfied at some point in the future: one throws an exception, the second returns a special value (either null or false, depending on the operation), the third blocks the current thread indefinitely until the operation can succeed, and the fourth blocks for only a given maximum time limit before giving up.
To be in safe side: I am not experienced with android. So not certain whether all java packages are allowed in android. But at least it should be :-S, I wish.
You are getting Index out of Bounds Exception because there are 2 threads that operate on the list and are doing it wrongly.
You should have been synchronizing at another level, in such a way that no other thread can iterate through the list while other thread is modifying it! Only on thread at a time should 'work on' the list.
I guess you have the following situation:
//piece of code that adds some item in the list
synchronized(mList){
mList.add(1, drawableElem);
...
}
and
//code that iterates you list(your code simplified)
synchronized (mList) {
if (mList.size() != 0) {
int s = mList.size() - 1;
for (int i = s; i > 0; i -= OFFSET) {
mList.get(i).doDraw(canv);
}
getHead().drawHead(canv);
}
}
Individually the pieces of code look fine. They seam thread-safe. But 2 individual thread-safe pieces of code might not be thread safe at a higher level!
It's just you would have done the following:
Vector v = new Vector();
if(v.length() == 0){ v.length() itself is thread safe!
v.add("elem"); v.add() itself is also thread safe individually!
}
BUT the compound operation is NOT!
Regards,
Tiberiu

Categories

Resources