Blocking Queue Vs Classical implementation of Producer Consumer pattern

Blocking Queue Vs Classical implementation of Producer Consumer pattern - java

I got to know that we can use BlockingQueue instead of classical wait() and notify() while implementing the Producer Consumer pattern. My question is, which implementation is more efficient? In an article about blocking queues it's been written that- "you don't require to use wait and notify to communicate between Producer and Consumer"
Read more: http://javarevisited.blogspot.com/2012/02/producer-consumer-design-pattern-with.html#ixzz2lczIZ3Mo" . Does this simplicity come at the cost of efficiency??

The BlockingQueue will be faster, because it does not use wait/notify or synchronized for the queue access. All concurrent packages implement the lock-free algorithms using the Atomic-classes.
Think about a queue of 100 elements, and 1000 Threads wanting to do their work. With a synchronized implementation, for each element 999 Threads need to wait, till 1 Thread has picked it's task. With a lock-free algorithm, 100 Threads simultaneously pick their task, and only the other 900 have to wait.

If the number of objects produced/consumed every second is less than 100000, then you'll be unable to see the difference for standard or your own implementations.
Otherwise, you have following options to speed up your code:
use ArrayBlockingQueue instead of LinkedBlockingQueue: no need to create wrapper object for each transferred message. Another advantage of ArrayBlockingQueue is that producer thread is blocked if the queue is full - and indeed, producer should slow down if consumer is not fast, otherwise, we will end up with memory exhausted.
send messages in batches, say in arrays of 10 messages each. This reduces the contention of threads on shared object.
If you have to send tens of millions messages per second, look at Lmax Disruptor.

BlockingQueue is simply a class that puts wait() and notify() to this common use. Generally, doing it yourself is just reinventing the wheel, and only worth it if you have lots of producers and consumers and you can optimize in some way that's specific to your code.

Related

Does a 'blocking' queue defeat the very purpose of multi threading

The ArrayBlockingQueue will block the producer thread if the queue is full and it will block the consumer thread if the queue is empty.
Does not this concept of blocking goes against the very idea of multi threading? if I have a 'main' thread and let us say I want to delegate all 'Logging' activities to another thread. So Basically inside my main thread,I create a Runnable to log the output and I put the Runnable on an ArrayBlockingQueue. The whole purpose of doing this is have the 'main' thread return immediately without wasting any time in an expensive logging operation.
But if the queue is full, the main thread will be blocked and will wait until a spot is available. So how does it help us?

The queue doesn't block out of spite, it blocks to introduce an additional quality into the system. In this case, it's prevention of starvation.
Picture a set of threads, one of which produces work units really fast. If the queue were to be allowed unbounded growth, potentially, the "rapid producer" queue could hog all the producing capacity. Sometimes, prevention of such side-effects is more important than having all threads unblocked.

I think this is the designer's decision. If he chose blocking mode ArrayBlockingQueue provides it with put method. If the desiner dont want blocking mode ArrayBlockingQueue has offer method which will return false when queue is full but then he needs to decide what to do with regected logging event.

In your example I would consider blocking to be a feature: It prevents an OutOfMemoryError.
Generally speaking, one of your threads is just not fast enough to cope with the assigned load. So the others must slow down somehow in order not to endanger the whole application.
On the other hand, if the load is balanced, the queue will not block.

Blocking is a necessary function of multithreading. You must block to have synchronized access to data. It does not defeat the purpose of multithreading.
I would suggest throwing an exception when the producer attempts to submit an item to a queue which is full. There are methods to test if the capacity is full beforehand I believe.
This would allow the invoking code to decide how it wants to handle a full queue.
If execution order when processing items from the queue is unimportant, I recommend using a threadpool (known as an ExecutorService in Java).

It depends on the nature of your multi threading philosophy. For those of us who favour Communicating Sequential Processes a blocking queue is nearly perfect. In fact, the ideal would be one where no message can be put into the queue at all unless the receiver is ready to receive it.
So no, I don't think that a blocking queue goes against the very purpose of multi-threading. In fact, the scenario that you describe (the main thread eventually getting stalled) is a good illustration of the major problem with the actor-model of multi-threading; you've no idea whether or not it will deadlock / block, and you can't exhaustively test for it either.
In contrast, imagine a blocking queue that is zero messages deep. That way for the system to work at all you'd have to find a way to ensure that the logger is always guaranteed to be able to receive a message from the main thread. That's CSP. It might mean that in your hypothetical logger thread you have to have application defined buffering (as opposed to some framework developer's best guess of how deep a FIFO should be), a fast I/O subsystem, checks for keeping up, ways of dealing with falling behind, etc. In short it doesn't let you get away with it, you're forced to address every aspect of your system's performance.
That is of course harder, but that way you end up with a system that's definitely OK rather than the questionable "maybe" that you have if your blocking queues are an unknown number of messages deep.

It sounds like you have the general idea right of why you'd use something like an ArrayBlockingQueue to talk between threads.
Having a blocking queue gives you the option to do something different in case something goes wrong with your background worker threads, rather than blindly adding more requests to the queue. If there is room in the queue, there is no blocking.
For your specific use case, though, I would use ExecutorService rather than reading/writing queues directly, which creates a pool of background worker threads:
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html
pool = Executors.newFixedThreadPool(poolSize);
pool.submit(myRunnable);

A multithreaded program is non-deterministic insofar as you can't say beforehand: n producer actions will take exactly as long as m consumer actions. Therefore, synchronization between n producers and m consumers is necessary in every case.
You'll want to choose the queue size so that the number of active producers and consumers is maximized most of the time. But the thread model of java does not guarantee that any consumer will run unless it is the only unblocked thread. (Yet, of course, on multi-core CPUs it is very likely that the consumer will run).

You have to make a choice about what to do when a Queue is full. In the case of an Array Blocking queue, that choice is to wait.
Another option would be to just throw away new Objects if the queue was full; you can achieve this with offer.
You have to make a trade-off.

Java, blocked queues

I've been reading about blocking queues and certain questions appeared. All the examples that i've read demonstrated only situations where there are only one consumer and one producer thread. The question is: suppose we have 1 producer and 3 consumers and in the current moment all consumers are called take() method but the queue is empty so they are all waiting for appearing first element. Which of the consumer threads will take the first element when it will appear? The consumer thread which called take() first?

I don't know if you can tell. The real question is: why do you need to know? All listeners should be equivalent. It should not matter which one handles a request. If you have to know, you designed and implemented it incorrectly.

check ArrayBlockingQueue(int capacity, boolean fair) if fair is true,then the queue accesses for threads blocked on insertion or removal, are processed in FIFO order.

Which of the consumer threads will take the first element when it will appear? The consumer thread which called take() first?
This is tied to the blocking queue implementation as well as the JVM in question but the short answer is most likely yes. Each of the threads will be waiting on a condition and the first thread in the wait queue will be awoken when the condition is signaled.
That said, you should not depend on this functionality since it is very dependent on the particulars of the blocking queue in question as well as the JVM and OS version.

I agree with duffymo, the idea of having multiple threads waiting indefinitely for some new elements to pop up in the queue does not sound very well structured.
Also, if you need to know which one of the consumers remove the element, that makes me think that the consumers are actually doing different things, giving life to different ouputs on different scenarios, depending on the order with which the consumers perform the take(). If that is the case you might want to have different queues for the different threads.
If you are not planning to change your code, what about having the threads to perform a poll on regular basis?

Is multithreading a part of Queue contract?

Currently I have an algorithm which somewhat looks like web-spiders or file search systems - it has a collection of the elements to process and processing elements can lead to enqueuing more elements.
However this algorithm is single threaded - it's because I fetch data from the db and would like to have only single db connection at once.
In my current situation performance is not critical - I'm doing this only for the visualization purposes to ease up debugging.
For me it seems natural to use queue abstraction, however it's seems that using queues implies multithreading - as I understand, most of standard java queue implementations reside in java.util.concurrent package.
I understand that I can go on with any data structure that support pull and push but I would like to know what data structure is more natural to use in this case(is it ok to use a queue in a single threaded application?).

It's basically fine to use the java.util.concurrent structures with a single thread.
The main thing to watch out for is blocking calls. If you use a bounded-size structure like an ArrayBlockingQueue, and you call the put method on a queue that's full, then the calling thread will block until there is space in the queue. If you use any kind of queue and you call take when it's empty, the calling thread will block until there's something in the queue. If you application is single-threaded, than those things can never happen, so that means blocking forever.
To avoid put blocking, you could use an unbounded structure like a LinkedBlockingQueue. To avoid blocking on removal, use a non-blocking operation - remove throws an exception if the queue is empty, and poll returns null.
Having said that, there are implementations of the Queue interface that are not in java.util.concurrent. ArrayDeque would probably be a good choice.

Queue is defined in java.util. LinkedList is a Queue and not very concurrency-friendly. None of the Queue method blocks, so they should be safe from a single threaded perspective.

It is ok to use any queue in a single threaded application. Synchronization overhead, in absence of concurrent threads, should be negligible, and is noticeable only if element processing time is very short.

If you want to use a Queue with a ThreadPool I sugges using an ExecutorService which combines both for you. The ExecutorService use LinkedBlockingQueue by default.
http://tutorials.jenkov.com/java-util-concurrent/executorservice.html
http://recursor.blogspot.co.uk/2007/03/mini-executorservice-future-how-to-for.html
http://www.vogella.com/articles/JavaConcurrency/article.html

Reading from multiple BlockingQueues within a single thread

I have three Java's LinkedBlockingQueue instances and I'd like to read from them (take operation) only using one thread. The naive approach is to have one thread per queue.
Is there anything like the UNIX select system call for blocking queues in Java?
Thanks.

Well, those BlockingQueues were really meant to be serviced by their own Threads.
Something I'd consider trying is to set up a 4th queue for much smaller items, say Booleans, and have the offer() calls on each of the 3 other queues accompany their insertion by inserting a Boolean into that 4th queue. Your thread can then go to sleep on the 4th queue, and when it wakes up it can peek() in the other 3 to find out where to get the goods.
Highly inelegant solution, I think, and I suspect there are possible race conditions where you won't be cleanly woken up some times. But it should basically work.

LinkedBlockingQueue vs ConcurrentLinkedQueue

My question relates to this question asked earlier. In situations where I am using a queue for communication between producer and consumer threads would people generally recommend using LinkedBlockingQueue or ConcurrentLinkedQueue?
What are the advantages / disadvantages of using one over the other?
The main difference I can see from an API perspective is that a LinkedBlockingQueue can be optionally bounded.

For a producer/consumer thread, I'm not sure that ConcurrentLinkedQueue is even a reasonable option - it doesn't implement BlockingQueue, which is the fundamental interface for producer/consumer queues IMO. You'd have to call poll(), wait a bit if you hadn't found anything, and then poll again etc... leading to delays when a new item comes in, and inefficiencies when it's empty (due to waking up unnecessarily from sleeps).
From the docs for BlockingQueue:
BlockingQueue implementations are designed to be used primarily for producer-consumer queues
I know it doesn't strictly say that only blocking queues should be used for producer-consumer queues, but even so...

This question deserves a better answer.
Java's ConcurrentLinkedQueue is based on the famous algorithm by Maged M. Michael and Michael L. Scott for non-blocking lock-free queues.
"Non-blocking" as a term here for a contended resource (our queue) means that regardless of what the platform's scheduler does, like interrupting a thread, or if the thread in question is simply too slow, other threads contending for the same resource will still be able to progress. If a lock is involved for example, the thread holding the lock could be interrupted and all threads waiting for that lock would be blocked. Intrinsic locks (the synchronized keyword) in Java can also come with a severe penalty for performance - like when biased locking is involved and you do have contention, or after the VM decides to "inflate" the lock after a spin grace period and block contending threads ... which is why in many contexts (scenarios of low/medium contention), doing compare-and-sets on atomic references can be much more efficient and this is exactly what many non-blocking data-structures are doing.
Java's ConcurrentLinkedQueue is not only non-blocking, but it has the awesome property that the producer does not contend with the consumer. In a single producer / single consumer scenario (SPSC), this really means that there will be no contention to speak of. In a multiple producer / single consumer scenario, the consumer will not contend with the producers. This queue does have contention when multiple producers try to offer(), but that's concurrency by definition. It's basically a general purpose and efficient non-blocking queue.
As for it not being a BlockingQueue, well, blocking a thread to wait on a queue is a freakishly terrible way of designing concurrent systems. Don't. If you can't figure out how to use a ConcurrentLinkedQueue in a consumer/producer scenario, then just switch to higher-level abstractions, like a good actor framework.

LinkedBlockingQueue blocks the consumer or the producer when the queue is empty or full and the respective consumer/producer thread is put to sleep. But this blocking feature comes with a cost: every put or take operation is lock contended between the producers or consumers (if many), so in scenarios with many producers/consumers the operation might be slower.
ConcurrentLinkedQueue is not using locks, but CAS, on its add/poll operations potentially reducing contention with many producer and consumer threads. But being an "wait free" data structure, ConcurrentLinkedQueue will not block when empty, meaning that the consumer will need to deal with the poll() returning null values by "busy waiting", for example, with the consumer thread eating up CPU.
So which one is "better" depends on the number of consumer threads, on the rate they consume/produce, etc. A benchmark is needed for each scenario.
One particular use case where the ConcurrentLinkedQueue is clearly better is when producers first produce something and finish their job by placing the work in the queue and only after the consumers starts to consume, knowing that they will be done when queue is empty. (here is no concurrency between producer-consumer but only between producer-producer and consumer-consumer)

Another solution (that does not scale well) is rendezvous channels : java.util.concurrent SynchronousQueue

If your queue is non expandable and contains only one producer/consumer thread. You can use lockless queue (You don't need to lock the data access).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.