I need a queue that can be processed by multiple readers.
The readers will dequeue an element and send it to a REST service.
What's important to note are:
Each reader should be dequeueing different elements. If the queue has elements A, B & C, Thread 1 should dequeue A and Thread 2 should dequeue B in concurrent fashion. And so forth until there's nothing in the queue.
I understand that it is CPU intensive to always run in busy loop, peeking into the queue for items. So I am not sure if a Blocking queue is a good option.
What are my options?
ConcurrentLinkedQueue or LinkedBlockingQueue are two options that immediately come to mind, depending on whether you want blocking behavior or not.
As Adamski notes, the take() method of the LinkedBlockingQueue does not needlessly burn cpu cycles while waiting for data to arrive.
I am not sure from your question description whether the threads need to dequeue elements in a strict round-robin fashion. Assuming this isn't a restriction you can use BlockingQueue's take() method, which will cause the thread to block until data is available (therefore not consuming CPU cycles).
Also note that take() implementations are atomic (e.g. LinkedBlockingQueue): If multiple threads are blocked on take() and a single element is enqueued then only one thread's take() call will return; the other will remain blocked.
The major difference between ConcurrentLinkedQueue and LinkedBLockingQueue is its throughput. Under moderate thread contention ConcurrentLinkedQueue greatly out performs all other BlockingQueues. Under heavy contetion, however, a BlockingQueue is a slightly better choice as it will appropriately put contending threads into the waiting thread set.
Related
I got to know that we can use BlockingQueue instead of classical wait() and notify() while implementing the Producer Consumer pattern. My question is, which implementation is more efficient? In an article about blocking queues it's been written that- "you don't require to use wait and notify to communicate between Producer and Consumer"
Read more: http://javarevisited.blogspot.com/2012/02/producer-consumer-design-pattern-with.html#ixzz2lczIZ3Mo" . Does this simplicity come at the cost of efficiency??
The BlockingQueue will be faster, because it does not use wait/notify or synchronized for the queue access. All concurrent packages implement the lock-free algorithms using the Atomic-classes.
Think about a queue of 100 elements, and 1000 Threads wanting to do their work. With a synchronized implementation, for each element 999 Threads need to wait, till 1 Thread has picked it's task. With a lock-free algorithm, 100 Threads simultaneously pick their task, and only the other 900 have to wait.
If the number of objects produced/consumed every second is less than 100000, then you'll be unable to see the difference for standard or your own implementations.
Otherwise, you have following options to speed up your code:
use ArrayBlockingQueue instead of LinkedBlockingQueue: no need to create wrapper object for each transferred message. Another advantage of ArrayBlockingQueue is that producer thread is blocked if the queue is full - and indeed, producer should slow down if consumer is not fast, otherwise, we will end up with memory exhausted.
send messages in batches, say in arrays of 10 messages each. This reduces the contention of threads on shared object.
If you have to send tens of millions messages per second, look at Lmax Disruptor.
BlockingQueue is simply a class that puts wait() and notify() to this common use. Generally, doing it yourself is just reinventing the wheel, and only worth it if you have lots of producers and consumers and you can optimize in some way that's specific to your code.
I am using a LinkedBlockingQueue together with the producer/consumer pattern to buffer tasks. To add tasks to the queue I use the method for my producers: Queue.put(Object); To take a task form my queue I use for my consumers: Queue.take(Object);
I found in the Java api that both these methods will block until they the queue becomes available. My problem is: I know for a fact that there are more producers of tasks in my system then consumers. And all my tasks need to be processed. So I need my consumers, when blocked, to have priority over the producers to get the queue.
Is their a way to do this without changing the methods of LinkedBlockingQueue to much?
LinkedBlockingQueue uses two ReenterantLocks lock.
private final ReentrantLock putLock = new ReentrantLock();
private final ReentrantLock takeLock = new ReentrantLock();
Since both the locks are seperate and put and take aquires seperate locks for carrying out their operating blocking one operation would not impact other operation.
Cheers !!
There is no need to prioritize consumers over producers, because they block under entirely different conditions: if the producer is blocked because the queue is full, then the consumers won't be blocked as a result of the queue being empty.
For example, producer1 has a blocked put call because the queue is full. Consumer1 then executes take, which proceeds as normal because the queue is not empty (unless your queue has a capacity of 0, which would be silly) - the consumer doesn't know or care that a producer's put call is blocked, all it cares about is that the queue is not empty.
The producers being blocked doesn't block consumers due to multiple independent locks.
take( states:
Retrieves and removes the head of this queue, waiting if necessary until an element becomes available.
put( states:
Inserts the specified element at the tail of this queue, waiting if necessary for space to become available
If there is no space, then put will block but take won't get blocked as it's by design waiting only if the queue is empty, obviously not the case here.
Original comment:
As far as I know, this queue, by design, won't block consumers even if producers are blocked due to the queue being full.
I've been reading about blocking queues and certain questions appeared. All the examples that i've read demonstrated only situations where there are only one consumer and one producer thread. The question is: suppose we have 1 producer and 3 consumers and in the current moment all consumers are called take() method but the queue is empty so they are all waiting for appearing first element. Which of the consumer threads will take the first element when it will appear? The consumer thread which called take() first?
I don't know if you can tell. The real question is: why do you need to know? All listeners should be equivalent. It should not matter which one handles a request. If you have to know, you designed and implemented it incorrectly.
check ArrayBlockingQueue(int capacity, boolean fair) if fair is true,then the queue accesses for threads blocked on insertion or removal, are processed in FIFO order.
Which of the consumer threads will take the first element when it will appear? The consumer thread which called take() first?
This is tied to the blocking queue implementation as well as the JVM in question but the short answer is most likely yes. Each of the threads will be waiting on a condition and the first thread in the wait queue will be awoken when the condition is signaled.
That said, you should not depend on this functionality since it is very dependent on the particulars of the blocking queue in question as well as the JVM and OS version.
I agree with duffymo, the idea of having multiple threads waiting indefinitely for some new elements to pop up in the queue does not sound very well structured.
Also, if you need to know which one of the consumers remove the element, that makes me think that the consumers are actually doing different things, giving life to different ouputs on different scenarios, depending on the order with which the consumers perform the take(). If that is the case you might want to have different queues for the different threads.
If you are not planning to change your code, what about having the threads to perform a poll on regular basis?
We are developing a Java application with several worker threads. These threads will have to deliver a lot of computation results to our UI thread. The order in which the results are delivered does not matter.
Right now, all threads simply push their results onto a synchronized Stack - but this means that every thread must wait for the other threads before results can be delivered.
Is there a data structure that supports simultaneous insertions with each insertion completing in constant time?
Thanks,
Martin
ConcurrentLinkedQueue is designed for high contention. Producers enqueue stuff on one end and consumers collect elements at the other end, so everything will be processed in the order it's added.
ArrayBlockingQueue is a better for lower contention, with lower space overhead.
Edit: Although that's not what you asked for. Simultaneuos inserts? You may want to give every thread one output queue (say, an ArrayBlockingQueue) and then have the UI thread poll the separate queues. However, I'd think you'll find one of the two above Queue implementations sufficient.
Right now, all threads simply push
their results onto a synchronized
Stack - but this means that every
thread must wait for the other threads
before results can be delivered.
Do you have any evidence indicating that this is actually a problem? If the computation performed by those threads is even the least little bit complex (and you don't have literally millions of threads), then lock contention on the result stack is simply a non-issue because when any given thread delivers its results, all others are most likely busy doing their computations.
Take a step back and evaluate whether performance is the key design consideration here. Don't think, know: does profiling back it up?
If not, I'd say a bigger concern is clarity and readability of design, and not introducing new code to maintain. It just so happens that, if you're using Swing, there is a library for doing exactly what you're trying to do, called SwingWorker.
Take a look at java.util.concurrent.ConcurrentLinkedQueue, java.util.concurrent.ConcurrentHashMap or java.util.concurrent.ConcurrentSkipListSet. They might do what you need. ConcurrentSkipListSet, for instance, claims to have "expected average log(n) time cost for the contains, add and remove operations and their variants. Insertion, removal, and access operations safely execute concurrently by multiple threads."
Two other patterns you might want to look at are
each thread has its own collection, when polled it returns the collection and creates a new one, so the collection only holds the pending items between polls. The thread needs to protect operations on its collection, but there is no contention between threads. This is blocking (each thread cannot add to its collection while the UI thread pulls updates from it), but can reduce contention (no contention between threads).
each thread has its own collection, and appends the results to a common queue which is protected using a Lock.tryLock(). The thread continues processing if it fails to acquire the lock. This makes it less likely that a thread will block waiting for the shared queue.
My question relates to this question asked earlier. In situations where I am using a queue for communication between producer and consumer threads would people generally recommend using LinkedBlockingQueue or ConcurrentLinkedQueue?
What are the advantages / disadvantages of using one over the other?
The main difference I can see from an API perspective is that a LinkedBlockingQueue can be optionally bounded.
For a producer/consumer thread, I'm not sure that ConcurrentLinkedQueue is even a reasonable option - it doesn't implement BlockingQueue, which is the fundamental interface for producer/consumer queues IMO. You'd have to call poll(), wait a bit if you hadn't found anything, and then poll again etc... leading to delays when a new item comes in, and inefficiencies when it's empty (due to waking up unnecessarily from sleeps).
From the docs for BlockingQueue:
BlockingQueue implementations are designed to be used primarily for producer-consumer queues
I know it doesn't strictly say that only blocking queues should be used for producer-consumer queues, but even so...
This question deserves a better answer.
Java's ConcurrentLinkedQueue is based on the famous algorithm by Maged M. Michael and Michael L. Scott for non-blocking lock-free queues.
"Non-blocking" as a term here for a contended resource (our queue) means that regardless of what the platform's scheduler does, like interrupting a thread, or if the thread in question is simply too slow, other threads contending for the same resource will still be able to progress. If a lock is involved for example, the thread holding the lock could be interrupted and all threads waiting for that lock would be blocked. Intrinsic locks (the synchronized keyword) in Java can also come with a severe penalty for performance - like when biased locking is involved and you do have contention, or after the VM decides to "inflate" the lock after a spin grace period and block contending threads ... which is why in many contexts (scenarios of low/medium contention), doing compare-and-sets on atomic references can be much more efficient and this is exactly what many non-blocking data-structures are doing.
Java's ConcurrentLinkedQueue is not only non-blocking, but it has the awesome property that the producer does not contend with the consumer. In a single producer / single consumer scenario (SPSC), this really means that there will be no contention to speak of. In a multiple producer / single consumer scenario, the consumer will not contend with the producers. This queue does have contention when multiple producers try to offer(), but that's concurrency by definition. It's basically a general purpose and efficient non-blocking queue.
As for it not being a BlockingQueue, well, blocking a thread to wait on a queue is a freakishly terrible way of designing concurrent systems. Don't. If you can't figure out how to use a ConcurrentLinkedQueue in a consumer/producer scenario, then just switch to higher-level abstractions, like a good actor framework.
LinkedBlockingQueue blocks the consumer or the producer when the queue is empty or full and the respective consumer/producer thread is put to sleep. But this blocking feature comes with a cost: every put or take operation is lock contended between the producers or consumers (if many), so in scenarios with many producers/consumers the operation might be slower.
ConcurrentLinkedQueue is not using locks, but CAS, on its add/poll operations potentially reducing contention with many producer and consumer threads. But being an "wait free" data structure, ConcurrentLinkedQueue will not block when empty, meaning that the consumer will need to deal with the poll() returning null values by "busy waiting", for example, with the consumer thread eating up CPU.
So which one is "better" depends on the number of consumer threads, on the rate they consume/produce, etc. A benchmark is needed for each scenario.
One particular use case where the ConcurrentLinkedQueue is clearly better is when producers first produce something and finish their job by placing the work in the queue and only after the consumers starts to consume, knowing that they will be done when queue is empty. (here is no concurrency between producer-consumer but only between producer-producer and consumer-consumer)
Another solution (that does not scale well) is rendezvous channels : java.util.concurrent SynchronousQueue
If your queue is non expandable and contains only one producer/consumer thread. You can use lockless queue (You don't need to lock the data access).