In a multithreaded environment, what is the policy regarding the take() method (removing an Object) for the various implementations of Java's BlockingQueue (for example LinkedBlockingQueue)?
Does the thread that calls take() first get the first available Object, e.g. is it a first come, first served, random, or is there some other policy describing how the queue is accessed by multiple threads? I cannot seem to find anything in the docs.
Related
public BlockingQueue<Message> Queue;
Queue = new LinkedBlockingQueue<>();
I know if I use, say a synchronized List, I need to surround it in synchronized blocks to safely use it across threads
Is that the same for Blocking Queues?
No you do not need to surround with synchronized blocks.
From the JDK javadocs...
BlockingQueue implementations are thread-safe. All queuing methods achieve their effects atomically using internal locks or other forms of concurrency control. However, the bulk Collection operations addAll, containsAll, retainAll and removeAll are not necessarily performed atomically unless specified otherwise in an implementation. So it is possible, for example, for addAll(c) to fail (throwing an exception) after adding only some of the elements in c.
Just want to point out that from my experience the classes in the java.util.concurrent package of the JDK do not need synchronization blocks. Those classes manage the concurrency for you and are typically thread-safe. Whether intentional or not, seems like the java.util.concurrent has superseded the need to use synchronization blocks in modern Java code.
Depends on use case, will explain 2 scenarios where you may need synchronized blocks or dont need it.
Case 1: Not required while using queuing methods e.g. put, take etc.
Why not required is explained here, important line is below:
BlockingQueue implementations are thread-safe. All queuing methods
achieve their effects atomically using internal locks or other forms
of concurrency control.
Case 2: Required while iterating over blocking queues and most concurrent collections
Since iterator (one example from comments) is weakly consistent, meaning it reflects some but not necessarily all of the changes that have been made to its backing collection since it was created. So if you care about reflecting all changes you need to use synchronized blocks/ Locks while iterating.
You are thinking about synchronization at too low a level. It doesn't have anything to do with what classes you use. It's about protecting data and objects that are shared between threads.
If one thread is able to modify any single data object or group of related data objects while other threads are able to look at or modify the same object(s) at the same time, then you probably need synchronization. The reason is, it often is not possible for one thread to modify data in a meaningful way without temporarily putting the data into an invalid state.
The purpose of synchronization is to prevent other threads from seeing the invalid state and possibly doing bad things to the same data or to other data as a result.
Java's Collections.synchronizedList(...) gives you a way for two or more threads to share a List in such a way that the list itself is safe from being corrupted by the action of the different threads. But, It does not offer any protection for the data objects that are in the List. If your application needs that protection, then it's up to you to supply it.
If you need the equivalent protection for a queue, you can use any of the several classes that implement java.util.concurrent.BlockingQueue. But beware! The same caveat applies. The queue itself will be protected from corruption, but the protection does not automatically extend to the objects that your threads pass through the queue.
Currently we have LinkedBlockingQueue and ConcurrentLinkedQueue.
LinkedBlockingQueue can be bounded, but it uses locks.
ConcurrentLinkedQueue doesn't use locks, but it is not bounded. And it is doesn't block which makes it hard to poll.
Obviously I can't have a queue that both blocks and is lock-free (wait-free or non-blocking or something else). I don't ask for academical definitions.
Does anyone know a queue implementation that is mostly lock-free (doesn't use a lock in the hot path), blocks when empty (no need to busy waiting), and is bounded (blocking when full)? Off-heap solution is welcome as well.
I heard about LMAX Disruptor, but it doesn't look like a queue at all.
I am happy to know non-general solutions too (Single-Producer-Single-Consumer, SPMC, MPSC)
If there are no known implementations, I am also happy to know possible algorithms.
The lock-free data structures use atomic reads and writes (e.g. compare-and-swap) to eliminate the need for locks. Naturally, these data structures never blocks.
What you describe is a queue that uses lock-free mechanisms for non-blocking calls, e.g. remove() with non-empty queue, while uses lock to block for e.g. remove() on empty queue.
As you might realize this is not possible to implement. If, for example, you were to after a pop operation, see if the queue was in fact empty and then proceed to block, by the time you block, the queue might already have one or more items inserted by another thread.
Java doc for concurrent linked queue clearly states that it is unbounded thread safe queue. Whereas, javadoc for linked transfer queue only mentions unbounded nature of the queue and says nothing about thread safety
I am not referring to transfer method.
Producer calling add method and Consumer calling poll method.
In short the answer is yes, class j.u.c.LinkedTransferQueue is thread safe. Since collection class is thread safe you can call any of its methods from any threads safely including add and poll.
The following words from javadoc should be considered as a proof of that:
Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a LinkedTransferQueue happen-before actions subsequent to the access or removal of that element from the LinkedTransferQueue in another thread.
Also j.u.c.BlockingQueue doesn't make lots of sense in a single-threaded environment. I mean you may use it, but there are more lightweight solutions like simple j.u.Queue interface. The main application area of BlockingQueue is producer-consumer applications where consumer is able to block waiting for the next element, which might come only from another thread because the current one is blocked . Since j.u.c.TransferQueue extends it then its implementations also supposed to be thread safe.
Currently I have an algorithm which somewhat looks like web-spiders or file search systems - it has a collection of the elements to process and processing elements can lead to enqueuing more elements.
However this algorithm is single threaded - it's because I fetch data from the db and would like to have only single db connection at once.
In my current situation performance is not critical - I'm doing this only for the visualization purposes to ease up debugging.
For me it seems natural to use queue abstraction, however it's seems that using queues implies multithreading - as I understand, most of standard java queue implementations reside in java.util.concurrent package.
I understand that I can go on with any data structure that support pull and push but I would like to know what data structure is more natural to use in this case(is it ok to use a queue in a single threaded application?).
It's basically fine to use the java.util.concurrent structures with a single thread.
The main thing to watch out for is blocking calls. If you use a bounded-size structure like an ArrayBlockingQueue, and you call the put method on a queue that's full, then the calling thread will block until there is space in the queue. If you use any kind of queue and you call take when it's empty, the calling thread will block until there's something in the queue. If you application is single-threaded, than those things can never happen, so that means blocking forever.
To avoid put blocking, you could use an unbounded structure like a LinkedBlockingQueue. To avoid blocking on removal, use a non-blocking operation - remove throws an exception if the queue is empty, and poll returns null.
Having said that, there are implementations of the Queue interface that are not in java.util.concurrent. ArrayDeque would probably be a good choice.
Queue is defined in java.util. LinkedList is a Queue and not very concurrency-friendly. None of the Queue method blocks, so they should be safe from a single threaded perspective.
It is ok to use any queue in a single threaded application. Synchronization overhead, in absence of concurrent threads, should be negligible, and is noticeable only if element processing time is very short.
If you want to use a Queue with a ThreadPool I sugges using an ExecutorService which combines both for you. The ExecutorService use LinkedBlockingQueue by default.
http://tutorials.jenkov.com/java-util-concurrent/executorservice.html
http://recursor.blogspot.co.uk/2007/03/mini-executorservice-future-how-to-for.html
http://www.vogella.com/articles/JavaConcurrency/article.html
The getQueue() method provides access to the underlying blocking queue in the ThreadPoolExecutor, but this does not seem to be safe.
A traversal over the queue returned by this function might miss updates made to the queue by the ThreadPoolExecutor.
"Method getQueue() allows access to the work queue for purposes of monitoring and debugging. Use of this method for any other purpose is strongly discouraged."
What would you do if you wanted to traverse the workQueue used by the ThreadPoolExecutor? Or is there an alternate approach?
This is a continuation of..
Choosing a data structure for a variant of producer consumer problem
Now, I am trying the multiple producer multiple consumer, but I want to use some existing threadpool, since I don't want to manage the threadpool myself, and also I want a callback when ThreadPoolExecutor has finished executing some task alongwith the ability to examine in a thread safe way the "inprogress transactions" data structure.
You can override the beforeExecute and afterExecute methods to let you know that a task has started and finished. You can override execute() to know when a task is added.
The problem you have is that the Queue is not designed to be queried and a task can be consumed before you see it. One way around this is to create you own implementation of a Queue (perhaps overriding/wrapping a ConcurrentLinkedQueue)
BTW: The queue is thread-safe, however it is not guaranteed you will see every entry.
A ConcurrentLinkedQueue.iterator() is documented as
Returns an iterator over the elements in this queue in proper sequence. The returned iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.
If you wish to copy the items in the queue and ensure that what you have in the queue has not been executed, you might try this:
a) Introduce the ability to pause and resume execution. See: http://download.oracle.com/javase/1,5.0/docs/api/java/util/concurrent/ThreadPoolExecutor.html
b) first pause the queue, then copy the queue, then resume the queue.
And then i have my own question. The problem i see is that while you execute your "Runnable", that "Runnable" is not placed in the queue, but a FutureTask "wrapper", and i cannot find any way to determine just which one of my runnables i'm looking at. So, grabbing and examining the queue is pretty useless. Does anybody know aht i missed there?
If you are following Jon Skeet's advice in your accepted answer from your previous question, then you'll be controlling access to your queues via locks. If you acquire a lock on the in-progress queue then you can guarantee that a traversal will not miss any items in it.
The problem with this of course is that while you are doing the traverse all other operations on the queue (other producers and consumers trying to access it) will block, which could have a pretty dire effect on performance.