When to use queue over arraylist

When to use queue over arraylist - java

One basic argument to use a Queue over an ArrayList is that Queue guarantees FIFO behavior.
But if I add 10 elements to an ArrayList and then iterate over the elements starting from the 0th element, then I will retrieve the elements in the same order as they were added. So essentially, that guarantees a FIFO behavior.
What is so special about Queue as compared to traditional ArrayList?

You can look at the javadoc here. The main difference is a List lets you look at any element whenever you want. A queue only lets you look at the "next" one.
Think about it as a real queue or as a line for the cash register at a grocery store. You don't ask the guy in the middle or the end to pay next, you always ask the guy who's in the front/been waiting the longest.
It's worth noting that some lists are queues. Look at LinkedList, for example.

If I gave you a Queue instance then you would know that by iteratively calling remove() you would retrieve the elements in FIFO order. If i gave you an ArrayList instance then you can make no such guarantee.
Take the following code as an example:
ArrayList<Integer> list = new ArrayList<Integer>();
list.add(5);
list.add(4);
list.add(3);
list.add(2);
list.add(1);
list.set(4,5);
list.set(3,4);
list.set(2,3);
list.set(1,2);
list.set(0,1);
System.out.println(list);
If I were now to give you this list, then my iterating from 0 to 4 you would not get the elements in FIFO order.
Also, I would say another difference is abstraction. With a Queue instance you don't have to worry about indexes and this makes things easier to think about if you don't need everything ArrayList has to offer.

The limitations imposed on a queue (FIFO, no random access), as compared to an ArrayList, allow for the data structure to be better optimized, have better concurrency, and be a more appropriate and cleaner design when called for.
In regards to optimization and concurrency, imagine the common scenario where a producer is filling a queue while a consumers consumes it. If we used an ArrayList for this, then in the naive implementation each removal of the first element would cause a shift operation on the ArrayList in order to move down every other element. This is very inefficient, especially in a concurrent implementation since the list would be locked for duration of the entire shift operation.
In regards to design, if items are to be accessed in a FIFO fashion then using a queue automatically communicates that intention, whereas a list does not. This clarity of communication allows for easier understanding of the code, and may possibly make the code more robust and bug free.

The difference is that for a Queue, you are guaranteed to pull elements out in FIFO order. For an ArrayList, you have no idea what order the elements were added. Depending on how you use it, you could enforce FIFO ordering on an ArrayList. I could also design a wrapper for a Queue that allowed me to pull out which-ever element I wanted.
The point I'm trying to make is that these classes are designed to be good at something. You don't have to use them for that, but that's what they are designed and optimized for. Queues are very good at adding and removing elements, but bad if you need to search through them. ArrayLists, on the other hand, are a bit slower to add elements, but allow easy random access. You won't see it in most applications you write, but there is often a performance penalty for choosing one over the other.

Yes!
I would have used poll() and peek() methods in queue which returns the value as well as remove , examine head element respectively .Also These methods provides you with a special value null if the operation fails and doesn't throws an exception as with remove() method will throw nosuchelement exception.
Ref: docs.oracle.com

For example, Queue methods poll() and remove() retrieves the element and removes it from the Queue.
Some implementation of Queue interface (PriorityQueue) allow to set a priority to the elements and retrieves them thanks to this priority. It is much more than a FIFO behaviour in that last case.

Consider a situation in which random processes update an arraylist randomly and we are supposed to process them in fifo?
There is absolutely no way to do that but to change the data structure from arraylist to queue

Related

Avoid ConcurrentModificationException using Iterator.next()

In my Android app I use this code while drawing some waypoints on a map
Iterator<Waypoint> iterator = waypoints.iterator();
while (iterator.hasNext()) {
Waypoint w = iterator.next();
}
But I am getting this error
Fatal Exception: java.util.ConcurrentModificationException
java.util.ArrayList$ArrayListIterator.next (ArrayList.java:573)
I am not modifying the list directly in the loop I am iterating over.
But it is possible that I modify the list in another thread because a user can move some waypoints. And the drawing of a waypoint can happen the same time a user uses the touch display to move a waypoint.
Can I avoid that exception somehow?

If you want to maintain a List you use in several threads, it's best you use a concurrent list, such as CopyOnWriteArrayList.
Locally, you can avoid the exception by creating a copy of the waypoints list first and iterate that:
Iterator<Waypoint> iterator = new ArrayList<>(waypoints).iterator();
while (iterator.hasNext()) {
handle(iterator.next());
}

The iterator provided by array list is fail-fast iterator - meaning it fails as soon as the underlying list is modified.
One way to avoid the exception is to take a snapshot of the list into another list and then iterate over it.
Iterator<Waypoint> iterator = new ArrayList<>(waypoints).iterator();
while (iterator.hasNext()) {
Waypoint w = iterator.next();
}
another way is to use collection that implements fail-safe iterators such as CopyOnWriteArrayList.

I see some options:
a. Avoid multithreading. Well, you don't have to avoid multithreading completely, just for access to the array. All accesses to the array (even read) must happen from the same thread. Heavy computations can happen on some other threads, of course. This might be a reasonable approach when you can iterate fast.
b. Lock the ArrayList, even for reading. This can be tricky, as excessive locking can introduce deadlocks.
c. Use data copies. Remember, you copy just references, but you usually don't have to clone all the objects. For large data structures, it might be worth considering some persistent data structure, which does not require to copy all the data.
d. Deal with the ConcurrentModificationException somehow. Maybe restart the computation. This might be useful in some cases, but it might get tricky in complex code. Also, in some cases when accessing multiple shared data structures, you might get a livelock – two (or more) threads causing ConcurrentModificationException repeatedly to each other.
EDIT: For some approaches (at least A), you might find reactive programming useful, because this programming style reduces the time spent in the main thread.

Ordered lists and class thread-safety

I have a class which has:
2 fields holding time-ordered list (list1, list2).
3 read-only methods which iterate above lists to
generate summary statistics.
1 mutating method, which looks for a match of given 'new-item' in list1. If match is not found, it adds 'new-item' to list1. If match is found, it removes the match from list1 and adds both match and 'new-item' to list2.
Lets assume that multiple concurrent invocation of all methods are possible. I need to achieve thread-safety while maximising performance.
Approach1 (extremely slow) - Declare field-types as ArrayList and use synchronise keyword on all methods.
Approach2 - Declare field-type as CopyOnWriteArrayList and synchronise the mutating method.
Questions
Does Approach2 ensure thread-safety?
Are there better alternatives?

Do you need the random access offered by an ArrayList? Can you instead use a thread-safe ordered collection like ConcurrentSkipListSet (non-blocking) or PriorityBlockingQueue (blocking)? Both have log(n) insertions.
Mutation methods in both cases are thread-safe.
Edit: Just note, you would still run into atomicity concerns. If you need the add's to be done attomically then you would need more coarse locking.

Approach number 2 does not guarantee thread-safety.
The two operations on collections are not atomic: first you remove an item, then you add it to the other collection. Some thread might in the meantime execute a read-only method to find out that the item is missing in list 1, and is not yet added to the list 2. It depends on your application whether this is acceptable.
On the other hand, it is also possible that: a read-only method first iterates through list 1, and finds that it contains item x; in the meantime the updating method executes and transfers item x; the read-only method continues and iterates through list 2 in which it finds item x one more item. Again, it depends on your application whether this is acceptable.
Other solutions are possible, but that would require more details about what are you trying to achieve exactly.
One obvious way would be to modify approach number 1, and instead of using synchronized on every method, use a readers-writer lock. You would read-lock in every read-only method and write-lock in the mutating one.
You could also use two separate readers-writer locks. One for the first collection and one for the other. If your read-only methods iterate through both of the lists, they would have to read-acquire both of the locks up front, before doing anything. On the other hand the mutating method would have to first write-acquire the first lock, and if it wishes to transfer an item, then it should write-acquire the second lock.
You'd need to do some testing to see if it works nicely for you. Still there are definitely even better ways to handle it, but you'd need to provide more details.

The time it takes to lock a method is less than a micro-second. If a fraction of a micro-second matters, you might consider something more complex, both otherwise something simple is usually better.
Just using thread safe collection is not enough when you perform multiple operations, e.g. remove from one list and add to another is two operations, and any number of thread can get in between those operations.
Note: if you do lots of updates this can be slower.

How to loop over a updatable list?

I'm building a Java Running class that will treat a set of items one by one. When working (running), that set may be updated (items added only).
How can I loop over that list by being sure that it will take into consideration newly added elements?
Update
Following the answers, I implemented the code that I suggested on Code Review.

Answer
You shoud use a Queue, i.e. java.util.concurrent.ConcurrentLinkedQueue, java.util.concurrent.LinkedBlockingQueue, java.util.concurrent.ArrayBlockingQueue or one of the other implementations that suits your needs.
They have different methods which allow you to implement different scenarios. You can check javadocs for differences between Queue methods (there are methods throwing exceptions, or returning nulls; methods for viewing elements or for retrieving them with removal).
Throws exception Returns special value
Insert add(e) offer(e)
Remove remove() poll()
Examine element() peek()
In case of BlockingQueue implementations there are additionally 2 options: blocking methods and methods that time out. The extended table for possible methods is in it's javadoc.
Choose carefully the required implementation. Do you want a fixed capacity? Do you want to block on retrieval from empty queue? Do you want your Queue bounded or unbounded. If still in doubt, look up Stack Overflow answers that explain differences between different queue types, Google them or check javadocs.
RANT
There is one problem you may or may not run into, depending on your design - how to tell the queue is empty because you are done producing elements. Is your producer (whatever is inserting elements into your queue) not fast enough to add items to the queue before they are consumed? Or is the queue empty because all the tasks where completed? In the latter case, if you use a blocking queue, you can block on retrieval of element when there will be none available. You can consider a non-blocking queue in such case, using a "poison pill" marker element that means the producers are done producing, or even better decouple producer from consumer by using an intermediary mediator class which holds the queue and producer / consumer interact only with the mediator.

Use a queue (for example, java.util.concurrent.LinkedBlockingQueue) instead of a list. Queues are specifically designed for this kind of scenario.

Don't go with List.
If I you use a Queue instance then you can call remove() and you would retrieve the elements in FIFO order. If you use an List instance then you can make no such guarantee.
Take the following code as an example:
ArrayList<Integer> list = new ArrayList<Integer>();
list.add(5);
list.add(4);
list.add(3);
list.add(2);
list.add(1);
list.set(4,5);
list.set(3,4);
list.set(2,3);
list.set(1,2);
list.set(0,1);
System.out.println(list);
Also, another difference is abstraction. With a Queue instance you don't have to worry about indexes and this makes things easier to think about if you don't need everything a List has to offer.

java single writer and multiple reader

Sorry if this was asked before, but I could not find my exact scenario.
Currently I have a background thread that adds an element to a list and removes the old data every few minutes. Theoretically there can be at most 2 items in the list at a time and the items are immutable. I also have multiple threads that will grab the first element in the list whenever they need it. In this scenario, is it necessary to explicitly serialized operations on the list? My assumption that since I am just grabbing references to the elements, if the background thread deletes elements from the list, that should not matter since the thread already grabs a copy of the reference before the deletion. There is probably a better way to do this. Thanks in advanced.

Yes, synchronization is still needed here, because adding and removing are not atomic operations. If one thread calls add(0, new Object()) at the same time another calls remove(0), the result is undefined; for example, the remove() might end up having no effect.
Depending on your usage, you might be able to use a non-blocking list class like ConcurrentLinkedQueue. However, given that you are pushing one change every few minutes, I doubt you are gaining much in performance by avoiding synchronization.

Java Concurrency: lock effiency

My program has 100 threads.
Every single thread does this:
1) if arrayList is empty, add element with certain properties to it
2) if arrayList is not empty, iterate through elements found in arrayList, if found suitable element (matching certain properties), get it and remove the arrayList
The problem here is that while one thread is iterating through the arrayList, other 99 threads are waiting for the lock on arrayList.
What would you suggest to me if I want all 100 threads to work in lock-less condition? So they all have work to do?
Thanks

Have you looked at shared vs exclusive locking? You could use a shared lock on the list, and then have a 'deleted' property on the list elements. The predicate you use to check the list elements would need to make sure the element is not marked 'deleted' in addition to whatever other queries you have - also due to potential read-write conflicts, you would need to lock on each element as you traverse. Then periodically get an exclusive lock on the list to perform the deletes for real.
The read lock allows for a lot of concurrency on the list. The exclusive locks on each element of the list are not as nice, but you need to force the memory model to update your 'deleted' flag to each thread, so there's no way around that.

First if you're not running on a machine that has 64 cores or more your 100 threads are probably a performance hog in themselves.
Then an ArrayList for what you're describing is certainly not a good choice because removing an element does not run in amortized constant time but in linear time O(n). So that's a second performance hog. You probably want to use a LinkedList instead of your ArrayList (if you insist on using a List).
Now of course I doubt very much that you need to iterate over your complete list each time you need to find one element: wouldn't another data structure be more appropriate? Maybe that the elements that you put in your list have such a concept as "equality" and hence a Map with an O(1) lookup time could be used instead?
That's just for a start: as I showed you, there are at least two serious performances issues in what you described.... Maybe you should clarify your question if you want more help.

If your notion of "suitable element (matching certain properties)" can be encoded using a Comparator then a PriorityBlockingQueue would allow each thread to poll the queue, taking the next element without having to search the list or enqueuing a new element if the queue is empty.
Addendum: Thilo raise an essential point: As your approach evolves, you may want to determine empirically how many threads are optimal.

The key is to only use the object lock on arraylist when you actually need to.
A good idea would be to subclass arraylist and provide synchro on single read + write + delete processes.
This will ensure fine granularity with the locking while allowing the threads to run through the array list while protecting the semantics of the arraylist.

Have a single thread own the array and be responsible for adding to it and iterating over it to find work to do. Once a unit of work is found, put the work on a BlockingQueue. Have all your worker threads use take() to remove work from the queue.
This allows multiple units of work to be discovered per pass through the array and they can be handed off to waiting worker threads fairly efficiently.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.