There are many posts that suggest using Iterators to safely remove an element from a collection. Something like this:
Iterator<Book> i = books.iterator();
while(i.hasNext()){
if(i.next().isbn().equals(isbn)){
i.remove();
}
}
According to the documentation, the benefit of using an Iterator is that it is "fail fast" in the sense that if any thread is modifying the collection (books in the above example), while the iterator is used, then the iterator would throw a ConcurrentModificationException.
However, the documentation of this exception also says
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast operations throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: ConcurrentModificationException should be used only to detect bugs.
Does this mean that using iterators is not an option if 100% correctness has to be guaranteed? Do I need to design my code in such a way that removal while the collection is modified would always result in correct behavior? If so, can anyone give an example where using the .remove() method of an iterator is useful outside of testing?
Iterator#remove guarantees 100% correctness for single-threaded processing. In multi-threaded processing of data, it depends on how (synchronized/asynchronized processing, using a different list for collecting the elements to be removed etc.) you process the data.
As long as you do not want the same collection to be modified, you can collect the elements to be removed, into a separate List and use List#removeAll(Collection<?> c) as shown below:
import java.util.ArrayList;
import java.util.List;
public class Main {
public static void main(String[] args) {
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.add(4);
List<Integer> elementsToBeRemoved = new ArrayList<>();
for (Integer i : list) {
if (i % 2 == 0) {
elementsToBeRemoved.add(i);
}
}
list.removeAll(elementsToBeRemoved);
System.out.println(list);
}
}
Output:
[1, 3]
In a loop, never remove elements using the index
For a beginner, it may be tempting to use List#remove(int index) to remove the elements using index but the fact that every remove operation resizes the List makes it produce confusing results e.g.
import java.util.Iterator;
import java.util.List;
import java.util.Vector;
public class Main {
public static void main(String[] args) {
List<Integer> list = new Vector<>();
list.add(1);
list.add(2);
Iterator<Integer> i = list.iterator();
while (i.hasNext()) {
System.out.println("I'm inside the iterator loop.");
i.next();
list.remove(0);
}
System.out.println(list);
}
}
Output:
I'm inside the iterator loop.
[2]
The reason for this output is depicted below:
Iterator.remove will work as long as no other thread changes the Collection while you're iterating over it. Sometimes its a handy feature.
When it comes to multithreaded environment, it really depends on how do you organize the code.
For example if you create a collection inside a web request and do not share it with other requests (for example if it gets passed to some methods via method parameters) you can still safely use this method of traversing the collection.
On the other hand, if you have say a 'global' queue of metrics snapshots shared among all the requests, each request adds stats to this queue, and some other thread reads the queue elements and deletes the metrics, this way won't be appropriate.
So its all about the use case and the how do you organize the code.
As for the example that you're asking for, say you have a collection of Strings and would like to remove all the strings that start with a letter 'a' by modifying the existing collection
Iterator<String> i = strings.iterator();
while(i.hasNext()){
if(i.next().startsWith('a')){
i.remove();
}
}
Of course in Java 8+ you can achieve almost the same with Streams:
strings.stream()
.filter(s -> !s.startsWith('a'))
.collect(Collectors.toList());
However, this method creates a new collection, rather than modifying the existing one (like in the case with iterators).
In pre java 8 world (and iterators have appeared way before java 8 was available), we don't even have streams, so code like this was not really straightforward task to write.
Here is an interesting piece of code (could be a good interview question). Would this program compile? And if so, would it run without exceptions?
List<Integer> list = new Vector<>();
list.add(1);
list.add(2);
Iterator<Integer> i = list.iterator();
while (i.hasNext()) {
i.next();
list.remove(0);
}
Answer : yes. It would compile and run without exceptions. That's because there are two remove methods for the list:
E remove(int index)
Removes the element at the specified position in this list (optional operation).
boolean remove(Object o)
Removes the first occurrence of the specified element from this list, if it is present (optional operation).
And the one that gets called is boolean remove(Object o). Since 0 is not in the list, the list is not modified, and there is no error. This doesn't mean that there's something wrong with the concept of an iterator, but it shows that, even in a single thread situation, just because an iterator is used, does not mean the developer cannot make mistakes.
Does this mean that using iterators is not an option if 100% correctness has to be guaranteed?
Not necessarily.
First of all, it depends on your criteria for correctness. Correctness can only be measured against specified requirements. Saying something is 100% correct is meaningless if you don't say what the requirements are.
There are also some generalizations that we can make.
If a collection (and its iterator) is used by one thread only, 100% correctness can be guaranteed.
A concurrent collection types can be safely accessed and updated via its iterators from any number of threads. There are some caveats though:
An iteration is not guaranteed to see structural changes made after the iteration starts.
An iterator is not designed to be shared by multiple threads.
Bulk operations on a ConcurrentHashMap are not atomic.
If your correctness criteria do not depend one these things, then 100% correctness can be guaranteed.
Note: I'm not saying that iterators guarantee correctness. I am saying that iterators can be part of a correct solution, assuming that you use them the right way.
Do I need to design my code in such a way that removal while the collection is modified would always result in correct behavior?
It depends how you use the collection. See above.
But as a general rule, you do need to design and implement you code to be correct. (Correctness won't happen by magic ...)
If so, can anyone give an example where using the remove() method of an iterator is useful outside of testing?
In any example where only one thread can access the collection, using remove() is 100% safe, for all standard collection classes.
In many examples where the collection is a concurrent type, remove() is 100% safe. (But there is no guarantee that an element will stay removed if another thread is simultaneously trying to add it. Or that it will be added for that matter.)
The bottom line is that if your application is multi-threaded, then you have to understand how different threads may interact with shared collections. There is no way to avoid that.
Related
Question: What is the optimal (performance-wise) solution for the add, removal, modification of items within an ArrayList which at the same time avoids the ConcurrentModificationException from being thrown during operations?
Context: Based on my research looking into this question, there doesn't seem to be any straight-forward answers to the question at hand - most recommend using CopyOnWriteArrayList, but my understanding is that it is not recommended for array lists of large size (which I am working with, hence the performance-aspect of the question).
Thus, my understanding can be summarized as the following, but want to make sure if is correct/incorrect:
IMPORTANT NOTE: The following statements all assume that the operation is done within a synchronized block.
Remove during iteration of an ArrayList should be done with an Iterator, because for loop results in unpredictable behavior if removal is done within the middle of a collection. Example:
Iterator<Item> itemIterator = items.iterator();
while (itemIterator.hasNext()) {
Item item = itemIterator.next();
// check if item needs to be removed
itemIterator.remove();
}
For add operations, cannot be done with an Iterator, but can be with ListIterator. Example:
ListIterator<Item> itemIterator = list.listIterator();
while(itemIterator.hasNext()){
\\ do some operation which requires iteration of the ArrayList
itemIterator.add(item);
}
For add operations, a ListIterator does NOT have to be necessarily be used (i.e. simply items.add(item) should not cause any problems).
For add operations while going through the collection can be done with EITHER a ListIterator or a for loop, but NOT an Iterator. Example:
Iterator<Item> itemIterator = item.iterator();
while (itemIterator.hasNext()) {
\\ do some operation which requires iteration of the ArrayList
items.add(item); \\ NOT acceptable - cannot modify ArrayList while in an Iterator of that ArrayList
}
Modification of an item within an ArrayList can be done with either an Iterator or a for loop with the same performance complexity (is this true?). Example:
\\ iterator example
Iterator<Item> itemIterator = item.iterator();
while (itemIterator.hasNext()) {
Item item = itemIterator.next();
item.update(); // modifies the item within the ArrayList during iteration
}
\\ for loop example
for (Item item : items){
item.update();
}
Will modification during iteration with the Iterator have the same performance as the for loop? Are there any thread-safety differences between the approaches?
Bonus question: what advantage does using a synchronizedList of the ArrayList for add/remove/modify operations vs. for loop vs. iterator if it also requires a synchronized block?
There is no difference between while loops and for loops and in fact, the idiomatic form of a loop using an iterator explicitly, is a for loop:
for(Iterator<Item> it = items.iterator(); it.hasNext(); ) {
Item item = it.next();
item.update();
}
which gets compiled to exactly the same code as
for(Item item: items) {
item.update();
}
Try it online!
There are no performance differences for identical compiled code dependent to the original source code used to produce it.
Instead of focusing on the loop form, you have to focus on the fundamental limitations when inserting or removing elements of an ArrayList. Each time you insert or remove an element, the elements behind the affected index have to be copied to a new location. This isn’t very expensive, as the array only consists of references to the objects, but the costs can easily add up when doing it repeatedly.
So, if you know that the number of insertions or removals is predictably small or will happen at the end or close to the end (so there is only a small number of elements to copy), it’s not a problem. But when inserting or removing an arbitrary number of elements at arbitrary positions in a loop, you run into a quadratic time complexity.
You can avoid this, by using
items.removeIf(item -> /* check and return whether to remove the item*/);
This will use an internal iteration and postpone the moving of elements until their final position is known, leading to a linear time complexity.
If that’s not feasible, you might be better off copying the list into a new list, skipping the unwanted elements. This will be slightly less efficient but still have a linear time complexity. That’s also the solution for inserting a significant number of items at arbitrary positions.
The item.update(); in an entirely different category. “the item within the ArrayList” is a wrong mindset. As said above, the ArrayList contains references to objects whereas the object itself is not affected by “being inside the ArrayList”. In fact, objects can be in multiple collections at the same time, as all standard collections only contain references.
So item.update(); changes the Item object, which is an operation independent of the ArrayList, which is dangerous when you assume a thread safety based on the list.
When you have code like
Item item = items.get(someIndex);
// followed by using item
where get is from a synchronizedList
or a manually synchronized retrieval operation which returns the item to the caller or any other form of code which uses a retrieved Item outside the synchronized block,
then your code is not thread safe. It doesn’t help when the update() call is done under a synchronization or lock when looping over the list, when the other uses are outside the synchronization or lock. To be thread safe, all uses of an object must be protected by the same thread safety construct.
So even when you use the synchronizedList, you must not only guard your loops manually, as the documentation already tells, you also have to expand the protection to all other uses of the contained elements, if they are mutable.
Alternatively, you could have different mechanisms for the list and the contained elements, if you know what you are doing, but it still means that the simplicity of “just wrap the list with synchronizedList” isn’t there.
So what advantage does it have? Actually none. It might have helped developers during the migration from Java 1.1 and its all-synchronized Vector and Hashtable to Java 2’s Collection API. But I never had a use for the synchronized wrappers at all. Any nontrivial use case requires manual synchronization (or locking) anyway.
AFAIK, there are two approaches:
Iterate over a copy of the collection
Use the iterator of the actual collection
For instance,
List<Foo> fooListCopy = new ArrayList<Foo>(fooList);
for(Foo foo : fooListCopy){
// modify actual fooList
}
and
Iterator<Foo> itr = fooList.iterator();
while(itr.hasNext()){
// modify actual fooList using itr.remove()
}
Are there any reasons to prefer one approach over the other (e.g. preferring the first approach for the simple reason of readability)?
Let me give a few examples with some alternatives to avoid a ConcurrentModificationException.
Suppose we have the following collection of books
List<Book> books = new ArrayList<Book>();
books.add(new Book(new ISBN("0-201-63361-2")));
books.add(new Book(new ISBN("0-201-63361-3")));
books.add(new Book(new ISBN("0-201-63361-4")));
Collect and Remove
The first technique consists in collecting all the objects that we want to delete (e.g. using an enhanced for loop) and after we finish iterating, we remove all found objects.
ISBN isbn = new ISBN("0-201-63361-2");
List<Book> found = new ArrayList<Book>();
for(Book book : books){
if(book.getIsbn().equals(isbn)){
found.add(book);
}
}
books.removeAll(found);
This is supposing that the operation you want to do is "delete".
If you want to "add" this approach would also work, but I would assume you would iterate over a different collection to determine what elements you want to add to a second collection and then issue an addAll method at the end.
Using ListIterator
If you are working with lists, another technique consists in using a ListIterator which has support for removal and addition of items during the iteration itself.
ListIterator<Book> iter = books.listIterator();
while(iter.hasNext()){
if(iter.next().getIsbn().equals(isbn)){
iter.remove();
}
}
Again, I used the "remove" method in the example above which is what your question seemed to imply, but you may also use its add method to add new elements during iteration.
Using JDK >= 8
For those working with Java 8 or superior versions, there are a couple of other techniques you could use to take advantage of it.
You could use the new removeIf method in the Collection base class:
ISBN other = new ISBN("0-201-63361-2");
books.removeIf(b -> b.getIsbn().equals(other));
Or use the new stream API:
ISBN other = new ISBN("0-201-63361-2");
List<Book> filtered = books.stream()
.filter(b -> b.getIsbn().equals(other))
.collect(Collectors.toList());
In this last case, to filter elements out of a collection, you reassign the original reference to the filtered collection (i.e. books = filtered) or used the filtered collection to removeAll the found elements from the original collection (i.e. books.removeAll(filtered)).
Use Sublist or Subset
There are other alternatives as well. If the list is sorted, and you want to remove consecutive elements you can create a sublist and then clear it:
books.subList(0,5).clear();
Since the sublist is backed by the original list this would be an efficient way of removing this subcollection of elements.
Something similar could be achieved with sorted sets using NavigableSet.subSet method, or any of the slicing methods offered there.
Considerations:
What method you use might depend on what you are intending to do
The collect and removeAl technique works with any Collection (Collection, List, Set, etc).
The ListIterator technique obviously only works with lists, provided that their given ListIterator implementation offers support for add and remove operations.
The Iterator approach would work with any type of collection, but it only supports remove operations.
With the ListIterator/Iterator approach the obvious advantage is not having to copy anything since we remove as we iterate. So, this is very efficient.
The JDK 8 streams example don't actually removed anything, but looked for the desired elements, and then we replaced the original collection reference with the new one, and let the old one be garbage collected. So, we iterate only once over the collection and that would be efficient.
In the collect and removeAll approach the disadvantage is that we have to iterate twice. First we iterate in the foor-loop looking for an object that matches our removal criteria, and once we have found it, we ask to remove it from the original collection, which would imply a second iteration work to look for this item in order to remove it.
I think it is worth mentioning that the remove method of the Iterator interface is marked as "optional" in Javadocs, which means that there could be Iterator implementations that throw UnsupportedOperationException if we invoke the remove method. As such, I'd say this approach is less safe than others if we cannot guarantee the iterator support for removal of elements.
Old Timer Favorite (it still works):
List<String> list;
for(int i = list.size() - 1; i >= 0; --i)
{
if(list.get(i).contains("bad"))
{
list.remove(i);
}
}
Benefits:
It only iterates over the list once
No extra objects created, or other unneeded complexity
No problems with trying to use the index of a removed item, because... well, think about it!
In Java 8, there is another approach. Collection#removeIf
eg:
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.removeIf(i -> i > 2);
Are there any reasons to prefer one approach over the other
The first approach will work, but has the obvious overhead of copying the list.
The second approach will not work because many containers don't permit modification during iteration. This includes ArrayList.
If the only modification is to remove the current element, you can make the second approach work by using itr.remove() (that is, use the iterator's remove() method, not the container's). This would be my preferred method for iterators that support remove().
Only second approach will work. You can modify collection during iteration using iterator.remove() only. All other attempts will cause ConcurrentModificationException.
You can't do the second, because even if you use the remove() method on Iterator, you'll get an Exception thrown.
Personally, I would prefer the first for all Collection instances, despite the additional overheard of creating the new Collection, I find it less prone to error during edit by other developers. On some Collection implementations, the Iterator remove() is supported, on other it isn't. You can read more in the docs for Iterator.
The third alternative, is to create a new Collection, iterate over the original, and add all the members of the first Collection to the second Collection that are not up for deletion. Depending on the size of the Collection and the number of deletes, this could significantly save on memory, when compared to the first approach.
I would choose the second as you don't have to do a copy of the memory and the Iterator works faster. So you save memory and time.
You can see this sample; If we think remove odd value from a list:
public static void main(String[] args) {
Predicate<Integer> isOdd = v -> v % 2 == 0;
List<Integer> listArr = Arrays.asList(5, 7, 90, 11, 55, 60);
listArr = listArr.stream().filter(isOdd).collect(Collectors.toList());
listArr.forEach(System.out::println);
}
why not this?
for( int i = 0; i < Foo.size(); i++ )
{
if( Foo.get(i).equals( some test ) )
{
Foo.remove(i);
}
}
And if it's a map, not a list, you can use keyset()
I'm attempting to use the number of iterations from an iterator as a counter, but was wondering the ramifications of doing so.
private int length(Iterator<?> it) {
int i = 0;
while(it.hasNext()) {
it.next();
i++;
}
return i;
}
This works fine, but I'm worried about what the iterator may do behind the scenes. Perhaps as I'm iterating over a stack, it pops the items off the stack, or if I'm using a priority queue, and it modifies the priority.
The javadoc say this about iterator:
next
E next()
Returns the next element in the iteration.
Returns:
the next element in the iteration
Throws:
NoSuchElementException - if the iteration has no more elements
I don't see a guarantee that iterating over this unknown collection won't modify it. Am I thinking of unrealistic edge cases, or is this a concern? Is there a better way?
The Iterator simply provides an interface into some sort of stream, therefore not only is it perfectly possible for next() to destroy data in some way, but it's even possible for the data in an Iterator to be unique and irreplaceable.
We could come up with more direct examples, but an easy one is the Iterator in DirectoryStream. While a DirectoryStream is technically Iterable, it only allows one Iterator to be constructed, so if you tried to do the following:
Path dir = ...
try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
int count = length(stream.iterator());
for (Path entry: stream) {
...
}
}
You would get an exception in the foreach block, because the stream can only be iterated once. So in summary, it is possible for your length() method to change objects and lose data.
Furthermore, there's no reason an Iterator has to be associated with some separate data-store. Take for example an answer I gave a few months ago providing a clean way to select n random numbers. By using an infinite Iterator we are able to provide, filter, and pass around arbitrarily large amounts of random data lazily, no need to store it all at once, or even compute them until they're needed. Because the Iterator doesn't back any data structure, querying it is obviously destructive.
Now that said, these examples don't make your method bad. Notice that the Guava library (which everyone should be using) provides an Iterators class with exactly the behavior you detail above, called size() to conform with the Collections Framework. The burden is then on the user of such methods to be aware of what sort of data they're working with, and avoid making careless calls such as trying to count the number of results in an Iterator that they know cannot be replaced.
As far as I can tell, the Collection specification does not explicitly state that iterating over a collection does not modify it, but no classes in the standard library show that behaviour (actually at least one does, see dimo414's answer), so any class that did would be highly suspect. I don't think you need to worry about this.
Note that the Guava library implements Iterators.size() and Iterables.size() in the same way that you are, so clearly they find it safe in the general case.
No, iterating over a collection will not modify the collection. The Iterator class does have a remove() method, which is the only safe way of removing an element from a collection during iteration. But simply calling hasNext() and next() will not modify the collection.
Keep in mind that if you modify the object returned by next(), those changes will be present in your collection.
Think about it -- methods that return things are (if they are written correctly) accessor methods, meaning that they just return data. They do not modify it (they are not mutator methods).
Here's an example I had on my disk of how an iterator might be implemented. As you can see, no values are actually modified.
public class ArraySetIterator implements Iterator
{
private int nextIndex;
private ArraySet theArraySet;
public ArraySetIterator (ArraySet a)
{
this.nextIndex = 0;
this.theArraySet = a;
}
public boolean hasNext ()
{
return this.nextIndex < this.theArraySet.size();
}
public Object next()
{
return this.theArraySet.get(this.nextIndex++);
}
}
According to http://download.oracle.com/javase/1.4.2/docs/api/java/util/Vector.html
The Iterators returned by Vector's
iterator and listIterator methods are
fail-fast: if the Vector is
structurally modified at any time
after the Iterator is created, in any
way except through the Iterator's own
remove or add methods, the Iterator
will throw a
ConcurrentModificationException. Thus,
in the face of concurrent
modification, the Iterator fails
quickly and cleanly, rather than
risking arbitrary, non-deterministic
behavior at an undetermined time in
the future. The Enumerations returned
by Vector's elements method are not
fail-fast. Note that the fail-fast
behavior of an iterator cannot be
guaranteed as it is, generally
speaking, impossible to make any hard
guarantees in the presence of
unsynchronized concurrent
modification. Fail-fast iterators
throw ConcurrentModificationException
on a best-effort basis. Therefore, it
would be wrong to write a program that
depended on this exception for its
correctness: the fail-fast behavior of
iterators should be used only to
detect bugs
Could you give me an example to validate the above set of statements ?Im still unclear with fail fast behaviour of vector's method Iterator and ListIterator.? Confused :-((
if the Vector is structurally modified at any time after the Iterator is created, in any way except through the Iterator's own remove or add methods, the Iterator will throw a ConcurrentModificationException.
Here is an example:
import java.util.*;
public class Test {
public static void main(String[] args) {
List<String> strings = new Vector<String>();
strings.add("lorem");
strings.add("ipsum");
strings.add("dolor");
strings.add("sit");
int i = 0;
Iterator<String> iter = strings.iterator();
while (iter.hasNext()) {
System.out.println(iter.next());
// Modify the list in the middle of iteration.
if (i++ == 1)
strings.remove(0);
}
}
}
Output:
lorem
ipsum
Exception in thread "main" java.util.ConcurrentModificationException
at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
at java.util.AbstractList$Itr.next(AbstractList.java:343)
at Test.main(Test.java:18)
The program does the following:
Creates a Vector and gets an iterator
Calls next() twice.
Modifies the vector (by removing the first element)
Calls next() again (after the vector has been modified)
This causes a ConcurrentModificationException to be thrown.
Since Java's for-each loops rely on iterators these constructs may also throw ConcurrentModificationExceptions. The solution is to make a copy of the list before iterating (so you iterate over a copy) or to use for instance an CopyOnWriteArrayList like this:
import java.util.*;
import java.util.concurrent.CopyOnWriteArrayList;
public class Test {
public static void main(String[] args) {
List<String> strings = new CopyOnWriteArrayList<String>();
strings.add("lorem");
strings.add("ipsum");
strings.add("dolor");
strings.add("sit");
int i = 0;
Iterator<String> iter = strings.iterator();
while (iter.hasNext()) {
System.out.println(iter.next());
// Modify the list in the middle of iteration.
if (i++ == 1)
strings.remove(0);
}
}
}
Output:
lorem
ipsum
dolor
sit
A simple way to trigger a concurrent modification exception is
List<String> strings = new ArrayList<String>();
strings.add("a");
strings.add("b");
for(String s: strings)
strings.remove(s);
This triggers an exception because the collection is changed while iteration over the collection.
The reason the Iterator fails fast is to help you detect that a collection was modified concurrently (which these collections don't support) and help detect where the error occurred. If it didn't have this feature you could have subtle bugs which may not show a problem until much later in your code. (Making them much harder to race)
The newer concurrency collections handle concurrent modification differently and don't do this in general. They were introduced into core Java in 2004, i suggest you have a look at these newer collections.
BTW: Don't use Vector unless you have to.
Say you have a Vector of Integers containing 1-10 and you want to remove the odd numbers. You iterate over this list looking for odds and using the Iterators remove() method. After this you have some code that of course assumes there are no odd numbers in the Vector. If another thread modifies the vector during this process, there might sometimes actually be an odd number (depending on the race condition), breaking the code that comes after. Perhaps it doesn't even break right away; maybe it doesn't cause a problem until hours or days later -- very hard to troubleshoot. This is what happens with the elements() method.
Fail-fast means trying to detect this (potential) problem as soon as it occurs and sounding the alarm, which makes it much easier to troubleshoot. As soon as another thread is found to have modified the collection, an exception is thrown. This is what happens with the iterators.
The Iterators returned by iterator() and listIterator() actively watch for unexpected modifications to the underlying list. The Vector class (actually its parent AbstractList), increments a counter each time it is modified. When iterators for the Vector are created, they store a copy of the Vector's modification counter. Each time you call next() or remove() the Iterator compares it's stored value for the counter to the Vector's actual counter. If they differ, it throws a ConcurrentModificationException.
I'm iterating over a JRE Collection which enforces the fail-fast iterator concept, and thus will throw a ConcurrentModificationException if the Collection is modified while iterating, other than by using the Iterator.remove() method . However, I need to remove an object's "logical partner" if the object meets a condition. Thus preventing the partner from also being processed. How can I do that? Perhaps by using better collection type for this purpose?
Example.
myCollection<BusinessObject>
for (BusinessObject anObject : myCollection)
{
if (someConditionIsTrue)
{
myCollection.remove(anObjectsPartner); // throws ConcurrentModificationException
}
}
Thanks.
It's not a fault of the collection, it's the way you're using it. Modifying the collection while halfway through an iteration leads to this error (which is a good thing as the iteration would in general be impossible to continue unambiguously).
Edit: Having reread the question this approach won't work, though I'm leaving it here as an example of how to avoid this problem in the general case.
What you want is something like this:
for (Iterator<BusinessObject> iter = myCollection.iterator; iter.hasNext(); )
{
BusinessObject anObject = iter.next();
if (someConditionIsTrue)
{
iter.remove();
}
}
If you remove objects through the Iterator itself, it's aware of the removal and everything works as you'd expect. Note that while I think all standard collections work nicely in this respect, Iterators are not required to implement the remove() method so if you have no control over the class of myCollection (and thus the implementation class of the returned iterator) you might need to put more safety checks in there.
An alternative approach (say, if you can't guarantee the iterator supports remove() and you require this functionality) is to create a copy of the collection to iterate over, then remove the elements from the original collection.
Edit: You can probably use this latter technique to achieve what you want, but then you still end up coming back to the reason why iterators throw the exception in the first place: What should the iteration do if you remove an element it hasn't yet reached? Removing (or not) the current element is relatively well-defined, but you talk about removing the current element's partner, which I presume could be at a random point in the iterable. Since there's no clear way that this should be handled, you'll need to provide some form of logic yourself to cope with this. In which case, I'd lean towards creating and populating a new collection during the iteration, and then assigning this to the myCollection variable at the end. If this isn't possible, then keeping track of the partner elements to remove and calling myCollection.removeAll would be the way to go.
You want to remove an item from a list and continue to iterate on the same list. Can you implement a two-step solution where in step 1 you collect the items to be removed in an interim collection and in step 2 remove them after identifying them?
Some thoughts (it depends on what exactly the relationship is between the two objects in the collection):
A Map with the object as the key and the partner as the value.
A CopyOnWriteArrayList, but you have to notice when you hit the partner
Make a copy into a different Collection object, and iterate over one, removing the other. If this original Collection can be a Set, that would certaily be helpful in removal.
You could try finding all the items to remove first and then remove them once you have finished processing the entire list. Skipping over the deleted items as you find them.
myCollection<BusinessObject>
List<BusinessObject> deletedObjects = new ArrayList(myCollection.size());
for (BusinessObject anObject : myCollection)
{
if (!deletedObjects.contains(anObject))
{
if (someConditionIsTrue)
{
deletedObjects.add(anObjectsPartner);
}
}
}
myCollection.removeAll(deletedObjects);
CopyOnWriteArrayList will do what you want.
Why not use a Collection of all the original BusinessObject and then a separate class (such as a Map) which associates them (ie creates partner)? Put these both as a composite elements in it's own class so that you can always remove the Partner when Business object is removed. Don't make it the responsibility of the caller every time they need to remove a BusinessObject from the Collection.
IE
class BusinessObjectCollection implements Collection<BusinessObject> {
Collection<BusinessObject> objects;
Map<BusinessObject, BusinessObject> associations;
public void remove(BusinessObject o) {
...
// remove from collection and dissasociate...
}
}
The best answer is the second, use an iterator.