Collection - Iterator.remove() vs Collection.remove()

Collection - Iterator.remove() vs Collection.remove() - java

As per Sun ,
"Iterator.remove is the only safe way to modify a collection during
iteration; the behavior is unspecified if the underlying collection is
modified in any other way while the iteration is in progress."
I have two questions :
What makes this operation "Iterator.remove()" stable than the others ?
Why did they provide a "Collection.remove()" method if it will not be useful in most of the use-cases?

First of all, Collection.remove() is very useful. It is applicable in a lot of use cases, probably more so than Iterator.remove().
However, the latter solves one specific problem: it allows you to modify the collection while iterating over it.
The problem solved by Iterator.remove() is illustrated below:
List<Integer> l = new ArrayList<Integer>(Arrays.asList(1, 2, 3, 4));
for (int el : l) {
if (el < 3) {
l.remove(el);
}
}
This code is invalid since l.remove() is called during iteration over l.
The following is the correct way to write it:
Iterator<Integer> it = l.iterator();
while (it.hasNext()) {
int el = it.next();
if (el < 3) {
it.remove();
}
}

If you're iterating over a collection and use:
Collection.remove()
you can get runtime errors (specifically ConcurrentModifcationException) because you're changing the state of the object used previously to construct the explicit series of calls necessary to complete the loop.
If you use:
Iterator.remove()
you tell the runtime that you would like to change the underlying collection AND re-evaluate the explicit series of calls necessary to complete the loop.

As the documentation you quoted clearly states,
Iterator.remove is the only safe way to modify a collection during iteration
(emphasis added)
While using am iterator, you cannot modify the collection, except by calling Iterator.remove().
If you aren't iterating the collection, you would use Collection.remove().

What makes this operation "Iterator.remove()" stable than the others ?
It means that iterator knows you removed the element so it won't produce a ConcurrentModifcationException.
Why did they provide a "Collection.remove()" method if it will not be useful in most of the use-cases ?
Usually you would use Map.remove() or Collection.remove() as this can be much more efficient than iterating over every objects. If you are removing while iterating often I suspect you should be using different collections.

Is just a design choice. It would have been possible to specify a different behavior (i.e. the iterator has to skip values that were removed by Collection.remove()), but that would have made the implementation of the collection framework much more complex. So the choice to leave it unspecified.
It's quite useful. If you know the object you want to remove, why iterate?

From what I understand, the Collection.remove(int index) will also return the removed object. Iterative.remove() will not.

Related

Does Kotlin's version of Java's for each have the same limitations?

I'm currently learning Kotlin coming from Java. In Java, doing
for(String s:stringList){
if(condition == true) stringList.remove(s);
}
doesn't work, as you can only read data with for each. Does this also apply to Kotlin?

It's the same restriction as in Java - when you're iterating over a collection, you're limited to modifying it through the Iterator's remove() method
Removes from the underlying collection the last element returned by this iterator (optional operation). This method can be called only once per call to next(). The behavior of an iterator is unspecified if the underlying collection is modified while the iteration is in progress in any way other than by calling this method.
Kotlin has its own equivalent (bearing in mind Kotlin uses the concept of mutable and immutable lists, so you can only do this with a MutableList)
The problem with the foreach / enhanced for structure is it doesn't expose the Iterator you need to call remove() on, and you're not allowed to call remove() on the collection itself. And Kotlin's the same - so your options are basically
use for/forEach (but you can't modify the collection)
call the collection's iterator() method and iterate over it yourself (now you can call remove() on it, with the restrictions I quoted)
avoid mutability and produce a new collection
The last one is what you're encouraged to do in Kotlin - instead of modifying a collection, try to use immutable ones and transform them into new collections instead:
stringList.filterNot { condition }
If you get familiar with all the collection functions (yeah, there are a lot of them) you'll get an idea of the ways you can transform A to B, and the convenience functions that are there to help you do it (e.g. filterTo which lets you provide a MutableList to populate with the items you want to keep).
If you do want to modify mutable collections in place though, it's possible but you have to manage it yourself, because it's inherently dangerous and the system can't guarantee you're doing it safely.

Yes, if you rewrite this code in Kotlin, you'll get the same concurrent modification problem:
for (s in stringList) {
if(condition == true) stringList.remove(s)
}
To avoid this problem, you can obtain a mutable iterator from a mutable list and use it to remove elements on the go. But unlike Java, you don't have to iterate it manually with hasNext()/next() calls — you can use the same for operator for iterating it:
val iterator = stringList.iterator()
for (s in iterator) {
if (condition == true) iterator.remove()
}

Need of Iterator class in Java?

The question might be pretty vague I know. But the reason I ask this is because the class must have been made with some thought in mind.
This question came into my mind while browsing through a few questions here on SO.
Consider the following code:
class A
{
private int myVar;
A(int varAsArg)
{
myVar = varAsArg;
}
public static void main(String args[])
{
List<A> myList = new LinkedList<A>();
myList.add(new A(1));
myList.add(new A(2));
myList.add(new A(3));
//I can iterate manually like this:
for(A obj : myList)
System.out.println(obj.myVar);
//Or I can use an Iterator as well:
for(Iterator<A> i = myList.iterator(); i.hasNext();)
{
A obj = i.next();
System.out.println(obj.myVar);
}
}
}
So as you can see from the above code, I have a substitute for iterating using a for loop, whereas, I could do the same using the Iterator class' hasNext() and next() method. Similarly there can be an example for the remove() method. And the experienced users had commented on the other answers to use the Iterator class instead of using the for loop to iterate through the List. Why?
What confuses me even more is that the Iterator class has only three methods. And the functionality of those can be achieved with writing a little different code as well.
Some people might argue that the functionality of many classes can be achieved by writing one's own code instead of using the class made for the purpose. Yes,true. But as I said, Iterator class has only three methods. So why go through the hassle of creating an extra class when the same job can be done with a simple block of code which is not way too complicated to understand either.
EDIT:
While I'm at it, since many of the answers say that I can't achieve the remove functionality without using Iterator,I would just like to know if the following is wrong, or will it have some undesirable result.
for(A obj : myList)
{
if(obj.myVar == 1)
myList.remove(obj);
}
Doesn't the above code snippet do the same thing as remove() ?

Iterator came long before the for statement that you show in the evolution of Java. So that's why it's there. Also if you want to remove something, using Iterator.remove() is the only way you can do it (you can't use the for statement for that).

First of all, the for-each construct actually uses the Iterator interface under the covers. It does not, however, expose the underlying Iterator instance to user code, so you can't call methods on it.
This means that there are some things that require explicit use of the Iterator interface, and cannot be achieved by using a for-each loop.
Removing the current element is one such use case.
For other ideas, see the ListIterator interface. It is a bidirectional iterator that supports inserting elements and changing the element under the cursor. None of this can be done with a for-each loop.
for(A obj : myList)
{
if(obj.myVar == 1)
myList.remove(obj);
}
Doesn't the above code snippet do the same thing as remove() ?
No, it does not. All standard containers that I know of will throw ConcurrentModificationException when you try to do this. Even if it were allowed to work, it is ambiguous (what if obj appears in the list twice?) and inefficient (for linked lists, it would require linear instead of constant time).

The foreach construct (for (X x: list)) actually uses Iterator as its implementation internally. You can feed it any Iterable as a source of elements.
And, as others already remarked: Iterator is longer in Java than foreach, and it provides remove().
Also: how else would you implement your own provider class (myList in your example)? You make it Iterable and implement a method that creates an Iterator.

For one thing, Iterator was created way before the foreach loop (shown in your code sample above) was introduced into Java. (The former came in Java2, the latter only in Java5).
Since Java5, indeed the foreach loop is the preferred idiom for the most common scenario (when you are iterating through a single Iterable at a time, in the default order, and do not need to remove or index elements). Note though that the foreach uses an iterator in the background for standard collection classes; in other words it is just syntactic sugar.

Iterator, listIterator both are used to allow different permission to user, like list iterator have 9 methods but iterator have only 3 methods, but have remove functionality which you can't achieve with for loop. Enumeration is another thing which is also used to give only read permissions.

Iterator is an implementation of the classical GoF design pattern. In that way you can achieve clear behaviour separation from the 'technical code' which iterates (the Iterator) and your business code.
Imagine you have to change the 'next' behaviour (say, by getting not the next element but the next EVEN element). If you rely only on for loops you will have to change manually every single for loop, in a way like this
for (int i; i < list.size(); i = i+2)
while if you use an Iterator you can simply override/rewrite the "next()" and "hasNext()" methods and the change will be visible everywhere in your application.

I think answer to your question is abstraction. Iterator is written because to abstract iterating over different set of collections.
Every collection has different methods to iterate over their elements. ArrayList has indexed access. Queues has poll and peek methods. Stack has pop and peek.
Usually you only need to iterate over elements so Iterator comes into play. You do not care about which type of Collection you need to iterate. You only call iterator() method and user Iterator object itself to do this.
If you ask why not put same methods on Collection interface and get rid of extra object creation. You need to know your current position in collection so you can not implement next method in Collection because you can not use it on different locations because every time you call next() method it will increment index (simplifying every collection has different implementation) so you will skip some objects if you use same collection at different places. Also if collection support concurrency than you can not write a multi-thread safe next() method in collection.
It is usually not safe to remove an object from collection iterating by other means than iterator. Iterator.remove() method is safest way to do it. For ArrayList example:
for(int i=0;i

for-each vs for vs while

I wonder what is the best way to implement a "for-each" loop over an ArrayList or every kind of List.
Which of the followings implementations is the best and why? Or is there a best way?
Thank you for your help.
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
...
//#0
for(String value : values) {
...
}
//#1
for(int i = 0; i < values.size(); i++) {
String value = values.get(i);
...
}
//#2
for(Iterator it = values.iterator(); it.hasNext(); ) {
String value = it.next();
...
}
//#3
Iterator it = values.iterator();
while (it.hasNext()) {
String value = (String) it.next();
...
}

#3 has a disadvantage because the scope of the iterator it extends beyond the end of the loop. The other solutions don't have this problem.
#2 is exactly the same as #0, except #0 is more readable and less prone to error.
#1 is (probably) less efficient because it calls .size() every time through the loop.
#0 is usually best because:
it is the shortest
it is least prone to error
it is idiomatic and easy for other people to read at a glance
it is efficiently implemented by the compiler
it does not pollute your method scope (outside the loop) with unnecessary names

The short answer is to use version 0. Take a peek at the section title Use Enhanced For Loop Syntax at Android's documentation for Designing for Performance. That page has a bunch of goodies and is very clear and concise.

#0 is the easiest to read, in my opinion, but #2 and #3 will work just as well. There should be no performance difference between those three.
In almost no circumstances should you use #1. You state in your question that you might want to iterate over "every kind of List". If you happen to be iterating over a LinkedList then #1 will be n^2 complexity: not good. Even if you are absolutely sure that you are using a list that supports efficient random access (e.g. ArrayList) there's usually no reason to use #1 over any of the others.

In response to this comment from the OP.
However, #1 is required when updating (if not just mutating the current item or building the results as a new list) and comes with the index. Since the List<> is an ArrayList<> in this case, the get() (and size()) is O(1), but that isn't the same for all List-contract types.
Lets look at these issues:
It is certainly true that get(int) is not O(1) for all implementations of the List contract. However, AFAIK, size() is O(1) for all List implementations in java.util. But you are correct that #1 is suboptimal for many List implementations. Indeed, for lists like LinkedList where get(int) is O(N), the #1 approach results in a O(N^2) list iteration.
In the ArrayList case, it is a simple matter to manually hoist the call to size(), assigning it to a (final) local variable. With this optimization, the #1 code is significantly faster than the other cases ... for ArrayLists.
Your point about changing the list while iterating the elements raises a number of issues:
If you do this with a solution that explicitly or implicitly uses iterators, then depending on the list class you may get ConcurrentModificationExceptions. If you use one of the concurrent collection classes, you won't get the exception, but the javadocs state that the iterator won't necessarily return all of the list elements.
If you do this using the #1 code (as is) then, you have a problem. If the modification is performed by the same thread, you need to adjust the index variable to avoid missing entries, or returning them twice. Even if you get everything right, a list entry concurrently inserted before the current position won't show up.
If the modification in the #1 case is performed by a different thread, it hard to synchronize properly. The core problem is that get(int) and size() are separate operations. Even if they are individually synchronized, there is nothing to stop the other thread from modifying the list between a size and get call.
In short, iterating a list that is being concurrently modified is tricky, and should be avoided ... unless you really know what you are doing.

Whats the replacement of For-Each loop for filtering?

Though for-each loop has many advantages but the problem is ,it doesn't work when you want to Filter(Filtering means removing element from List) a List,Can you please any replacement as even traversing through Index is not a good option..

What do you mean by "filtering"? Removing certain elements from a list? If so, you can use an iterator:
for(Iterator<MyElement> it = list.iterator(); it.hasNext(); ) {
MyElement element = it.next();
if (some condition) {
it.remove();
}
}
Update (based on comments):
Consider the following example to illustrate how iterator works. Let's say we have a list that contains 'A's and 'B's:
A A B B A
We want to remove all those pesky Bs. So, using the above loop, the code will work as follows:
hasNext()? Yes. next(). element points to 1st A.
hasNext()? Yes. next(). element points to 2nd A.
hasNext()? Yes. next(). element points to 1st B. remove(). iterator counter does NOT change, it still points to a place where B was (technically that's not entirely correct but logically that's how it works). If you were to call remove() again now, you'd get an exception (because list element is no longer there).
hasNext()? Yes. next(). element points to 2nd B. The rest is the same as #3
hasNext()? Yes. next(). element points to 3rd A.
hasNext()? No, we're done. List now has 3 elements.
Update #2: remove() operation is indeed optional on iterator - but only because it is optional on an underlying collection. The bottom line here is - if your collection supports it (and all collections in Java Collection Framework do), so will the iterator. If your collection doesn't support it, you're out of luck anyway.

ChssPly76's answer is the right approach here - but I'm intrigued as to your thinking behind "traversing through index is not a good option". In many cases - the common case in particular being that of an ArrayList - it's extremely efficient. (In fact, in the arraylist case, I believe that repeated calls to get(i++) are marginally faster than using an Iterator, though nowhere near enough to sacrifice readability).
Broadly speaking, if the object in question implements java.util.RandomAccess, then accessing sequential elements via an index should be roughly the same speed as using an Iterator. If it doesn't (e.g. LinkedList would be a good counterexample) then you're right; but don't dismiss the option out of hand.

I have had success using the
filter(java.util.Collection collection, Predicate predicate)
method of CollectionUtils in commons collections.
http://commons.apache.org/collections/api-2.1.1/org/apache/commons/collections/CollectionUtils.html#filter(java.util.Collection,%20org.apache.commons.collections.Predicate)

If you, like me, don't like modifying a collection while iterating through it's elements or if the iterator just doesn't provide an implementation for remove, you can use a temporary collection to just collect the elements you want to delete. Yes, yes, its less efficient compared to modifying the iterator, but to me it's clearer to understand whats happening:
List<Object> data = getListFromSomewhere();
List<Object> filter = new ArrayList<Object>();
// create Filter
for (Object item: data) {
if (throwAway(item)) {
filter.add(item);
}
}
// use Filter
for (Object item:filter) {
data.remove(item);
}
filter.clear();
filter = null;

How to safely remove other elements from a Collection while iterating through the Collection

I'm iterating over a JRE Collection which enforces the fail-fast iterator concept, and thus will throw a ConcurrentModificationException if the Collection is modified while iterating, other than by using the Iterator.remove() method . However, I need to remove an object's "logical partner" if the object meets a condition. Thus preventing the partner from also being processed. How can I do that? Perhaps by using better collection type for this purpose?
Example.
myCollection<BusinessObject>
for (BusinessObject anObject : myCollection)
{
if (someConditionIsTrue)
{
myCollection.remove(anObjectsPartner); // throws ConcurrentModificationException
}
}
Thanks.

It's not a fault of the collection, it's the way you're using it. Modifying the collection while halfway through an iteration leads to this error (which is a good thing as the iteration would in general be impossible to continue unambiguously).
Edit: Having reread the question this approach won't work, though I'm leaving it here as an example of how to avoid this problem in the general case.
What you want is something like this:
for (Iterator<BusinessObject> iter = myCollection.iterator; iter.hasNext(); )
{
BusinessObject anObject = iter.next();
if (someConditionIsTrue)
{
iter.remove();
}
}
If you remove objects through the Iterator itself, it's aware of the removal and everything works as you'd expect. Note that while I think all standard collections work nicely in this respect, Iterators are not required to implement the remove() method so if you have no control over the class of myCollection (and thus the implementation class of the returned iterator) you might need to put more safety checks in there.
An alternative approach (say, if you can't guarantee the iterator supports remove() and you require this functionality) is to create a copy of the collection to iterate over, then remove the elements from the original collection.
Edit: You can probably use this latter technique to achieve what you want, but then you still end up coming back to the reason why iterators throw the exception in the first place: What should the iteration do if you remove an element it hasn't yet reached? Removing (or not) the current element is relatively well-defined, but you talk about removing the current element's partner, which I presume could be at a random point in the iterable. Since there's no clear way that this should be handled, you'll need to provide some form of logic yourself to cope with this. In which case, I'd lean towards creating and populating a new collection during the iteration, and then assigning this to the myCollection variable at the end. If this isn't possible, then keeping track of the partner elements to remove and calling myCollection.removeAll would be the way to go.

You want to remove an item from a list and continue to iterate on the same list. Can you implement a two-step solution where in step 1 you collect the items to be removed in an interim collection and in step 2 remove them after identifying them?

Some thoughts (it depends on what exactly the relationship is between the two objects in the collection):
A Map with the object as the key and the partner as the value.
A CopyOnWriteArrayList, but you have to notice when you hit the partner
Make a copy into a different Collection object, and iterate over one, removing the other. If this original Collection can be a Set, that would certaily be helpful in removal.

You could try finding all the items to remove first and then remove them once you have finished processing the entire list. Skipping over the deleted items as you find them.
myCollection<BusinessObject>
List<BusinessObject> deletedObjects = new ArrayList(myCollection.size());
for (BusinessObject anObject : myCollection)
{
if (!deletedObjects.contains(anObject))
{
if (someConditionIsTrue)
{
deletedObjects.add(anObjectsPartner);
}
}
}
myCollection.removeAll(deletedObjects);

CopyOnWriteArrayList will do what you want.

Why not use a Collection of all the original BusinessObject and then a separate class (such as a Map) which associates them (ie creates partner)? Put these both as a composite elements in it's own class so that you can always remove the Partner when Business object is removed. Don't make it the responsibility of the caller every time they need to remove a BusinessObject from the Collection.
IE
class BusinessObjectCollection implements Collection<BusinessObject> {
Collection<BusinessObject> objects;
Map<BusinessObject, BusinessObject> associations;
public void remove(BusinessObject o) {
...
// remove from collection and dissasociate...
}
}

The best answer is the second, use an iterator.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Collection - Iterator.remove() vs Collection.remove() - java

From what I understand, the Collection.remove(int index) will also return the removed object. Iterative.remove() will not.

Related

Does Kotlin's version of Java's for each have the same limitations?

Need of Iterator class in Java?

for-each vs for vs while

Whats the replacement of For-Each loop for filtering?

How to safely remove other elements from a Collection while iterating through the Collection

Categories

Resources