Java Set iterator, safe for removal of elements? - java

I would like to iterate over a Set and remove the elements from the set that match some condition. The documentation of iterator says nothing about modifying the list while iterating over it.
Is this possible? If not, what would be the best way to do it? Note that I only want to remove elements from the set that are provided by the Iterator.
Edit: Quickly was shown that this is possible. Can I also do it with the following syntax?
for(Node n : mySet) {
mySet.remove(n);
}

Yes, you can use the iterator to remove the current element safely:
iterator.remove();
The javadoc of remove() says:
Removes the specified element from this set if it is present (optional operation). More formally, removes an element e such that (o==null ? e==null : o.equals(e)), if this set contains such an element. Returns true if this set contained the element (or equivalently, if this set changed as a result of the call). (This set will not contain the element once the call returns.)
Answer to your next question: No, you can't. Modifying a set while iterating over it with an enhanced for loop will cause a ConcurrentModificationException.

The answer of tangens is correct. If you don't use iterator.remove() but remove directly from Set, you will receive an exception call ConcurrentModificationException

This has actually improved in Java 8. Now you can just
mySet.removeIf(element -> someConditionMatches());
The above is implemented as a default method in java.util.Collection and should save everyone from writing boring loops. That said, it should work for any type of collection, and not just Set.

This ist what .remove() does:
http://download.oracle.com/javase/6/docs/api/java/util/Iterator.html#remove%28%29
"Removes from the underlying collection the last element returned by the iterator (optional operation). This method can be called only once per call to next. The behavior of an iterator is unspecified if the underlying collection is modified while the iteration is in progress in any way other than by calling this method. "

for(Node n : mySet) {
mySet.remove(n); }
wont work since you are modifying the set you are iterating. This however can only be done using the iterator which is not the case this way.
This is one disadvantage of using enhanced for loops.

Related

How to remove an element from set using Iterator?

I have a scenario that I am iterating over a set using iterator. Now I want to remove 1st element while my iterator is on 2nd element. How can I do it. I know Set is unordered and there is nothing like first or second element but my question is I want to remove an element which is not being currently returned by Iterator.next
I dont want to convert this set to list and using listIterator.
I dont want to collect all objects to be removed in other set and call remove all
I cant store it and remove after the iteration
sample code.
Set<MyObject> mySet = new HashSet<MyObject>();
mySet.add(MyObject1);
mySet.add(MyObject2);
...
Iterator itr = mySet.iterator();
while(itr.hasNext())
{
// Now iterator is at second element and I want to remove first element
}
Given the constraints as you have stated them, I don't think that there is solution to the problem.
The Iterator.remove() method will only remove the "current" element.
You have excluded "remembering" objects and removing them from the HashSet in a second pass / phase.
Schemes that involve using two iterators simultaneously and removing using one of them will result in CCMEs on the second one.
The three approaches that you suggested (but then excluded) would all work. I think the 2nd one would be the most performant.
Another idea would be to implement a new hash table-based Set type which has an Iterator with an extra remove operation. (You could start with the source code of HashSet etcetera, rename the class and then modify it to do what you need.)
Set.iterator() returns a java.lang.Iterator. This iterator only provides methods to remove the current element and to iterate forward.
So if you don't want to convert your set, using only Iterator you cannot remove the previous element.
What you can do for example is that you collect the elements you want to remove, and after you iterated through the whole set, you remove the collected elements after, e.g. with Set.removeAll(removableCollection):
List<MyObject> removableList = new ArrayList<>();
MyObject previous;
Iterator<MyObject> itr = mySet.iterator();
while (itr.hasNext()) {
MyObject current = itr.next();
// If you find you want to remove the previous element:
if (someCondition)
removableList.add(previous);
previous = current;
}
mySet.removeAll(removeableList);
HashSet is unordered and javadoc clearly states that Iterator's remove method Removes from the underlying collection the last element returned by this iterator (optional operation). So the answer is no through an iterator.Since HashSet contains unique elements,you can use Set.remove(Object) after traversing the first element,in thios case you dont even need to go to the 2nd element
HashSet<K> hs;// you HashSet containing unique elements
if(!hs.isEmpty())
{
hs.remove(hs.iterator().next());
}
Just remember HashSet is unordered and there is no such thing as 1st or 2nd element
Alternately,you should use LinkedHashSet which gives you an ordered Set based on insertion order

Collection - Iterator.remove() vs Collection.remove()

As per Sun ,
"Iterator.remove is the only safe way to modify a collection during
iteration; the behavior is unspecified if the underlying collection is
modified in any other way while the iteration is in progress."
I have two questions :
What makes this operation "Iterator.remove()" stable than the others ?
Why did they provide a "Collection.remove()" method if it will not be useful in most of the use-cases?
First of all, Collection.remove() is very useful. It is applicable in a lot of use cases, probably more so than Iterator.remove().
However, the latter solves one specific problem: it allows you to modify the collection while iterating over it.
The problem solved by Iterator.remove() is illustrated below:
List<Integer> l = new ArrayList<Integer>(Arrays.asList(1, 2, 3, 4));
for (int el : l) {
if (el < 3) {
l.remove(el);
}
}
This code is invalid since l.remove() is called during iteration over l.
The following is the correct way to write it:
Iterator<Integer> it = l.iterator();
while (it.hasNext()) {
int el = it.next();
if (el < 3) {
it.remove();
}
}
If you're iterating over a collection and use:
Collection.remove()
you can get runtime errors (specifically ConcurrentModifcationException) because you're changing the state of the object used previously to construct the explicit series of calls necessary to complete the loop.
If you use:
Iterator.remove()
you tell the runtime that you would like to change the underlying collection AND re-evaluate the explicit series of calls necessary to complete the loop.
As the documentation you quoted clearly states,
Iterator.remove is the only safe way to modify a collection during iteration
(emphasis added)
While using am iterator, you cannot modify the collection, except by calling Iterator.remove().
If you aren't iterating the collection, you would use Collection.remove().
What makes this operation "Iterator.remove()" stable than the others ?
It means that iterator knows you removed the element so it won't produce a ConcurrentModifcationException.
Why did they provide a "Collection.remove()" method if it will not be useful in most of the use-cases ?
Usually you would use Map.remove() or Collection.remove() as this can be much more efficient than iterating over every objects. If you are removing while iterating often I suspect you should be using different collections.
Is just a design choice. It would have been possible to specify a different behavior (i.e. the iterator has to skip values that were removed by Collection.remove()), but that would have made the implementation of the collection framework much more complex. So the choice to leave it unspecified.
It's quite useful. If you know the object you want to remove, why iterate?
From what I understand, the Collection.remove(int index) will also return the removed object. Iterative.remove() will not.

Removing elements from ListBuffer

According to this posting, it is said that ListBuffer allows constant-time removal of the first and last elements. I've been looking into the API reference and the ListBuffer source code, but I can't find how I remove the last element in constant time while remove(0) will do the job for the first element. What would be the proper way to remove the last element?
Another question: is it possible to remove an element efficiently while iterating over a ListBuffer? In Java it can be done with Iterator.remove() but the Scala iterator doesn't seem to have the remove() method...
The first question has an easy if disappointing answer: you can't remove the last element in constant time, as doing so would require a reference to the element-before-last. (It's a singly linked list, inside a wrapper class that holds the beginning and end elements of the list.)
The second question is equally easy and perhaps disappointing: Iterators in Scala are simply views of the collection. They don't modify the underlying collection. (This is in keeping with the "immutable by default, mutable only when necessary" philosophy.)
You can remove the last element with trimEnd(1)

Iterator behavior in Java

I have a question regarding the iterator behavior in Java.
I have a call such as this:
myIterable.iterator().hasNext()
If this call returns true, can I be sure that the collection has at least two elements?
From the Java API specification, I could only find out that true means there is one more element to go which can be reached by next(). But what happens if the pointer is at the very beginning (meaning whether the hasNext() can recognize the first element separately)
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/Iterator.html
it says true if the iteration has more elements. But more element could also mean the very first one?
[Edit]
How can I know whether the iterator has exactly two elements to iterate through? Of course, I can iterate and count, but I can't go back or iterate twice or clone the iterator in my case, this is an Hadoop iterator.
it can mean the first one. You can only be sure that it has more than zero element
hasNext() returning true or false makes you able to discern between zero items and one-or-more (at the start of the iteration that is).
You want to know whether it has two-or-more, without starting the iteration. I think, basically, you can not. Exposing that information is more than an iterator has to do: make available the next and only the next item (and information about whether this item exists).
Possibly, the iterator itself doesn't even have that knowledge yet!
But of course you are free to memorize the items you already took out of the list. Then you could use them later, once you know there's actually two or more items in the list.
If you need to pass the iterator to other code, you could write your own class that implements iterator, internally remembers the first two items as member variables and hands out those first, then continues to iterate over the rest of the items in the original iterator (if there's any more left) - a reference to the original iterator therefore also needs to be stored in your custom made iterator
Java iterators are positioned before the fist element. What your expression myIterable.iterator().hasNext() shows is that there is at least one element.
hasNext tells you there is another element accessible with next().
So it just mean you have at least one element.
If myIterable.iterator().hasNext() returns true it means there is at least one element and you can use next() to access that.
How can I know whether the iterator has exactly two elements to iteratte through? Of course, I can iterate and count, but I can't go back or iterate twice or clone the iterator in my case, this is hadoop iterable.
There is no way to do this. Maybe an iterator is wrong for your scenario.
No. If hasNext() returns true means, that collection having atleast one element because the origin position of iterator should before the first element.
List l = new ArrayList();
l.add(1);
Iterator it = l.iterator();
while (it.hasNext()) {
System.out.println(it.next());
}
Result will be 1 because origin position of iterator should before the first element. When we check it.hasNext() for the first time, it will return true because it is having one element. Then, print the element using it.next(). Now only, iterator in the first position. When we check it.hasNext() for the second time, it will return false.
In the iterator, it has a field named cursor.This cursor init value is the collection object size.When you call the method next(), it will --. The method hasNext() check the cursor is equals to zero.Its zero return false, else return true.
But the list and map is not the same, their difference is the concrete realization, the original is the same.List check size, map check end node.

Whats the replacement of For-Each loop for filtering?

Though for-each loop has many advantages but the problem is ,it doesn't work when you want to Filter(Filtering means removing element from List) a List,Can you please any replacement as even traversing through Index is not a good option..
What do you mean by "filtering"? Removing certain elements from a list? If so, you can use an iterator:
for(Iterator<MyElement> it = list.iterator(); it.hasNext(); ) {
MyElement element = it.next();
if (some condition) {
it.remove();
}
}
Update (based on comments):
Consider the following example to illustrate how iterator works. Let's say we have a list that contains 'A's and 'B's:
A A B B A
We want to remove all those pesky Bs. So, using the above loop, the code will work as follows:
hasNext()? Yes. next(). element points to 1st A.
hasNext()? Yes. next(). element points to 2nd A.
hasNext()? Yes. next(). element points to 1st B. remove(). iterator counter does NOT change, it still points to a place where B was (technically that's not entirely correct but logically that's how it works). If you were to call remove() again now, you'd get an exception (because list element is no longer there).
hasNext()? Yes. next(). element points to 2nd B. The rest is the same as #3
hasNext()? Yes. next(). element points to 3rd A.
hasNext()? No, we're done. List now has 3 elements.
Update #2: remove() operation is indeed optional on iterator - but only because it is optional on an underlying collection. The bottom line here is - if your collection supports it (and all collections in Java Collection Framework do), so will the iterator. If your collection doesn't support it, you're out of luck anyway.
ChssPly76's answer is the right approach here - but I'm intrigued as to your thinking behind "traversing through index is not a good option". In many cases - the common case in particular being that of an ArrayList - it's extremely efficient. (In fact, in the arraylist case, I believe that repeated calls to get(i++) are marginally faster than using an Iterator, though nowhere near enough to sacrifice readability).
Broadly speaking, if the object in question implements java.util.RandomAccess, then accessing sequential elements via an index should be roughly the same speed as using an Iterator. If it doesn't (e.g. LinkedList would be a good counterexample) then you're right; but don't dismiss the option out of hand.
I have had success using the
filter(java.util.Collection collection, Predicate predicate)
method of CollectionUtils in commons collections.
http://commons.apache.org/collections/api-2.1.1/org/apache/commons/collections/CollectionUtils.html#filter(java.util.Collection,%20org.apache.commons.collections.Predicate)
If you, like me, don't like modifying a collection while iterating through it's elements or if the iterator just doesn't provide an implementation for remove, you can use a temporary collection to just collect the elements you want to delete. Yes, yes, its less efficient compared to modifying the iterator, but to me it's clearer to understand whats happening:
List<Object> data = getListFromSomewhere();
List<Object> filter = new ArrayList<Object>();
// create Filter
for (Object item: data) {
if (throwAway(item)) {
filter.add(item);
}
}
// use Filter
for (Object item:filter) {
data.remove(item);
}
filter.clear();
filter = null;

Categories

Resources