I have a question regarding the iterator behavior in Java.
I have a call such as this:
myIterable.iterator().hasNext()
If this call returns true, can I be sure that the collection has at least two elements?
From the Java API specification, I could only find out that true means there is one more element to go which can be reached by next(). But what happens if the pointer is at the very beginning (meaning whether the hasNext() can recognize the first element separately)
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/Iterator.html
it says true if the iteration has more elements. But more element could also mean the very first one?
[Edit]
How can I know whether the iterator has exactly two elements to iterate through? Of course, I can iterate and count, but I can't go back or iterate twice or clone the iterator in my case, this is an Hadoop iterator.
it can mean the first one. You can only be sure that it has more than zero element
hasNext() returning true or false makes you able to discern between zero items and one-or-more (at the start of the iteration that is).
You want to know whether it has two-or-more, without starting the iteration. I think, basically, you can not. Exposing that information is more than an iterator has to do: make available the next and only the next item (and information about whether this item exists).
Possibly, the iterator itself doesn't even have that knowledge yet!
But of course you are free to memorize the items you already took out of the list. Then you could use them later, once you know there's actually two or more items in the list.
If you need to pass the iterator to other code, you could write your own class that implements iterator, internally remembers the first two items as member variables and hands out those first, then continues to iterate over the rest of the items in the original iterator (if there's any more left) - a reference to the original iterator therefore also needs to be stored in your custom made iterator
Java iterators are positioned before the fist element. What your expression myIterable.iterator().hasNext() shows is that there is at least one element.
hasNext tells you there is another element accessible with next().
So it just mean you have at least one element.
If myIterable.iterator().hasNext() returns true it means there is at least one element and you can use next() to access that.
How can I know whether the iterator has exactly two elements to iteratte through? Of course, I can iterate and count, but I can't go back or iterate twice or clone the iterator in my case, this is hadoop iterable.
There is no way to do this. Maybe an iterator is wrong for your scenario.
No. If hasNext() returns true means, that collection having atleast one element because the origin position of iterator should before the first element.
List l = new ArrayList();
l.add(1);
Iterator it = l.iterator();
while (it.hasNext()) {
System.out.println(it.next());
}
Result will be 1 because origin position of iterator should before the first element. When we check it.hasNext() for the first time, it will return true because it is having one element. Then, print the element using it.next(). Now only, iterator in the first position. When we check it.hasNext() for the second time, it will return false.
In the iterator, it has a field named cursor.This cursor init value is the collection object size.When you call the method next(), it will --. The method hasNext() check the cursor is equals to zero.Its zero return false, else return true.
But the list and map is not the same, their difference is the concrete realization, the original is the same.List check size, map check end node.
Related
I have a scenario that I am iterating over a set using iterator. Now I want to remove 1st element while my iterator is on 2nd element. How can I do it. I know Set is unordered and there is nothing like first or second element but my question is I want to remove an element which is not being currently returned by Iterator.next
I dont want to convert this set to list and using listIterator.
I dont want to collect all objects to be removed in other set and call remove all
I cant store it and remove after the iteration
sample code.
Set<MyObject> mySet = new HashSet<MyObject>();
mySet.add(MyObject1);
mySet.add(MyObject2);
...
Iterator itr = mySet.iterator();
while(itr.hasNext())
{
// Now iterator is at second element and I want to remove first element
}
Given the constraints as you have stated them, I don't think that there is solution to the problem.
The Iterator.remove() method will only remove the "current" element.
You have excluded "remembering" objects and removing them from the HashSet in a second pass / phase.
Schemes that involve using two iterators simultaneously and removing using one of them will result in CCMEs on the second one.
The three approaches that you suggested (but then excluded) would all work. I think the 2nd one would be the most performant.
Another idea would be to implement a new hash table-based Set type which has an Iterator with an extra remove operation. (You could start with the source code of HashSet etcetera, rename the class and then modify it to do what you need.)
Set.iterator() returns a java.lang.Iterator. This iterator only provides methods to remove the current element and to iterate forward.
So if you don't want to convert your set, using only Iterator you cannot remove the previous element.
What you can do for example is that you collect the elements you want to remove, and after you iterated through the whole set, you remove the collected elements after, e.g. with Set.removeAll(removableCollection):
List<MyObject> removableList = new ArrayList<>();
MyObject previous;
Iterator<MyObject> itr = mySet.iterator();
while (itr.hasNext()) {
MyObject current = itr.next();
// If you find you want to remove the previous element:
if (someCondition)
removableList.add(previous);
previous = current;
}
mySet.removeAll(removeableList);
HashSet is unordered and javadoc clearly states that Iterator's remove method Removes from the underlying collection the last element returned by this iterator (optional operation). So the answer is no through an iterator.Since HashSet contains unique elements,you can use Set.remove(Object) after traversing the first element,in thios case you dont even need to go to the 2nd element
HashSet<K> hs;// you HashSet containing unique elements
if(!hs.isEmpty())
{
hs.remove(hs.iterator().next());
}
Just remember HashSet is unordered and there is no such thing as 1st or 2nd element
Alternately,you should use LinkedHashSet which gives you an ordered Set based on insertion order
I have a java ArrayList to which I add 5 objects.
If I iterate over the list and print them out, then iterate over the list and print them out again.
Will the retrieval order in these 2 cases be the same? (I know it may be different from the insertion order)
Yes, assuming you haven't modified the list in-between. From http://docs.oracle.com/javase/6/docs/api/java/util/List.html:
iterator
Iterator<E> iterator()
Returns an iterator over the elements in this list in proper sequence.
A bit vague, perhaps, but in other portions of that page, this term is defined:
proper sequence (from first to last element)
(I know it may be different from the insertion order)
No it won't. The contract of List requires that the add order is the same as the iteration order, since add inserts at the end, and iterator produces an iterator that iterates from start to end in order.
Set doesn't require this, so you may be confusing the contract of Set and List regarding iteration order.
From the Javadoc:
Iterator<E> iterator()
Returns an iterator over the elements in this list in proper sequence.
It's in the specification of the List interface to preserve order.
It's the Set classes that don't preserve order.
If you're not mutating the list, then the iteration order will stay the same. Lists have a contractually specified ordering, and the iterator specification guarantees that it iterates over elements in that order.
Yes, an ArrayList guarantees iteration order over its elements - that is, they will come out in the same order you inserted them, provided that you don't make any insertions while iterating over the ArrayList.
Retrieval does not vary unless you change the iterator you are using. As long as you are using the same method for retrieval and have not changed the list itself then the items will be returned in the same order.
When you add an element to an ArrayList using add(E e), the element is appended to the end of the list. Consequently, if all you do is call the single-argument add method a number of times and then iterate, the iteration will be in exactly the same order as the calls to add.
The iteration order will be the same everytime you iterate over the same unmodified list.
Also, assuming you add the elements using the add() method, the iteration order will be the same as the insertion order since this method appends elements to the end of the list.
Yes the retrieval order is guaranteed to be the same as long as list is not mutated and you use the same iterator, but having to need to rely on retrieval order indicated something fishy with the design. It is generally not a good idea to base business logic upon certain retrieval order.
Even Sets will return the same result, if you don't modify them (adding or removing items to them).
If I am using a for loop (the standard for loop, not an enhanced for statement), I fail to see how an iterator increases efficiency when searching through a collection. If I have a statement such as:
(Assuming that aList is a List of generic objects, type E, nextElement refers to the next element within the list)
for (int index = 0; index < aList.size(); index++){
E nextElement = aList.get(index);
// do something with nextElement...
}
and I have the get method that looks something like:
Node<E> nodeRef = head;
for (int i = 0; i < index; i++){
nodeRef = nodeRef.next;
// possible other code
}
this would essentially be searching through the List, one element at a time. However, if I use an iterator, will it not be doing the same operation? I know an iterator is supposed to be O(1) speed, but wouldn't it be O(n) if it has to search through the entire list anyway?
It's not primarily about efficiency, IMO. It's about abstraction. Using an index ties you to collections which can retrieve an item for a given index efficiently (so it won't work well with a linked list, say)... and it doesn't express what you're trying to do, which is iterate over the list.
With an iterator, you can express the idea of iterating over a sequence of items whether that sequence can easily be indexed or not, whether the size is known in advance or not, and even in cases where it's effectively infinite.
Your second case is still written using a for loop which increments an index, which isn't the idiomatic way of thinking about it - it should simply be testing whether or not it's reached the end. For example, it might be:
for (Node<E> nodeRef = head; nodeRef != null; nodeRef = nodeRef.next)
{
}
Now we have the right abstraction: the loop expresses where we start (the head), when we stop (when there are no more elements) and how we go from one element to the next (using the next field). This expresses the idea of iterating more effectively than "I've got a counter starting at 0, and I'm going to ask for the value at the particular counter on each iteration until the value of the counter is greater than some value which happens to be the length of the list."
We're fairly used to the latter way of expressing things, but it doesn't really say what we mean nearly as may as the iterator approach.
Iterators are not about increasing efficiency, they're about abstraction in the object-oriented sense. Implementation-wise, the iterator is doing something similar to what you're doing, going through your collection one element at a time, at least if the collection is index-based. It's supposed to be O(1) when retrieving the next element, not the entire list. Iterators help mask what collection is underneath as well, it could be a linked list or a set, etc, but you don't have to know.
Also, notice how connected your for loop is to your specific logic that you want to do on each element, while with an iterator you can abstract out the looping logic from whatever action you want to do.
I think the question you are asking refers to the efficiency of iterators vs. a for-loop using an explicit get on the collection object.
If you write code with a naive version of get, and you iterate through your list using it, then it takes you
one step to "get" the first element
two steps to "get" the second
three steps to get the third
...
n steps to get the last
for a total of n(n-1)/2 operations, which is O(n^2).
But if you used an iterator which internally kept track of the next element (i.e. one step to advance), then iterating the whole list is O(n), a big improvement.
Like Jon said, iterators have nothing to do with efficiency they just abstract the concept of being able to iterate over a collection. So you are right, if you are just searching through a list there is no real benefit to an iterator over a for loop, but in some cases iterators provide convenient ways for doing things that would be difficult with a simple for loop. For example:
while(itr.hasNext()) {
if(itr.next().equals(somethingBad);
itr.remove();
}
In other cases iterators provide a way to traverse the elements of a collection, that you can not fetch by index (eg a hashset). In this case a for loop is not an option.
Remember that it's also a Design Pattern.
"The Iterator Pattern allows traversal of the elements of an aggregate without exposing the underlying implementation. It also places the task of traversal on the iterator object, not on the aggregate, which simplifies the aggregate interface and implementation, and places the responsibility where it should be." (From: Head First Design Pattern)
It's about encapsulation and also the 'single responsibility' principle.
Cheers,
Wim
You are using a linked list here. Iterating over that list without an iterator takes O(n^2) steps, where n is the size of the list. O(n) for iterating over the list and O(n) each time for finding the next element.
The iterator, on the other hand, remembers the node it has visited the last time, and therefore needs only O(1) to find the next element. So eventually the complexity is O(n), which is faster.
I would like to iterate over a Set and remove the elements from the set that match some condition. The documentation of iterator says nothing about modifying the list while iterating over it.
Is this possible? If not, what would be the best way to do it? Note that I only want to remove elements from the set that are provided by the Iterator.
Edit: Quickly was shown that this is possible. Can I also do it with the following syntax?
for(Node n : mySet) {
mySet.remove(n);
}
Yes, you can use the iterator to remove the current element safely:
iterator.remove();
The javadoc of remove() says:
Removes the specified element from this set if it is present (optional operation). More formally, removes an element e such that (o==null ? e==null : o.equals(e)), if this set contains such an element. Returns true if this set contained the element (or equivalently, if this set changed as a result of the call). (This set will not contain the element once the call returns.)
Answer to your next question: No, you can't. Modifying a set while iterating over it with an enhanced for loop will cause a ConcurrentModificationException.
The answer of tangens is correct. If you don't use iterator.remove() but remove directly from Set, you will receive an exception call ConcurrentModificationException
This has actually improved in Java 8. Now you can just
mySet.removeIf(element -> someConditionMatches());
The above is implemented as a default method in java.util.Collection and should save everyone from writing boring loops. That said, it should work for any type of collection, and not just Set.
This ist what .remove() does:
http://download.oracle.com/javase/6/docs/api/java/util/Iterator.html#remove%28%29
"Removes from the underlying collection the last element returned by the iterator (optional operation). This method can be called only once per call to next. The behavior of an iterator is unspecified if the underlying collection is modified while the iteration is in progress in any way other than by calling this method. "
for(Node n : mySet) {
mySet.remove(n); }
wont work since you are modifying the set you are iterating. This however can only be done using the iterator which is not the case this way.
This is one disadvantage of using enhanced for loops.
Though for-each loop has many advantages but the problem is ,it doesn't work when you want to Filter(Filtering means removing element from List) a List,Can you please any replacement as even traversing through Index is not a good option..
What do you mean by "filtering"? Removing certain elements from a list? If so, you can use an iterator:
for(Iterator<MyElement> it = list.iterator(); it.hasNext(); ) {
MyElement element = it.next();
if (some condition) {
it.remove();
}
}
Update (based on comments):
Consider the following example to illustrate how iterator works. Let's say we have a list that contains 'A's and 'B's:
A A B B A
We want to remove all those pesky Bs. So, using the above loop, the code will work as follows:
hasNext()? Yes. next(). element points to 1st A.
hasNext()? Yes. next(). element points to 2nd A.
hasNext()? Yes. next(). element points to 1st B. remove(). iterator counter does NOT change, it still points to a place where B was (technically that's not entirely correct but logically that's how it works). If you were to call remove() again now, you'd get an exception (because list element is no longer there).
hasNext()? Yes. next(). element points to 2nd B. The rest is the same as #3
hasNext()? Yes. next(). element points to 3rd A.
hasNext()? No, we're done. List now has 3 elements.
Update #2: remove() operation is indeed optional on iterator - but only because it is optional on an underlying collection. The bottom line here is - if your collection supports it (and all collections in Java Collection Framework do), so will the iterator. If your collection doesn't support it, you're out of luck anyway.
ChssPly76's answer is the right approach here - but I'm intrigued as to your thinking behind "traversing through index is not a good option". In many cases - the common case in particular being that of an ArrayList - it's extremely efficient. (In fact, in the arraylist case, I believe that repeated calls to get(i++) are marginally faster than using an Iterator, though nowhere near enough to sacrifice readability).
Broadly speaking, if the object in question implements java.util.RandomAccess, then accessing sequential elements via an index should be roughly the same speed as using an Iterator. If it doesn't (e.g. LinkedList would be a good counterexample) then you're right; but don't dismiss the option out of hand.
I have had success using the
filter(java.util.Collection collection, Predicate predicate)
method of CollectionUtils in commons collections.
http://commons.apache.org/collections/api-2.1.1/org/apache/commons/collections/CollectionUtils.html#filter(java.util.Collection,%20org.apache.commons.collections.Predicate)
If you, like me, don't like modifying a collection while iterating through it's elements or if the iterator just doesn't provide an implementation for remove, you can use a temporary collection to just collect the elements you want to delete. Yes, yes, its less efficient compared to modifying the iterator, but to me it's clearer to understand whats happening:
List<Object> data = getListFromSomewhere();
List<Object> filter = new ArrayList<Object>();
// create Filter
for (Object item: data) {
if (throwAway(item)) {
filter.add(item);
}
}
// use Filter
for (Object item:filter) {
data.remove(item);
}
filter.clear();
filter = null;