for-each vs for vs while - java

I wonder what is the best way to implement a "for-each" loop over an ArrayList or every kind of List.
Which of the followings implementations is the best and why? Or is there a best way?
Thank you for your help.
List values = new ArrayList();
values.add("one");
values.add("two");
values.add("three");
...
//#0
for(String value : values) {
...
}
//#1
for(int i = 0; i < values.size(); i++) {
String value = values.get(i);
...
}
//#2
for(Iterator it = values.iterator(); it.hasNext(); ) {
String value = it.next();
...
}
//#3
Iterator it = values.iterator();
while (it.hasNext()) {
String value = (String) it.next();
...
}

#3 has a disadvantage because the scope of the iterator it extends beyond the end of the loop. The other solutions don't have this problem.
#2 is exactly the same as #0, except #0 is more readable and less prone to error.
#1 is (probably) less efficient because it calls .size() every time through the loop.
#0 is usually best because:
it is the shortest
it is least prone to error
it is idiomatic and easy for other people to read at a glance
it is efficiently implemented by the compiler
it does not pollute your method scope (outside the loop) with unnecessary names

The short answer is to use version 0. Take a peek at the section title Use Enhanced For Loop Syntax at Android's documentation for Designing for Performance. That page has a bunch of goodies and is very clear and concise.

#0 is the easiest to read, in my opinion, but #2 and #3 will work just as well. There should be no performance difference between those three.
In almost no circumstances should you use #1. You state in your question that you might want to iterate over "every kind of List". If you happen to be iterating over a LinkedList then #1 will be n^2 complexity: not good. Even if you are absolutely sure that you are using a list that supports efficient random access (e.g. ArrayList) there's usually no reason to use #1 over any of the others.

In response to this comment from the OP.
However, #1 is required when updating (if not just mutating the current item or building the results as a new list) and comes with the index. Since the List<> is an ArrayList<> in this case, the get() (and size()) is O(1), but that isn't the same for all List-contract types.
Lets look at these issues:
It is certainly true that get(int) is not O(1) for all implementations of the List contract. However, AFAIK, size() is O(1) for all List implementations in java.util. But you are correct that #1 is suboptimal for many List implementations. Indeed, for lists like LinkedList where get(int) is O(N), the #1 approach results in a O(N^2) list iteration.
In the ArrayList case, it is a simple matter to manually hoist the call to size(), assigning it to a (final) local variable. With this optimization, the #1 code is significantly faster than the other cases ... for ArrayLists.
Your point about changing the list while iterating the elements raises a number of issues:
If you do this with a solution that explicitly or implicitly uses iterators, then depending on the list class you may get ConcurrentModificationExceptions. If you use one of the concurrent collection classes, you won't get the exception, but the javadocs state that the iterator won't necessarily return all of the list elements.
If you do this using the #1 code (as is) then, you have a problem. If the modification is performed by the same thread, you need to adjust the index variable to avoid missing entries, or returning them twice. Even if you get everything right, a list entry concurrently inserted before the current position won't show up.
If the modification in the #1 case is performed by a different thread, it hard to synchronize properly. The core problem is that get(int) and size() are separate operations. Even if they are individually synchronized, there is nothing to stop the other thread from modifying the list between a size and get call.
In short, iterating a list that is being concurrently modified is tricky, and should be avoided ... unless you really know what you are doing.

Related

For loop vs. Iterator to avoid ConcurrentModificationException with an ArrayList

Question: What is the optimal (performance-wise) solution for the add, removal, modification of items within an ArrayList which at the same time avoids the ConcurrentModificationException from being thrown during operations?
Context: Based on my research looking into this question, there doesn't seem to be any straight-forward answers to the question at hand - most recommend using CopyOnWriteArrayList, but my understanding is that it is not recommended for array lists of large size (which I am working with, hence the performance-aspect of the question).
Thus, my understanding can be summarized as the following, but want to make sure if is correct/incorrect:
IMPORTANT NOTE: The following statements all assume that the operation is done within a synchronized block.
Remove during iteration of an ArrayList should be done with an Iterator, because for loop results in unpredictable behavior if removal is done within the middle of a collection. Example:
Iterator<Item> itemIterator = items.iterator();
while (itemIterator.hasNext()) {
Item item = itemIterator.next();
// check if item needs to be removed
itemIterator.remove();
}
For add operations, cannot be done with an Iterator, but can be with ListIterator. Example:
ListIterator<Item> itemIterator = list.listIterator();
while(itemIterator.hasNext()){
\\ do some operation which requires iteration of the ArrayList
itemIterator.add(item);
}
For add operations, a ListIterator does NOT have to be necessarily be used (i.e. simply items.add(item) should not cause any problems).
For add operations while going through the collection can be done with EITHER a ListIterator or a for loop, but NOT an Iterator. Example:
Iterator<Item> itemIterator = item.iterator();
while (itemIterator.hasNext()) {
\\ do some operation which requires iteration of the ArrayList
items.add(item); \\ NOT acceptable - cannot modify ArrayList while in an Iterator of that ArrayList
}
Modification of an item within an ArrayList can be done with either an Iterator or a for loop with the same performance complexity (is this true?). Example:
\\ iterator example
Iterator<Item> itemIterator = item.iterator();
while (itemIterator.hasNext()) {
Item item = itemIterator.next();
item.update(); // modifies the item within the ArrayList during iteration
}
\\ for loop example
for (Item item : items){
item.update();
}
Will modification during iteration with the Iterator have the same performance as the for loop? Are there any thread-safety differences between the approaches?
Bonus question: what advantage does using a synchronizedList of the ArrayList for add/remove/modify operations vs. for loop vs. iterator if it also requires a synchronized block?
There is no difference between while loops and for loops and in fact, the idiomatic form of a loop using an iterator explicitly, is a for loop:
for(Iterator<Item> it = items.iterator(); it.hasNext(); ) {
Item item = it.next();
item.update();
}
which gets compiled to exactly the same code as
for(Item item: items) {
item.update();
}
Try it online!
There are no performance differences for identical compiled code dependent to the original source code used to produce it.
Instead of focusing on the loop form, you have to focus on the fundamental limitations when inserting or removing elements of an ArrayList. Each time you insert or remove an element, the elements behind the affected index have to be copied to a new location. This isn’t very expensive, as the array only consists of references to the objects, but the costs can easily add up when doing it repeatedly.
So, if you know that the number of insertions or removals is predictably small or will happen at the end or close to the end (so there is only a small number of elements to copy), it’s not a problem. But when inserting or removing an arbitrary number of elements at arbitrary positions in a loop, you run into a quadratic time complexity.
You can avoid this, by using
items.removeIf(item -> /* check and return whether to remove the item*/);
This will use an internal iteration and postpone the moving of elements until their final position is known, leading to a linear time complexity.
If that’s not feasible, you might be better off copying the list into a new list, skipping the unwanted elements. This will be slightly less efficient but still have a linear time complexity. That’s also the solution for inserting a significant number of items at arbitrary positions.
The item.update(); in an entirely different category. “the item within the ArrayList” is a wrong mindset. As said above, the ArrayList contains references to objects whereas the object itself is not affected by “being inside the ArrayList”. In fact, objects can be in multiple collections at the same time, as all standard collections only contain references.
So item.update(); changes the Item object, which is an operation independent of the ArrayList, which is dangerous when you assume a thread safety based on the list.
When you have code like
Item item = items.get(someIndex);
// followed by using item
where get is from a synchronizedList
or a manually synchronized retrieval operation which returns the item to the caller or any other form of code which uses a retrieved Item outside the synchronized block,
then your code is not thread safe. It doesn’t help when the update() call is done under a synchronization or lock when looping over the list, when the other uses are outside the synchronization or lock. To be thread safe, all uses of an object must be protected by the same thread safety construct.
So even when you use the synchronizedList, you must not only guard your loops manually, as the documentation already tells, you also have to expand the protection to all other uses of the contained elements, if they are mutable.
Alternatively, you could have different mechanisms for the list and the contained elements, if you know what you are doing, but it still means that the simplicity of “just wrap the list with synchronizedList” isn’t there.
So what advantage does it have? Actually none. It might have helped developers during the migration from Java 1.1 and its all-synchronized Vector and Hashtable to Java 2’s Collection API. But I never had a use for the synchronized wrappers at all. Any nontrivial use case requires manual synchronization (or locking) anyway.

using Iterator is a performance hit? [duplicate]

I would like to know what are the advantages of Enhanced for loop and Iterators in Java +5 ?
The strengths and also the weaknesses are pretty well summarized in Stephen Colebourne (Joda-Time, JSR-310, etc) Enhanced for each loop iteration control proposal to extend it in Java 7:
FEATURE SUMMARY:
Extends the Java 5 for-each loop to allow access to the
loop index, whether this is the first
or last iteration, and to remove the
current item.
MAJOR ADVANTAGE
The for-each loop is almost certainly the most new
popular feature from Java 5. It works
because it increases the abstraction
level - instead of having to express
the low-level details of how to loop
around a list or array (with an index
or iterator), the developer simply
states that they want to loop and the
language takes care of the rest.
However, all the benefit is lost as
soon as the developer needs to access
the index or to remove an item.
The original Java 5 for each work took
a relatively conservative stance on a
number of issues aiming to tackle the
80% case. However, loops are such a
common form in coding that the
remaining 20% that was not tackled
represents a significant body of code.
The process of converting the loop
back from the for each to be index or
iterator based is painful. This is
because the old loop style if
significantly lower-level, is more
verbose and less clear. It is also
painful as most IDEs don't support
this kind of 'de-refactoring'.
MAJOR BENEFIT:
A common coding idiom is expressed at
a higher abstraction than at present.
This aids readability and clarity.
...
To sum up, the enhanced for loop offers a concise higher level syntax to loop over a list or array which improves clarity and readability. However, it misses some parts: allowing to access the index loop or to remove an item.
See also
Java 7 - For-each loop control access
Stephen Colebourne's original writeup
For me, it's clear, the main advantage is readability.
for(Integer i : list){
....
}
is clearly better than something like
for(int i=0; i < list.size(); ++i){
....
}
I think it's pretty much summed up by the documentation page introducing it here.
Iterating over a collection is uglier than it needs to be
So true..
The iterator is just clutter. Furthermore, it is an opportunity for error. The iterator variable occurs three times in each loop: that is two chances to get it wrong. The for-each construct gets rid of the clutter and the opportunity for error.
Exactly
When you see the colon (:) read it as “in.” The loop above reads as “for each TimerTask t in c.” As you can see, the for-each construct combines beautifully with generics. It preserves all of the type safety, while removing the remaining clutter. Because you don't have to declare the iterator, you don't have to provide a generic declaration for it. (The compiler does this for you behind your back, but you need not concern yourself with it.)
I'd like to sum it up more, but I think that page does it pretty much perfectly.
You can iterate over any collection that's Iterable and also arrays.
And the performance difference isn't anything you should be worried about at all.
Readability is important.
Prefer this
for (String s : listofStrings)
{
...
}
over
for (Iterator<String> iter = listofStrings.iterator(); iter.hasNext(); )
{
String s = iter.next();
...
}
Note that if you need to delete elements as you iterate, you need to use Iterator.
For example,
List<String> list = getMyListofStrings();
for (Iterator<String> iter = list.iterator(); iter.hasNext(); )
{
String s = iter.next();
if (someCondition) {
iter.remove();
}
}
You can't use for(String s : myList) to delete an element in the list.
Also note that when iterating through an array, foreach (or enhanced for) can be used only to obtain the elements, you can't modify the elements in the array.
For more info, see this.
Major drawback is the creation of an Iterator, which is not there with an index-based loop.
It is usually OK, but in performance-critical sections (in a real-time application for instance, when it has to run several hundreds times a second), it can cause major GC intervention...
A cleaner syntax !
There is no difference from the performance perspective as this is just a convenience for a programmer.
As others said, the enhanced for loop provides cleaner syntax, readable code and less type.
Plus, it avoids the possible 'index out of bound' error scenario too. For example, when you iterate a list manually, you might use the index variable in a wrong way, like:
for(int i=0; i<= list.size(); i++)
which will throw exception. But incase of enhanced for loop, we are leaving the iterating task to the compiler. It completely avoids the error case.
As others already answer, it is a syntax sugar for cleaner. If you compare to the class Iterator loop, you will found one less variable you will have to declare.
A foreach/enhanced for/for loop serves to provide a cursor onto a data object. This is particularly useful when you think in terms of “walk a file line by line” or “walk a result set record by record” as it is simple and straightforward to implement.
This also provides a more general and improved way of iterating compared to index-based methods because there is no longer any need for the caller (for loop) to know how values are fetched or collection sizes or other implementation details.
It is more concise. Only problem is null checking.
for (String str : strs) { // make sure strs is not null here
// Do whatever
}
Less typing! Plus more help from the compiler
The enhanced for-loop offers the following main advantage:
for (int i=0; i <= list.size(); i++)
It eliminates the repeated calculation of list.size() on every iteration in the non-enhanced version above. This is a performance advantage that matters.
Alternatively, you may calculate the size outside the loop as follows using an additional variable:
int size = list.size();
for (int i=0; i <= size; i++)

Enhanced for-loop does not accept Iterator

Excuse me if this has been asked before. My search did not bring up any other similar question. This is something that surprised me in Java.
Apparently, the enhanced for-loop only accepts an array or an instance of java.lang.Iterable. It does not accept a java.util.Iterator as a valid obj reference to iterate over. For example, Eclipse shows an error message for the following code. It says: "Can only iterate over an array or an instance of java.lang.Iterable"
Set<String> mySet = new HashSet<String>();
mySet.add("dummy");
mySet.add("test");
Iterator<String> strings = mySet.iterator();
for (String str : strings) {
// Process str value ...
}
Why would this be so? I want to know why the enhanced for-loop was designed to work this way. Although an Iterator is not a collection, it can be used to return one element at a time from a collection. Note that the only method in the java.lang.Iterable interface is Iterator<T> iterator() which returns an Iterator. Here, I am directly handing it an Iterator. I know that hasNext() and next() can be used but using the enhanced for-loop makes it look cleaner.
One thing I understand now is that I could use the enhanced for-loop directly over mySet. So I don't even need the extra call to get an Iterator. So, that would be the way to code this, and yes - it does make some sense.
The enhanced for loop was part of JSR 201. From that page, you can download the proposed final draft documents, which include a FAQ section directly addressing your question:
Appendix I. Design FAQ
Why can't I use the enhanced for statement with an Iterator (rather
than an Iterable or array)?
Two reasons: (1) The construct would not
provide much in the way on syntactic improvement if you had an
explicit iterator in your code, and (2) Execution of the loop would
have the "side effect" of advancing (and typically exhausting) the
iterator. In other words, the enhanced for statement provides a
simple, elegant, solution for the common case of iterating over a
collection or array, and does not attempt to address more complicated
cases, which are better addressed with the traditional for statement.
Why can't I use the enhanced for statement to:
remove elements as I traverse a collection ("filtering")?
simultaneously iterate over multiple collections or arrays?
modify the current slot in an array or list?
See Item 1 above. The expert group considered these cases, but
opted for a simple, clean extension that dose(sic) one thing well. The
design obeys the "80-20 rule" (it handles 80% of the cases with 20% of
the effort). If a case falls into the other 20%, you can always use an
explicit iterator or index as you've done in the past.
In other words, the committee chose to limit the scope of the proposal, and some features that one could imagine being part of the proposal didn't make the cut.
The enhanced for loop was introduced in Java 5 as a simpler way to
iterate through all the elements of a Collection [or an array].
http://www.cis.upenn.edu/~matuszek/General/JavaSyntax/enhanced-for-loops.html
An iterator is not a collection of elements,
it is an object that enables a programmer to traverse a container.
An iterator may be thought of as a type of pointer.
https://en.wikipedia.org/wiki/Iterator
So enhanced for loops work by going through all the elements in a structure that contains elements, while an iterator doesn't contain elements, it acts more like a pointer.
In your example, you are creating an iterator but not using it properly. As to answer your question of why the exception is being thrown- it's from the line:
for (String str : strings) {
"strings" is an iterator here, not a collection that you can iterate through. So you have a few options you can iterate through the set by using an enhanced for loop:
for(String myString : mySet){
//do work
}
or you can iterate through the set using an iterator:
Iterator<String> strings = mySet.iterator();
while(strings.hasNext()){
//do work
}
hope you find this helpful.
The error comes because you are trying to iterate over an Iterator, and not a List or Collection. If you want to use the Iterator, i recommend you to use it next() and hasNext() methods:
Set<String> mySet = new HashSet<String>();
mySet.add("dummy");
mySet.add("test");
Iterator<String> strings = mySet.iterator();
while(strings.hasNext()){
String temp = strings.next();
System.out.println(temp);
}

Collection - Iterator.remove() vs Collection.remove()

As per Sun ,
"Iterator.remove is the only safe way to modify a collection during
iteration; the behavior is unspecified if the underlying collection is
modified in any other way while the iteration is in progress."
I have two questions :
What makes this operation "Iterator.remove()" stable than the others ?
Why did they provide a "Collection.remove()" method if it will not be useful in most of the use-cases?
First of all, Collection.remove() is very useful. It is applicable in a lot of use cases, probably more so than Iterator.remove().
However, the latter solves one specific problem: it allows you to modify the collection while iterating over it.
The problem solved by Iterator.remove() is illustrated below:
List<Integer> l = new ArrayList<Integer>(Arrays.asList(1, 2, 3, 4));
for (int el : l) {
if (el < 3) {
l.remove(el);
}
}
This code is invalid since l.remove() is called during iteration over l.
The following is the correct way to write it:
Iterator<Integer> it = l.iterator();
while (it.hasNext()) {
int el = it.next();
if (el < 3) {
it.remove();
}
}
If you're iterating over a collection and use:
Collection.remove()
you can get runtime errors (specifically ConcurrentModifcationException) because you're changing the state of the object used previously to construct the explicit series of calls necessary to complete the loop.
If you use:
Iterator.remove()
you tell the runtime that you would like to change the underlying collection AND re-evaluate the explicit series of calls necessary to complete the loop.
As the documentation you quoted clearly states,
Iterator.remove is the only safe way to modify a collection during iteration
(emphasis added)
While using am iterator, you cannot modify the collection, except by calling Iterator.remove().
If you aren't iterating the collection, you would use Collection.remove().
What makes this operation "Iterator.remove()" stable than the others ?
It means that iterator knows you removed the element so it won't produce a ConcurrentModifcationException.
Why did they provide a "Collection.remove()" method if it will not be useful in most of the use-cases ?
Usually you would use Map.remove() or Collection.remove() as this can be much more efficient than iterating over every objects. If you are removing while iterating often I suspect you should be using different collections.
Is just a design choice. It would have been possible to specify a different behavior (i.e. the iterator has to skip values that were removed by Collection.remove()), but that would have made the implementation of the collection framework much more complex. So the choice to leave it unspecified.
It's quite useful. If you know the object you want to remove, why iterate?
From what I understand, the Collection.remove(int index) will also return the removed object. Iterative.remove() will not.

What are the advantages of Enhanced for loop and Iterator in Java?

I would like to know what are the advantages of Enhanced for loop and Iterators in Java +5 ?
The strengths and also the weaknesses are pretty well summarized in Stephen Colebourne (Joda-Time, JSR-310, etc) Enhanced for each loop iteration control proposal to extend it in Java 7:
FEATURE SUMMARY:
Extends the Java 5 for-each loop to allow access to the
loop index, whether this is the first
or last iteration, and to remove the
current item.
MAJOR ADVANTAGE
The for-each loop is almost certainly the most new
popular feature from Java 5. It works
because it increases the abstraction
level - instead of having to express
the low-level details of how to loop
around a list or array (with an index
or iterator), the developer simply
states that they want to loop and the
language takes care of the rest.
However, all the benefit is lost as
soon as the developer needs to access
the index or to remove an item.
The original Java 5 for each work took
a relatively conservative stance on a
number of issues aiming to tackle the
80% case. However, loops are such a
common form in coding that the
remaining 20% that was not tackled
represents a significant body of code.
The process of converting the loop
back from the for each to be index or
iterator based is painful. This is
because the old loop style if
significantly lower-level, is more
verbose and less clear. It is also
painful as most IDEs don't support
this kind of 'de-refactoring'.
MAJOR BENEFIT:
A common coding idiom is expressed at
a higher abstraction than at present.
This aids readability and clarity.
...
To sum up, the enhanced for loop offers a concise higher level syntax to loop over a list or array which improves clarity and readability. However, it misses some parts: allowing to access the index loop or to remove an item.
See also
Java 7 - For-each loop control access
Stephen Colebourne's original writeup
For me, it's clear, the main advantage is readability.
for(Integer i : list){
....
}
is clearly better than something like
for(int i=0; i < list.size(); ++i){
....
}
I think it's pretty much summed up by the documentation page introducing it here.
Iterating over a collection is uglier than it needs to be
So true..
The iterator is just clutter. Furthermore, it is an opportunity for error. The iterator variable occurs three times in each loop: that is two chances to get it wrong. The for-each construct gets rid of the clutter and the opportunity for error.
Exactly
When you see the colon (:) read it as “in.” The loop above reads as “for each TimerTask t in c.” As you can see, the for-each construct combines beautifully with generics. It preserves all of the type safety, while removing the remaining clutter. Because you don't have to declare the iterator, you don't have to provide a generic declaration for it. (The compiler does this for you behind your back, but you need not concern yourself with it.)
I'd like to sum it up more, but I think that page does it pretty much perfectly.
You can iterate over any collection that's Iterable and also arrays.
And the performance difference isn't anything you should be worried about at all.
Readability is important.
Prefer this
for (String s : listofStrings)
{
...
}
over
for (Iterator<String> iter = listofStrings.iterator(); iter.hasNext(); )
{
String s = iter.next();
...
}
Note that if you need to delete elements as you iterate, you need to use Iterator.
For example,
List<String> list = getMyListofStrings();
for (Iterator<String> iter = list.iterator(); iter.hasNext(); )
{
String s = iter.next();
if (someCondition) {
iter.remove();
}
}
You can't use for(String s : myList) to delete an element in the list.
Also note that when iterating through an array, foreach (or enhanced for) can be used only to obtain the elements, you can't modify the elements in the array.
For more info, see this.
Major drawback is the creation of an Iterator, which is not there with an index-based loop.
It is usually OK, but in performance-critical sections (in a real-time application for instance, when it has to run several hundreds times a second), it can cause major GC intervention...
A cleaner syntax !
There is no difference from the performance perspective as this is just a convenience for a programmer.
As others said, the enhanced for loop provides cleaner syntax, readable code and less type.
Plus, it avoids the possible 'index out of bound' error scenario too. For example, when you iterate a list manually, you might use the index variable in a wrong way, like:
for(int i=0; i<= list.size(); i++)
which will throw exception. But incase of enhanced for loop, we are leaving the iterating task to the compiler. It completely avoids the error case.
As others already answer, it is a syntax sugar for cleaner. If you compare to the class Iterator loop, you will found one less variable you will have to declare.
A foreach/enhanced for/for loop serves to provide a cursor onto a data object. This is particularly useful when you think in terms of “walk a file line by line” or “walk a result set record by record” as it is simple and straightforward to implement.
This also provides a more general and improved way of iterating compared to index-based methods because there is no longer any need for the caller (for loop) to know how values are fetched or collection sizes or other implementation details.
It is more concise. Only problem is null checking.
for (String str : strs) { // make sure strs is not null here
// Do whatever
}
Less typing! Plus more help from the compiler
The enhanced for-loop offers the following main advantage:
for (int i=0; i <= list.size(); i++)
It eliminates the repeated calculation of list.size() on every iteration in the non-enhanced version above. This is a performance advantage that matters.
Alternatively, you may calculate the size outside the loop as follows using an additional variable:
int size = list.size();
for (int i=0; i <= size; i++)

Categories

Resources