Which iteration method to use for Java ArrayList [duplicate] - java

This question already has an answer here:
Time Complexity for Java ArrayList
(1 answer)
Closed 9 years ago.
When writing a for loop, we can write code like:
ArrayList<Object> myList = ...
for(int i=0; i < myList.size(); i++){
...
}
This way we are invoking .size() every time. Is it better to get the size in a variable and use that, i.e.
ArrayList<Object> myList = ...
int listSize = myList.size();
for(int i=0; i < listSize ; i++){
...
}
And there is another way for iteration, i.e.
for ( Object o : myList) { ... }
Which iteration method should be used for efficient coding pratice?
Thanks

Yes, size is a constant-time operation.
Since you're using the concrete type ArrayList, the call will almost certainly be inlined by the JIT compiler.
The inlining will also probably open the door for hoisting, so the actual machine code will be exactly as if you manually extracted size into a local variable.
It will almost never actually matter whether it's inlined/hoisted or not.
If your loop runs for at least 100k iterations, does almost nothing in the body, and is the inner loop executed many times over, then it starts making sense to wonder about the performance impact of the size call.

Check the implementation. Yes it does run in constant time, there is a field that holds the size.

The for-each operator should be used whenever possible (that is: whenever you are not modifying the list in between), as it allows the list to choose the most efficient processing mode.
If you must use a for-loop, you can solve the constant checking in a very easy way, by running backwards:
for (int i = list.size()-1; i>= 0; --i)
Edit: As of the comment of Marko Topolnik I wrote a small program to test the efficiency of Iterators and it turned out that the Iterator is actually faster than the index implementation. See here for the code.
This is only true if the JVM has fully optimized that code, as otherwise the Iterator is about 2% slower than the index implementation, but even then this isn't any relevant time for a normal program execution.

Related

For loops vs. While loops [duplicate]

This question already has answers here:
For vs. while in C programming?
(19 answers)
Disadvantage of for loop [closed]
(4 answers)
Closed 9 years ago.
It seems like, at least in all the languages I'm used to, a while loop can do all the things that a for loop can, and more. Since I'm most acquainted with Java, I'll use that for an example:
int foo = 6;
while (foo > 0)
{
this.bar();
foo--;
}
seems functionally identical to
for (int foo = 6; foo > 0; foo--)
this.bar();
From this, it looks to me like the for loop is wholly redundant in function to the while one. What am I missing here? Is one more faster or more streamlined than the other once compiled? Does one automatically ditch the foo timer once it's no longer needed? Are they exactly the same in some compilers?
I'd be really surprised if they were completely identical, because, you know, DRY.
I've seen similar questions asked before, but none of them sought a really detailed answer.
Internally, both loops compile to the same machine code. The existence of two ways to repeat execution of code stems from two different use cases for each of them: for loops are usually used to iterate through a finite collection of identical objects, processing them in the same way. while loops, on the other hand, are slightly more versatile, and are usually used to repeat a piece of code until a condition is fulfilled. For example, in this piece of pseudocode:
while(document.nextLine())
document.doStuff();
the while loop iterates through the lines of a document, and processing each line. It is not known in advance how many lines are there. This could not be done easily in a for loop, where you need to know in advance when you are going to stop.
A for loop is a while loop. The only difference is that the for loop includes an initialize and state-change options, whereas a while loop requires you to do those separately.
The distinction is really more for ease of use and readability than it is purely functional. Say you want to iterate through a list:
for(int i = 0; i < list.size(); i++){
//code
}
is a lot nicer looking than:
int i = 0;
while(i<list.size()){
//code
i++;
}
Although they are (almost) functionally identical. Why did I say almost? Well the scope of the variable i is different in those two cases. If, say, you wanted to do another iteration through that list after doing the first one, you could re-use the i variable if you used the for loop. But if you used the while loop, the i would still be in scope (and you might not want that as it's purpose was only to assist in that iteration).
The two are functionally equivalent. However, there are dowhile loops which will execute at least once (something that for loops cannot do) and there are enhanced for loops:
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/for.html

Java for loop performance

What is better in for loop
This:
for(int i = 0; i<someMethod(); i++)
{//some code
}
or:
int a = someMethod();
for(int i = 0; i<a; i++)
{//some code
}
Let's just say that someMethod() returns something large.
First method will execute someMethod() in each loop thus decreasing speed, second is faster but let's say that there are a lot of similar loops in application so declaring a variable vill consume more memory.
So what is better, or am I just thinking stupidly.
The second is better - assuming someMethod() does not have side effects.
It actually caches the value calculated by someMethod() - so you won't have to recalculate it (assuming it is a relatively expansive op).
If it does (has side effects) - the two code snaps are not equivalent - and you should do what is correct.
Regarding the "size for variable a" - it is not an issue anyway, the returned value of someMethod() needs to be stored on some intermediate temp variable anyway before calculation (and even if it wasn't the case, the size of one integer is negligible).
P.S.
In some cases, compiler / JIT optimizer might optimize the first code into the second, assuming of course no side effects.
If in doubt, test. Use a profiler. Measure.
Assuming the iteration order isn't relevant, and also assuming you really want to nano-optimize your code, you may do this :
for (int i=someMethod(); i-->0;) {
//some code
}
But an additional local variable (your a) isn't such a burden. In practice, this isn't much different from your second version.
If you don't need this variable after loop, there is simple way to hide it inside:
for (int count = someMethod (), i = 0; i < count; i++)
{
// some code
}
It really depends how long it takes to generate the output of someMethod(). Also the memory usage would be the same, because someMethod() first has to generate the output and stores this then. The second way safes your cpu from computing the same output every loop and it should not take more memory. So the second one is better.
I would not consider the memory consumption of the variable a as a problem as it is an int and requires 192 bit on a 64 bit machine. So I would prefer the second alternative as it execution efficiency is better.
The most important part about loop optimizations is allowing the JVM to unroll the loop. To do so in the 1st variant it has to be able to inline the call to someMethod(). Inlining has some budget and it can get busted at some point. If someMethod() is long enough the JVM may decide it doesn't like to inline.
The second variant is more helpful (to JIT compiler) and likely to work better.
my way for putting down the loop is:
for (int i=0, max=someMethod(); i<max; i++){...}
max doesn't pollute the code, you ensure no side effects from multiple calls of someMethod() and it's compact (single liner)
If you need to optimize this, then this is the clean / obvious way to do it:
int a = someMethod();
for (int i = 0; i < a; i++) {
//some code
}
The alternative version suggested by #dystroy
for (int i=someMethod(); i-->0;) {
//some code
}
... has three problems.
He is iterating in the opposite direction.
That iteration is non-idiomatic, and hence less readable. Especially if you ignore the Java style guide and don't put whitespace where you are supposed to.
There is no proof that the code will actually be faster than the more idiomatic version ... especially once the JIT compiler has optimized them both. (And even if the less readable version is faster, the difference is likely to be negligible.)
On the other hand, if someMethod() is expensive (as you postulate) then "hoisting" the call so that it is only done once is likely to be worthwhile.
I was a bit confused about the same and did a sanity test for the same with a list of 10,000,000 integers in it. Difference was more than two seconds with latter being faster:
int a = someMethod();
for(int i = 0; i<a; i++)
{//some code
}
My results on Java 8 (MacBook Pro, 2.2 GHz Intel Core i7) were:
using list object:
Start- 1565772380899,
End- 1565772381632
calling list in 'for' expression:
Start- 1565772381633,
End- 1565772384888

Best Practice in `For` loop in java [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
for loop optimization
In java i have a block of code:
List e = {element1, element2, ...., elementn};
for(int i = 0; i < e.size(); i++){//Do something in here
};
and another block:
List e = {element1, element2, ...., elementn};
int listSize = e.size();
for(int i = 0; i < listSize; i++){//Do something in here
};
I think that the second block is better, because in the first block, if i++, we have to calculate e.size() one more times to compare the condition in the for loop. Is it right or wrong?
And comparing the two block above, what is the best practice for writing for? And why?Explain clearly and try this loop yourself
Personally I'd use the enhanced for statement instead:
for (Object element : e) {
// Use element
}
Unless you need the index, of course.
If I had to use one of the two forms, I'd use the first as it's tidier (it doesn't introduce another local variable which is only used in that loop), until I had concrete evidence that it was causing a problem. (In most list implementations, e.size() is a simple variable access which can be inlined by the JIT anyway.)
Usually, the most brief and readable code is the best choice, all things being equal. In the case of Java, the enhanced for loop (which works with any class that implements Iterable) is the way to go.
for (Object object : someCollection) { // do something }
In terms solely of the two you posted, I think the first is the better option. It's more readable, and you have to remember that, under the hood, JIT will attempt to optimize a great deal of the code you write anyway.
EDIT: Have you heard the phrase "premature optimisation is the root of all evil"? Your second block is an example of premature optimisation.
If you check the size() implementation on a LinkedList class, you will find that the size is incemented or decremented when an element is added or removed from the list.
Calling size() just returns the value of this property and does not involve any calculation.
So directly calling size() method should be better as you will save on the save for another integer.
I would always use (if you need an index variable):
List e = {element1, element2, ...., elementn};
for(int i = 0, size = e.size(); i < size; i++){
// Do something in here
};
Since e.size() could be an expensive operation.
Your 2nd option is not good, since it introduces a new variable outside of the for loop. I recommend to keep variable visibility as limited as possible.
Otherwise a
for (MyClass myObj : list) {
// Do something here
}
is even cleaner, but might introduce a small performance hit (the index approach doesn't require to instantiate an Iterator).
Yes, the second form is marginally more efficient as you don't repeated perform the size() method invocation. Compilers are good are doing this sort of optimisation themselves.
However, it's unlikely that this would be the performance bottleneck of your application. Avoid premature optimisation. Make your code clean and readable foremost.
HotSpot will move e.size() from cycle in most cases. So it will calculate size of List only once.
As for me I prefer the following notation:
for (Object elem: e) {
//Do something
}
i think this should be much more better..
may be initializing the int variable every time can be escaped from this..
List e = {element1, element2, ...., elementn};
int listSize = e.size();
int i=0;
for(i = 0; i < listSize; i++){//Do something in here
};
Second one is better approach because in the first block, you are calling the e.size() is a method which is an operation in a loop that is a extra burden to JVM.
Im not so sure but i think the optimizer of java will replace the value with a static value, so in the end it will be the same.
To avoid all this numbering and iterators and checkings in writing the code use the following simple most readable code that has its performance to maximum.
Why this has maximum performance (details are coming up)
for (Object object : aCollection) {
// Do something here
}
If the index is needed then:
To choose between the above two forms:
The second is the better as you said because it only calculated the size once.
I think now we have a tendency to write short and understandable code, so the first option is better.
the second is better , cos in the firt loop in the body of it maybe u will do this statment
e.remove, and then the size of e will be changed , so it is better to save the size in a parameter before the looop

How bad is declaring arrays inside a for loop in Java?

I come from a C background, so I admit that I'm still struggling with letting go of memory management when writing in Java. Here's one issue that's come up a few times that I would love to get some elaboration on. Here are two ways to write the same routine, the only difference being when double[] array is declared:
Code Sample 1:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Code Sample 2:
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Here, private double[] calculateSomethingAndReturnAnArray(int i) always returns an array of the same length. I have a strong aversion to Code Sample 2 because it creates a new array for each iteration when it could just overwrite the existing array. However, I think this might be one of those times when I should just sit back and let Java handle the situation for me.
What are the reasons to prefer one of the ways over the other or are they truly identical in Java?
There's nothing special about arrays here because you're not allocating for the array, you're just creating a new variable, it's equivalent to:
Object foo;
for(...){
foo = func(...);
}
In the case where you create the variable outside the loop it, the variable (which will hold the location of the thing it refers to) will only ever be allocated once, in the case where you create the variable inside the loop, the variable may be reallocated for in each iteration, but my guess is the compiler or the JIT will fix that in an optimization step.
I'd consider this a micro-optimization, if you're running into problems with this segment of your code, you should be making decisions based on measurements rather than on the specs alone, if you're not running into issues with this segment of code, you should do the semantically correct thing and declare the variable in the scope that makes sense.
See also this similar question about best practices.
A declaration of a local variable without an initializing expression will do NO work whatsoever. The work happens when the variable is initialized.
Thus, the following are identical with respects to semantics and performance:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
// ...
}
and
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
// ...
}
(You can't even quibble that the first case allows the array to be used after the loop ends. For that to be legal, array has to have a definite value after the loop, and it doesn't unless you add an initializer to the declaration; e.g. double[] array = null;)
To elaborate on #Mark Elliot 's point about micro-optimization:
This is really an attempt to optimize rather than a real optimization, because (as I noted) it should have no effect.
Even if the Java compiler actually emitted some non-trivial executable code for double[] array;, the chances are that the time to execute would be insignificant compared with the total execution time of the loop body, and of the application as a whole. Hence, this is most likely to be a pointless optimization.
Even if this is a worthwhile optimization, you have to consider that you have optimized for a specific target platform; i.e. a particular combination of hardware and JVM version. Micro-optimizations like this may not be optimal on other platforms, and could in theory be anti-optimizations.
In summary, you are most likely wasting your time if you focus on things like this when writing Java code. If performance is a concern for your application, focus on the MACRO level performance; e.g. things like algorithmic complexity, good database / query design, patterns of network interactions, and so on.
Both create a new array for each iteration. They have the same semantics.

Calling a method n times: should I use a converted for-each loop or a traditional for loop?

Given the need to loop up to an arbitrary int value, is it better programming practice to convert the value into an array and for-each the array, or just use a traditional for loop?
FYI, I am calculating the number of 5 and 6 results ("hits") in multiple throws of 6-sided dice. My arbitrary int value is the dicePool which represents the number of multiple throws.
As I understand it, there are two options:
Convert the dicePool into an array and for-each the array:
public int calcHits(int dicePool) {
int[] dp = new int[dicePool];
for (Integer a : dp) {
// call throwDice method
}
}
Use a traditional for loop:
public int calcHits(int dicePool) {
for (int i = 0; i < dicePool; i++) {
// call throwDice method
}
}
My view is that option 1 is clumsy code and involves unnecessary creation of an array, even though the for-each loop is more efficient than the traditional for loop in Option 2.
At this point, speed isn't important (insert premature-optimization comment ;). What matters is how quickly you can understand what the code does, which is to call a method dicePool times.
The first method allocates an array of size dicePool and iterates through its values, which happens to run the loop body dicePool times (I'll pretend you meant int instead of Integer to avoid the unrelated autoboxing issue). This is potentially inefficient for the computer running the code, but more importantly it's inefficient for the human reading the code as it's conceptually distant from what you wanted to accomplish. Specifically, you force the reader to think about the new array you've just made, AND the value of the variable a, which will be 0 for every iteration of the loop, even though neither of those are related to your end goal.
Any Java programmer looking at the second method will realize that you're executing the loop body dicePool times with i 'counting up' to dicePool. While the latter part isn't especially important, the beginning is exactly what you meant to do. Using this common Java idiom minimizes the unrelated things a reader needs to think about, so it's the best choice.
When in doubt, go with simplicity. :D
Why would you need to allocate an array to loop over a variable that can be safely incremented and used without any need of allocation?
It sounds unecessarily inefficient. You can need to allocate an array if you need to swap the order of ints but this is not the case. I would go for option 2 for sure.
The foreach is useful when you want to iterate on a collection but creating a collection just to iterate over it when you don't need it is just without sense..
(2) is the obvious choice because there's no point in creating the array, based on your description. If there is, of course things change.
What makes you think that the for-each loop is more efficient?
Iterating over a set is very likely less efficient than a simple loop and counter.
It might help if you gave more context about the problem, specifically whether there's more to this question than choosing one syntax over the other. I am having trouble thinking of a problem to which #1 would be a better solution.
I wouldn't write the first one. It's not necessary to use the latest syntax in every setting.
Your instinct is a good one: if it feels and looks clumsy, it probably is.
Go with #2 and sleep at night.

Categories

Resources