Java: instantiate variables in loop: good or bad style? - java

Ive got one simple question. Normally I write code like this:
String myString = "hello";
for (int i=0, i<10; i++)
{
myString = "hello again";
}
Because I think the following would not be good style cause it would create too many unnecessary objects.
for (int i=0, i<10; i++)
{
String myString = "hello again";
}
Is this even correct? Or is this just the case when Ive got an explicit object like an object from a class I created? What if it was a boolean or an int? What is better coding style? Instantiate it once before the loop and use it in the loop or instantiate it every time in the loop again? And why? Because the program is faster or less storage is used or...?
Some one told me, if it was a boolean I should instantiate it directly in the loop. He said it would not make a difference for the heap and it would be more clear that the variable belongs inside the loop. So what is correct?
Thanks for an answer! :-)
====
Thanks for all your answers!
In conclusion: it is preferable to declare an object inside the smallest scope possible. There are no performance improvements by declaring and instantiating objects outside the loop, even if in every looping the object is reinstantiated.

No, the latter code isn't actually valid. It would be with braces though:
for (int i=0; i<10; i++)
{
String myString = "hello again";
}
(Basically you can't use a variable declaration as a single-statement body for an if statement, a loop etc.)
It would be pointless, but valid - and preferable to the first version, IMO. It takes no more memory, but it's generally a good idea to give your local variables the narrowest scope you can, declaring as late as you can, ideally initializing at the same point. It makes it clearer where each variable can be used.
Of course, if you need to refer to the variable outside the loop (before or afterwards) then you'll need to declare it outside the loop too.
You need to differentiate between variables and objects when you consider efficiency. The above code uses at most one object - the String object referred to by the literal "hello again".

As Binyamin Sharet mentioned, you generally want to declare a variable within the smallest scope possible. In your specific examples, the second one is generally preferable unless you need access to the variable outside your loop.
However, under certain conditions this can have performance implications--namely, if you are instantiating the same object over and over again. In your particular example, you benefit from Java's automatic pooling of String literals. But suppose you were actually creating a new instance of the same object on every iteration of the loop, and this loop was being executed hundreds or thousands of times:
for (int i=0, i<1000; i++)
{
String myString = new String("hello again"); // 1000 Strings are created--one on every iteration
...
}
If your loop is looping hundreds or thousands of times but it just so happens that you're instantiating the same object over and over again, instantiating it inside the loop is going to result in a lot of unnecessary garbage collection, because you create and throw away a new object on every iteration. In that case, you would be better off declaring and instantiating the variable once outside of the loop:
String myString = new String("hello again"); // only one String is created
for (int i=0, i<1000; i++)
{
...
}
And, to come full circle, you can manually limit the scope by adding extra braces around the relevant section of code:
{ // Limit the scope
String myString = new String("hello again");
for (int i=0, i<1000; i++)
{
...
}
}

Seems like you mean declare, not instantiate and in general, you should declare a variable in the smallest scope required (in this case - in the loop).

if you are going to use the variable outside the for loop, then declare it out side, otherwise its better to keep the scope to minimum

The problem with the second is you create object and someone (the GC) has to clean them, of course for a 10 iteration it is unimportant.
BTW in your specific example I would have wrote
String myString = null;
final String HELLO_AGAIN="hello again";
for (int i=0; i<10; i++)
myString = HELLO_AGAIN;

Unless value is changed, you should definitely instantiate outside of the loop.

The problem here is that String is an immutable object: you cannot change the value of a string, only you can create new String objects. Either way, if your goal is to assign a variable a new object instance, then limit your scope and declare it inside the body of your loop.
If your object is mutable, then it would be reasonable to reuse the object in every next iteration of the loop, and just change those attributes you need. This concept is used to run the same query multiple times, but with different parameters, you use a PreparedStatement.
In the extreme case, you would even maintain pools of objects which can be shared within the whole application. You create additional objects as you run out of resources, you shrink if you detect a reasonable amount of non-use. This concept is used to maintain a Connection Pool.

Related

Memory footprint with for loop

I have the following code:
for (int i = 0; i < array.length; i++) {
int current = array[i];
//do something with current...
}
and the function
int current = 0;
for (int i = 0; i < array.length; i++) {
current = array[i];
//do something with current...
}
My question is, do they have the same memory footprint??
I mean, it is clear that the 2nd function will only have 1 variable "current". But how about the first function. Lets assume array has length 1000, does this mean 1000 integeger variables "current" will be created in the inner loop?
No difference.But IMHO You should generally give variables the smallest scope you can. So declare it inside the loop to limit its scope. You should also initialize variables when they are defined, which is another reason not to declare it outside the loop.
They have exactly the same footprint. They even have (without regard to some variable numbering) the exact same bytecode. You can try by putting this in a Test.java, compile it and disassemble it with "javap -c Test"
HTH :)
There is no difference. The compiler is smart enough to generate similar bytecode for both cases by making the right optimizations.
If you want to use the variable outside the loop, declare it outside it, otherwise, in order to give the variable the smallest scope, declare it inside the loop (and consider making it final in this case).
The two code fragments are equivalent. May even compile to the exact same bytecode (someone will decompile it). Each just creates a single local variable (that is reused in the loop).

Best Practice in `For` loop in java [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
for loop optimization
In java i have a block of code:
List e = {element1, element2, ...., elementn};
for(int i = 0; i < e.size(); i++){//Do something in here
};
and another block:
List e = {element1, element2, ...., elementn};
int listSize = e.size();
for(int i = 0; i < listSize; i++){//Do something in here
};
I think that the second block is better, because in the first block, if i++, we have to calculate e.size() one more times to compare the condition in the for loop. Is it right or wrong?
And comparing the two block above, what is the best practice for writing for? And why?Explain clearly and try this loop yourself
Personally I'd use the enhanced for statement instead:
for (Object element : e) {
// Use element
}
Unless you need the index, of course.
If I had to use one of the two forms, I'd use the first as it's tidier (it doesn't introduce another local variable which is only used in that loop), until I had concrete evidence that it was causing a problem. (In most list implementations, e.size() is a simple variable access which can be inlined by the JIT anyway.)
Usually, the most brief and readable code is the best choice, all things being equal. In the case of Java, the enhanced for loop (which works with any class that implements Iterable) is the way to go.
for (Object object : someCollection) { // do something }
In terms solely of the two you posted, I think the first is the better option. It's more readable, and you have to remember that, under the hood, JIT will attempt to optimize a great deal of the code you write anyway.
EDIT: Have you heard the phrase "premature optimisation is the root of all evil"? Your second block is an example of premature optimisation.
If you check the size() implementation on a LinkedList class, you will find that the size is incemented or decremented when an element is added or removed from the list.
Calling size() just returns the value of this property and does not involve any calculation.
So directly calling size() method should be better as you will save on the save for another integer.
I would always use (if you need an index variable):
List e = {element1, element2, ...., elementn};
for(int i = 0, size = e.size(); i < size; i++){
// Do something in here
};
Since e.size() could be an expensive operation.
Your 2nd option is not good, since it introduces a new variable outside of the for loop. I recommend to keep variable visibility as limited as possible.
Otherwise a
for (MyClass myObj : list) {
// Do something here
}
is even cleaner, but might introduce a small performance hit (the index approach doesn't require to instantiate an Iterator).
Yes, the second form is marginally more efficient as you don't repeated perform the size() method invocation. Compilers are good are doing this sort of optimisation themselves.
However, it's unlikely that this would be the performance bottleneck of your application. Avoid premature optimisation. Make your code clean and readable foremost.
HotSpot will move e.size() from cycle in most cases. So it will calculate size of List only once.
As for me I prefer the following notation:
for (Object elem: e) {
//Do something
}
i think this should be much more better..
may be initializing the int variable every time can be escaped from this..
List e = {element1, element2, ...., elementn};
int listSize = e.size();
int i=0;
for(i = 0; i < listSize; i++){//Do something in here
};
Second one is better approach because in the first block, you are calling the e.size() is a method which is an operation in a loop that is a extra burden to JVM.
Im not so sure but i think the optimizer of java will replace the value with a static value, so in the end it will be the same.
To avoid all this numbering and iterators and checkings in writing the code use the following simple most readable code that has its performance to maximum.
Why this has maximum performance (details are coming up)
for (Object object : aCollection) {
// Do something here
}
If the index is needed then:
To choose between the above two forms:
The second is the better as you said because it only calculated the size once.
I think now we have a tendency to write short and understandable code, so the first option is better.
the second is better , cos in the firt loop in the body of it maybe u will do this statment
e.remove, and then the size of e will be changed , so it is better to save the size in a parameter before the looop

Temporary variable used for each iteration of a large loop, strings are immutable so what should I use?

I have a loop like:
String tmp;
for(int x = 0; x < 1000000; x++) {
// use temp
temp = ""; // reset
}
This string is holding at most 10 characters.
What would be the most effecient way of creating a variable for this use case?
Should I use a fixed size array? Or a stringbuffer instead?
I don't want to create 1million variables when I don't have to, and it matters for this method (performance).
Edit
I simplified my scenerio, I actually need this variable to be at the class level scope as there are some events that take place i.e. it can't be declared within the loop.
Why not simply declare temp inside the loop like so:
for(int x = 0; x < 1000000; x++) {
String temp;
// use temp
}
You even get a very (very, very) slight performance increase because you don't have to waste time resetting the value of temp to "".
With regards to your update, It still depends on what you do with temp but a StringBuffer would probably be the easiest to use. And especially if you need to concatenate together a Sting, it would be quite fast.
What exactly are you looking to do with tmp (or temp)?
Honestly, I'd just try declaring your variables within the loop if they aren't needed afterwards, and profile it. Many of the obscurities that have been used in the past to help with performance issues within loops are no longer needed in recent versions of Java, due to optimizations and other improvements in the compiler and the Hotspot JVM.
Whats the problem with using fixed array? I think array will do. Here is similar question i found Making a very large Java array
Well, stringbuffer or StringBuilder will do too. But stringBuilder is fast than stringBuffer.
And if it based on the performance level, i think you might want to check the types of loops that give better performance.
Try this
public class Robal {
public void looping()
{
for(int x = 0; x < 1000000; x++) {
String temp=x+"";
System.out.println(temp);
temp = ""; // reset
}
}
The answer really depends on what you do with temp in the loop.
String instances are immutable by definition. If your processing includes string manipulation, you should not use String since you'll end up creating a lot of unnecessary very short-lived immutable instances. In this case use StringBuilder (or StringBuffer if thread-safety is required) instead.
If you merely create a new String (or obtain it from an external source) in every iteration and use it without any string manipulation operations that create new String objects, then you're OK using String. Note that creating a new String instance every iteration is usually quite fast and unless your profiler specifically points to this being a problem, you should not attempt to optimize this prematurely.
Note, also, that unless you specifically rely in each iteration on temp initial value being a reference to an empty string, there is no need to do temp = ""

How bad is declaring arrays inside a for loop in Java?

I come from a C background, so I admit that I'm still struggling with letting go of memory management when writing in Java. Here's one issue that's come up a few times that I would love to get some elaboration on. Here are two ways to write the same routine, the only difference being when double[] array is declared:
Code Sample 1:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Code Sample 2:
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Here, private double[] calculateSomethingAndReturnAnArray(int i) always returns an array of the same length. I have a strong aversion to Code Sample 2 because it creates a new array for each iteration when it could just overwrite the existing array. However, I think this might be one of those times when I should just sit back and let Java handle the situation for me.
What are the reasons to prefer one of the ways over the other or are they truly identical in Java?
There's nothing special about arrays here because you're not allocating for the array, you're just creating a new variable, it's equivalent to:
Object foo;
for(...){
foo = func(...);
}
In the case where you create the variable outside the loop it, the variable (which will hold the location of the thing it refers to) will only ever be allocated once, in the case where you create the variable inside the loop, the variable may be reallocated for in each iteration, but my guess is the compiler or the JIT will fix that in an optimization step.
I'd consider this a micro-optimization, if you're running into problems with this segment of your code, you should be making decisions based on measurements rather than on the specs alone, if you're not running into issues with this segment of code, you should do the semantically correct thing and declare the variable in the scope that makes sense.
See also this similar question about best practices.
A declaration of a local variable without an initializing expression will do NO work whatsoever. The work happens when the variable is initialized.
Thus, the following are identical with respects to semantics and performance:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
// ...
}
and
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
// ...
}
(You can't even quibble that the first case allows the array to be used after the loop ends. For that to be legal, array has to have a definite value after the loop, and it doesn't unless you add an initializer to the declaration; e.g. double[] array = null;)
To elaborate on #Mark Elliot 's point about micro-optimization:
This is really an attempt to optimize rather than a real optimization, because (as I noted) it should have no effect.
Even if the Java compiler actually emitted some non-trivial executable code for double[] array;, the chances are that the time to execute would be insignificant compared with the total execution time of the loop body, and of the application as a whole. Hence, this is most likely to be a pointless optimization.
Even if this is a worthwhile optimization, you have to consider that you have optimized for a specific target platform; i.e. a particular combination of hardware and JVM version. Micro-optimizations like this may not be optimal on other platforms, and could in theory be anti-optimizations.
In summary, you are most likely wasting your time if you focus on things like this when writing Java code. If performance is a concern for your application, focus on the MACRO level performance; e.g. things like algorithmic complexity, good database / query design, patterns of network interactions, and so on.
Both create a new array for each iteration. They have the same semantics.

Which loop has better performance? Why?

String s = "";
for(i=0;i<....){
s = some Assignment;
}
or
for(i=0;i<..){
String s = some Assignment;
}
I don't need to use 's' outside the loop ever again.
The first option is perhaps better since a new String is not initialized each time. The second however would result in the scope of the variable being limited to the loop itself.
EDIT: In response to Milhous's answer. It'd be pointless to assign the String to a constant within a loop wouldn't it? No, here 'some Assignment' means a changing value got from the list being iterated through.
Also, the question isn't because I'm worried about memory management. Just want to know which is better.
Limited Scope is Best
Use your second option:
for ( ... ) {
String s = ...;
}
Scope Doesn't Affect Performance
If you disassemble code the compiled from each (with the JDK's javap tool), you will see that the loop compiles to the exact same JVM instructions in both cases. Note also that Brian R. Bondy's "Option #3" is identical to Option #1. Nothing extra is added or removed from the stack when using the tighter scope, and same data are used on the stack in both cases.
Avoid Premature Initialization
The only difference between the two cases is that, in the first example, the variable s is unnecessarily initialized. This is a separate issue from the location of the variable declaration. This adds two wasted instructions (to load a string constant and store it in a stack frame slot). A good static analysis tool will warn you that you are never reading the value you assign to s, and a good JIT compiler will probably elide it at runtime.
You could fix this simply by using an empty declaration (i.e., String s;), but this is considered bad practice and has another side-effect discussed below.
Often a bogus value like null is assigned to a variable simply to hush a compiler error that a variable is read without being initialized. This error can be taken as a hint that the variable scope is too large, and that it is being declared before it is needed to receive a valid value. Empty declarations force you to consider every code path; don't ignore this valuable warning by assigning a bogus value.
Conserve Stack Slots
As mentioned, while the JVM instructions are the same in both cases, there is a subtle side-effect that makes it best, at a JVM level, to use the most limited scope possible. This is visible in the "local variable table" for the method. Consider what happens if you have multiple loops, with the variables declared in unnecessarily large scope:
void x(String[] strings, Integer[] integers) {
String s;
for (int i = 0; i < strings.length; ++i) {
s = strings[0];
...
}
Integer n;
for (int i = 0; i < integers.length; ++i) {
n = integers[i];
...
}
}
The variables s and n could be declared inside their respective loops, but since they are not, the compiler uses two "slots" in the stack frame. If they were declared inside the loop, the compiler can reuse the same slot, making the stack frame smaller.
What Really Matters
However, most of these issues are immaterial. A good JIT compiler will see that it is not possible to read the initial value you are wastefully assigning, and optimize the assignment away. Saving a slot here or there isn't going to make or break your application.
The important thing is to make your code readable and easy to maintain, and in that respect, using a limited scope is clearly better. The smaller scope a variable has, the easier it is to comprehend how it is used and what impact any changes to the code will have.
In theory, it's a waste of resources to declare the string inside the loop.
In practice, however, both of the snippets you presented will compile down to the same code (declaration outside the loop).
So, if your compiler does any amount of optimization, there's no difference.
In general I would choose the second one, because the scope of the 's' variable is limited to the loop. Benefits:
This is better for the programmer because you don't have to worry about 's' being used again somewhere later in the function
This is better for the compiler because the scope of the variable is smaller, and so it can potentially do more analysis and optimisation
This is better for future readers because they won't wonder why the 's' variable is declared outside the loop if it's never used later
If you want to speed up for loops, I prefer declaring a max variable next to the counter so that no repeated lookups for the condidtion are needed:
instead of
for (int i = 0; i < array.length; i++) {
Object next = array[i];
}
I prefer
for (int i = 0, max = array.lenth; i < max; i++) {
Object next = array[i];
}
Any other things that should be considered have already been mentioned, so just my two cents (see ericksons post)
Greetz, GHad
To add on a bit to #Esteban Araya's answer, they will both require the creation of a new string each time through the loop (as the return value of the some Assignment expression). Those strings need to be garbage collected either way.
I know this is an old question, but I thought I'd add a bit that is slightly related.
I've noticed while browsing the Java source code that some methods, like String.contentEquals (duplicated below) makes redundant local variables that are merely copies of class variables. I believe that there was a comment somewhere, that implied that accessing local variables is faster than accessing class variables.
In this case "v1" and "v2" are seemingly unnecessary and could be eliminated to simplify the code, but were added to improve performance.
public boolean contentEquals(StringBuffer sb) {
synchronized(sb) {
if (count != sb.length())
return false;
char v1[] = value;
char v2[] = sb.getValue();
int i = offset;
int j = 0;
int n = count;
while (n-- != 0) {
if (v1[i++] != v2[j++])
return false;
}
}
return true;
}
It seems to me that we need more specification of the problem.
The
s = some Assignment;
is not specified as to what kind of assignment this is. If the assignment is
s = "" + i + "";
then a new sting needs to be allocated.
but if it is
s = some Constant;
s will merely point to the constants memory location, and thus the first version would be more memory efficient.
Seems i little silly to worry about to much optimization of a for loop for an interpreted lang IMHO.
When I'm using multiple threads (50+) then i found this to be a very effective way of handling ghost thread issues with not being able to close a process correctly ....if I'm wrong, please let me know why I'm wrong:
Process one;
BufferedInputStream two;
try{
one = Runtime.getRuntime().exec(command);
two = new BufferedInputStream(one.getInputStream());
}
}catch(e){
e.printstacktrace
}
finally{
//null to ensure they are erased
one = null;
two = null;
//nudge the gc
System.gc();
}

Categories

Resources