This question already has answers here:
Difference between declaring variables before or in loop?
(26 answers)
Closed 7 years ago.
I was wondering which one is better in term of memory allocation. I know at this scale these would not be any different since creating a variable would take so little memory, but I wanted to get used to coding the better way for the future.
public static void test(Scanner input, int[] arr){
for (int i = 0; i < 5; i++){
int age = input.nextInt();
arr[i] = age;
}
or
public static void test(Scanner input, int[] arr){
int age = null;
for (int i = 0; i < 5; i++){
age = input.nextInt();
arr[i] = age;
}
}
Declaring the variable before the for loop would only allocate one place in memory for all the values I set it to, correct? But if I declare the variable in the for loop, will 5 different places in memory be allocated or just one that will be overwritten when the for loop is run again?
I should also state that the variable age will not be used anywhere else but inside that for loop.
Thanks.
Shouldn't make any difference, so my suggestion is limit it to the required scope (i.e. option 1) - also int age = null; is a bit misleading - makes you read like it's a pointer even though it isn't. int age = 0; would be a bit clearer.
many discussion: already there: Difference between declaring variables before or in loop?
There are others factors:
static, global.
In general, the more performance, the less readable, the less safe
The difference in performance, if not eliminated by the compiler's optimization, would not be noticeable at all. What is noticeable, and should be taken into account, is the readability of your code.
Declaring the variable before the loop only makes the code harder to read, and also pollutes the namespace outside the loop. As a general rule of thumb, declare your variables in the smallest scope where you need them.
Don't use temporary variable yourself, this one is better:
public static void test(Scanner input, int[] arr){
for (int i = 0; i < 5; i++){
arr[i] = input.nextInt();
}
}
If variable will be used only within cycle, then declare it within cycle, in order to pollute function's namespace.
Which is better, first or second?
From a performance perspective, you'd have to measure it. (And in my opinion, if you can measure a difference, the compiler isn't very good).
From a maintenance perspective, first is better. Declare and initialize variables in the same place, in the narrowest scope possible. Don't leave a gaping hole between the declaration and the initialization, and don't pollute namespaces you don't need to.
Related
I am wondering, from perspective of memory usage, performance and clean code, is it better to initialize variable inside or outside the loop.
For example, below I show two options using variable myInt in a for loop.
Which options is better?
I have a intuition on which option does what, but I want a true "Java" clarification which option is better for 1) Performance, 2) Memory and 3) Better code style.
Option 1:
int myInt = 0;
for (int i =0; i<100; i++){
some manipulation here with myInt
}
Option 2:
for (int i =0; i<100; i++){
int myInt = 0;
some manipulation here with myInt
}
Variables should always* be declared as locally as posible. If you use the integer inside the loop only, it should be declared inside the loop.
*always - unless you have a really good and proven reason not to
If you want to use the myInt within the for loop Option2 is better.
You want to use it outside the loop Option1 is better.
Using variables in smallest scope is better option.
Well, these two options provide two different use cases:
The value of of myInt in Option2 will reset on every loop iteration, since it's scope is only within the loop.
Option1 is the way to go, if you want to something with myInt within the loop and do something with it after the loop.
I personally wouldn't care about memory or performance here, use the scope you need.
I have the following code:
for (int i = 0; i < array.length; i++) {
int current = array[i];
//do something with current...
}
and the function
int current = 0;
for (int i = 0; i < array.length; i++) {
current = array[i];
//do something with current...
}
My question is, do they have the same memory footprint??
I mean, it is clear that the 2nd function will only have 1 variable "current". But how about the first function. Lets assume array has length 1000, does this mean 1000 integeger variables "current" will be created in the inner loop?
No difference.But IMHO You should generally give variables the smallest scope you can. So declare it inside the loop to limit its scope. You should also initialize variables when they are defined, which is another reason not to declare it outside the loop.
They have exactly the same footprint. They even have (without regard to some variable numbering) the exact same bytecode. You can try by putting this in a Test.java, compile it and disassemble it with "javap -c Test"
HTH :)
There is no difference. The compiler is smart enough to generate similar bytecode for both cases by making the right optimizations.
If you want to use the variable outside the loop, declare it outside it, otherwise, in order to give the variable the smallest scope, declare it inside the loop (and consider making it final in this case).
The two code fragments are equivalent. May even compile to the exact same bytecode (someone will decompile it). Each just creates a single local variable (that is reused in the loop).
I have a loop like:
String tmp;
for(int x = 0; x < 1000000; x++) {
// use temp
temp = ""; // reset
}
This string is holding at most 10 characters.
What would be the most effecient way of creating a variable for this use case?
Should I use a fixed size array? Or a stringbuffer instead?
I don't want to create 1million variables when I don't have to, and it matters for this method (performance).
Edit
I simplified my scenerio, I actually need this variable to be at the class level scope as there are some events that take place i.e. it can't be declared within the loop.
Why not simply declare temp inside the loop like so:
for(int x = 0; x < 1000000; x++) {
String temp;
// use temp
}
You even get a very (very, very) slight performance increase because you don't have to waste time resetting the value of temp to "".
With regards to your update, It still depends on what you do with temp but a StringBuffer would probably be the easiest to use. And especially if you need to concatenate together a Sting, it would be quite fast.
What exactly are you looking to do with tmp (or temp)?
Honestly, I'd just try declaring your variables within the loop if they aren't needed afterwards, and profile it. Many of the obscurities that have been used in the past to help with performance issues within loops are no longer needed in recent versions of Java, due to optimizations and other improvements in the compiler and the Hotspot JVM.
Whats the problem with using fixed array? I think array will do. Here is similar question i found Making a very large Java array
Well, stringbuffer or StringBuilder will do too. But stringBuilder is fast than stringBuffer.
And if it based on the performance level, i think you might want to check the types of loops that give better performance.
Try this
public class Robal {
public void looping()
{
for(int x = 0; x < 1000000; x++) {
String temp=x+"";
System.out.println(temp);
temp = ""; // reset
}
}
The answer really depends on what you do with temp in the loop.
String instances are immutable by definition. If your processing includes string manipulation, you should not use String since you'll end up creating a lot of unnecessary very short-lived immutable instances. In this case use StringBuilder (or StringBuffer if thread-safety is required) instead.
If you merely create a new String (or obtain it from an external source) in every iteration and use it without any string manipulation operations that create new String objects, then you're OK using String. Note that creating a new String instance every iteration is usually quite fast and unless your profiler specifically points to this being a problem, you should not attempt to optimize this prematurely.
Note, also, that unless you specifically rely in each iteration on temp initial value being a reference to an empty string, there is no need to do temp = ""
I come from a C background, so I admit that I'm still struggling with letting go of memory management when writing in Java. Here's one issue that's come up a few times that I would love to get some elaboration on. Here are two ways to write the same routine, the only difference being when double[] array is declared:
Code Sample 1:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Code Sample 2:
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Here, private double[] calculateSomethingAndReturnAnArray(int i) always returns an array of the same length. I have a strong aversion to Code Sample 2 because it creates a new array for each iteration when it could just overwrite the existing array. However, I think this might be one of those times when I should just sit back and let Java handle the situation for me.
What are the reasons to prefer one of the ways over the other or are they truly identical in Java?
There's nothing special about arrays here because you're not allocating for the array, you're just creating a new variable, it's equivalent to:
Object foo;
for(...){
foo = func(...);
}
In the case where you create the variable outside the loop it, the variable (which will hold the location of the thing it refers to) will only ever be allocated once, in the case where you create the variable inside the loop, the variable may be reallocated for in each iteration, but my guess is the compiler or the JIT will fix that in an optimization step.
I'd consider this a micro-optimization, if you're running into problems with this segment of your code, you should be making decisions based on measurements rather than on the specs alone, if you're not running into issues with this segment of code, you should do the semantically correct thing and declare the variable in the scope that makes sense.
See also this similar question about best practices.
A declaration of a local variable without an initializing expression will do NO work whatsoever. The work happens when the variable is initialized.
Thus, the following are identical with respects to semantics and performance:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
// ...
}
and
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
// ...
}
(You can't even quibble that the first case allows the array to be used after the loop ends. For that to be legal, array has to have a definite value after the loop, and it doesn't unless you add an initializer to the declaration; e.g. double[] array = null;)
To elaborate on #Mark Elliot 's point about micro-optimization:
This is really an attempt to optimize rather than a real optimization, because (as I noted) it should have no effect.
Even if the Java compiler actually emitted some non-trivial executable code for double[] array;, the chances are that the time to execute would be insignificant compared with the total execution time of the loop body, and of the application as a whole. Hence, this is most likely to be a pointless optimization.
Even if this is a worthwhile optimization, you have to consider that you have optimized for a specific target platform; i.e. a particular combination of hardware and JVM version. Micro-optimizations like this may not be optimal on other platforms, and could in theory be anti-optimizations.
In summary, you are most likely wasting your time if you focus on things like this when writing Java code. If performance is a concern for your application, focus on the MACRO level performance; e.g. things like algorithmic complexity, good database / query design, patterns of network interactions, and so on.
Both create a new array for each iteration. They have the same semantics.
String s = "";
for(i=0;i<....){
s = some Assignment;
}
or
for(i=0;i<..){
String s = some Assignment;
}
I don't need to use 's' outside the loop ever again.
The first option is perhaps better since a new String is not initialized each time. The second however would result in the scope of the variable being limited to the loop itself.
EDIT: In response to Milhous's answer. It'd be pointless to assign the String to a constant within a loop wouldn't it? No, here 'some Assignment' means a changing value got from the list being iterated through.
Also, the question isn't because I'm worried about memory management. Just want to know which is better.
Limited Scope is Best
Use your second option:
for ( ... ) {
String s = ...;
}
Scope Doesn't Affect Performance
If you disassemble code the compiled from each (with the JDK's javap tool), you will see that the loop compiles to the exact same JVM instructions in both cases. Note also that Brian R. Bondy's "Option #3" is identical to Option #1. Nothing extra is added or removed from the stack when using the tighter scope, and same data are used on the stack in both cases.
Avoid Premature Initialization
The only difference between the two cases is that, in the first example, the variable s is unnecessarily initialized. This is a separate issue from the location of the variable declaration. This adds two wasted instructions (to load a string constant and store it in a stack frame slot). A good static analysis tool will warn you that you are never reading the value you assign to s, and a good JIT compiler will probably elide it at runtime.
You could fix this simply by using an empty declaration (i.e., String s;), but this is considered bad practice and has another side-effect discussed below.
Often a bogus value like null is assigned to a variable simply to hush a compiler error that a variable is read without being initialized. This error can be taken as a hint that the variable scope is too large, and that it is being declared before it is needed to receive a valid value. Empty declarations force you to consider every code path; don't ignore this valuable warning by assigning a bogus value.
Conserve Stack Slots
As mentioned, while the JVM instructions are the same in both cases, there is a subtle side-effect that makes it best, at a JVM level, to use the most limited scope possible. This is visible in the "local variable table" for the method. Consider what happens if you have multiple loops, with the variables declared in unnecessarily large scope:
void x(String[] strings, Integer[] integers) {
String s;
for (int i = 0; i < strings.length; ++i) {
s = strings[0];
...
}
Integer n;
for (int i = 0; i < integers.length; ++i) {
n = integers[i];
...
}
}
The variables s and n could be declared inside their respective loops, but since they are not, the compiler uses two "slots" in the stack frame. If they were declared inside the loop, the compiler can reuse the same slot, making the stack frame smaller.
What Really Matters
However, most of these issues are immaterial. A good JIT compiler will see that it is not possible to read the initial value you are wastefully assigning, and optimize the assignment away. Saving a slot here or there isn't going to make or break your application.
The important thing is to make your code readable and easy to maintain, and in that respect, using a limited scope is clearly better. The smaller scope a variable has, the easier it is to comprehend how it is used and what impact any changes to the code will have.
In theory, it's a waste of resources to declare the string inside the loop.
In practice, however, both of the snippets you presented will compile down to the same code (declaration outside the loop).
So, if your compiler does any amount of optimization, there's no difference.
In general I would choose the second one, because the scope of the 's' variable is limited to the loop. Benefits:
This is better for the programmer because you don't have to worry about 's' being used again somewhere later in the function
This is better for the compiler because the scope of the variable is smaller, and so it can potentially do more analysis and optimisation
This is better for future readers because they won't wonder why the 's' variable is declared outside the loop if it's never used later
If you want to speed up for loops, I prefer declaring a max variable next to the counter so that no repeated lookups for the condidtion are needed:
instead of
for (int i = 0; i < array.length; i++) {
Object next = array[i];
}
I prefer
for (int i = 0, max = array.lenth; i < max; i++) {
Object next = array[i];
}
Any other things that should be considered have already been mentioned, so just my two cents (see ericksons post)
Greetz, GHad
To add on a bit to #Esteban Araya's answer, they will both require the creation of a new string each time through the loop (as the return value of the some Assignment expression). Those strings need to be garbage collected either way.
I know this is an old question, but I thought I'd add a bit that is slightly related.
I've noticed while browsing the Java source code that some methods, like String.contentEquals (duplicated below) makes redundant local variables that are merely copies of class variables. I believe that there was a comment somewhere, that implied that accessing local variables is faster than accessing class variables.
In this case "v1" and "v2" are seemingly unnecessary and could be eliminated to simplify the code, but were added to improve performance.
public boolean contentEquals(StringBuffer sb) {
synchronized(sb) {
if (count != sb.length())
return false;
char v1[] = value;
char v2[] = sb.getValue();
int i = offset;
int j = 0;
int n = count;
while (n-- != 0) {
if (v1[i++] != v2[j++])
return false;
}
}
return true;
}
It seems to me that we need more specification of the problem.
The
s = some Assignment;
is not specified as to what kind of assignment this is. If the assignment is
s = "" + i + "";
then a new sting needs to be allocated.
but if it is
s = some Constant;
s will merely point to the constants memory location, and thus the first version would be more memory efficient.
Seems i little silly to worry about to much optimization of a for loop for an interpreted lang IMHO.
When I'm using multiple threads (50+) then i found this to be a very effective way of handling ghost thread issues with not being able to close a process correctly ....if I'm wrong, please let me know why I'm wrong:
Process one;
BufferedInputStream two;
try{
one = Runtime.getRuntime().exec(command);
two = new BufferedInputStream(one.getInputStream());
}
}catch(e){
e.printstacktrace
}
finally{
//null to ensure they are erased
one = null;
two = null;
//nudge the gc
System.gc();
}