Is compiler able to optimize reference creation? - java

Take for example a loop like this:
public boolean method(){
for (int i = 0; i < 5; i++) {
if (this.object.getSomething().getSomeArray().get(i).getArray().size() > 0)
return false;
}
return true;
}
Each get method simply retrieves a private attribute. A more readable version of the same code would be:
public boolean method(){
MySomeArray mySomeArray = this.object.getSomething().getSomeArray();
for (int i = 0; i < 5; i++) {
MyArray array = mySomeArray.get(i).getArray();
if (array.size() > 0)
return false;
}
return true;
}
Another version is:
public boolean method(){
MySomeArray mySomeArray = this.object.getSomething().getSomeArray();
MyArray array;
for (int i = 0; i < 5; i++) {
array = mySomeArray.get(i).getArray();
if (array.size() > 0)
return false;
}
return true;
}
I know that in theory compilers can optimize many things and in this case, (in my opinion) the three versions of the loop should be optimized in exactly the same machine code.
Am I correct or there would be difference in terms of number of instructions executed in the three versions?

If MySomeArray, as well as all other classes involved in your dereference chain, are at the bottom of their respective class hierarchies, then HotSpot will have an easy time turning all those virtual function calls into "plain" (non-virtual) calls by a technique known as monomorphic call site optimization.
This can also happen even if the classes involved are not leaf classes. The important thing is that at each call site, only one object type ever gets dispatched on.
With the uncertainty of virtual functions out of the way, the compiler can proceed to inline all the calls, and then to perform any further optimizations, like hoisting in your case. The ultimate values retrieved from the chain of dereferencing can be bound to registers, etc.
Note that much of the above is subject to the entire code path being free of any happens-before relations to the actions of other threads. In practice this mostly means no volatile variable access and no synchronized blocks (within your own code as well as within all the code called from your code).

Write a test case that uses this method and print the generated assembly code when you run it. You can then check yourself how many of the calls are inlined. I'm skeptical about the compiler being able to inline them all, but the JIT compiler can be surprising.
I would prefer the more readable version anyway, because it's more readable.

With enough inlining, the compiler can indeed hoist the method calls out of the loop, very much like you did by hand in your second and third examples. The details of whether it will actually do this depend entirely on the behavior and size of the methods in question, and the sophistication of the JIT involved.
I wrote up your example and tested it with Caliper, and all the methods have equivalent timings. I didn't inspect the assembly, since that's more involved - but I'll bet they are near equivalent.

The trouble is that you are making assumptions that the compiler cannot make.
You know that this.object.getSomething().getSomeArray() does not change each time around the loop but the compiler has no way to know that. Especially since other threads may potentially be modifying those variables at the same time...

Related

Should a local variable be introduced for an array element at a specific index which is accessed repeatedly?

If an array element at a specific index is accessed repeatedly in a loop, should a local variable be introduced, in sense of performance ? Aka. does the array access by index bring overhead?
e.g
public void test(int[] arr) {
for (int i = 0; i < (1 << 20); i++) {
System.out.println(arr[0]);
}
}
public void test2(int[] arr) {
int first = arr[0];
for (int i = 0; i < (1 << 20); i++) {
System.out.println(first);
}
}
Is test2() better than test(), in the sense of performance ?
Update - languages of interests
Golang, C, Java
The question is entirely dependent of the language used and more specifically of the toolchain used (compilers, JIT, interpreters, etc.). Since the provided code is in Java, I will consider the case of Java using a mainstream JVM like HotSpot for example.
Mainstream JVM implementations can optimize this themselves easily as long as the loop is a hot loop. Indeed, JVMs can know that arr[0] is a constant here. This is the case here, especially if the function is executed multiple times. Thus, like most of the time, it is not a problem and you should not care about such micro-optimizations unless you get a benchmark that shows it is actually a problem. The proposed optimization does not matter here because the println call will be several orders of magnitude slower than anything else in the loop.
Note however, when you have a lot of small loops and the code is executed only few time or very rarely, then the second code can be slightly faster. This reason is that the second code results in a bit less efficient bytecode that may not be directly optimized by the JVM due to the cost of compiling the bytecode to a fast native code (it as to find a trade-off).
Jérôme Richard's answer has the most important part in it: don't worry about this kind of micro-optimization unless/until you have a benchmark showing that it's important.
I'll answer from the Go and C sides in a different way though. The two bits of code have different meanings here. (I'm not really a Java programmer so I'll just refer to Aliasing in Java with Arrays and Reference Types for the Java variant of this point.) Let's also change the code so that we have a mystery function, rather than some known-to-do-nothing-but-print function:
/* C */
extern void f(int);
void test(int *arr) {
int i;
for (i = 0; i < (1 << 20); i++) {
f(arr[0]);
}
}
// Go
func test(arr []int, f func(int)) {
for i := 0; i < (1 << 20); i++ {
f(arr[0])
}
}
Now let's consider a valid call to test. Here's part of the C-language implementation of f:
extern int A[];
void f(int arg) {
/* do something with arg */
A[0]++;
}
The call to test reads:
test(A);
That is, arr in test is A, and f() modifies A[0]. So each call to f() needs to pass a different integer value.
If you modify test to read:
/* C */
extern void f(int);
void test(int *arr) {
int i;
int arg = arr[0];
for (i = 0; i < (1 << 20); i++) {
f(arg);
}
}
then suddenly each call to f passes the original value only from A[0]. So these programs have different meanings.
Go and C have similar aliasing rules. However, Go compilers can often "see further" than C compilers (because the compiler usually gets a better chance to do function inlining, if nothing else) and hence detect whether or not some aliasing may be taking place. It's easier, in a sense, for a Go compiler to grab the arr[0] value once outside the loop, if that's possible, than it is for the C compiler. That's not a function of the language itself: it's a function of the traditional ways that C and Go compilers have been written.
Still, the upshot of all this is that if you intend to pass the same value to your function every trip through the loop, you can write that as code by copying arr[0] to a local variable before running the loop. If you intend to allow arr[0] to be modified each trip through the loop, you can write that by writing the variant without a local variable—but it might also be wise to put in a comment, noting that the called function is intended to be able to modify the array element.
Write the code so that the reader can understand the intent first. Then, if and when it proves to be a bottleneck, write the code in some more-obscure-but-faster form, if that's possible and appropriate.

Overuse of Method-chaining in Java

I see a lot of this kind of code written by Java developers and Java instructors:
for ( int x = 0 ; x < myArray.length ; x++ )
accum += (mean() - myArray[x]) * (mean() - myArray[x] );
I am very critical of this because mean() is being invoked twice for every element in the array, when it only has to be invoked once:
double theMean = mean();
for ( int x = 0 ; x < myArray.length ; x++ )
accum += (theMean - myArray[x]) * (theMean - myArray[x]);
Is there something about optimization in Java that makes the first example acceptable? Should I stop riding developers about this?
*** More information. An array of samples is stored as an instance variable. mean() has to traverse the array and calculate the mean every time it is invoked.
You are right. Your way (second code sample) is more efficient. I don't think Java can optimize the first code sample to call mean() just once and re-use its return value, since mean() might have side effects, so the compiler can't decide to call it once if your code calls it twice.
Leave your developers alone, it's fine -- it's readable and it works, without introducing unnecessary names and variables.
Optimization should only ever be done under the guidance of a performance monitoring tool which can show you where you're actually slow. And, typically, performance is enhanced more effectively by considering the large scale architecture of an application, not line by line bytecode optimization, which is expensive and usually unhelpful.
Your version will likely run faster, though an optimizing compiler may be able to detect if the mean() method returns the same value every time (e.g. if the value is hard-coded or stored in a field) and eliminate the method call.
If you are recommending this change for efficiency reasons, you may be falling foul of premature optimization. You don't really know where the bottlenecks are in your system until you measure in the appropriate environment under appropriate loads. Even then, improved hardware is often more cost-effective solution than developer time.
If you are recommending it because it will eliminate duplication then I think you might be on stronger ground. If the mean() method took arguments too, it would be especially reasonable to pull that out of the loop and call the method once and only once.
Yes, some compilers will optimize this to just what you say.
Yes, you should stop riding developers about this.
I think your preferred way is better, but not mostly because of the optimization. It is more clear that the value is the same in both places if it does not involve a method call, particularly in cases where the method call is more complex than the one you have here.
For that matter, I think it's better to write
double theMean = mean();
for (int x=0; x < myArray.length; x++)
{ double curValue = myArray[x];
double toSquare = theMean - curValue;
accum += toSquare * toSquare;
}
Because it makes it easier to determine that you are squaring whatever is being accumulated, and just what it is that's being sqaured.
Normally the compiler will not optimize the method call since it cannot know whether the return value would be the same (this is especially true when mean processes an array as it has no way of checking whether the result can be cached). So yes the mean() method would be invoked twice.
In this case, if you know for sure that the array is kept the same regardless of the values of x and accum in the loop (more generally, regardless of any change in the program values), then the second code is more optimal.

Java for loop performance

What is better in for loop
This:
for(int i = 0; i<someMethod(); i++)
{//some code
}
or:
int a = someMethod();
for(int i = 0; i<a; i++)
{//some code
}
Let's just say that someMethod() returns something large.
First method will execute someMethod() in each loop thus decreasing speed, second is faster but let's say that there are a lot of similar loops in application so declaring a variable vill consume more memory.
So what is better, or am I just thinking stupidly.
The second is better - assuming someMethod() does not have side effects.
It actually caches the value calculated by someMethod() - so you won't have to recalculate it (assuming it is a relatively expansive op).
If it does (has side effects) - the two code snaps are not equivalent - and you should do what is correct.
Regarding the "size for variable a" - it is not an issue anyway, the returned value of someMethod() needs to be stored on some intermediate temp variable anyway before calculation (and even if it wasn't the case, the size of one integer is negligible).
P.S.
In some cases, compiler / JIT optimizer might optimize the first code into the second, assuming of course no side effects.
If in doubt, test. Use a profiler. Measure.
Assuming the iteration order isn't relevant, and also assuming you really want to nano-optimize your code, you may do this :
for (int i=someMethod(); i-->0;) {
//some code
}
But an additional local variable (your a) isn't such a burden. In practice, this isn't much different from your second version.
If you don't need this variable after loop, there is simple way to hide it inside:
for (int count = someMethod (), i = 0; i < count; i++)
{
// some code
}
It really depends how long it takes to generate the output of someMethod(). Also the memory usage would be the same, because someMethod() first has to generate the output and stores this then. The second way safes your cpu from computing the same output every loop and it should not take more memory. So the second one is better.
I would not consider the memory consumption of the variable a as a problem as it is an int and requires 192 bit on a 64 bit machine. So I would prefer the second alternative as it execution efficiency is better.
The most important part about loop optimizations is allowing the JVM to unroll the loop. To do so in the 1st variant it has to be able to inline the call to someMethod(). Inlining has some budget and it can get busted at some point. If someMethod() is long enough the JVM may decide it doesn't like to inline.
The second variant is more helpful (to JIT compiler) and likely to work better.
my way for putting down the loop is:
for (int i=0, max=someMethod(); i<max; i++){...}
max doesn't pollute the code, you ensure no side effects from multiple calls of someMethod() and it's compact (single liner)
If you need to optimize this, then this is the clean / obvious way to do it:
int a = someMethod();
for (int i = 0; i < a; i++) {
//some code
}
The alternative version suggested by #dystroy
for (int i=someMethod(); i-->0;) {
//some code
}
... has three problems.
He is iterating in the opposite direction.
That iteration is non-idiomatic, and hence less readable. Especially if you ignore the Java style guide and don't put whitespace where you are supposed to.
There is no proof that the code will actually be faster than the more idiomatic version ... especially once the JIT compiler has optimized them both. (And even if the less readable version is faster, the difference is likely to be negligible.)
On the other hand, if someMethod() is expensive (as you postulate) then "hoisting" the call so that it is only done once is likely to be worthwhile.
I was a bit confused about the same and did a sanity test for the same with a list of 10,000,000 integers in it. Difference was more than two seconds with latter being faster:
int a = someMethod();
for(int i = 0; i<a; i++)
{//some code
}
My results on Java 8 (MacBook Pro, 2.2 GHz Intel Core i7) were:
using list object:
Start- 1565772380899,
End- 1565772381632
calling list in 'for' expression:
Start- 1565772381633,
End- 1565772384888

How bad is declaring arrays inside a for loop in Java?

I come from a C background, so I admit that I'm still struggling with letting go of memory management when writing in Java. Here's one issue that's come up a few times that I would love to get some elaboration on. Here are two ways to write the same routine, the only difference being when double[] array is declared:
Code Sample 1:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Code Sample 2:
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Here, private double[] calculateSomethingAndReturnAnArray(int i) always returns an array of the same length. I have a strong aversion to Code Sample 2 because it creates a new array for each iteration when it could just overwrite the existing array. However, I think this might be one of those times when I should just sit back and let Java handle the situation for me.
What are the reasons to prefer one of the ways over the other or are they truly identical in Java?
There's nothing special about arrays here because you're not allocating for the array, you're just creating a new variable, it's equivalent to:
Object foo;
for(...){
foo = func(...);
}
In the case where you create the variable outside the loop it, the variable (which will hold the location of the thing it refers to) will only ever be allocated once, in the case where you create the variable inside the loop, the variable may be reallocated for in each iteration, but my guess is the compiler or the JIT will fix that in an optimization step.
I'd consider this a micro-optimization, if you're running into problems with this segment of your code, you should be making decisions based on measurements rather than on the specs alone, if you're not running into issues with this segment of code, you should do the semantically correct thing and declare the variable in the scope that makes sense.
See also this similar question about best practices.
A declaration of a local variable without an initializing expression will do NO work whatsoever. The work happens when the variable is initialized.
Thus, the following are identical with respects to semantics and performance:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
// ...
}
and
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
// ...
}
(You can't even quibble that the first case allows the array to be used after the loop ends. For that to be legal, array has to have a definite value after the loop, and it doesn't unless you add an initializer to the declaration; e.g. double[] array = null;)
To elaborate on #Mark Elliot 's point about micro-optimization:
This is really an attempt to optimize rather than a real optimization, because (as I noted) it should have no effect.
Even if the Java compiler actually emitted some non-trivial executable code for double[] array;, the chances are that the time to execute would be insignificant compared with the total execution time of the loop body, and of the application as a whole. Hence, this is most likely to be a pointless optimization.
Even if this is a worthwhile optimization, you have to consider that you have optimized for a specific target platform; i.e. a particular combination of hardware and JVM version. Micro-optimizations like this may not be optimal on other platforms, and could in theory be anti-optimizations.
In summary, you are most likely wasting your time if you focus on things like this when writing Java code. If performance is a concern for your application, focus on the MACRO level performance; e.g. things like algorithmic complexity, good database / query design, patterns of network interactions, and so on.
Both create a new array for each iteration. They have the same semantics.

Which loop has better performance? Why?

String s = "";
for(i=0;i<....){
s = some Assignment;
}
or
for(i=0;i<..){
String s = some Assignment;
}
I don't need to use 's' outside the loop ever again.
The first option is perhaps better since a new String is not initialized each time. The second however would result in the scope of the variable being limited to the loop itself.
EDIT: In response to Milhous's answer. It'd be pointless to assign the String to a constant within a loop wouldn't it? No, here 'some Assignment' means a changing value got from the list being iterated through.
Also, the question isn't because I'm worried about memory management. Just want to know which is better.
Limited Scope is Best
Use your second option:
for ( ... ) {
String s = ...;
}
Scope Doesn't Affect Performance
If you disassemble code the compiled from each (with the JDK's javap tool), you will see that the loop compiles to the exact same JVM instructions in both cases. Note also that Brian R. Bondy's "Option #3" is identical to Option #1. Nothing extra is added or removed from the stack when using the tighter scope, and same data are used on the stack in both cases.
Avoid Premature Initialization
The only difference between the two cases is that, in the first example, the variable s is unnecessarily initialized. This is a separate issue from the location of the variable declaration. This adds two wasted instructions (to load a string constant and store it in a stack frame slot). A good static analysis tool will warn you that you are never reading the value you assign to s, and a good JIT compiler will probably elide it at runtime.
You could fix this simply by using an empty declaration (i.e., String s;), but this is considered bad practice and has another side-effect discussed below.
Often a bogus value like null is assigned to a variable simply to hush a compiler error that a variable is read without being initialized. This error can be taken as a hint that the variable scope is too large, and that it is being declared before it is needed to receive a valid value. Empty declarations force you to consider every code path; don't ignore this valuable warning by assigning a bogus value.
Conserve Stack Slots
As mentioned, while the JVM instructions are the same in both cases, there is a subtle side-effect that makes it best, at a JVM level, to use the most limited scope possible. This is visible in the "local variable table" for the method. Consider what happens if you have multiple loops, with the variables declared in unnecessarily large scope:
void x(String[] strings, Integer[] integers) {
String s;
for (int i = 0; i < strings.length; ++i) {
s = strings[0];
...
}
Integer n;
for (int i = 0; i < integers.length; ++i) {
n = integers[i];
...
}
}
The variables s and n could be declared inside their respective loops, but since they are not, the compiler uses two "slots" in the stack frame. If they were declared inside the loop, the compiler can reuse the same slot, making the stack frame smaller.
What Really Matters
However, most of these issues are immaterial. A good JIT compiler will see that it is not possible to read the initial value you are wastefully assigning, and optimize the assignment away. Saving a slot here or there isn't going to make or break your application.
The important thing is to make your code readable and easy to maintain, and in that respect, using a limited scope is clearly better. The smaller scope a variable has, the easier it is to comprehend how it is used and what impact any changes to the code will have.
In theory, it's a waste of resources to declare the string inside the loop.
In practice, however, both of the snippets you presented will compile down to the same code (declaration outside the loop).
So, if your compiler does any amount of optimization, there's no difference.
In general I would choose the second one, because the scope of the 's' variable is limited to the loop. Benefits:
This is better for the programmer because you don't have to worry about 's' being used again somewhere later in the function
This is better for the compiler because the scope of the variable is smaller, and so it can potentially do more analysis and optimisation
This is better for future readers because they won't wonder why the 's' variable is declared outside the loop if it's never used later
If you want to speed up for loops, I prefer declaring a max variable next to the counter so that no repeated lookups for the condidtion are needed:
instead of
for (int i = 0; i < array.length; i++) {
Object next = array[i];
}
I prefer
for (int i = 0, max = array.lenth; i < max; i++) {
Object next = array[i];
}
Any other things that should be considered have already been mentioned, so just my two cents (see ericksons post)
Greetz, GHad
To add on a bit to #Esteban Araya's answer, they will both require the creation of a new string each time through the loop (as the return value of the some Assignment expression). Those strings need to be garbage collected either way.
I know this is an old question, but I thought I'd add a bit that is slightly related.
I've noticed while browsing the Java source code that some methods, like String.contentEquals (duplicated below) makes redundant local variables that are merely copies of class variables. I believe that there was a comment somewhere, that implied that accessing local variables is faster than accessing class variables.
In this case "v1" and "v2" are seemingly unnecessary and could be eliminated to simplify the code, but were added to improve performance.
public boolean contentEquals(StringBuffer sb) {
synchronized(sb) {
if (count != sb.length())
return false;
char v1[] = value;
char v2[] = sb.getValue();
int i = offset;
int j = 0;
int n = count;
while (n-- != 0) {
if (v1[i++] != v2[j++])
return false;
}
}
return true;
}
It seems to me that we need more specification of the problem.
The
s = some Assignment;
is not specified as to what kind of assignment this is. If the assignment is
s = "" + i + "";
then a new sting needs to be allocated.
but if it is
s = some Constant;
s will merely point to the constants memory location, and thus the first version would be more memory efficient.
Seems i little silly to worry about to much optimization of a for loop for an interpreted lang IMHO.
When I'm using multiple threads (50+) then i found this to be a very effective way of handling ghost thread issues with not being able to close a process correctly ....if I'm wrong, please let me know why I'm wrong:
Process one;
BufferedInputStream two;
try{
one = Runtime.getRuntime().exec(command);
two = new BufferedInputStream(one.getInputStream());
}
}catch(e){
e.printstacktrace
}
finally{
//null to ensure they are erased
one = null;
two = null;
//nudge the gc
System.gc();
}

Categories

Resources