Will the Java Compiler optimize simple repeated math operations like:
if (prevX / width != curX / width) {
// Do something with prevX / width value
} else {
// Do something with curX / width value
}
I know I can just assign the results to a variables before the if statement, and return the variables, but it's kind of cumbersome. If the compiler automatically recognizes that the same calculations are being made and caches the results to temporary variables on its own, I'd rather stick to the above convention.
*Edit - I'm an idiot. I tried to simply/abstract my question too much. It's not at simple as: if (x > y)
The answer is yes. This is called Common Subexpression Elimination and is a standard (and powerful) compiler optimization used in Java, C/C++ and others...
This page confirms that the HotSpot JVM will do this optimization.
That said, whether or not the compiler/run-time will be able to do this optimization when you expect it to is another story. So I usually prefer to do these optimizations myself if it also enhances readability.
double xw = x / width;
double yw = y / width;
if (xw > yw) {
return xw;
} else {
return yw;
}
The compiler may perform such optimizations. Whether it actually does depends on the answers to following:
Is the compiler allowed to do this by the JLS?
In some cases it is not. For instance if prevX was a volatile instance variable, then it must be fetched from memory each time the source code says it is used. Another case is where the common subexpression involves a method call with an observable side-effect; i.e. where something else in program might be able to tell if the method is called once or twice.
Is the compiler capable of doing this?
A compiler needs to analyze the code to detect common subexpressions that could legally be optimized. There two issues here:
Is the compiler capable of performing the necessary reasoning? For instance, one could hypothesize a compiler that can determine that a specific method call will be side-effect free and that therefore can be optimized. However, building a compiler that is actually capable of doing this is ... and interesting problem.
Is the optimization worthwhile? There is a trade-off between the cost of performing an optimization and the benefits. It is not a straight forward trade-off. It needs to take into account the cost of looking to see if an optimization can be performed ... when it actually can't. In other words, the impact on compilation time. (Bear and mind that in Java the optimizations are mostly done at runtime by the JIT compiler ... so this impacts on application performance.)
In a simple example like yours, the optimization is legal (modulo volatile) and one should expect a half-decent JIT compiler to perform it.
The other question is whether you should try to help the compiler by evaluating the common expressions explicitly your code and assigning the results to temporaries.
IMO, the answer is generally no.
A good compiler will probably do as good as job as you at this. And if it doesn't, then the next generation may do.
The code probably doesn't warrant hand optimization. Unless you've profiled your code to determine where the bottlenecks are, your hand optimizations stand a good chance of being irrelevant to actual application performance ... and a waste of your time.
There is a chance that you will stuff it up; e.g. by forgetting that a method call has an important side-effect or that a variable is volatile for a good reason.
On the other hand, if the rewrite makes your code more readable, that's a good reason to do it.
In general, "yes" - the compiler will optimize the code if it can, and the HotSpot JVM can also improve repeatedly-executed code blocks.
In this case however, you would be better to refactor the code like this:
if (x > y)
return x / width;
return y / width;
which avoids one division operation if x > y.
Related
Is there any difference between doing
if (numberOfEntries >= array.length) {do stuff}; // Check if array is full directly
over doing something like
private boolean isArrayFull(){
return numberOfEntries >= array.length;
}
if (isArrayFull()) {do stuff}; // Call a check function
Over large arrays, many iterations and any other environment of execution, is there any difference to these methods other than readability and code duplication, if I need to check if the array is full anywhere else?
Forget about performance. That is negligible.
But if you are doing it many times, util method isArrayFull() makes sense. Because if you are adding more conditions to your check, changing in the function reflects everywhere.
As said above, first make your design good and then determine performance issues, using some tools. Java has JIT optimisations for inlining, so there is no difference.
The JIT aggressively inlines methods, removing the overhead of method calls
from https://techblug.wordpress.com/2013/08/19/java-jit-compiler-inlining/
Note: The below explanation is not any language specific. It is generic.
The difference comes when you analyze the options at machine level, A function is actually some JMP operations and allot of PUSH/POP operations on the CPU. An IF is usually a single COMP operation which is much cheaper than any what happens during function call.
If your 'IF's usually return false/true then I won't worry about it as the CPU optimizes IFs in a very good way by predicting the result as long as the IFs are "predictable" (usually returns true or false or has some pattern of true/false)
I would go with the IFs in cases where even negligible improvement in performance is a big deal.
In cases like web applications reducing the code redundancy to make the code manageable and readable is way more important than the optimization to save a few instructions at machine level.
I am wondering which of the following is the most efficient?
int x = 1, y = 2;
System.out.print(x+y)
or...
int x = 1, y = 2, z = 3;
System.out.print(z);
I'm guessing it's the first, but not sure - thanks.
The real answer is: talking about efficiency on such a level does not make any sense at all.
Keep in mind that the overall performance and efficiency of a Java program is determined by many many factors - for example when/how the JIT kicks in in order to turn byte code into machine code.
Worrying about such subtleties will not help you to create a meaningful, maintainable, "good OO" design. Heck; in your case, depending on context, it could even be that the compiler does constant folding and turns your whole thing into println(3) (as it is really straight forward to throw away those variables); so maybe in both cases, the compiler creates the exact same bytecode.
Dont get me wrong: it is fair to ask/learn/understand what compilers, JVMs and JITs do. But: dont assume that you can categorize things that easily into "A more efficient than B".
If you truly mean the case where you have supplied all the literal values like that, then the difference doesn't exist at all, at least not after your code is JIT-compiled. In either case you will have zero calculation done at runtime. The JIT compiler will work out the result and hardcode it into all its use sites. The optimization techniques involved are Constant Propagation and Constant Folding.
It would be second option as you do not need any memory for calculation. You're just print a number instead of adding them together and than printing.
This is simple example, so performance is not noticeable at this level..
Good practice is to assign the task appropriately to different functions.
I see a lot of this kind of code written by Java developers and Java instructors:
for ( int x = 0 ; x < myArray.length ; x++ )
accum += (mean() - myArray[x]) * (mean() - myArray[x] );
I am very critical of this because mean() is being invoked twice for every element in the array, when it only has to be invoked once:
double theMean = mean();
for ( int x = 0 ; x < myArray.length ; x++ )
accum += (theMean - myArray[x]) * (theMean - myArray[x]);
Is there something about optimization in Java that makes the first example acceptable? Should I stop riding developers about this?
*** More information. An array of samples is stored as an instance variable. mean() has to traverse the array and calculate the mean every time it is invoked.
You are right. Your way (second code sample) is more efficient. I don't think Java can optimize the first code sample to call mean() just once and re-use its return value, since mean() might have side effects, so the compiler can't decide to call it once if your code calls it twice.
Leave your developers alone, it's fine -- it's readable and it works, without introducing unnecessary names and variables.
Optimization should only ever be done under the guidance of a performance monitoring tool which can show you where you're actually slow. And, typically, performance is enhanced more effectively by considering the large scale architecture of an application, not line by line bytecode optimization, which is expensive and usually unhelpful.
Your version will likely run faster, though an optimizing compiler may be able to detect if the mean() method returns the same value every time (e.g. if the value is hard-coded or stored in a field) and eliminate the method call.
If you are recommending this change for efficiency reasons, you may be falling foul of premature optimization. You don't really know where the bottlenecks are in your system until you measure in the appropriate environment under appropriate loads. Even then, improved hardware is often more cost-effective solution than developer time.
If you are recommending it because it will eliminate duplication then I think you might be on stronger ground. If the mean() method took arguments too, it would be especially reasonable to pull that out of the loop and call the method once and only once.
Yes, some compilers will optimize this to just what you say.
Yes, you should stop riding developers about this.
I think your preferred way is better, but not mostly because of the optimization. It is more clear that the value is the same in both places if it does not involve a method call, particularly in cases where the method call is more complex than the one you have here.
For that matter, I think it's better to write
double theMean = mean();
for (int x=0; x < myArray.length; x++)
{ double curValue = myArray[x];
double toSquare = theMean - curValue;
accum += toSquare * toSquare;
}
Because it makes it easier to determine that you are squaring whatever is being accumulated, and just what it is that's being sqaured.
Normally the compiler will not optimize the method call since it cannot know whether the return value would be the same (this is especially true when mean processes an array as it has no way of checking whether the result can be cached). So yes the mean() method would be invoked twice.
In this case, if you know for sure that the array is kept the same regardless of the values of x and accum in the loop (more generally, regardless of any change in the program values), then the second code is more optimal.
I have the following code
float square(float val) { return val*val;}
boolean isInCircle(final float x,final float y) {
float squareDistance = square(cx - x) + square(cy - y);
return squareDistance < square(RADIUS);
}
where RADIUS is a static final float.
Will the java compiler optimize the call square(RADIUS) ?
What happens when this converted to dalvik code for android ? Will it remain optimized ?
The Java compiler won't do anything with that code.
The HotSpot JVM will almost certainly precompute square(RADIUS).
Android doesn't have that particular JVM.
I personally wouldn't write the square() method at all, just return (cx-x)*(cx-x)+(cy-y)*(cy-y) < RADIUS*RADIUS; and let the compiler/JVM battle it out from there. But then I'm a mathematician ;-)
The Dalvik JIT compiler does in fact inline short functions, such as square() as defined in the question (though probably better to declare it static). However, I couldn't tell you off-hand whether it would definitely get inlined.
Profile the code if it matters!
Optimizations in Java are done (as far as I know) by the HotSpot compiler at runtime (bytecode is optimized when translated to machine code). So the answer is yes and no.
The transformed code will be equally optimized, but it depends on JVM, what will do with it. According my experience, it is highly dependent on the JVM and probably in its setting (agressivity of the optimizer). I have tried to compare running of SHA1 with loops and without on Windows JVM and Linux one. In one case the code without loops was many times faster, in the second (I think on Linux) there was only a difference about 40% of the time taken...
So it is a magic, you might give HotSpot good hints to optimize, or configure JVM, but still, it will depend on the current algorithm of JVM...
The only optimization that will happen is that the compiler will most likely "intern" the value of the static final field as a constant where it is accessed, rather than performing a field lookup at runtime.
I am interested whether should I manually inline small methods which are called 100k - 1 million times in some performance-sensitive algorithm.
First, I thought that, by not inlining, I am incurring some overhead since JVM will have to find determine whether or not to inline this method (or even fail to do so).
However, the other day, I replaced this manually inlined code with invocation of static methods and seen a performance boost. How is that possible? Does this suggest that there is actually no overhead and that by letting JVM inline at "its will" actually boosts performance? Or this hugely depends on the platform/architecture?
(The example in which a performance boost occurred was replacing array swapping (int t = a[i]; a[i] = a[j]; a[j] = t;) with a static method call swap(int[] a, int i, int j). Another example in which there was no performance difference was when I inlined a 10-liner method which was called 1000000 times.)
I have seen something similar. "Manual inlining" isn't necessarily faster, the result program can be too complex for optimizer to analyze.
In your example let's make some wild guesses. When you use the swap() method, JVM may be able to analyze the method body, and conclude that since i and j don't change, although there are 4 array accesses, only 2 range checks are needed instead of 4. Also the local variable t isn't necessary, JVM can use 2 registers to do the job, without involving r/w of t on stack.
Later, the body of swap() is inlined into the caller method. That is after the previous optimization, so the saves are still in place. It's even possible that caller method body has proved that i and j are always within range, so the 2 remaining range checks are also dropped.
Now in the manually inlined version, the optimizer has to analyze the whole program at once, there are too many variables and too many actions, it may fail to prove that it's safe to save range checks, or eliminate the local variable t. In the worst case this version may cost 6 more memory accesses to do the swap, which is a huge overhead. Even if there is only 1 extra memory read, it is still very noticeable.
Of course, we have no basis to believe that it's always better to do manual "outlining", i.e. extract small methods, wishfully thinking that it will help the optimizer.
--
What I've learned is that, forget manual micro optimizations. It's not that I don't care about micro performance improvements, it's not that I always trust JVM's optimization. It is that I have absolutely no idea what to do that does more good than bad. So I gave up.
The JVM can inline small methods very efficiently. The only benifit inlining yourself is if you can remove code i.e. simplify what it does by inlining it.
The JVM looks for certain structures and has some "hand coded" optimisations when it recognises those structures. By using a swap method, the JVM may recognise the structure and optimise it differently with a specific optimisation.
You might be interested to try the OpenJDK 7 debug version which has an option to print out the native code it generates.
Sorry for my late reply, but I just found this topic and it got my attention.
When developing in Java, try to write "simple and stupid" code. Reasons:
the optimization is made at runtime (since the compilation itself is made at runtime). The compiler will figure out anyway what optimization to make, since it compiles not the source code you write, but the internal representation it uses (several AST -> VM code -> VM code ... -> native binary code transformations are made at runtime by the JVM compiler and the JVM interpreter)
When optimizing the compiler uses some common programming patterns in deciding what to optimize; so help him help you! write a private static (maybe also final) method and it will figure out immediately that it can:
inline the method
compile it to native code
If the method is manually inlined, it's just part of another method which the compiler first tries to understand and see whether it's time to transform it into binary code or if it must wait a bit too understand the program flow. Also, depending on what the method does, several re-JIT'ings are possible during runtime => JVM produces optimum binary code only after a "warm up"... and maybe your program ended before the JVM warms itself up (because I expect that in the end the performance should be fairly similar).
Conclusion: it makes sense to optimize code in C/C++ (since the translation into binary is made statically), but the same optimizations usually don't make a difference in Java, where the compiler JITs byte code, not your source code. And btw, from what I've seen javac doesn't even bother to make optimizations :)
However, the other day, I replaced this manually inlined code with invocation of static methods and seen a performance boost. How is that possible?
Probably the JVM profiler sees the bottleneck more easily if it is in one place (a static method) than if it is implemented several times separately.
The Hotspot JIT compiler is capable of inlining a lot of things, especially in -server mode, although I don't know how you got an actual performance boost. (My guess would be that inlining is done by method invocation count and the method swapping the two values isn't called too often.)
By the way, if its performance really matters, you could try this for swapping two int values. (I'm not saying it will be faster, but it may be worth a punt.)
a[i] = a[i] ^ a[j];
a[j] = a[i] ^ a[j];
a[i] = a[i] ^ a[j];