Java compiler optimization for repeated method calls?

Java compiler optimization for repeated method calls? - java

Does the java compiler (the default javac that comes in JDK1.6.0_21) optimize code to prevent the same method from being called with the same arguments over and over? If I wrote this code:
public class FooBar {
public static void main(String[] args) {
foo(bar);
foo(bar);
foo(bar);
}
}
Would the method foo(bar) only run once? If so, is there any way to prevent this optimization? (I'm trying to compare runtime for two algos, one iterative and one comparative, and I want to call them a bunch of times to get a representative sample)
Any insight would be much appreciated; I took this problem to the point of insanity (I though my computer was insanely fast for a little while, so I kept on adding method calls until I got the code too large error at 43671 lines).

The optimization you are observing is probably nothing to do with repeated calls ... because that would be an invalid optimization. More likely, the optimizer has figured out that the method calls have no observable effect on the computation.
The cure is to change the method so that it does affect the result of computation ...

It doesn't; that would cause a big problem if foo is non-pure (changes the global state of the program). For example:
public class FooBar {
private int i = 0;
private static int foo() {
return ++i;
}
public static void main(String[] args) {
foo();
foo();
foo();
System.out.println(i);
}
}

You haven't provided enough information to allow for any definitive answers, but the jvm runtime optimizer is extremely powerful and does all sorts of inlining, runtime dataflow and escape analysis, and all manner of cache tricks.
The end result is to make the sort of micro-benchmarks you are trying to perform all but useless in practice; and extremely difficult to get right even when they are potentially useful.
Definitely read http://www.ibm.com/developerworks/java/library/j-benchmark1.html for a fuller discussion on the problems you face. At the very least you need to ensure:
foo is called in a loop that runs thousands of times
foo() returns a result, and
that result is used
The following is the minimum starting point, assuming foo() is non-trivial and therefore is unlikely to be inlined. Note: You still have to expect loop-unrolling and other cache level optimizations. Also watch out for the hotspot compile breakpoint (I believe this is ~5000 calls on -server IIRC), which can completely stuff up your measurements if you try to re-run the measurements in the same JVM.
public class FooBar {
public static void main(String[] args) {
int sum = 0;
int ITERATIONS = 10000;
for (int i = 0; i < ITERATIONS; i++) {
sum += foo(i);
}
System.out.println("%d iterations returned %d sum", ITERATIONS, sum);
}
}
Seriously, you need to do some reading before you can make any meaningful progress towards writing benchmarks on a modern JVM. The same optimizations that allows modern Java code to match or even sometimes beat C++ make benchmarking really difficult.

The Java compiler is not allowed to perform such optimizations because method calls very likely cause side effets, for example IO actions or changes to all fields it can reach, or calling other methods that do so.
In functional languages where each function call is guaranteed to return the same result if called with the same arguments (changes to state are forbidden), a compiler might indeed optimize away multiple calls by memorizing the result.
If you feel your algorithms are too fast, try to give them some large or complicated problem sets. There are only a few algorithms which are always quite fast.

Related

java: If I assign a variable the same value as just before, does it change the memory or does JIT recognize this?

For example:
class Main {
public boolean hasBeenUpdated = false;
public void updateMain(){
this.hasBeenUpdated = true;
/*
alternative:
if(!hasBeenUpdated){
this.hasBeenUpdated = true;
}
*/
}
public void persistUpdate(){
this.hasBeenUpdated = false;
}
}
public Main instance = new Main()
instance.updateMain()
instance.updateMain()
instance.updateMain()
Does instance.hasBeenUpdated get updated 3 times in memory?
The reason I ask this is because I hoped to use a boolean("hasBeenUpdated") as a flag, and this could theoretically be "changed" many, many times, before I call "instance.persistUpdate()".
Does the JVM's JIT see this and perform an optimization?

JIT will collapse redundant statements only when it can PROVE that removing the code will not change the behavior. For example, if you did this:
int i;
i = 1;
i = 1;
i = 1;
The first two assignments are provably redundant, and the JIT could eliminate them. If instead it's
int i;
i = someMethodReturningInt();
i = someMethodReturningInt();
i = someMethodReturningInt();
the JIT has no way of knowing what someMethodReturnintInt() does, including whether it has any side effects, so it must invoke the method 3 times. Whether or not it actually stores any but the final value is immaterial, as the code would behave the same either way. (Declaring volatile int i; instead would force it to store each value)
Of course if you're doing other things in between the method invocations the it will be forced to perform the assignment.
The whole topic is part of the more general "happens-before" and "happens-after" concepts documented in the language and JVM specifications.
Optimization is NEVER supposed to change the behavior of a program, except possibly to reduce its runtime. There have been instances where bugs in the optimizer inadvertently did introduce errors, but these have been few and far between. In general you don't need to worry about whether optimization will break your code.

It can perform an optimization, yes.
As a matter of fact, it can issue a single write, or a single call to updateMain. All those three calls will be collapsed to one, only.
But for that to happen, JIT has to prove that nothing else breaks, or more specifically that code does not break the JMM rules. In this specific case, as far as I understand it, it does not.

Given the choice is between JVM code that implements
move new value to variable
and
compare new value with current value of variable
if not the same
move new value to variable
the JVM would have to be fairly nutty to implement it the latter way. That's a pessimization, not an optimization.
The JVM to a large extent relies on the real machine to do simple operations, and real machines store values in memory when you tell them to store values in memory.

Clean code vs performance

Some principles of clean code are:
functions should do one thing at one abstraction level
functions should be at most 20 lines long
functions should never have more than 2 input parameters
How many cpu cycles are "lost" by adding an extra function call in Java?
Are there compiler options available that transform many small functions into one big function in order to optimize performance?
E.g.
void foo() {
bar1()
bar2()
}
void bar1() {
a();
b();
}
void bar2() {
c();
d();
}
Would become
void foo() {
a();
b();
c();
d();
}

How many cpu cycles are "lost" by adding an extra function call in Java?
This depends on whether it is inlined or not. If it's inline it will be nothing (or a notional amount)
If it is not compiled at runtime, it hardly matters because the cost of interperting is more important than a micro optimisation, and it is likely to be not called enough to matter (which is why it wasn't optimised)
The only time it really matters is when the code is called often, however for some reason it is prevented from being optimised. I would only assume this is the case because you have a profiler telling you this is a performance issue, and in this case manual inlining might be the answer.
I designed, develop and optimise latency sensitive code in Java and I choose to manually inline methods much less than 1% of time, but only after a profiler e.g. Flight Recorder suggests there is a significant performance problem.
In the rare event it matters, how much difference does it make?
I would estimate between 0.03 and 0.1 micros-seconds in real applications for each extra call, in a micro-benchmark it would be far less.
Are there compiler options available that transform many small functions into one big function in order to optimize performance?
Yes, in fact what could happen is not only are all these method inlined, but the methods which call them are inlined as well and none of them matter at runtime, but only if the code is called enough to be optimised. i.e. not only is a,b, c and d inlined and their code but foo is inlined as well.
By default the Oracle JVM can line to a depth of 9 levels (until the code gets more than 325 bytes of byte code)
Will clean code help performance
The JVM runtime optimiser has common patterns it optimises for. Clean, simple code is generally easier to optimise and when you try something tricky or not obvious, you can end up being much slower. If it harder to understand for a human, there is a good chance it's hard for the optimiser to understand/optimise.

Runtime behavior and cleanliness of code (a compile time or life time property of code) belong to different requirement categories. There might be cases where optimizing for one category is detrimental to the other.
The question is: which category really needs you attention?
In my view cleanliness of code (or malleability of software) suffers from a huge lack of attention. You should focus on that first. And only if other requirements start to fall behind (e.g. performance) you inquire as to whether that's due to how clean the code is. That means you need to really compare, you need to measure the difference it makes. With regard to performance use a profiler of your choice: run the "dirty" code variant and the clean variant and check the difference. Is it markedly? Only if the "dirty" variant is significantly faster should you lower the cleanliness.

Consider the following piece of code, which compares a code that does 3 things in one for loop to another that has 3 different for loops for each task.
#Test
public void singleLoopVsMultiple() {
for (int j = 0; j < 5; j++) {
//single loop
int x = 0, y = 0, z = 0;
long l = System.currentTimeMillis();
for (int i = 0; i < 100000000; i++) {
x++;
y++;
z++;
}
l = System.currentTimeMillis() - l;
//multiple loops doing the same thing
int a = 0, b = 0, c = 0;
long m = System.currentTimeMillis();
for (int i = 0; i < 100000000; i++) {
a++;
}
for (int i = 0; i < 100000000; i++) {
b++;
}
for (int i = 0; i < 100000000; i++) {
c++;
}
m = System.currentTimeMillis() - m;
System.out.println(String.format("%d,%d", l, m));
}
}
When I run it, here is the output I get for time in milliseconds.
6,5
8,0
0,0
0,0
0,0
After a few runs, JVM is able to identify hotspots of intensive code and optimises parts of the code to make them significantly faster. In our previous example, after 2 runs, the JVM had already optimised the code so much that the discussion around for-loops became redundant.
Unless we know what's happening inside, we cannot predict the performance implications of changes like introduction of for-loops. The only way to actually improve the performance of a system is by measuring it and focusing only on fixing the actual bottlenecks.
There is a chance that cleaning your code may make it faster for the JVM. But even if that is not the case, every performance optimisation, comes with added code complexity. Ask yourself whether the added complexity is worth the future maintenance effort. After all, the most expensive resource on any team is the developer, not the servers, and any additional complexity slows the developer, adding to the project cost.
The way to deal it is to figure out your benchmarks, what kind of application you're making, what are the bottlenecks. If you're making a web-app, perhaps the DB is taking most of the time, and reducing the number of functions will not make a difference. On the other hand, if its an app running on a system where performance is everything, every small thing counts.

Does Java hotspot compiler remove dead code involving an instance variables known final state

In the following code it is clear that baa is always false. Will the hotspot compiler spot this and remove the isBaa() method call and contained code?
public class Foo() {
public final boolean baa = false;
public isBaa() {
return baa;
}
}
Usage like this
static final Foo foo = new Foo();
public m() {
if (foo.isBaa()) {
// code here...
}
}
I'd like to know if this code compares to adding
static final Foo foo = new Foo();
static final BAA = foo.isBaa();
and checking with
if (BAA) ...
Interested in runtime speed after hotspot has done its thing. Is there anyway to actually see what the result of hotspot compilation is? Or do we have to infer from the implementation details of the hotspot compiler being used.
The use case is to back isDebugEnabled() by a final variable in very performance sensitive code. So I'm interested in whether the method call itself is optimized out.

Answering my own question...
I used timing to measure when hotspot has completely got rid of code. i.e. if code in a loop adds no more time to the loop execution time its been compiled out.
static final boolean checks can be completely compiled out.
static method calls can be compiled out.
instance methods, even if marked final, are not.
In my tests on ibm jdk8 final method calls that do nothing took ~6 clock cycles.
Hotspot docs from IBM imply that final instance methods can be inlined but testing indicates there is still a cost. I guess future VMs might be able to optimize this further.
In my tests it seems hotspot compilation also affects how loops are distributed across CPU cores.
for(; i < Integer.MAX_VALUE ; i++)
Is magically run across all 4 CPUs (avg time 1/4 of a clock cycle)
for(; i < Integer.MAX_VALUE ; i++) if (false) ...
Takes average 1 clock cycle
So even code that is hotspot compiled to nothing can affect optimizations of the surrounding code.

Empty loop consuming more memory than non empty loop in java

I was reading up on java performance tuning and ran into this.
When we run
public class test {
public static void main(String a[]){
for(int i=0;i<1000000;i++){
for(int j=0;j<100000;j++){
Double d = new Double(1.0);
}
}
}
}
JVisualVM shows a flat for memory consumption graph:
But when we run the below code,
public class test {
public static void main(String a[]){
for(int i=0;i<1000000;i++){
for(int j=0;j<100000;j++){
}
}
}
}
JVisualVM renders a sawtooth:
Why is this happening?
How and why gc triggering limit is changed for both the cases?

Regarding your v1 for loops, your local variable, once it has exited its scope, it will be marked for GC as free to be collected, so every once in a while the GC will kick in and collect that memory.
For your v2 for loops, those empty loops won't be 'optimized away' by the compiler**, only after multiple calls, because the
JIT triggers AFTER a certain piece of code has been executed many
times [1]
Regarding your saw tooth pattern, ruakh has a very nice explanation about it [2] :
If I'm not mistaken, part of the reason for this is that the monitor itself is forcing the application to create temporary objects that contain information about the state of garbage-collection and memory usage.
** It is possible that they may never be removed because those empty loops are well known as being used as a waiting mechanism.*
[1] Java: how much time does an empty loop use? - Simone Gianni's answer
[2] Why does an empty Java program consume memory?

The reason is the optimization algorithm once the code is compiled. on the first case because you create a double every time without keeping a record of it.This would make the program use GC constantly. Thus the compiler optimize the code so less memory used. Empty loop is a special case, because many programmer use it to make a thread to wait. so compiler won't try to optimize that.

Do more methods mean the code is less performant?

Do more methods, even if they are not called, have an affect on the performance of a particular class...
By performance, I mean anything, like does it take longer to create the object, does it take longer to actually execute a method...etc...
Now my understanding is that the code will only be compiled by the JIT compiler if it reaches a code block/method that it has not reached before...which would lead me to think that I am no affecting anything by adding default methods. Yes it will add to the "size" of the (byte) code but doesn't actually affect performance?
Am I right or wrong?
Here is the example:
public interface MyInterface {
void someMethod();
public default void myDefaultMethod() {
System.out.println("hey");
}
}
public class MyClass implements MyInterface {
public void someMethod() {
}
}
public static void main(String[] args) {
MyClass c = new MyClass();
c.someMethod();
c.myDefaultMethod();
}
If I then change MyInterface and add LOTS of default methods, even if they are never called, will it have an affect:
public interface MyInterface {
void someMethod();
public default void myDefaultMethod() {
System.out.println("hey");
}
public default void myDefaultMethod1() {
System.out.println("hey1");
}
public default void myDefaultMethod2() {
System.out.println("hey1");
}
// ...
public default void myDefaultMethod100() {
System.out.println("hey100");
}
}

You're right in one sense. There's some nuance, though.
Do more methods, even if they are not called, have an affect on the performance of a particular class...
"Performance" usually refers to speed of execution of the program. Code that is never executed will never (directly) consume any CPU time. Thus, code that is never executed cannot (directly) affect execution time. So in that sense, you are correct.
By performance, I mean anything, like does it take longer to create the object, does it take longer to actually execute a method...etc...
No, and no. There's no reason having extra methods lying around would affect object creation time, as that's a function of object size, and at least in Java objects don't directly contain their methods, if memory serves. Having extra methods definitely won't (directly) affect execution of unrelated methods.
Now my understanding is that the code will only be compiled by the JIT compiler if it reaches a code block/method that it has not reached before...
This isn't totally right. The JITC can revisit the same section of code over and over again if it determines that doing so would be beneficial.
... which would lead me to think that I am no affecting anything by adding default methods. Yes it will add to the "size" of the (byte) code but doesn't actually affect performance?
You're right that the bytecode file would be larger. You're wrong in that that wouldn't make a difference.
Code size can have a significant impact on performance. Small programs that can fit mostly/entirely in cache will have a significant advantage over larger programs, as pieces of code don't have to be loaded from RAM or the HDD/SSD, which are much slower than the cache.
The amount of code needed to do this might be pretty large, though, so maybe for a method or two it wouldn't matter that much. I'm not sure at what point code size in Java becomes a problem. Never tried to find out.
If you never call those methods, it might be possible that the bits of code that make up those methods are never loaded, which removes their cache-related performance penalty. I'm not certain if splitting the program code like this is possible, though.
So in the end, it probably wouldn't be harmful, so long as you don't have an excessive number of methods. Having methods around that are never called, though, might be problem for code maintainability, which is always a factor that you should consider.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.