How to find out what optimizations the JVM applied to my code?

How to find out what optimizations the JVM applied to my code? - java

The JVM (especially the HotSpot VM) is famous for having a huge number of optimizations it can apply at runtime.
Is there a way to look at a certain piece of code and see what the JVM has actually done to it?

One problem is that "what JVM has actually done to it" changes between invocations as the JVM is free to re-generate code.
As an example I investigated some days ago what Hotspot does with final methods compared to virtual methods. Judging by microbenchmarks, my conclusions were:
Client JVM: If the method is effectively final (there is not any loaded class that overrides it), the JVM uses a nonvirtual call. Afterwards, if you load a class that overrides this method, the JVM will change the JIT'ed code to make the calls virtual. So declaring as finalhas no significant relevance.
Server JVM: Here final seems to have no relevance either. What seems to happen is that the JVM generates a non virtual call for whatever class you are using the first time, independently of whatever classes are loaded. Afterwards, if you make a call from an object of another class, the JVM will patch all calls with something similar to this (I guess that it will also profile calls so it can change fast-path and slow-path if it did not get it right the first time):
if (object instanceof ClassOfNonVirtualCall) {
do non-virtual call to ClassOfNonVirtualCall.method
} else {
do virtual call to object.method
}
If you are really interested in seeing generated code, you may play with DEBUG JVMs from OpenJDK:
http://dlc.sun.com.edgesuite.net/jdk7/binaries/index.html
http://wikis.sun.com/display/HotSpotInternals/PrintAssembly

This is highly JVM specific, and you will most likely need to do some serious investigation in the particular JVM you are looking at.
You can see the available HotSpot VM options here http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

The following is a very good resource:
http://wikis.sun.com/display/HotSpotInternals/Home
Particularly interesting are the "LogCompilation tool" and "LogCompilation overview" links (can't post direct links as I've just registered).

Here is a good page on HotSpot optimizations.
Some of the optimizations can be seen by looking at the bytecode emitted by the compiler. Other optimizations are dynamic and only exist during run-time. For example, HotSpot can do on-stack replacement which modifies the stack directly during runtime.

Related

Why is Java's debugging Hot Swap limited to intra-method changes?

I have gone through hot deployment tutorial and it works.
But i have questions about the limitations(point 3) i.e
Hot deploy has supported the code changes in the method implementation only. If you add a new class or a new method, restart is still required.
Basically why we don't need server restart if i make changes in existing method but required in case of adding method or class.
My understanding how it works :- When i make the changes in existing method or introduced a new method, Eclipse will place the file the at right location
under webserver. If class has been already loaded by classloader in perm gen space, it will unload it from permgen space and load the new the one internally without server restart so that new changes(byte code) is reflected . Is that correct ?
If yes why hot deployment does not work for new methods and new class files ?

The reasoning is quite complicated and really only fully known to people with intimate knowledge of the JVM and how it manages memory. There is a decent explanation: Java HotSwap Guide section titled Why is HotSwap limited to method bodies? (although it's really an advertisement for the JRebel product).
The gist: there are two primary factors that prevent HotSwap from handling structural changes to classes: JIT and memory allocation.
The JIT (Just In Time) compiler in the JVM optimizes the bytecode after classes have been loaded and run a few times, basically inlining many calls for increased performance. Implementing that feature safely and effectively in an environment where class signatures and structure can change would be a significant challenge.
Other problems surround what would happen regarding memory management if class structures were allowed to change. The JVM would have to modify existing instances of classes, which would mean relocating them to other parts of the heap storage. Not to mention having to relocate the class objects themselves. The JVM's memory management is already incredibly complex and highly optimized; such changes would only increase the complexity and potentially reduce performance of the JIT compiler (and likely lead to additional bugs).
I think it's safe to assume that the JVM engineers have not been willing to take the performance and bug footprint tradeoffs that would be required to support this feature. Which is why products like JRebel and others have come to exist.

As a side note, the specification itself is not limited.
It just happens some of the available implementations, including the ubiquitous Reference Implementation, are limited.
After you connect to a remote VM, you can check whether it allows to add methods or redefine classes.

You can if you run your java on a smalltalk vm. Smalltalk has been doing this basically forever, and it is one of the reasons why Smalltalkers tend to do debugger driven development as a superior form of test driven development. Smalltalk vms do the required clean-up of memory data structures. In Eliot Miranda's Spur (for Squeak, Pharo and Cuis) and Gemstone that is done lazily, but otherwise you might have to wait for all objects to be migrated. The reference implementation java vm probably has more optimizations than any smalltalk vm you could run java on a.t.m.

The answer provided by E-Riz already has a good explanation of the reasons why the standard Java HotSwap technology only supports the modifications to existing methods and not addition of new class or methods to classes.
However, as has been described in a related SO discussion the level of hot swapping you achieve is dependent on the tool chain you use. So, if you end up adding JRebel plug-in you would be able to perform hot swapping even when new methods and classes have been added.
There is another project :Hot Swap Agent - this is typically a java agent that can be used to run your Java container and you can activate it using a couple of command line parameters (as mentioned in the quickstart).

Cross-compiler vs JVM

I am wondering about the purpose of JVM. If JVM was created to allow platform independent executable code, then can't a cross-compiler which is capable of producing platform independent executable code replace a JVM?
the information about cross-compiler was retrieved from: http://en.wikipedia.org/wiki/Cross_compiler

The advantage of the bytecode format and the JVM is the ability to optimize code at runtime, based on profiling data acquired during the actual run. In other words, not having statically compiled native code is a win.
A specifically interesting example of the advantages of runtime compilation are monomorphic call sites: for each place in code where an instance method is being called, the runtime keeps track exactly what object types the method is called on. In very many cases it will turn out that there is only one object type involved and the JVM will then compile that call is if it was a static method (no vtables involved). This will further allow it to inline the call and then do even more optimizations such as escape analysis, register allocation, constant folding, and much more.
In fact, your criticism could (some say, should) be turned upside-down: why does Java define bytecode at all, fixing many design decisions which could have been left up to the implementation? The modern trend is to distribute source code and have the JIT compiler work on that.

JVM is doing much more than compiling. JVM is an interpreter for byte code, which also contain JIT (just in time) compiler that compiles byte code - but depending on the context of the application the same byte code can be compiled differently based on the runtime context (it decides in the runtime how your byte code is compiled). JIT is doing lof of optimization - it is trying to compile your code in most efficient way. Cross compiler can't do (all of) this because it doesn't know how your code will be used in the runtime. This is the big advantage of JVM over cross compiler.
I haven't been using cross compiler before but I guess that the advantage of crosscompiler is that you have better control on how your code is compiled.

platform independent executable code
That's what Java bytecode is. The problem with "platform independent executable code" is that it can't be native to every platform (otherwise being platform independent would be a trivial, uninteresting property). In other words, there is no format which runs without natively on all platforms.
The JVM is, depending on your definition of the term, either the ISA which defines Java bytecode, or the component that allows Java bytecode to be run on platforms whose native format for executable code isn't Java bytecode.
Of course, there is an infinite design space for alternative platform independent executable code and the above is true for any other occupant of said space. So yes, in a sense you can replace the JVM with another thing which fulfills the same function for another platform independent executable code format.

Can I force the JVM to natively compile a given method?

I have a performance-critical method called often when my app starts up. Eventually, it gets JIT-compiled, but not after some noticeable time being run in the interpreter.
Is there any way I can tell the JVM that I want this method compiled right from the start (without tweaking other internals with stuff like -XX:CompileThreshold)?

The only way I know of is the -Xcomp flag, but that is not generally advisable to use. It forces immediate JIT compilation of ALL classes and methods first time they are run. The downside is that you will see a performance decrease on initial startup (due to increased JIT activity). The other major limitation with this flag is that it appears to disable the incremental profiling-based optimization that JIT would normally do. In standard mixed mode, the JIT compiler can (and will) deoptimize and re-compile parts of the code continually based on profiling and run-time information collected. This allows it to "correct" faulty optimizations like boundary checks that were omitted but turned out to be needed, sub-optimal inlinings etc. -Xcomp disables the profiling-based optimization and depending on program, can cause significant performance losses overall for only a small or no real gain in startup, which is why it's not recommended to use.
Beyond to -Xcomp (which is pretty brutal) and -XX:CompileThreshold (which controls how many executions of a given method the JIT will run in intepreted mode to gather stats before compiling/optimizing it), there is also -Xbatch. This forces JIT compilation to the "foreground", essentially blocking calls to methods until it's been compiled, rather than compiling it in the background as it normally does.
You didn't specify which Java version you are using, but if Java 7 is an option for you, it introduces a new JIT model called "Tiered compilation" (activated with the -XX:+TieredCompilation switch). What tiered compilation does is that it allows an initial, smaller compilation pass on the first use of a method and than an additional, larger compilation/optimization later, based on collected profiling data. Sounds like it should be interesting to you.
It supposedly requires some additional tweaking and parameters/configurations, but I've not got around to checking it out further.

im not sure if it'll completely precompile the code, but you could add your class with the critical method to the JVM's shared data dump. see this question for more details.
also, have you considered JNI? if your method is very CPU intensive it might speed things up considerably.

Hotspot JIT optimizations

In a lecture about JIT in Hotspot I want to give as many examples as possible of the specific optimizations that JIT performs.
I know just about "method inlining", but there should be much more. Give a vote for every example.

Well, you should scan Brian Goetz's articles for examples.
In brief, HotSpot can and will:
Inline methods
Join adjacent synchronized blocks on the same object
Eliminate locks if monitor is not reachable from other threads
Eliminate dead code (hence most of micro-benchmarks are senseless)
Drop memory write for non-volatile variables
Replace interface calls with direct method calls for methods only implemented once
et cetera

There is a great presentation on the optimizations used by modern JVMs on the Jikes RVM site:
ACACES’06 - Dynamic Compilation and Adaptive Optimization in Virtual Machines
It discusses architecture, tradeoffs, measurements and techniques. And names at least 20 things JVMs do to optimize the machine code.

I think the interesting stuff are those things that a conventional compiler can't do contrary to the JIT. Inlining methods, eliminating dead code, CSE, live analysis, etc. are all done by your average c++ compiler as well, nothing "special" here
But optimizing something based on optimistic assumptions and then deoptimizing later if they turn out to be wrong? (assuming a specific type, removing branches that will fail later anyhow if not done,..) Removing virtual calls if we can guarantee that there exists only one class at the moment (again something that only reliably works with deoptimization)? Adaptive optimization is I think the one thing that really distinguishes the JIT from your run of the mill c++ compiler.
Maybe also mention the runtime profiling done by the JIT to analyse which optimizations it should apply (not that unique anymore with all the profile-guided optimizations though).

There's an old but likely still valid overview in this article.
The highlights seem to be performing classical optimizations based on available runtime profiling information:
JITting "hot spots" into native code
Adaptive inlining – inlining the most commonly called implementations for a given method dispatch to avoid a huge code size
And some minor ones like generational GC which makes allocating short lived objects cheaper, and various other smaller optimizations, plus whatever else was added since that article was published.
There's also a more detailed official whitepaper, and a fairly nitty-gritty HotSpot Internals wiki page that lists how to write fast Java code that should let you extrapolate what use cases were optimized.

Jumps to equivalent native machine code instead of JVM interpretation of the op-codes. The lack of a need to simulate a machine (the JVM) in machine code for a heavily used part of a Java application (which is the equivalent of an extension of the JVM) provides a good speed increase.
Of course, that's most of what HotSpot is.

What can you not do on the Dalvik VM (Android's VM) that you can in Sun VM?

I know that you can run almost all Java in Dalvik's VM that you can in Java's VM but the limitations are not very clear. Has anyone run into any major stumbling blocks? Any major libraries having trouble? Any languages that compile to Java byte code (Scala, Jython etc...) not work as expected?

There is a number of things that Dalvik will not handle or will not handle quite the same way as standard Java bytecode, though most of them are quite advanced.
The most severe example is runtime bytecode generation and custom class loading. Let's say you would like to create some bytecode and then use classloader to load it for you, if that trick works on your normal machine, it is guaranteed to not work on Dalvik, unless you change your bytecode generation.
That prevents you from using certain dependency injection frameworks, most known example being Google Guice (though I am sure some people work on that). On the other hand AspectJ should work as it uses bytecode instrumentation as a compilation step (though I don't know if anyone tried).
As to other jvm languages -- anything that in the end compiles to standard bytecode and does not use bytecode instrumentation at runtime can be converted to Dalvik and should work. I know people did run Jython on Android and it worked ok.
Other thing to be aware of is that there is no just in time compilation. This is not strictly Dalviks problem (you can always compile any bytecode on the fly if you wish) but that Android does not support that and is unlikely to do so. In the effect while microbenchmarking for standard Java was useless -- components had different runtime characterstics in tests than as parts of larger systems -- microbenchmarks for Android phones totally make sense.

If you see "Dalvik Virtual Machine internals" Google IO session, you can find Dalvik does not support generational GC.
So, it could degrade performance of frequent object creation and deletion. Java VM supports generational GC so, it would show better GC performance for the same situation.
And also, Dalvik uses trace-granuality JIT instead of method granuality JIT.

Another thing that I guess could be added here is that Dalvik apparently does not preserve field order when listing the fields of a class using the reflection API. Now, the reflection API does not make any guarantees on it anyway (so ideally you shouldn't depend on it anyway), but most of the other VMs out there do preserve the order.

Just to add to the conversation, not intended to revive an old thread. I just ran across this in my search, and want to add that Jython does not work out of the box with Dalvik either. Simply trying to do a hello world example will yield the following:

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.