I have a performance-critical method called often when my app starts up. Eventually, it gets JIT-compiled, but not after some noticeable time being run in the interpreter.
Is there any way I can tell the JVM that I want this method compiled right from the start (without tweaking other internals with stuff like -XX:CompileThreshold)?
The only way I know of is the -Xcomp flag, but that is not generally advisable to use. It forces immediate JIT compilation of ALL classes and methods first time they are run. The downside is that you will see a performance decrease on initial startup (due to increased JIT activity). The other major limitation with this flag is that it appears to disable the incremental profiling-based optimization that JIT would normally do. In standard mixed mode, the JIT compiler can (and will) deoptimize and re-compile parts of the code continually based on profiling and run-time information collected. This allows it to "correct" faulty optimizations like boundary checks that were omitted but turned out to be needed, sub-optimal inlinings etc. -Xcomp disables the profiling-based optimization and depending on program, can cause significant performance losses overall for only a small or no real gain in startup, which is why it's not recommended to use.
Beyond to -Xcomp (which is pretty brutal) and -XX:CompileThreshold (which controls how many executions of a given method the JIT will run in intepreted mode to gather stats before compiling/optimizing it), there is also -Xbatch. This forces JIT compilation to the "foreground", essentially blocking calls to methods until it's been compiled, rather than compiling it in the background as it normally does.
You didn't specify which Java version you are using, but if Java 7 is an option for you, it introduces a new JIT model called "Tiered compilation" (activated with the -XX:+TieredCompilation switch). What tiered compilation does is that it allows an initial, smaller compilation pass on the first use of a method and than an additional, larger compilation/optimization later, based on collected profiling data. Sounds like it should be interesting to you.
It supposedly requires some additional tweaking and parameters/configurations, but I've not got around to checking it out further.
im not sure if it'll completely precompile the code, but you could add your class with the critical method to the JVM's shared data dump. see this question for more details.
also, have you considered JNI? if your method is very CPU intensive it might speed things up considerably.
Related
I've been thinking about it lately, and it seems to me that most advantages given to JIT compilation should more or less be attributed to the intermediate format instead, and that jitting in itself is not much of a good way to generate code.
So these are the main pro-JIT compilation arguments I usually hear:
Just-in-time compilation allows for greater portability. Isn't that attributable to the intermediate format? I mean, nothing keeps you from compiling your virtual bytecode into native bytecode once you've got it on your machine. Portability is an issue in the 'distribution' phase, not during the 'running' phase.
Okay, then what about generating code at runtime? Well, the same applies. Nothing keeps you from integrating a just-in-time compiler for a real just-in-time need into your native program.
But the runtime compiles it to native code just once anyways, and stores the resulting executable in some sort of cache somewhere on your hard drive. Yeah, sure. But it's optimized your program under time constraints, and it's not making it better from there on. See the next paragraph.
It's not like ahead-of-time compilation had no advantages either. Just-in-time compilation has time constraints: you can't keep the end user waiting forever while your program launches, so it has a tradeoff to do somewhere. Most of the time they just optimize less. A friend of mine had profiling evidence that inlining functions and unrolling loops "manually" (obfuscating source code in the process) had a positive impact on performance on his C# number-crunching program; doing the same on my side, with my C program filling the same task, yielded no positive results, and I believe this is due to the extensive transformations my compiler was allowed to make.
And yet we're surrounded by jitted programs. C# and Java are everywhere, Python scripts can compile to some sort of bytecode, and I'm sure a whole bunch of other programming languages do the same. There must be a good reason that I'm missing. So what makes just-in-time compilation so superior to ahead-of-time compilation?
EDIT To clear some confusion, maybe it would be important to state that I'm all for an intermediate representation of executables. This has a lot of advantages (and really, most arguments for just-in-time compilation are actually arguments for an intermediate representation). My question is about how they should be compiled to native code.
Most runtimes (or compilers for that matter) will prefer to either compile them just-in-time or ahead-of-time. As ahead-of-time compilation looks like a better alternative to me because the compiler has more time to perform optimizations, I'm wondering why Microsoft, Sun and all the others are going the other way around. I'm kind of dubious about profiling-related optimizations, as my experience with just-in-time compiled programs displayed poor basic optimizations.
I used an example with C code only because I needed an example of ahead-of-time compilation versus just-in-time compilation. The fact that C code wasn't emitted to an intermediate representation is irrelevant to the situation, as I just needed to show that ahead-of-time compilation can yield better immediate results.
Greater portability: The
deliverable (byte-code) stays
portable
At the same time, more platform-specific: Because the
JIT-compilation takes place on the
same system that the code runs, it
can be very, very fine-tuned for
that particular system. If you do
ahead-of-time compilation (and still
want to ship the same package to
everyone), you have to compromise.
Improvements in compiler technology can have an impact on
existing programs. A better C
compiler does not help you at all
with programs already deployed. A
better JIT-compiler will improve the
performance of existing programs.
The Java code you wrote ten years ago will run faster today.
Adapting to run-time metrics. A JIT-compiler can not only look at
the code and the target system, but
also at how the code is used. It can
instrument the running code, and
make decisions about how to optimize
according to, for example, what
values the method parameters usually
happen to have.
You are right that JIT adds to start-up cost, and so there is a time-constraint for it,
whereas ahead-of-time compilation can take all the time that it wants. This makes it
more appropriate for server-type applications, where start-up time is not so important
and a "warm-up phase" before the code gets really fast is acceptable.
I suppose it would be possible to store the result of a JIT compilation somewhere, so that it could be re-used the next time. That would give you "ahead-of-time" compilation for the second program run. Maybe the clever folks at Sun and Microsoft are of the opinion that a fresh JIT is already good enough and the extra complexity is not worth the trouble.
The ngen tool page spilled the beans (or at least provided a good comparison of native images versus JIT-compiled images). Executables that are compiled ahead-of-time typically have the following benefits:
Native images load faster because they don't have much startup activities, and require a static amount of fewer memory (the memory required by the JIT compiler);
Native images can share library code, while JIT-compiled images cannot.
Just-in-time compiled executables typically have the upper hand in these cases:
Native images are larger than their bytecode counterpart;
Native images must be regenerated whenever the original assembly or one of its dependencies is modified.
The need to regenerate an image that is ahead-of-time compiled every time one of its components is a huge disadvantage for native images. On the other hand, the fact that JIT-compiled images can't share library code can cause a serious memory hit. The operating system can load any native library at one physical location and share the immutable parts of it with every process that wants to use it, leading to significant memory savings, especially with system frameworks that virtually every program uses. (I imagine that this is somewhat offset by the fact that JIT-compiled programs only compile what they actually use.)
The general consideration of Microsoft on the matter is that large applications typically benefit from being compiled ahead-of-time, while small ones generally don't.
Simple logic tell us that compiling huge MS Office size program even from byte-codes will simply take too much time. You'll end up with huge starting time and that will scare anyone off your product. Sure, you can precompile during installation but this also has consequences.
Another reason is that not all parts of application will be used. JIT will compile only those parts that user care about, leaving potentially 80% of code untouched, saving time and memory.
And finally, JIT compilation can apply optimizations that normal compilators can't. Like inlining virtual methods or parts of the methods with trace trees. Which, in theory, can make them faster.
Better reflection support. This could be done in principle in an ahead-of-time compiled program, but it almost never seems to happen in practice.
Optimizations that can often only be figured out by observing the program dynamically. For example, inlining virtual functions, escape analysis to turn stack allocations into heap allocations, and lock coarsening.
Maybe it has to do with the modern approach to programming. You know, many years ago you would write your program on a sheet of paper, some other people would transform it into a stack of punched cards and feed into THE computer, and tomorrow morning you would get a crash dump on a roll of paper weighing half a pound. All that forced you to think a lot before writing the first line of code.
Those days are long gone. When using a scripting language such as PHP or JavaScript, you can test any change immediately. That's not the case with Java, though appservers give you hot deployment. So it is just very handy that Java programs can be compiled fast, as bytecode compilers are pretty straightforward.
But, there is no such thing as JIT-only languages. Ahead-of-time compilers have been available for Java for quite some time, and more recently Mono introduced it to CLR. In fact, MonoTouch is possible at all because of AOT compilation, as non-native apps are prohibited in Apple's app store.
I have been trying to understand this as well because I saw that Google is moving towards replacing their Dalvik Virtual Machine (essentially another Java Virtual Machine like HotSpot) with Android Run Time (ART), which is a AOT compiler, but Java usually uses HotSpot, which is a JIT compiler. Apparently, ARM is ~ 2x faster than Dalvik... so I thought to myself "why doesn't Java use AOT as well?".
Anyways, from what I can gather, the main difference is that JIT uses adaptive optimization during run time, which (for example) allows ONLY those parts of the bytecode that are being executed frequently to be compiled into native code; whereas AOT compiles the entire source code into native code, and code of a lesser amount runs faster than code of a greater amount.
I have to imagine that most Android apps are composed of a small amount of code, so on average it makes more sense to compile the entire source code to native code AOT and avoid the overhead associated from interpretation / optimization.
It seems that this idea has been implemented in Dart language:
https://hackernoon.com/why-flutter-uses-dart-dd635a054ebf
JIT compilation is used during development, using a compiler that is especially fast. Then, when an app is ready for release, it is compiled AOT. Consequently, with the help of advanced tooling and compilers, Dart can deliver the best of both worlds: extremely fast development cycles, and fast execution and startup times.
One advantage of JIT which I don't see listed here is the ability to inline/optimize across separate assemblies/dlls/jars (for simplicity I'm just going to use "assemblies" from here on out).
If your application references assemblies which might change after install (e. g. pre-installed libraries, framework libraries, plugins), then a "compile-on-install" model must refrain from inlining methods across assembly boundaries. Otherwise, when the referenced assembly is updated we would have to find all such inlined bits of code in referencing assemblies on the system and replace them with the updated code.
In a JIT model, we can freely inline across assemblies because we only care about generating valid machine code for a single run during which the underlying code isn't changing.
The difference between platform-browser-dynamic and platform-browser is the way your angular app will be compiled.
Using the dynamic platform makes angular sending the Just-in-Time compiler to the front-end as well as your application. Which means your application is being compiled on client-side.
On the other hand, using platform-browser leads to an Ahead-of-Time pre-compiled version of your application being sent to the browser. Which usually means a significantly smaller package being sent to the browser.
The angular2-documentation for bootstrapping at https://angular.io/docs/ts/latest/guide/ngmodule.html#!#bootstrap explains it in more detail.
I just installed the Unnecessary Code Detector for Eclipse and ran it on my project. I see a lot of so-called "dead code". Although, from an organizational standpoint, it makes sense to remove dead/unnecessary code, it got me thinking:
Does dead code actually hinder a Java app's performance?!?!
To me, if code is truly "dead", it never gets executed, so I don't see how removing it (again, except for organizational/housekeeping/code cleanup purposes) could ever improve performance.
I don't think "dead code" will hinder application performance, but it will hinder development performance, which is invariably more expensive.
Where possible the JIT compiler may remove this kind of dead-code - see dead-code elimination. I suppose in theory if the JIT compiler removed huge amounts of dead-code it could impact initial compilation.
However I doubt this would occur in practice, and I'd only recommend dead-code removal to make development easier/faster.
It could affect a few things...
Size of the application
Memory used when application is running
Decreased performance on package scanning (if applicable)
It could affect it a bit, but the JIT compiler should be able to detect and eliminate methods that are never used. And even if it doesn't, the overheads (memory, load time, JIT compilation time, etc) are likely to be small.
A much better reason to eliminate dead methods is to get rid of "old stuff" that makes your codebase harder to read, test, maintain. If this is code that you might conceivably need again, you can always get it back from version control.
What if I ask the user which method do you want to call? , take the input as a String and then invoke that method using reflection?. The JIT can't say which method will be used so it can't remove any methods :).
Good point. So it probably won't eliminate methods in practice. (But it could ... if the classloader knew where to reload the method from ...)
Dead methods increases method area in JVM.
Yes, though the percentage memory increase is probably insignificant. And the consequent performance reduction is probably even less significant.
Also a method that is never called will never be JIT compiled, so you are likely to not incure 50% or more of memory usage for a typical live method.
So too much dead code miight lead to unloading of classes from the method area (heap) which could affect app performance. am I right?.
That is highly unlikely. A class will only be unloaded if nothing references it and its classloader is also unreachable. And if it does happen, the class would not be used again anyway, so it is right to unload it.
It could affect performance of your application.
Edit
One way to look at it is; dead code is going to add some extra memory to your running application. So it will impact application performance as well.
I have been looking into the Java JIT compiler and i cannot figure out why some of the code is still interpreted. Why doesn't the JIT compiler translate everything to native code? Interpretation is much slower, am I missing something?
It's all a matter of tradeoffs
the time taken to compile + execute code can be longer than the time to interpret once
you can often optimise things much more efficiently if you have statistics on branching, etc
some things can't be compiled (anything that does RTTI, probably)
some things you don't want compiled (line numbers for stack traces, etc)
I'm sure there's others.
If you are running a JVM like HotSpot, it JIT-compiles opportunistically, only focusing on code that executes frequently. It determines which code to optimise on the fly by counting frequency of each code block (or method — I'm not sure which). Consequently, at startup time, everything is interpreted.
The intent behind this is allow for much more aggressive and expensive optimisations by only requiring a small fraction of the code to be optimised.
Two main reasons:
Interpretation is not slower if code is only run a few times. The cost of compilation alone can be much more expensive than interpreting the code if it is only run a few times.
While interpreting it is possible to gather statistics at runtime that are useful for optimising the code later. For example, you can count how many times a particular branch is taken and optimise the code to be faster for the more frequent case. This kind of trick can make JIT compilation better than ahead-of-time compilation (which doesn't have the opportunity to exploit the runtime statistics)
Hence the Java JIT takes a sensible strategy: don't compile until you observe that the same code is being run multiple times, at which point you have evidence that doing the compilation is probably worthwhile and you are able to make some additional optimisations.
I'm in the process of benchmarking an app i've written. I ran my app through the benchmark 10 times in a loop (to get 10 results instead of only 1). Each time, the first iteration seems to take some 50 - 100 milliseconds longer than rest of the iterations.
Is this related to the JIT compiler and is there anything one could do to "reset" the state so that you would get the initial "lag" included with all iterations?
To benchmark a long running application you should allow an initialization (1st pass), thats because classes have to be loaded, code has to be generated, in web-apps JSP compile to servlets etc. JIT of course plays its role also. Sometimes a pass could take longer if garbage collection occurs.
It is probably caused by the JIT kicking in, however you probably want to ignore the initial lag anyway. At least most benchmarks try to, because it heavily distorts the statistics.
You can't "uncompile" code that has been compiled but you can turn compiling off completely by using the -Xint command line switch.
The first pass will probably always be slower because of the JIT. I'd even expect to see differences when more runs are made because of possible incremental compilation or better branch prediction.
For benchmarking, follow the recommondations given in the other answers (except I wouldn't turn off the JIT because you'd have your app running with JIT in a production environment).
In any case use a profiler such as JVisualVM (included in JDK).
Is this related to the JIT compiler
Probably yes, though there are other potential sources of "lag":
Bootstrapping the JVM and creation of the initial classloader.
Reading and loading the application's classes, and the library classes that are used.
Initializing the classes.
JIT compilation.
Heap warmup effects; e.g. the overheads of having a heap that is initially too small. (This can result on the GC running more often than normal ... until the heap reaches a size that matches the application's peak working set size.)
Virtual memory warmup effects; e.g. the OS overheads incurred when the JVM grows the process address space and physical pages are allocated.
... and is there anything one could do to "reset" the state so that you would get the initial "lag" included with all iterations?
There is nothing you can do, apart from starting the JVM over again.
However, there are things that you can do to remove some of these sources of "lag"; e.g. turning of JIT compilation, using a large initial heap size, and running on an otherwise idle machine.
Also, the link that #Joachim contributed above is worth a thorough read.
There are certain structures you might have in your code, such as singletons which are initialized only once and consume system resources. If you're using a database connection pool for example, this might be the case. Moreover it is the time needed by Java classes to be initialized. For these reasons, I think you should discard that first value and keep only the rest.
I know that the JVM can do some pretty serious optimizations at runtime, especially in -server mode. Of course, it takes a little while for the JVM to settle down and reach peak performance. Is there any way to take a snapshot of those optimizations so they can be applied immediately the next time you run your app?
"Hey JVM! Great job optimizing my code. Could you write that down for me for later?"
Basically not yet with Sun's VM, but they have it in mind.
See various postings/comments under here:
http://blogs.oracle.com/fatcatair/category/Java
(Sorry: I can't find quite the right one about retaining stats over restart for immediate C1 compilation of known-hot-at-startup methods.)
But I don't know where all this stuff is right now.
Note that optimisations appropriate in steady-state may well not be appropriate at start-up and might indeed reduce start-up performance, and indeed two runs may not have the same hotspots...
Perhaps this might help: http://wikis.sun.com/display/HotSpotInternals/PrintAssembly.