Impact of changing JVM vendor on you application

Impact of changing JVM vendor on you application - java

If you want to change JVM(only vendor and stick to same version of JAVA) for your application, what aspects one should verify to make sure your application will perform optimally.
The differences between these JVMs which will have obvious impact on your application's performance & maintenance would be:
Memory management & garbage collection algorithms
Security & stability patches
So is it safe to assume that if your application performance tests results are good and if the security & stability patch support from the vendor is acceptable then you can go ahead with the change.
I am assuming that switching JVMs will not have any functional impact and impact will be only on performance of your application.

There is a difference between JVM and Java(and implicitly javac compiler).
When you say change vendor you probably mean both: re-compile the .java files with the new compiler and running those generated .class files on the new jvm vendor.
In this case the bytecode might differ (content of .class file). Generally you don't care about that, because your code still runs fine. But it's good to know that is might change and you can't rely on that.
The problem with a different jvm is far bigger as far as I can tell. Each vendor might do their optimizations on it and it might differ quite a lot. For example, your current jvm might inline methods after 2000 calls, while your new jvm might do that after 10000 calls. So JIT might differ. It might use different Garbage Collector implementation and even might slice your memory differently than Oracle Hotspot, so heap might differ.
You might have tunning parameters enabled now (-XX:SomeProp) that simply will not be understood by your new vendor. Tools provided for debugging problems might also be different: tools and start-up properties are to be be watched for.
Then there are bugs. Even if all compilers must comply with jls, they all have bugs (it's still software) and while some things would normally compile/run now, they might suddenly break due to the change that you are making.
I don't know if there is safe way to change this. Because it depends on what you actually define as safe. If all you care about is that code runs, tests also and performance tests are OK - then you are fine.
As a last note: we were involved in such a move from J9 to Oracle Hotspot. The change was not easy, mainly because J9 AOT Compiler, heap memory j9 != oracle, different GC (it was quite a while ago and I do not want to repeat that story).

Related

Why is it important that Java (and other JVM languages) is highly portable? [duplicate]

I've been thinking about it lately, and it seems to me that most advantages given to JIT compilation should more or less be attributed to the intermediate format instead, and that jitting in itself is not much of a good way to generate code.
So these are the main pro-JIT compilation arguments I usually hear:
Just-in-time compilation allows for greater portability. Isn't that attributable to the intermediate format? I mean, nothing keeps you from compiling your virtual bytecode into native bytecode once you've got it on your machine. Portability is an issue in the 'distribution' phase, not during the 'running' phase.
Okay, then what about generating code at runtime? Well, the same applies. Nothing keeps you from integrating a just-in-time compiler for a real just-in-time need into your native program.
But the runtime compiles it to native code just once anyways, and stores the resulting executable in some sort of cache somewhere on your hard drive. Yeah, sure. But it's optimized your program under time constraints, and it's not making it better from there on. See the next paragraph.
It's not like ahead-of-time compilation had no advantages either. Just-in-time compilation has time constraints: you can't keep the end user waiting forever while your program launches, so it has a tradeoff to do somewhere. Most of the time they just optimize less. A friend of mine had profiling evidence that inlining functions and unrolling loops "manually" (obfuscating source code in the process) had a positive impact on performance on his C# number-crunching program; doing the same on my side, with my C program filling the same task, yielded no positive results, and I believe this is due to the extensive transformations my compiler was allowed to make.
And yet we're surrounded by jitted programs. C# and Java are everywhere, Python scripts can compile to some sort of bytecode, and I'm sure a whole bunch of other programming languages do the same. There must be a good reason that I'm missing. So what makes just-in-time compilation so superior to ahead-of-time compilation?
EDIT To clear some confusion, maybe it would be important to state that I'm all for an intermediate representation of executables. This has a lot of advantages (and really, most arguments for just-in-time compilation are actually arguments for an intermediate representation). My question is about how they should be compiled to native code.
Most runtimes (or compilers for that matter) will prefer to either compile them just-in-time or ahead-of-time. As ahead-of-time compilation looks like a better alternative to me because the compiler has more time to perform optimizations, I'm wondering why Microsoft, Sun and all the others are going the other way around. I'm kind of dubious about profiling-related optimizations, as my experience with just-in-time compiled programs displayed poor basic optimizations.
I used an example with C code only because I needed an example of ahead-of-time compilation versus just-in-time compilation. The fact that C code wasn't emitted to an intermediate representation is irrelevant to the situation, as I just needed to show that ahead-of-time compilation can yield better immediate results.

Greater portability: The
deliverable (byte-code) stays
portable
At the same time, more platform-specific: Because the
JIT-compilation takes place on the
same system that the code runs, it
can be very, very fine-tuned for
that particular system. If you do
ahead-of-time compilation (and still
want to ship the same package to
everyone), you have to compromise.
Improvements in compiler technology can have an impact on
existing programs. A better C
compiler does not help you at all
with programs already deployed. A
better JIT-compiler will improve the
performance of existing programs.
The Java code you wrote ten years ago will run faster today.
Adapting to run-time metrics. A JIT-compiler can not only look at
the code and the target system, but
also at how the code is used. It can
instrument the running code, and
make decisions about how to optimize
according to, for example, what
values the method parameters usually
happen to have.
You are right that JIT adds to start-up cost, and so there is a time-constraint for it,
whereas ahead-of-time compilation can take all the time that it wants. This makes it
more appropriate for server-type applications, where start-up time is not so important
and a "warm-up phase" before the code gets really fast is acceptable.
I suppose it would be possible to store the result of a JIT compilation somewhere, so that it could be re-used the next time. That would give you "ahead-of-time" compilation for the second program run. Maybe the clever folks at Sun and Microsoft are of the opinion that a fresh JIT is already good enough and the extra complexity is not worth the trouble.

The ngen tool page spilled the beans (or at least provided a good comparison of native images versus JIT-compiled images). Executables that are compiled ahead-of-time typically have the following benefits:
Native images load faster because they don't have much startup activities, and require a static amount of fewer memory (the memory required by the JIT compiler);
Native images can share library code, while JIT-compiled images cannot.
Just-in-time compiled executables typically have the upper hand in these cases:
Native images are larger than their bytecode counterpart;
Native images must be regenerated whenever the original assembly or one of its dependencies is modified.
The need to regenerate an image that is ahead-of-time compiled every time one of its components is a huge disadvantage for native images. On the other hand, the fact that JIT-compiled images can't share library code can cause a serious memory hit. The operating system can load any native library at one physical location and share the immutable parts of it with every process that wants to use it, leading to significant memory savings, especially with system frameworks that virtually every program uses. (I imagine that this is somewhat offset by the fact that JIT-compiled programs only compile what they actually use.)
The general consideration of Microsoft on the matter is that large applications typically benefit from being compiled ahead-of-time, while small ones generally don't.

Simple logic tell us that compiling huge MS Office size program even from byte-codes will simply take too much time. You'll end up with huge starting time and that will scare anyone off your product. Sure, you can precompile during installation but this also has consequences.
Another reason is that not all parts of application will be used. JIT will compile only those parts that user care about, leaving potentially 80% of code untouched, saving time and memory.
And finally, JIT compilation can apply optimizations that normal compilators can't. Like inlining virtual methods or parts of the methods with trace trees. Which, in theory, can make them faster.

Better reflection support. This could be done in principle in an ahead-of-time compiled program, but it almost never seems to happen in practice.
Optimizations that can often only be figured out by observing the program dynamically. For example, inlining virtual functions, escape analysis to turn stack allocations into heap allocations, and lock coarsening.

Maybe it has to do with the modern approach to programming. You know, many years ago you would write your program on a sheet of paper, some other people would transform it into a stack of punched cards and feed into THE computer, and tomorrow morning you would get a crash dump on a roll of paper weighing half a pound. All that forced you to think a lot before writing the first line of code.
Those days are long gone. When using a scripting language such as PHP or JavaScript, you can test any change immediately. That's not the case with Java, though appservers give you hot deployment. So it is just very handy that Java programs can be compiled fast, as bytecode compilers are pretty straightforward.
But, there is no such thing as JIT-only languages. Ahead-of-time compilers have been available for Java for quite some time, and more recently Mono introduced it to CLR. In fact, MonoTouch is possible at all because of AOT compilation, as non-native apps are prohibited in Apple's app store.

I have been trying to understand this as well because I saw that Google is moving towards replacing their Dalvik Virtual Machine (essentially another Java Virtual Machine like HotSpot) with Android Run Time (ART), which is a AOT compiler, but Java usually uses HotSpot, which is a JIT compiler. Apparently, ARM is ~ 2x faster than Dalvik... so I thought to myself "why doesn't Java use AOT as well?".
Anyways, from what I can gather, the main difference is that JIT uses adaptive optimization during run time, which (for example) allows ONLY those parts of the bytecode that are being executed frequently to be compiled into native code; whereas AOT compiles the entire source code into native code, and code of a lesser amount runs faster than code of a greater amount.
I have to imagine that most Android apps are composed of a small amount of code, so on average it makes more sense to compile the entire source code to native code AOT and avoid the overhead associated from interpretation / optimization.

It seems that this idea has been implemented in Dart language:
https://hackernoon.com/why-flutter-uses-dart-dd635a054ebf
JIT compilation is used during development, using a compiler that is especially fast. Then, when an app is ready for release, it is compiled AOT. Consequently, with the help of advanced tooling and compilers, Dart can deliver the best of both worlds: extremely fast development cycles, and fast execution and startup times.

One advantage of JIT which I don't see listed here is the ability to inline/optimize across separate assemblies/dlls/jars (for simplicity I'm just going to use "assemblies" from here on out).
If your application references assemblies which might change after install (e. g. pre-installed libraries, framework libraries, plugins), then a "compile-on-install" model must refrain from inlining methods across assembly boundaries. Otherwise, when the referenced assembly is updated we would have to find all such inlined bits of code in referencing assemblies on the system and replace them with the updated code.
In a JIT model, we can freely inline across assemblies because we only care about generating valid machine code for a single run during which the underlying code isn't changing.

The difference between platform-browser-dynamic and platform-browser is the way your angular app will be compiled.
Using the dynamic platform makes angular sending the Just-in-Time compiler to the front-end as well as your application. Which means your application is being compiled on client-side.
On the other hand, using platform-browser leads to an Ahead-of-Time pre-compiled version of your application being sent to the browser. Which usually means a significantly smaller package being sent to the browser.
The angular2-documentation for bootstrapping at https://angular.io/docs/ts/latest/guide/ngmodule.html#!#bootstrap explains it in more detail.

Why is Java's debugging Hot Swap limited to intra-method changes?

I have gone through hot deployment tutorial and it works.
But i have questions about the limitations(point 3) i.e
Hot deploy has supported the code changes in the method implementation only. If you add a new class or a new method, restart is still required.
Basically why we don't need server restart if i make changes in existing method but required in case of adding method or class.
My understanding how it works :- When i make the changes in existing method or introduced a new method, Eclipse will place the file the at right location
under webserver. If class has been already loaded by classloader in perm gen space, it will unload it from permgen space and load the new the one internally without server restart so that new changes(byte code) is reflected . Is that correct ?
If yes why hot deployment does not work for new methods and new class files ?

The reasoning is quite complicated and really only fully known to people with intimate knowledge of the JVM and how it manages memory. There is a decent explanation: Java HotSwap Guide section titled Why is HotSwap limited to method bodies? (although it's really an advertisement for the JRebel product).
The gist: there are two primary factors that prevent HotSwap from handling structural changes to classes: JIT and memory allocation.
The JIT (Just In Time) compiler in the JVM optimizes the bytecode after classes have been loaded and run a few times, basically inlining many calls for increased performance. Implementing that feature safely and effectively in an environment where class signatures and structure can change would be a significant challenge.
Other problems surround what would happen regarding memory management if class structures were allowed to change. The JVM would have to modify existing instances of classes, which would mean relocating them to other parts of the heap storage. Not to mention having to relocate the class objects themselves. The JVM's memory management is already incredibly complex and highly optimized; such changes would only increase the complexity and potentially reduce performance of the JIT compiler (and likely lead to additional bugs).
I think it's safe to assume that the JVM engineers have not been willing to take the performance and bug footprint tradeoffs that would be required to support this feature. Which is why products like JRebel and others have come to exist.

As a side note, the specification itself is not limited.
It just happens some of the available implementations, including the ubiquitous Reference Implementation, are limited.
After you connect to a remote VM, you can check whether it allows to add methods or redefine classes.

You can if you run your java on a smalltalk vm. Smalltalk has been doing this basically forever, and it is one of the reasons why Smalltalkers tend to do debugger driven development as a superior form of test driven development. Smalltalk vms do the required clean-up of memory data structures. In Eliot Miranda's Spur (for Squeak, Pharo and Cuis) and Gemstone that is done lazily, but otherwise you might have to wait for all objects to be migrated. The reference implementation java vm probably has more optimizations than any smalltalk vm you could run java on a.t.m.

The answer provided by E-Riz already has a good explanation of the reasons why the standard Java HotSwap technology only supports the modifications to existing methods and not addition of new class or methods to classes.
However, as has been described in a related SO discussion the level of hot swapping you achieve is dependent on the tool chain you use. So, if you end up adding JRebel plug-in you would be able to perform hot swapping even when new methods and classes have been added.
There is another project :Hot Swap Agent - this is typically a java agent that can be used to run your Java container and you can activate it using a couple of command line parameters (as mentioned in the quickstart).

Why operating systems are not written in java?

All the operating systems till date have been written in C/C++ while there is none in Java. There are tonnes of Java applications but not an OS. Why?

Because we have operating systems already, mainly. Java isn't designed to run on bare metal, but that's not as big of a hurdle as it might seem at first. As C compilers provide intrinsic functions that compile to specific instructions, a Java compiler (or JIT, the distinction isn't meaningful in this context) could do the same thing. Handling the interaction of GC and the memory manager would be somewhat tricky also. But it could be done. The result is a kernel that's 95% Java and ready to run jars. What's next?
Now it's time to write an operating system. Device drivers, a filesystem, a network stack, all the other components that make it possible to do things with a computer. The Java standard library normally leans heavily on system calls to do the heavy lifting, both because it has to and because running a computer is a pain in the ass. Writing a file, for example, involves the following layers (at least, I'm not an OS guy so I've surely missed stuff):
The filesystem, which has to find space for the file, update its directory structure, handle journaling, and finally decide what disk blocks need to be written and in what order.
The block layer, which has to schedule concurrent writes and reads to maximize throughput while maximizing fairness.
The device driver, which has to keep the device happy and poke it in the right places to make things happen. And of course every device is broken in its own special way, requiring its own driver.
And all this has to work fine and remain performant with a dozen threads accessing the disk, because a disk is essentially an enormous pile of shared mutable state.
At the end, you've got Linux, except it doesn't work as well because it doesn't have near as much effort invested into functionality and performance, and it only runs Java. Possibly you gain performance from having a single address space and no kernel/userspace distinction, but the gain isn't worth the effort involved.
There is one place where a language-specific OS makes sense: VMs. Let the underlying OS handle the hard parts of running a computer, and the tenant OS handles turning a VM into an execution environment. BareMetal and MirageOS follow this model. Why would you bother doing this instead of using Docker? That's a good question.

Indeed there is a JavaOS http://en.wikipedia.org/wiki/JavaOS
And here is discuss about why there is not many OS written in java Is it possible to make an operating system using java?
In short, Java need to run on JVM. JVM need to run on an OS. writing an OS using Java is not a good choice.
OS needs to deal with hardware which is not doable using java (except using JNI). And that is because JVM only provided limited commands which can be used in Java. These command including add, call a method and so on. But deal with hardware need command to operate reg, memory, CPU, hardware drivers directly. These are not supported directly in JVM so JNI is needed. That is back to the start - it is still needed to write an OS using C/assembly.
Hope this helps.

One of the main benefits of using Java is that abstracts away a lot of low level details that you usually don't really need to care about. It's those details which are required when you build an OS. So while you could work around this to write an OS in Java, it would have a lot of limitations, and you'd spend a lot of time fighting with the language and its initial design principles.

For operating systems you need to work really low-level. And that is a pain in Java. You do need e.g. unsigned data types, and Java only has signed data types. You need struct objects that have exactly the memory alignment the driver expects (and no object header like Java adds to every object).
Even key components of Java itself are no longer written in Java.
And this is -by no means- a temporary thing. More and more does get rewritten in native code to get better performance. The HotSpot VM adds "intrinsics" for performance critical native code, and there is work underway to reduce the overall cost of native calls.
For example JavaFX: The reason why it is much faster than AWT/Swing ever were is because it contains/uses a huge amount of native code. It relies on native code for rendering, and e.g. if you add the "webview" browser component it is actually using the webkit C library to provide the browser.
There is a number of things Java does really well. It is a nicely structured language with a fantastic toolchain. Python is much more compact to write, but its toolchain is a mess, e.g. refactoring tools are disappointing. And where Java shines is at optimizing polymorphism at run-time. Where C++ compilers would need to do expensive virtual calls - because at compile time it is not known which implementation will be used - there Hotspot can aggressively inline code to get better performance. But for operating systems, you do not need this much. You can afford to manually optimize call sites and inlining.

This answer does not mean to be exhaustive in any way, but I'd like to share my thoughts on the (very vast) topic.
Although it is theoretically possible to write some OS in pure java, there are practical matters that make this task really difficult. The main problem is that there is no (currently up to date and reliable) java compiler able to compile java to byte code. So there is no existing tool to make writing a whole OS from the ground up feasible in java, at least as far as my knowledge goes.
Java was designed to run in some implementation of the java virtual machine. There exist implementations for Windows, Mac, Linux, Android, etc. The design of the language is strongly based on the assumption that the JVM exists and will do some magic for you at runtime (think garbage collection, JIT compiler, reflection, etc.). This is most likely part of the reason why such a compiler does not exist: where would all these functionality go? Compiled down to byte code? It's possible but at this point I believe it would be difficult to do. Even Android, whose SDK is purely java based, runs Dalvik (a version of the JVM that supports a subset of the language) on a Linux Kernel.

Why is java bytecode interpreted?

As far as I understand Java compiles to Java bytecode, which can then be interpreted by any machine running Java for its specific CPU. Java uses JIT to interpret the bytecode, and I know it's gotten really fast at doing so, but why doesn't/didn't the language designers just statically compile down to machine instructions once it detects the particular machine it's running on? Is the bytecode interpreted every single pass through the code?

The original design was in the premise of "compile once run anywhere". So every implementer of the virtual machine can run the bytecodes generated by a compiler.
In the book Masterminds for Programming, James Gosling explained:
James: Exactly. These days we’re
beating the really good C and C++
compilers pretty much always. When you
go to the dynamic compiler, you get
two advantages when the compiler’s
running right at the last moment. One
is you know exactly what chipset
you’re running on. So many times when
people are compiling a piece of C
code, they have to compile it to run
on kind of the generic x86
architecture. Almost none of the
binaries you get are particularly well
tuned for any of them. You download
the latest copy of Mozilla,and it’ll
run on pretty much any Intel
architecture CPU. There’s pretty much
one Linux binary. It’s pretty generic,
and it’s compiled with GCC, which is
not a very good C compiler.
When HotSpot runs, it knows exactly
what chipset you’re running on. It
knows exactly how the cache works. It
knows exactly how the memory hierarchy
works. It knows exactly how all the
pipeline interlocks work in the CPU.
It knows what instruction set
extensions this chip has got. It
optimizes for precisely what machine
you’re on. Then the other half of it
is that it actually sees the
application as it’s running. It’s able
to have statistics that know which
things are important. It’s able to
inline things that a C compiler could
never do. The kind of stuff that gets
inlined in the Java world is pretty
amazing. Then you tack onto that the
way the storage management works with
the modern garbage collectors. With a
modern garbage collector, storage
allocation is extremely fast.

Java is commonly compiled to machine instructions; that's what just-in-time (JIT) compilation is. But Sun's Java implementation by default only does that for code that is run often enough (so startup and shutdown bytecode, that is executed only once, is still interpreted to prevent JIT overhead).

Bytecode interpretation is usually "fast enough" for a lot of cases. Compiling, on the other hand, is rather expensive. If 90% of the runtime is spent in 1% of the code it's far better to just compile that 1% and leave the other 99% alone.

Static compiling can blow up on you because all the other libraries you use also need to be write-once run everywhere (i.e. byte-code), including all of their dependencies. This can lead to a chain of compilations following dependencies that can blow up on you. Compiling only the code as (while running) the runtime discovers it actually needs that section of code compiled is the general idea I think. There may be many code paths you don't actually follow, especially when libraries come into question.

Java Bytecode is interpreted because bytecodes are portable across various platforms.JVM, which is platform dependent,converts and executes bytecodes to specific instruction set of that machine whether it may be a Windows or LINUX or MAC etc...

One important difference of dynamic compiling is that it optimises the code base don how it is run. There is an option -XX:CompileThreshold= which is 10000 by default. You can decrease this so it optimises the code sooner, but if you run a complex application or benchmark, you can find that reducing this number can result in slower code. If you run a simple benchmark, you may not find it makes any difference.
One example where dynamic compiling has an advantage over static compiling is inlining "virtual" methods, esp those which can be replaced. For example, the JVM can inline up to two heavily used "virtual" methods, which may be in a separate jar compiled after the caller was compiled. The called jar(s) can even be removed from the running system e.g. OSGi and have another jar added or replace it. The replacement JAR's methods can then be inlined. This can only be achieved with dynamic compiling.

What can you not do on the Dalvik VM (Android's VM) that you can in Sun VM?

I know that you can run almost all Java in Dalvik's VM that you can in Java's VM but the limitations are not very clear. Has anyone run into any major stumbling blocks? Any major libraries having trouble? Any languages that compile to Java byte code (Scala, Jython etc...) not work as expected?

There is a number of things that Dalvik will not handle or will not handle quite the same way as standard Java bytecode, though most of them are quite advanced.
The most severe example is runtime bytecode generation and custom class loading. Let's say you would like to create some bytecode and then use classloader to load it for you, if that trick works on your normal machine, it is guaranteed to not work on Dalvik, unless you change your bytecode generation.
That prevents you from using certain dependency injection frameworks, most known example being Google Guice (though I am sure some people work on that). On the other hand AspectJ should work as it uses bytecode instrumentation as a compilation step (though I don't know if anyone tried).
As to other jvm languages -- anything that in the end compiles to standard bytecode and does not use bytecode instrumentation at runtime can be converted to Dalvik and should work. I know people did run Jython on Android and it worked ok.
Other thing to be aware of is that there is no just in time compilation. This is not strictly Dalviks problem (you can always compile any bytecode on the fly if you wish) but that Android does not support that and is unlikely to do so. In the effect while microbenchmarking for standard Java was useless -- components had different runtime characterstics in tests than as parts of larger systems -- microbenchmarks for Android phones totally make sense.

If you see "Dalvik Virtual Machine internals" Google IO session, you can find Dalvik does not support generational GC.
So, it could degrade performance of frequent object creation and deletion. Java VM supports generational GC so, it would show better GC performance for the same situation.
And also, Dalvik uses trace-granuality JIT instead of method granuality JIT.

Another thing that I guess could be added here is that Dalvik apparently does not preserve field order when listing the fields of a class using the reflection API. Now, the reflection API does not make any guarantees on it anyway (so ideally you shouldn't depend on it anyway), but most of the other VMs out there do preserve the order.

Just to add to the conversation, not intended to revive an old thread. I just ran across this in my search, and want to add that Jython does not work out of the box with Dalvik either. Simply trying to do a hello world example will yield the following:

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.