How to setup development environment for Java HotSpot VM?

How to setup development environment for Java HotSpot VM? - java

What is the best way to understand the Java HotSpot VM? And if I want to make modifications to the source code and add my own features, what would be the best development environment (does ctags work well with the large code base, or do I need a full-blown IDE)?

I doubt that you would want to dive into the Hotspot code-base... I'm copying parts of my answer on from this question:
Which JVM to choose for GC hacking?
I think the Maxine Research VM from Oracle Labs would be a good starting point. Here's a quote from the first page of their wiki:
Project Overview
In this era of modern, managed languages we demand ever more from our virtual machines: better performance, more scalability, and support for the latest new languages. Research and experimentation is essential but no longer practical in the context of mature, complex, production VMs written in multiple languages.
The Maxine VM is a next generation platform that establishes a new standard of productivity in this area of research. It is written entirely in Java, completely compatible with modern Java IDEs and the standard JDK, features a modular architecture that permits alternate implementations of subsystems such as GC and compilation to be plugged in, and is accompanied by a dedicated development tool (the Maxine Inspector) for debugging and visualizing nearly every aspect of the VM's runtime state.
Here's an excelent video demonstrating its memory monitoring utilities:
Introduction to the Maxine Inspector

Related

What are the differences between JVisualVM and Java Mission Control?

Other than the more 'advanced' GUI from Java mission control, how are they different?
At first glance they seem to offer very similar functionality (Interpreting JMX data and Memory/CPU profiling).
However, as they are both shipped with the JDK (I'm using JDK 1.7.0_51 SE) I'm assuming there are significant differences, otherwise they would be combined into a single solution. Especially as this increases the size of the JDK significantly.
Is Java Mission Control ultimately going to replace JVisualVM in the future?

One important point is that Mission Control is potentially not free to use on production environments. It is free for applications running in DEV & QA and Oracle are not currently enforcing the charges for production applications (as of Nov 2014). However, their executives have made it clear this may change in time.

The JMX Console part of Java Mission Control is just like any other JMX console. I'm of course biased, but in my opinion it's one of the more feature rich consoles available. The more unique part of JMC is the Java Flight Recorder part.
JMC is targeting production systems, and is very careful to avoid introducing unnecessary overhead. With the Java Flight Recorder you can do production time profiling and diagnostics with an almost unmeasurable overhead.

Which JVM to choose for GC hacking?

I have a design for a GC algorithm that I would like to implement for a JVM, to allow benchmarking.
Does anyone have any experience as to which implementation would allow the easy hacking, but which still has a built in GC that would make for a meaningful comparison?
Edited: I want a JVM that has garbage collection, as I want to collect stats using it, then rip out their GC, put my own in, and the compare. I want it to have a good GC, as otherwise the comparison is meaning, but I want something with code that is not too difficult to work with (HotSpot has a lot of assembler, making the task more difficult)

I think that the Maxine Research VM from Oracle Labs would be a perfect match for your needs.
Quote from the first page of their wiki:
Project Overview
In this era of modern, managed languages we demand ever more from our virtual machines: better performance, more scalability, and support for the latest new languages. Research and experimentation is essential but no longer practical in the context of mature, complex, production VMs written in multiple languages.
The Maxine VM is a next generation platform that establishes a new standard of productivity in this area of research. It is written entirely in Java, completely compatible with modern Java IDEs and the standard JDK, features a modular architecture that permits alternate implementations of subsystems such as GC and compilation to be plugged in, and is accompanied by a dedicated development tool (the Maxine Inspector) for debugging and visualizing nearly every aspect of the VM's runtime state.
Here's an excelent video demonstrating its memory monitoring utilities:
Introduction to the Maxine Inspector

I'm not aware of any that don't have a built-in GC; not much of a Java without one. Why not start with OpenJDK or Harmony?

maybe you don't need a JVM but a virtual machine would be sufficient for testing your algorithm. Unless, you are obliged to use a JVM, you can use APache Harmony or I would recommend another VM that was created on a phd thesis called VmKit. You can take a look at it and browse the source

Is there an advantage to running JRuby if you don't know any Java?

I've heard great things about JRuby and I know you can run it without knowing any Java. My development skills are strong, Java is just not one of the tools I know. It's a massive tool with a myriad of accompanying tools such as Maven/Ant/JUnit etc.
Is it worth moving my current Rails applications to JRuby for performance reasons alone? Perhaps if I pick up some basic Java along side, there can be so added benefits that aren't obvious such as better debugging/performance optimization tools?
Would love some advice on this one.

I think you pretty much nailed it.
JRuby is just yet another Ruby execution engine, just like MRI, YARV, IronRuby, Rubinius, MacRuby, MagLev, SmallRuby, Ruby.NET, XRuby, RubyGoLightly, tinyrb, HotRuby, BlueRuby, Red Sun and all the others.
The main differences are:
portability: for example, YARV is only officially supported on x86 32 Bit Linux. It is not supported on OSX or Windows or 64 Bit Linux. Rubinius only works on Unix, not on Windows. JRuby OTOH runs everywhere: desktops, servers, phones, App Engine, you name it. It runs on the Oracle JDK, OpenJDK, IBM J9, Apple SoyLatte, RedHat IcedTea and Oracle JRockit JVMs (and probably a couple of others I forgot about) and also on the Dalvik VM. It runs on Windows, Linux, OSX, Solaris, several BSDs, other proprietary and open Unices, OpenVMS and several mainframe OSs, Android and Google App Engine. In fact, on Windows, JRuby passes more RubySpec tests than "Ruby" (meaning MRI or YARV) itself!
extensibility: Ruby programs running on JRuby can use any arbitrary Java library. Through JRuby-FFI, they can also use any arbitrary C library. And with the new C extension support in JRuby 1.6, they can even use a large subset of MRI and YARV C extensions, like Mongrel for example. (And note that "Java" or "C" library does not actually mean written in those languages, it only means with a Java or C API. They could be written in Scala or Clojure or C++ or Haskell.)
tooling: whenever someone writes a new tool for YARV or MRI (like e.g. memprof), it turns out that JRuby already had a tool 5 years ago which does the same thing, only better. The Java ecosystem has some of the best tools for "runtime behavior comprehension" (which is a term I just made up, by which I mean much more than just simple profiling, I mean tools for deeply understanding what exactly your program does at runtime, what its performance characteristics are, where the bottlenecks are, where the memory is going, and most importantly why all of that is happening) and visualization available on the market, and pretty much all of those work with JRuby, at least to some extent.
deployment: assuming that your target system already has a JVM installed, deploying a JRuby app (and I'm not just talking about Rails, I also mean desktop, mobile, other kinds of servers) is literally just copying one JAR (or WAR) and a double-click.
performance: JRuby has much higher startup overhead. In return you get much higher throughput. In practice, this means that deploying a Rails app to JRuby is a good idea, as is running your integration tests, but for developer unit tests and scripts, MRI, YARV or Rubinius are better choices. Note that many Rails developers simply develop and unit test on MRI and integration test and deploy on JRuby. There's no need to choose a single execution engine for everything.
concurrency: JRuby runs Ruby threads concurrently. This means two things: if your locking is correct, your program will run faster, and if your locking is incorrect, your program will break. (Unfortunately, neither MRI nor YARV nor Rubinius run threads concurrently, so there's still some broken multithreaded Ruby code out there that doesn't know it's broken, because obviously concurrency bugs can only show up if there's actual concurrency.)
platforms (this is somewhat related to portability): there are some amazing Java platforms out there, e.g. the Azul JCA with 768 GiBytes of RAM and 864 CPU cores specifically designed for memory-safe, pointer-safe, garbage-collected, object-oriented languages. Android. Google App Engine. All of those run JRuby.

I would modify what Peter said slightly. JRuby may use more memory compared to standard Ruby, but that's usually because you're doing the work in a single process what would take several processes with Ruby.
You should try the Rails.threadsafe! option with a single JRuby runtime (for example, the Trinidad gem with the --threadsafe option). We've heard several stories where it gives you great performance and low memory usage, while leveraging multiple CPU cores with a single process.

JRuby is one of the few implementations that uses native threads. So if you care to do some multithreading, go for it.
As far as hosting is concerned, you have to put your app in some sort of java container, which I personally find to be far less straightforward than using something like passenger (for Rack apps)
I use JRuby for an app as we communicate over JMS and it works fine, but if I wasn't using any Java I would certainly stick to CRuby. My biggest beef is that in testing, running tests takes forever with JRuby as you have to spin up a VM each time you run them. This makes it a lot harder to TDD as it's a significant hit on your testing time.

Jruby has advantages if you're on Windows. It supports 64 bits and you can use a lot of proprietary databases with standard JDBC drivers.

The latest releases are significantly faster than Ruby but also use significantly more memory. If that is your only reason for using JRuby, I wouldn't bother unless you have a specific performance need that it solves, simply because, while it is pretty popular, it is less standard for hosting and less people use it as compared to standard Ruby. That being said, there are many other reasons to use JRuby such as a need for interoperability with existing Java code and the need to deploy in environments where Java has been "blessed" by the operations department and Ruby has not.

What are Scala's future platform concerns people should be prepared for?

At the moment Scala runs only on the JVM, with an outdated implementation for the CLR.
But there are some voices at the moment, that Microsoft is interested funding an up-to-date Scala port for .NET.
Considering the lack of any plan or oversight at Oracle's side what to do with Java/the JVM/the ecosystem, how can a Scala developer be prepared that in the end there might be no decent platform left to run Scala on?
Are there any plans to have some "independent" implementation of a Scala VM in the future, which maps Scala's feature to some bytecode/VM, instead of having to live with all these legacy bugs in current VM implementations (no generics, covariant arrays, weird annotations, no tail calls etc.)?

Here's another view regarding the VM:
While not really Sun's brightest moment if you look the whole picture, slapping the GPL license on JDK/related things has actually caused this wonderful situation where the whole JVM platform is completely independent from Oracle. I mean, the virtual machine isn't tied to Java, the garbage collectors aren't tied to Java and most importantly the Java programmers aren't really tied to Java and thus Oracle.
As a Java programmer, I'd say we won - if Oracle decides to deprecate everything in Java world in hopes of bigger profits, we can just grab the VM and a modern language such as Scala and let Larry Ellison sail to sunset in his yacht for all we care.

The current implementation of Scala is very much focused on the JVM. Much in the Scala library depends on classes in the Java standard library and Java classes are also exposed to user programs.
If there are going to be Scala implementations on other platforms such as the CLR or LLVM, then programs written for the current Java-oriented Scala implementation will not be automatically compatible with those other implementations (unless those implementations go to great lengths to support the classes available in Java).
I agree with Randall that the JVM is not going to disappear anytime soon; it's probably the most succesful and widespread virtual machine platform, deployed on billions of devices, from smartcards and handheld devices to the biggest servers. In fact, the Java programming language might disappear much sooner than the JVM itself. There is no reason to fear for the disappearance of the JVM in the forseeable future.
And even in the unlikely case that it does - does it really matter? You'd still be able to program in your favorite programming language Scala, on one of the other platforms.

I woudn't worry too much about the death of the JVM due to Oracle mismanagement, just as Esko said.
As of now, i do worry about the JVM in another way: The JVM was not constructed as a platform for multiple languages. Most languages running on the JVM use dynamic typing, and are in a way freed from the complexity of compiling to bytecode.
Scala is compiling to bytecode, and was constructed with the JVM in mind by the man (Odersky) who wrote the Java compiler (1.1-1.4). Scala is the only language written by someone with intimate knowledge of the JVM, and we do not really know how hard it was for him to do it.
I worry that the JVM eventually will dwindle in popularity due to the fact that it is not a multi-language platform to begin with.

Performance Cost of Profiling a Web-Application in Production

I am attempting to solve performance issues with a large and complex tomcat java web application. The biggest issue at the moment is that, from time to time, the memory usage spikes and the application becomes unresponsive. I've fixed everything I can fix with log profilers and Bayesian analysis of the log files. I'm considering running a profiler on the production tomcat server.
A Note to the Reader with Gentle Sensitivities:
I understand that some may find the very notion of profiling a production app offensive. Please be assured that I have exhausted most of the other options. The reason I am considering this is that I do not have the resources to completely duplicate our production setup on my test server, and I have been unable to cause the failures of interest on my test server.
Questions:
I am looking for answers which work either for a java web application running on tomcat, or answer this question in a language agnostic way.
What are the performance costs of profiling?
Any other reasons why it is a bad idea to remotely connect and profile a web application in production (strange failure modes, security issues, etc)?
How much does profiling effect the memory foot print?
Specifically are there java profiling tools that have very low performance costs?
Any java profiling tools designed for profiling web applications?
Does anyone have benchmarks on the performance costs of profiling with visualVM?
What size applications and datasets can visualVM scale to?

OProfile and its ancestor DPCI were developed for profiling production systems. The overhead for these is very low, and they profile your full system, including the kernel, so you can find performance problems in the VM and in the kernel and libraries.
To answer your questions:
Overhead: These are sampled profilers, that is, they generate timer or performance counter interrupts at some regular interval, and they take a look at what code is currently executing. They use that to build a histogram of where you spend your time, and the overhead is very low (1-8% is what they claim) for reasonable sampling intervals.
Take a look at this graph of sampling frequency vs. overhead for OProfile. You can tune the sampling frequency for lower overhead if the defaults are not to your liking.
Usage in production: The only caveat to using OProfile is that you'll need to install it on your production machine. I believe there's kernel support in Red Hat since RHEL3, and I'm pretty sure other distributions support it.
Memory: I'm not sure what the exact memory footprint of OProfile is, but I believe it keeps relatively small buffers around and dumps them to log files occasionally.
Java: OProfile includes profiling agents that support Java and that are aware of code running in JITs. So you'll be able to see Java calls, not just the C calls in the interpreter and JIT.
Web Apps: OProfile is a system-level profiler, so it's not aware of things like sessions, transactions, etc. that a web app would have.
That said, it is a full-system profiler, so if your performance problem is caused by bad interactions between the OS and the JIT, or if it's in some third-party library, you'll be able to see that, because OProfile profiles the kernel and libraries. This is an advantage for production systems, as you can catch problems that are due to misconfigurations or particulars of the production environment that might not exist in your test environment.
VisualVM: Not sure about this one, as I have no experience with VisualVM
Here's a tutorial on using OProfile to find performance bottlenecks.

I've used YourKit to profile apps in a high-load production environment, and while there was certainly an impact, it was easily an acceptable one. Yourkit makes a big deal of being able to do this in a non-invasive manner, such as selectively turning off certain profiling features that are more expensive (it's a sliding scale, really).
My favourite aspect of it is that you can run the VM with the YourKit agent running, and it has zero performance impact. it's only when you connect the GUI and start profiling that it has an effect.

There is nothing wrong in profiling production apps. If you work on distributed applications, there are times when a outofmemory exception occurs in a very unique probability scenario which is very difficult to reproduce in a dev/stage/uat environment.
You can try using custom profilers but if you are in a hurry and plugging in/ setting upa profiler on a production box will take time, you can also use the jvm to take a memory dump(jvms memory dump also gives you thread dump)
You can activate the automatic generation on the JVM command line, by using the following option :
-XX:+HeapDumpOnOutOfMemoryError
he Eclipse Memory Analyzer project has a very powerful feature called “group by value”, which makes it possible to build an object query and regroup the instances by a field value. This is useful in the case where you have a lot of instances that are containing a smaller set of possible values, and you can to see which values are being used the most. This has really helped me understand some complex memory dumps so I recommend you try it out.

You may also consider using one of the modern HotSpot JVM - Java Flight Recorder and Java Mission Control. It is a set of tools that allow you to collect low-level runtime information with the CPU overhead about 5% (I cannot prove the last statement anyhow, this is the statement of Oracle engineer who presented the feature and live demo).
You can use this tool as long as your application is running 1_7u40 JVM or higher. To enable the runtime info collection, you need to start JVM with particular flags:
By default, JFR is disabled in the JVM. To enable JFR, you must launch your Java application with the -XX:+FlightRecorder option. Because JFR is a commercial feature, available only in the commercial packages based on Java Platform, Standard Edition (Oracle Java SE Advanced and Oracle Java SE Suite), you also have to enable commercial features using the -XX:+UnlockCommercialFeatures options.
(Quoted http://docs.oracle.com/javase/8/docs/technotes/guides/jfr/about.html#sthref7)
I added this answer as this is viable option for profiling in production IMO.
Also there is an Eclipse plugin that supports JFR and JMC and capable of displaying information user-friendly.

The tools have improved vastly over the years. These days, most people who have needs like these use a tool that hooks into Java's instrumentation API instead of the profiling API. Surely there are more examples, but NewRelic and AppDynamics come to mind. Instrumentation-based solutions usually run as an agent in the JVM and constantly collect data. They report the data at a higher level (business transaction, web transaction, database transaction) than the old profiling approach and allow you to dig deeper (down to the method or line) if necessary. You can even setup monitoring and alerts, so you can track/alert on metrics like page load times and performance against SLAs. With these great tools, you really should have no reason to run a profiler in production any longer. The cost of running them is negligible.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.