Cost of GC in Java

Cost of GC in Java - java

Recently I was asked in one the interviews :
What is the cost of Garbage collection in java ?
I answered as : By collecting ununsed objects, we can free up the heap space , in order to avoid OutOfMemoryError.
but it seemed to me like , the interviewer was not satisfied.
Kindly help me in knowing costs of GC in Java.

Garbage collection involves complex memory management and may requires important CPU resources ... There are various GC strategies based on the type of application and the JVM you are running.
If your application creates and releases large amounts of memory, garbage collection may take "some" non negligible amount of CPU and may even lock your application for some time.
Depending on the purpose of your application, this may be acceptable or not. You have to chose a GC strategy based on your needs. This article explains the 5 strategies of OpenJDK (https://dzone.com/articles/choosing-the-right-gc).
Choosing the correct GC strategy is a compromise between periodical JVM blocking and overall efficiency and performance.

Related

Why Garbage-First (G1) targeted for multiprocessor machines with large memories

According to:
9 Garbage-First Garbage Collector
and:
G1: Java's Garbage First Garbage Collector
G1 targeted for multiprocessor machines with large memories.
Those 2 papers (and other web papers), does not describe why we need:
a. large memories
b. multiprocessor (I assume this need due to concurrent & parallel)
What is the technical explanation for those requirements ?

It's other way around. G1 is not targeted for large memories. If your application demands large heap size, G1 is effective.
Why your application demands large heaps? It's depend on business requirements and specific needs of application. You may load huge set of master data Or you may use in - memory caching to reduce response times. Think of big data applications,(Spark,Hadoop) which are processing teta bytes of data and use memory for processing.
Multiprocessors machines have more processing powers and effective for parallel execution of different tasks. Large heap applications obviously demands more processing power.
By setting Max pause time goal, G1GC try to meet that goal. Compared to other algorithms, by default G1GC spends 10% of time in garbage collection activities. You have to fine tune the parameters properly to achieve your pause time goals.
This related question is helpful to get some more insight into G1GC: Java 7 (JDK 7) garbage collection and documentation on G1

G1 is the only collection algorithm in the hotspot VM that can deal with very large heaps efficiently. However, a large heap is NOT a requirement but instead the G1 is built for situations where your application needs a very large heap. In low-heap situations, it is still outperformed by older algorithms. The same is true for the number of processors.

Best suitable JVM implementation for Realtime Applications of Telecom domain

Out of many implementations of JVM, which is most suitable for Realtime Applications like applications for Telecom domain?
I am working on an application of Telecom domain, and wanted some advice regarding the choice of JVM.
Currently using HotSpot but read somewhere regarding JRockit and Azul.
If some one uses one of these JVMs and has seen some major improvements in performance please share.

HostSpot JVM is pretty good and cost-effectve option. It provides few GC algorithms, in particular Concurrent Mark Sweep is good for certain kinds of real-time applications.
G1 is another low-pause GC algorithm, which is actively promoted by Oracle, but so far its results are quite disappointing.
JRockit - is deadend. It never had good low-pause GC algorithm and it is going to be merged in to HotSpot.
Azul Zing - is another league compared to HotSpot/JRockit.It really reliably keep GC pauses in order of milliseconds, but it requires more complex setup. It has few deployment options (virtual appliance, etc) you should check whenever it would fit your infrastructure.
On general note
No JVM could guaranty you minimal GC pause time, it is always best effort. There are lot of factors affecting GC puases duration and most of them very application specific.
If your are seeking guarantied response time below 5ms (not just for 99.9...% responses but for 100%), you should consider techniques avoid Java heap memory usage (i.e. using off-heap memory or static memory preallocation).
Here few links where you can find more specific details about GC algorithms and low-pause tuning.
Understanding GC pauses in JVM, HotSpot's minor GC
Understanding GC pauses in JVM, HotSpot's CMS collector
JRockit GC in action
GC checklist for data grid nodes

Is a garbage collector (.net/java) an issue for real-time systems?

When building a system which needs to respond very consistently and fast, is having a garbage collector a potential problem?
I remember horror stories from years ago where the typical example always was an action game where your character would stop for a few seconds in mid-jump, when the garbage collector would do its cleanup.
We are some years further, but I'm wondering if this is still an issue. I read about the new garbage collector in .Net 4, but it still seems a lot like a big black box, and you just have to trust everything will be fine.
If you have a system which always has to be quick to respond, is having a garbage collector too big of a problem and is it better to chose for a more hardcore, control it yourself language like c++? I would hate it that if it turns out to be a problem, that there is basically almost nothing you can do about it, other than waiting for a new version of the runtime or doing very weird things to try and influence the collector.
EDIT
thanks for all the great resources. However, it seems that most articles/custom gc's/solutions pertain to the Java environment. Does .Net also have tuning capabilities or options for a custom GC?

To be precise, garbage collectors are a problem for real-time systems. To be even more precise, it is possible to write real-time software in languages that have automatic memory management.
More details can be found in the Real Time Specification for Java on one of the approaches for achieving real-time behavior using Java. The idea behind RTSJ is very simple - do not use a heap. RTSJ provides for new varieties of Runnable objects that ensure threads do not access heap memory of any kind. Threads can either access scoped memory (nothing unusual here; values are destroyed when the scope is closed) or immortal memory (that exists throughout the application lifetime). Variables in the immortal memory are written over, time and again with new values.
Through the use of immortal memory, RTSJ ensures that threads do not access the heap, and more importantly, the system does not have a garbage collector that preempts execution of the program by the threads.
More details are available in the paper "Project Golden Gate: Towards Real-Time Java in Space Missions" published by JPL and Sun.

I've written games in Java and .NET and never found this to be a big problem. I expect your "horror stories" are based on the garbage collectors of many years ago - the technology really has moved a long way since then.
The only thing I would hesitate to use Java/.NET for on the the basis of garbage collection would be something like embedded programming with hard real time constraints (e.g. motion controllers).
However you do need to be aware of GC pauses and all of the following can be helpful in minimising the risk of GC pauses:
Minimise new object allocations - while object allocations are extremely fast in modern GC systems, they do contribute to future pauses so should be minimised. You can use techniques like pre-allocating arrays of objects, keeping object pools or using unboxed primitives.
Use specialized low-latency libraries such as Javalution for heavily used functions and data types. These are designed specifically for real-time / low latency application
Make sure you are using the best GC algorithm when there are multiple versions available. I've heard good things about the Sun G1 Collector for low latency applications. The best GC systems do most of their collections concurrently so that garbage collections do not have to "stop the world" for very long if at all.
Tune the GC parameters appropriately. Usually there is a trade-off between overall performance and pause times, you may want to improve the latter at the expense of the former.
If you're very rich, you can of course buy machines with hardware GC support. :-)

Yes, garbage must be handled in a deterministic manner in real-time systems.
One approach is to schedule a certain amount of garbage collection time during each memory allocation. This is called "work-based garbage collection." The idea is that in the absence of leaks, allocation and collection should be proportional.
Another simple approach ("time-based garbage collection") is to schedule a certain proportion of time for periodic garbage collection, whether it is needed or not.
In either case, it is possible that a program will run out of usable memory because it is not allowed to spend enough time to do a full garbage collection. This is in contrast to a non-realtime system, which is permitted to pause as long as it needs to in order to collect garbage.

On a theoretical point of view, garbage collectors are not a problem but a solution. Real-time systems are hard, when there is dynamic memory allocation. In particular, the usual C functions malloc() and free() do not offer real-time guarantees (they are normally fast but have, at least theoretically, "worst cases" where they use inordinate amounts of time).
It so happens that it is possible to build a dynamic memory allocator which offers real-time guarantees, but this requires the allocator to do some heavy stuff, in particular moving some objects in RAM. Object moving implies adjusting pointers (transparently, from the application code point of view), and at that point the allocator is just one small step away from being a garbage collector.
Usual Java or .NET implementations do not offer real-time garbage collection, in the sense of guaranteed response times, but their GC are still heavily optimized and have very short response times most of the time. Under normal conditions, very short average response times are better than guaranteed response times ("guaranteed" does not mean "fast").
Also, note that usual Java or .NET implementations run on operating systems which are not real-time either (the OS can decide to schedule other threads, or may aggressively send some data to a swap file, and so on), and neither is the underlying hardware (e.g. a typical hard disk may make "recalibration pauses" on time to time). If you are ready to tolerate the occasional timing glitch due to the hardware, then you should be fine with a (carefully tuned) JVM garbage collector. Even for games.

It is a potential problem, BUT...
Your character might also freeze in the middle of your C++ program while the OS retrieves a page of memory from an overtaxed hard disk. If you are not using a real-time OS on hardware designed to provide concrete performance guarantees, you are never guaranteed performance.
To get a more specific answer, you'd have to ask about a specific implementation of a specific virtual machine. You can use a garbage-collected virtual machine for real-time systems if it provides suitable performance guarantees about garbage collection.

You bet it is a problem. If you are writing low-latency applications you cannot afford the stop-the-world pauses that most garbage collectors impose. Since Java does not allow you to turn off the GC, your only option is to produce no garbage. That can be done and has been done through object pooling and bootstrapping. I wrote a blog article where I talk about this in detail.

Our company is employing a large .Net-based software application that amongst other things monitors binary sensors over fieldbus networks. In some situations, the sensors activate only for a short amount of time (300 ms) but our software still needs to capture those events as the controlled system will immediately fail when an event is missed. We recently observed increased problems at our customer sites due to the garbage collector running for long timespans (up to 1 second). We are still trying to figure out how to enforce a time limit on the garbage collector. In conclusion of this short story, i would say the garbage collector is a handicap in time critical applications.

On switching out garbage collectors in Java

Recently I heard Kirk Pepperdine speak about changing garbage collectors for better performance -- but what exactly does that mean and what makes one garbage collector better or different than the other?

You ask two questions:
What does it mean to change garbage collectors in Java for better performance?
This is a huge topic, and like some of the other responders, I urge you to do some reading. I recommend Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning from Sun. The information below mostly comes from there. The "turbo-charging" java article recommended in another answer is older.
In brief, one of the many options we have when running the JVM is to select a garbage collector, of which there are presently three:
The serial collector (selected with the -XX:+UseSerialGC option) - this uses a single thread to do all collection work, and everything waits while it happens.
The parallel collector (selected with the -XX:+UseParallelGC option) - this does minor collections (of the young generation) in parallel, but everything waits during the major collections.
The concurrent collector (selected with the -XX:+UseConcMarkSweepGC option) - this allows most collection operations to happen while the application is running.
What makes one garbage collector better than another?
Your application does. Each of the garbage collectors has a "sweet spot" - a range of application profiles for which it is the superior collector.
First, know that the VM is pretty good at selecting a collector for you, and as with most optimizations, you should not consider second-guessing it until you've identified that your application is not performing well, and that garbage collection is the likely culprit.
In that case, you have to ask these questions: 1) is your app running on a single-processor machine, or multi? 2) Are you more concerned with "minimizing pause time", or with "maximizing throughput"? That is, if you had to choose between the application never pausing but getting less work done overall, versus getting more work done overall, but pausing from time to time, which would you pick?
Roughly speaking, as a starting point:
On a Multi-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
On a Multi-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Parallel collector (consider enabling parallel compaction)
On a Single-processor machine, with small datasets (up to roughly 100Mb), you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
Again, though, the VM does a pretty good job of selecting a collector for you, and you're better off not overriding that unless and until you discover that it's not working well enough for your application.

Some collectors are better for throughput, others are better for response time. The difference is usually in how the collector chooses to pause the application. Some such as CMS use mutiple passes to triage the garbage before stopping the application. This triage can happen in a background thread while the application is running, and thus not interfere with your application as much as one that "stops the world" to do a GC.
Edit
Check out this document by sun. Also, about half way down there is a nice image showing the default mark-compact collector against the CMS collector. A picture is worth a thousand words, but the article is a good read too ;) Also worth reading is all the documents on the new G1 collector.

The basic problem is that the way that Java program sees memory (you call "new MyObject" and there it is, and when you are done with it you just forget about it) does not map very well to the underlying operating system and hardware.
The job of the garbage collector is to identify those memory areas which are not in use by an object, and "melt" them together to give a LARGE memory area from where new objects can be allocated. This is very vaguely worded in the Java specification HOW this is done, most likely in order to provide maximum flexibility for the designers of this important task.
Several approaches exist, with advantages and disadvantages. What you usually want is a garbage collector that can keep up in the background with the rate of objects being abandoned, as the only way for it to catch up is to stop the program while catching up. That gives really bad user experiences.
A typical trend for Java objects is that either they live for a very short time (current block or method) or a very long time. Modern garbage collectors deal with this by having multiple pools so that young objects are treated differnetly than old objects.

Java performance with very large amounts of RAM

I'm exploring the possibility of running a Java app on a machine with very large amounts of RAM (anywhere from 300GB to 15TB, probably on an SGI Altix 4700 machine), and I'm curious as to how Java's GC is likely to perform in this scenario.
I've heard that IBM's or JRockit's JVMs may be better suited to this than Sun's. Does anyone know of any research or data on JVM performance in this situation?

On the Sun JVM, you can use the option -XX:UseConcMarkSweepGC to turn on the Concurrent mark and sweep Collector, which will avoid the "stop the world" phases of the default GC algorithm almost completely, at the cost of a little bit more overhead.
The advise to use more than on VM on such a machine is IMHO outdated.
In real world applications you often have enough shared data so that the performance with the CMS and one JVM is better.

The question is: do you want to run within a single process (JVM) or not? If you do, then you're going to have a problem. Refer to Tuning Java Virtual Machines, Oracle Coherence User Guide and similar documentation. The rule of thumb I've operated by is try and avoid heaps larger than 1GB. Whereas a 512MB-1GB full GC might take less than a second. A 2-4GB full GC could potentially take 5 seconds or longer. Obvioiusly this depends on many factors but the moral of the story is that GC overhead does not scale linearly and once you get into the one second range performance then degrades rapidly.

Sun's JVM allows you to configure and optimize the heck out of garbage collection, but it's a science unto itself:
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html
You might have to do some reading and research, but for that kind of machine, GC settings optimized for the machine and application probably make a big difference.

Since 5.0 the Hotspot JVM uses a concept know as Ergonomics to try to optimise the memory usage. This is based on more than just the sheer amount of memory available and effects heap sizes, generation sizes and garbage collection algorithms.
Start by having a read of this, which explains Ergonomics and more:
https://www.oracle.com/technetwork/java/javase/memorymanagement-whitepaper-150215.pdf
There's also a guy called Brian Goetz that's written numerous articles about how Java allocates and uses memory, all of which and more can be found here:
http://www.briangoetz.com/pubs.html

This is not at all answering your question, but if you plan do deploy a huge Java app you might be interested in looking into Azul Systems appliances. They say to be able to garbage-collect without creating a pause in the application up to a single 670 GB heap.

You might want to consider running a virtual Terracotta cluster on this machine.

The only people who can really tell you are SGI. Super computers don't behave like regular servers only bigger.
However, I have found that Java performs best when memory is local to the processors accessing it. Note: the GC needs to be able to walk the whole memory end to end. This means it doesn't scale well if you have a design which is like lots of computers stuck together which may be the case here. The memory module size is 32 GB, so you may get better performance if you limit your JVM to comfortably fit into this size.

The accepted answer for this post is rather old and is now outdated. As of September 2014, if you are using Java 7, you should probably switch to the GC1 collector. From the Java 7 update 4 release notes:
http://www.oracle.com/technetwork/java/javase/7u4-relnotes-1575007.html
"The G1 collector is targeted for applications that fully utilize the large amount of memory available in today's multiprocessor servers, while still keeping garbage collection latencies under control. Applications that require a large heap, have a big active data set, have bursty or non-uniform workloads or suffer from long Garbage Collection induced latencies should benefit from switching to G1."

Surely the answer as to how the GC's going to perform is "who cares?" ;-)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.