Garbage collection in JAVA (Mark-and-sweep and Reference Counting)

Garbage collection in JAVA (Mark-and-sweep and Reference Counting) - java

I have some questions about the Garbage Collection concept of Java when working in distributed systems:
Why is mark-and-sweep GC not recommended in RMI system?
Is it possible to run the GCs "Reference counting"-algorithm in a parallel thread without suspending the application itself?
Thanks in advance.

Why is mark-and-sweep GC not recommended in RMI system?
I don't believe it is.
Is it possible to run the GCs "Reference counting"-algorithm in a parallel thread without suspending the application itself?
While reference counting is not forbidden as a mode of GC, it is not supported by any JVM AFAIK as it has many limitations including performance, memory usage and circular references. I know C++ uses it but is a hack by comparison to what the managed memory systems do.
Note: MappedByteBuffers use reference counts for some purposes. This is an isolated use case.
There is a purely concurrent collector, the most popular of which is available from Azul. http://www.azulsystems.com/zing/pgc Note: it should really be called "pause less" instead of "pauseless" as it dramatically reduces GC related pauses, but doesn't eliminate them completely. (It is often used for low latency trading system in Java.)
If you are really concerned about GC pauses, the best thing to do is avoid using Java RMI. It is designed to be a "full fat" fully featured RPC which does lots of things you possibly never thought of doing. The Serialization isn't very efficient and generates lots of garbage. Using a more targeted RPC solution can reduce garbage by 90 - 99% or much better.

Check Java's web site on this: http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html
Basically there are several garbage collectors available. I've been running Parallel collector on production system for quite a while, but googling around will show you that G1 is also showing great promises.

Why is mark-and-sweep GC not recommended in RMI system?
I don't know what this means. RMI uses a distributed garbage collection algorithm (DGC), from Modula-3, which uses reference-counting, among other things, but that's completely separate from JVM garbage collection. I've never hear of the recommendation you mention. Citation please. I'm not sure the statement even makes sense. Changing 'recommended' to 'convenient' as per your comment doesn't really help.
Is it possible to run the GCs "Reference counting"-algorithm in a parallel thread without suspending the application itself?
There is no reference counting algorithm in current JVMs, but GC does run in its own thread as far as I am aware. So does RMI DGC.
In short your question doesn't make sense.

Related

Should I use SerialGC or G1GC ona single CPU system?

I have a container that is limited to 1 CPU, the default case for java 11+ (and probably older also) in such case it to user SerialGC.
Should I force a threaded GC (like G1GC) or just leave it at SerialGC?
Which one will perform better on a single CPU?
I always assumed SerialGC is better in such case but I frequently see G1GC forced in some cases.
EDIT: I'm asking for general case, because we have a lot of different apps running using the same configuration and it is hard to test each and every case.

According to the documentation.
The serial collector uses a single thread to perform all garbage
collection work, which makes it relatively efficient because there is
no communication overhead between threads.
It's best-suited to single processor machines because it can't take
advantage of multiprocessor hardware, although it can be useful on
multiprocessors for applications with small data sets (up to
approximately 100 MB).
I'm assuming processor = core in the documentation (and your question). While the documentation says that the serial collector is not a good option for multi-core machines, it doesn't say that other collectors would be bad for a single-core machine.
The other collectors do tend to use multiple threads though, and you won't get the full benefits of those in a single-core environment.
So why have you seen G1GC used? Maybe no reason other than it was the newest. However if there is a reason, it would most likely be the shorter GC pauses that G1 provides:
If response time is more important than overall throughput and garbage
collection pauses must be kept shorter than approximately one second,
then select a mostly concurrent collector with -XX:+UseG1GC or
-XX:+UseConcMarkSweepGC.
The best case scenario is that in those cases they measured the performance with different collectors and chose the one that provided the best results.
Also consider the String deduplication Holger mentioned in the comments. This is a specific memory optimization that can be the reason behind using G1GC. After all if you have a single core, you probably don't have a lot of memory at your disposal either.

What do you want to optimize? Do you want to be able always to answer extremely fast or to have better overall performance? In the first case, you should aim for shorter GC pauses, in the second for the lower sum of all the GC pauses.
There are other factors that you have in mind (i.e. how often applications are restarted) so IMO the best approach is a data-driven approach. Use GC easy or GC viewer to analyze the performance of each application and act accordingly.
Please have in mind that GC tuning is not always required so if you do not know what you want to achieve you probably optimize prematurely.
In general:
use The Serial GC for applications that do not have low pause time requirements and are run in the environment with low resources
go with G1 Garbage Collector if you have more resources or you need to answer fast (remember to measure the performance before and after the change)

As a more general comment, don't make the assumption that because you only have a single core/CPU that making a task multi-threaded will have no benefit. Depending on the task involved (in this case GC), there may well be situations where one thread becomes blocked (e.g. waiting for IO to complete), which allows other threads performing another part of the task to use the processor and complete useful work. Overall performance is increased, despite only one thread being able to run at a time.

One important thing that has not been mentioned in this thread is that the G1GC can return the memory (uncommit it) back to the OS, so if other applications are running on the server, they can make use of it.
I noticed this when switching from a single vCPU server to 2 vCPU server, as java by default uses SerialGC for single CPU and G1GC for multi-CPU (well at least it does for JDK 11)

What are the useful JVM options for a multithreaded application?

I am working on an application that creates a lot of threads and relies heavily on String manipulation.
The application works for a good 24 hrs at a time and needs to be always very responsive.
I am trying to keep the creation of objects to a minimum. The application is doing well without any configuration at the moment.
But I was wondering for my own knowledge if there were any advantages (or disavantages) in using a specific JVM configuration?
Please bear with me, I am pretty new on on the subject of the JVM/GC configuration:
I was wondering if there were any JVM options I should absolutely use while working with multithreads?
Should I configure the heap?
Should I also configure the GC?
Should I keep the Garbage Collection to a minimum?
I started reading: http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
Any tips on the subject would greatly be appreciated.
Thanks in advance,

Generally, the best intial advice concerning tweaking your JVM is don't. Unless you are experiencing specific JVM-related problems with the default settings, leave them alone.
If you do need to fiddle around with the settings, I would recommend you set up a representative testcase and use an advanced profiler such as JProfiler.
Furthermore, you should really read the technical documentation regarding the HotSpot VM, specifically the Memory Management Whitepaper, all of which you may find here.

If it is working fine then you should not do anything.
If your application is CPU bound you should not create Lot of threads.
Reason is lot of time is wasted in context switching.
String manipulation if it in memory then there should be only those threads which are required
NCPU = UCPU* (1+W/C)
Where NCPU--> Number of CPU
UCPU--> Target CPU Utilization
W-->Wait time
C--> Compute time
So for CPU bound operations it should be max (Number of CPU +1) threads.
Also there are lot of test cases defined for concurrency applications in Java Concurrency in Practice. You may want to check those.

I was wondering if there were any JVM options I should absolutely use while working with multithreads?
All the best options will be on by default. If you look at HotSpot VM Options you can see quite a few are -XX:+ which means they are on by default.
Should I configure the heap?
Possibly. But I would leave the default setting if you can.
Should I also configure the GC?
Possibly. But I would leave the default setting if you can.
Should I keep the Garbage Collection to a minimum?
Reducing the amount of garbage created takes effort. It provides some benefit up to a point. You have to decide what is the best use of your time and how much time to spend reducing the amount of garbage created.
I would always start with a memory profiler and find where you are creating the most garbage. Start from the top of the list rather than trying to tune everything as this ensures you will get the most benefit for the least amount of effort.
BTW: I am an advocate of low garbage and off heap programs where it makes sense to do so. I have written trading systems which can run for a day without even a minor GC and programs which can load/use 500+ GB of data in off heap memory. However, you have to be able to demonstrate or quantify how much difference it will make to the end users or your business to determine whether it is really worth it.

I was wondering if there were any JVM options I should absolutely use while working with multithreads?
No.
Should I configure the heap?
No, apart from setting the heap size to something reasonable (with -Xmx and -Xms)
Should I also configure the GC?
No, unless you have a particular need for "low-pause". The default throughput compiler is the best option if you are currently meeting your "responsiveness" goals. If you are not meeting those goals then you should consider CMS or G1 ... but beware that they reduce pauses but they also reduce throughput.
Should I keep the Garbage Collection to a minimum?
No. That is not a sensible goal. Your aim is to maximize throughput, and minimizing GC won't necessarily achieve that. In a lot of case, it is more efficient to generate garbage than to to have the application do extra work to avoid generating garbage. (And as Peter Lawrey pointed out, you've also got the extra developer effort in writing and maintaining mode complex code.)
I would advise you to use a profiler to see if your application is spending a lot of time (CPU time or elapsed time) relative to doing other productive work. If not, or if the application is already running fast enough then don't fiddle with the JVM options.
If you are worried that your application won't cope with increased load in the future, then tweaking the GC doesn't scale. A better option is to investigate scaling up your hardware and/or figuring out how to do the work on multiple machines. In addition, tuning the GC to improve performance with current load may actually result in worse performance when the load increases. (Consider the problem that arises with CMS when it can't keep up and is forced to do a full stop-the-world collection to recover.)
Finally, it is generally speak a bad idea to have lots of threads. It is better to use a small number of worker threads (roughly equal to the number of processors / cores) and feed them work via concurrent queues, etcetera.

In the past, I have faced the similar server application: lots of String manipulation, String creation, and needs to be always very responsive. The app worked fine with default configuration, until run into high-stress situation. You need to enable -XX:+UseConcMarkSweepGC for low pause, and fine tune other parameters to ensure the app behavior the way that you want. Here is the short list:
-XX:+CMSParallelRemarkEnabled
-XX:+CMSScavengeBeforeRemark
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=nn
-XX:CMSWaitDuration=300000
-XX:GCTimeRatio=nn
-XX:+DisableExplicitGC

Multiple threads and garbage collection in Java

Can someone give me some advice on this? I am reading in an old text and some notes from my teacher that when using multiple threads with Java it's necessary to write a special program for garbage collection.
Does this still apply in Java SE6 and above? If it does could someone provide the standard way to do this.

Using a garbage collector makes writing multi-threaded code easier. This is because manual freeing of resources in a multi-threaded context is hard to get right. With GC its something you don't need to worry about most of the time.
I am reading that when using multiple threads it's necessary to write a special program for garbage collection.
I don't believe this was ever the case.
Does this still apply in SE6 and above and if so is there a standard way to do this.
The standard way to do this is to not reference objects you don't need. e.g. if you have a local variable you don't need, let it drop out of scope.
It doesn't have to be complicated.

As far as I know, as long if nothing is pointing to an object, that object get's freed by the garbage collector.
Java's garbage collector is very robust in terms of circular referencing, I don't see why It won't work with multiple threads running at the same time.
So it is safe for you to assume that you don't need to write a special program for garbage collection, because java will do it for you very effectively.
If you want to free objects in java, just make sure that no variables are referencing your object. (Including structures (lists, arrays, etc) from java collections or other libraries)

This article from JavaWorld in 2003, J2SE 1.4.1 boosts garbage collection, has this to say about the Java garbage collection prior to J2SE 1.4.1:
Mark and sweep is a "stop-the-world" garbage collection technique;
that is, all application threads stop until garbage collection
completes, or until a higher-priority thread interrupts the garbage
collector. If the garbage collector is interrupted, it must restart,
which can lead to application thrashing with little apparent result.
The other problem with mark and sweep is that many types of
applications can't tolerate its stop-the-world nature. That is
especially true of applications that require near real-time behavior
or those that service large numbers of transaction-oriented clients.
An article in Dr. Dobbs from 2009, G1: Java's Garbage First Garbage Collector, has this to say about Java garbage collector before SE 6.
Until recently, Java SE came with two main collectors: the parallel
collector, and the concurrent-mark-sweep (CMS) collector -- see the
sidebar Parallelism and Concurrency. As of the latest Java SE 6 update
release, the G1 collector is another option. The plan is for G1 to
eventually replace CMS as a low-pause, soft real-time collector. Let's
take a look at how it works.
So it may be that prior to SE 6 some additional precautions to assist with Java garbage collection may have helped, especially with multi-threaded applications with a fair amount of temporary variables generating garbage that needed collecting. However this should entail at most an explicit call to the garbage collector during slow times. Writing something special would seem very unusual.
However things are much more improved than they were. Plus garbage collection can vary between different versions of Java Virtual Machines.
So what may have been true years ago is almost definitely not true now with current technology.
This posting, How to monitor Java memory usage?, discusses monitoring Java memory usage as well as some of the pros and cons of calling the garbage collector explicitly.
Oracle has a Java Garbage Collection Basics tutorial that covers Java SE 7 Hotspot JVM.

Use following code to call garbage collector explicitly
Runtime runtime = Runtime.getRuntime();
runtime.gc();
But it is not needed, jvm will automatically handle correct timely running of GC.

Almost certainly your instructor's notes are stating (correctly) that since Java is a multithreaded environment, more care is needed when implementing the garbage collector inside the Java run time environment than would be necessary if only a single thread were involved. This is true of any multithreaded environment.
As others have said, you the programmer don't see any of this complexity. That's the gift of automatic memory management that gc provides.

Is a garbage collector (.net/java) an issue for real-time systems?

When building a system which needs to respond very consistently and fast, is having a garbage collector a potential problem?
I remember horror stories from years ago where the typical example always was an action game where your character would stop for a few seconds in mid-jump, when the garbage collector would do its cleanup.
We are some years further, but I'm wondering if this is still an issue. I read about the new garbage collector in .Net 4, but it still seems a lot like a big black box, and you just have to trust everything will be fine.
If you have a system which always has to be quick to respond, is having a garbage collector too big of a problem and is it better to chose for a more hardcore, control it yourself language like c++? I would hate it that if it turns out to be a problem, that there is basically almost nothing you can do about it, other than waiting for a new version of the runtime or doing very weird things to try and influence the collector.
EDIT
thanks for all the great resources. However, it seems that most articles/custom gc's/solutions pertain to the Java environment. Does .Net also have tuning capabilities or options for a custom GC?

To be precise, garbage collectors are a problem for real-time systems. To be even more precise, it is possible to write real-time software in languages that have automatic memory management.
More details can be found in the Real Time Specification for Java on one of the approaches for achieving real-time behavior using Java. The idea behind RTSJ is very simple - do not use a heap. RTSJ provides for new varieties of Runnable objects that ensure threads do not access heap memory of any kind. Threads can either access scoped memory (nothing unusual here; values are destroyed when the scope is closed) or immortal memory (that exists throughout the application lifetime). Variables in the immortal memory are written over, time and again with new values.
Through the use of immortal memory, RTSJ ensures that threads do not access the heap, and more importantly, the system does not have a garbage collector that preempts execution of the program by the threads.
More details are available in the paper "Project Golden Gate: Towards Real-Time Java in Space Missions" published by JPL and Sun.

I've written games in Java and .NET and never found this to be a big problem. I expect your "horror stories" are based on the garbage collectors of many years ago - the technology really has moved a long way since then.
The only thing I would hesitate to use Java/.NET for on the the basis of garbage collection would be something like embedded programming with hard real time constraints (e.g. motion controllers).
However you do need to be aware of GC pauses and all of the following can be helpful in minimising the risk of GC pauses:
Minimise new object allocations - while object allocations are extremely fast in modern GC systems, they do contribute to future pauses so should be minimised. You can use techniques like pre-allocating arrays of objects, keeping object pools or using unboxed primitives.
Use specialized low-latency libraries such as Javalution for heavily used functions and data types. These are designed specifically for real-time / low latency application
Make sure you are using the best GC algorithm when there are multiple versions available. I've heard good things about the Sun G1 Collector for low latency applications. The best GC systems do most of their collections concurrently so that garbage collections do not have to "stop the world" for very long if at all.
Tune the GC parameters appropriately. Usually there is a trade-off between overall performance and pause times, you may want to improve the latter at the expense of the former.
If you're very rich, you can of course buy machines with hardware GC support. :-)

Yes, garbage must be handled in a deterministic manner in real-time systems.
One approach is to schedule a certain amount of garbage collection time during each memory allocation. This is called "work-based garbage collection." The idea is that in the absence of leaks, allocation and collection should be proportional.
Another simple approach ("time-based garbage collection") is to schedule a certain proportion of time for periodic garbage collection, whether it is needed or not.
In either case, it is possible that a program will run out of usable memory because it is not allowed to spend enough time to do a full garbage collection. This is in contrast to a non-realtime system, which is permitted to pause as long as it needs to in order to collect garbage.

On a theoretical point of view, garbage collectors are not a problem but a solution. Real-time systems are hard, when there is dynamic memory allocation. In particular, the usual C functions malloc() and free() do not offer real-time guarantees (they are normally fast but have, at least theoretically, "worst cases" where they use inordinate amounts of time).
It so happens that it is possible to build a dynamic memory allocator which offers real-time guarantees, but this requires the allocator to do some heavy stuff, in particular moving some objects in RAM. Object moving implies adjusting pointers (transparently, from the application code point of view), and at that point the allocator is just one small step away from being a garbage collector.
Usual Java or .NET implementations do not offer real-time garbage collection, in the sense of guaranteed response times, but their GC are still heavily optimized and have very short response times most of the time. Under normal conditions, very short average response times are better than guaranteed response times ("guaranteed" does not mean "fast").
Also, note that usual Java or .NET implementations run on operating systems which are not real-time either (the OS can decide to schedule other threads, or may aggressively send some data to a swap file, and so on), and neither is the underlying hardware (e.g. a typical hard disk may make "recalibration pauses" on time to time). If you are ready to tolerate the occasional timing glitch due to the hardware, then you should be fine with a (carefully tuned) JVM garbage collector. Even for games.

It is a potential problem, BUT...
Your character might also freeze in the middle of your C++ program while the OS retrieves a page of memory from an overtaxed hard disk. If you are not using a real-time OS on hardware designed to provide concrete performance guarantees, you are never guaranteed performance.
To get a more specific answer, you'd have to ask about a specific implementation of a specific virtual machine. You can use a garbage-collected virtual machine for real-time systems if it provides suitable performance guarantees about garbage collection.

You bet it is a problem. If you are writing low-latency applications you cannot afford the stop-the-world pauses that most garbage collectors impose. Since Java does not allow you to turn off the GC, your only option is to produce no garbage. That can be done and has been done through object pooling and bootstrapping. I wrote a blog article where I talk about this in detail.

Our company is employing a large .Net-based software application that amongst other things monitors binary sensors over fieldbus networks. In some situations, the sensors activate only for a short amount of time (300 ms) but our software still needs to capture those events as the controlled system will immediately fail when an event is missed. We recently observed increased problems at our customer sites due to the garbage collector running for long timespans (up to 1 second). We are still trying to figure out how to enforce a time limit on the garbage collector. In conclusion of this short story, i would say the garbage collector is a handicap in time critical applications.

On switching out garbage collectors in Java

Recently I heard Kirk Pepperdine speak about changing garbage collectors for better performance -- but what exactly does that mean and what makes one garbage collector better or different than the other?

You ask two questions:
What does it mean to change garbage collectors in Java for better performance?
This is a huge topic, and like some of the other responders, I urge you to do some reading. I recommend Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning from Sun. The information below mostly comes from there. The "turbo-charging" java article recommended in another answer is older.
In brief, one of the many options we have when running the JVM is to select a garbage collector, of which there are presently three:
The serial collector (selected with the -XX:+UseSerialGC option) - this uses a single thread to do all collection work, and everything waits while it happens.
The parallel collector (selected with the -XX:+UseParallelGC option) - this does minor collections (of the young generation) in parallel, but everything waits during the major collections.
The concurrent collector (selected with the -XX:+UseConcMarkSweepGC option) - this allows most collection operations to happen while the application is running.
What makes one garbage collector better than another?
Your application does. Each of the garbage collectors has a "sweet spot" - a range of application profiles for which it is the superior collector.
First, know that the VM is pretty good at selecting a collector for you, and as with most optimizations, you should not consider second-guessing it until you've identified that your application is not performing well, and that garbage collection is the likely culprit.
In that case, you have to ask these questions: 1) is your app running on a single-processor machine, or multi? 2) Are you more concerned with "minimizing pause time", or with "maximizing throughput"? That is, if you had to choose between the application never pausing but getting less work done overall, versus getting more work done overall, but pausing from time to time, which would you pick?
Roughly speaking, as a starting point:
On a Multi-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
On a Multi-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Parallel collector (consider enabling parallel compaction)
On a Single-processor machine, with small datasets (up to roughly 100Mb), you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
Again, though, the VM does a pretty good job of selecting a collector for you, and you're better off not overriding that unless and until you discover that it's not working well enough for your application.

Some collectors are better for throughput, others are better for response time. The difference is usually in how the collector chooses to pause the application. Some such as CMS use mutiple passes to triage the garbage before stopping the application. This triage can happen in a background thread while the application is running, and thus not interfere with your application as much as one that "stops the world" to do a GC.
Edit
Check out this document by sun. Also, about half way down there is a nice image showing the default mark-compact collector against the CMS collector. A picture is worth a thousand words, but the article is a good read too ;) Also worth reading is all the documents on the new G1 collector.

The basic problem is that the way that Java program sees memory (you call "new MyObject" and there it is, and when you are done with it you just forget about it) does not map very well to the underlying operating system and hardware.
The job of the garbage collector is to identify those memory areas which are not in use by an object, and "melt" them together to give a LARGE memory area from where new objects can be allocated. This is very vaguely worded in the Java specification HOW this is done, most likely in order to provide maximum flexibility for the designers of this important task.
Several approaches exist, with advantages and disadvantages. What you usually want is a garbage collector that can keep up in the background with the rate of objects being abandoned, as the only way for it to catch up is to stop the program while catching up. That gives really bad user experiences.
A typical trend for Java objects is that either they live for a very short time (current block or method) or a very long time. Modern garbage collectors deal with this by having multiple pools so that young objects are treated differnetly than old objects.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.