What is the difference between G1GC options -XX:ParallelGCThreads vs -XX:ConcGCThreads - java

when configuring the G1GC
we have 2 kinds of thread count
-XX:ParallelGCThreads and -XX:ConcGCThreads
what is the difference, how they are going to impact,
any reference is appreciated.

G1 algorithm has phases which some of them are "stop the world" phases that stops the application during garbage collection, and it also has phases which happens concurrently while application is running(candidate marking etc..), with that information in mind:
ParallelGCThreads option affects the number of threads used for phases when application threads are stopped, and the ConcGCThreads flag affects the number of threads used for concurrent phases.

It is the setting or precisely say JVM tuning settings... we inform JVM to use how many threads in that particular type of Garbage Collection.
I hope you are already aware of what is Garbage Collection, so when JVM runs Garbage Collection, it depends on what algorithm is set for your JVM as default collector.
You might already be knowing that there are various kind of Garbage collectors available, like G1, CMS, etc.
So, based on your setting (here, number of threads) GC algorithm will try to use that many threads for heap cleanup. While JVM runs FULL GC, it halts other threads processing.
Now suppose, your application is live and performing very heavy tasks, multiple users using it for multiple purpose (say very busy app), and JVM running FULL GC now, then in that case, all worker threads will come to a pause and GC will clean up. In this period, if all threads are acquired by JVM, then user will see delay in response. So, you can tell JVM, hey Use only that much (number) thread on that type (CMS or Parallel) of garbage collection run.
To get more on GC types and how they differ, whats suits your needs, refer some good article and docs from oracle.
Here is one reference for the options you mentioned.
-XX:ParallelGCThreads: Sets the number of threads used during parallel phases of the garbage collectors. The default value varies with the platform on which the JVM is running.
-XX:ConcGCThreads: Number of threads concurrent garbage collectors will use. The default value varies with the platform on which the JVM
is running.

Related

Should I use SerialGC or G1GC ona single CPU system?

I have a container that is limited to 1 CPU, the default case for java 11+ (and probably older also) in such case it to user SerialGC.
Should I force a threaded GC (like G1GC) or just leave it at SerialGC?
Which one will perform better on a single CPU?
I always assumed SerialGC is better in such case but I frequently see G1GC forced in some cases.
EDIT: I'm asking for general case, because we have a lot of different apps running using the same configuration and it is hard to test each and every case.
According to the documentation.
The serial collector uses a single thread to perform all garbage
collection work, which makes it relatively efficient because there is
no communication overhead between threads.
It's best-suited to single processor machines because it can't take
advantage of multiprocessor hardware, although it can be useful on
multiprocessors for applications with small data sets (up to
approximately 100 MB).
I'm assuming processor = core in the documentation (and your question). While the documentation says that the serial collector is not a good option for multi-core machines, it doesn't say that other collectors would be bad for a single-core machine.
The other collectors do tend to use multiple threads though, and you won't get the full benefits of those in a single-core environment.
So why have you seen G1GC used? Maybe no reason other than it was the newest. However if there is a reason, it would most likely be the shorter GC pauses that G1 provides:
If response time is more important than overall throughput and garbage
collection pauses must be kept shorter than approximately one second,
then select a mostly concurrent collector with -XX:+UseG1GC or
-XX:+UseConcMarkSweepGC.
The best case scenario is that in those cases they measured the performance with different collectors and chose the one that provided the best results.
Also consider the String deduplication Holger mentioned in the comments. This is a specific memory optimization that can be the reason behind using G1GC. After all if you have a single core, you probably don't have a lot of memory at your disposal either.
What do you want to optimize? Do you want to be able always to answer extremely fast or to have better overall performance? In the first case, you should aim for shorter GC pauses, in the second for the lower sum of all the GC pauses.
There are other factors that you have in mind (i.e. how often applications are restarted) so IMO the best approach is a data-driven approach. Use GC easy or GC viewer to analyze the performance of each application and act accordingly.
Please have in mind that GC tuning is not always required so if you do not know what you want to achieve you probably optimize prematurely.
In general:
use The Serial GC for applications that do not have low pause time requirements and are run in the environment with low resources
go with G1 Garbage Collector if you have more resources or you need to answer fast (remember to measure the performance before and after the change)
As a more general comment, don't make the assumption that because you only have a single core/CPU that making a task multi-threaded will have no benefit. Depending on the task involved (in this case GC), there may well be situations where one thread becomes blocked (e.g. waiting for IO to complete), which allows other threads performing another part of the task to use the processor and complete useful work. Overall performance is increased, despite only one thread being able to run at a time.
One important thing that has not been mentioned in this thread is that the G1GC can return the memory (uncommit it) back to the OS, so if other applications are running on the server, they can make use of it.
I noticed this when switching from a single vCPU server to 2 vCPU server, as java by default uses SerialGC for single CPU and G1GC for multi-CPU (well at least it does for JDK 11)

Garbage collector for young generation

Have sort question - is it true that all GC in JDK 7 (other than G1) always use stop-the-world for young generation collection?
thanks
For OpenJDK, JRockit, IBM JVM, and Sun/Oracle JDK, the young collection is always stop the world for every available collector.
The only JVM I know of which does not have a stop the world collector is Azul's Zing. (Not free)
While OpenJDK/Hotspot has CMS this is mostly concurrent. There is still stop the world portions and in some cases CMS will fall back to a Full GC which is stop-the-world.
AFAIK, It is hard to find real world examples where G1 is faster in terms of pause time than CMS, however it is improving all the time.
Do your GC logs speak to you
All (almost) Java garbage collectors has some sort of a Stop-the-world phase where all the Java threads (not native threads) are suspended waiting for exclusive system operations to complete. This state is sometimes referred to as a safepoint.
The modern garbage collectors are concurrently running together with the applications threads, which means that the garbage collector perform its work at the same time as the application threads are running. During the garbage collector process there are phases where exclusive access memory is needed, in that phase the application Java threads goes into the safepoint state.
One alternative to get rid of the stop-the-world garbage collections is to go for the Zing JVM with the C4 collector from Azul systems. The implementation has a low pause approach with no stop-the-world collections at all. Instead it is using a concurrent compacting approach with no stop-the-world phase.
No it is not true. Java 7 also supports the older Concurrent Mark Sweep (CMS) collector. CMS is a low pause collector, just like G1.
UPDATE
Apparently CMS is only for the tenured generation ... according to the blog posting that you found at http://blogs.oracle.com/jonthecollector/entry/our_collectors
So that means that your proposition is in fact true.
One could argue that all of the low-pause collectors:
- need to stop the mutator threads to do some phases of their work, and
- may fall back to a Full GC using the mark/sweep collector when they can't keep up.
However, there is a qualitive difference between "mostly concurrent" collectors like G1 and CMS, and other collectors that suspend non-GC threads for the entire duration of the collection process. That is what is normally meant by a "stop the world" strategy.

Can the OS stop a Java process from garbage collecting?

I'm monitoring a production system with AppDynamics and we just had the system slow to a crawl and almost freeze up. Just prior to this event, AppDynamics is showing all GC activity (minor and major alike) flatline for several minutes...and then come back to life.
Even during periods of ultra low load on the system, we still see our JVMs doing some GC activity. We've never had it totally flatline and drop to 0.
Also - the network I/O flatlined at the same instance of time as the GC/memory flatline.
So I ask: can something at the system level cause a JVM to freeze, or cause its garbage collection to hang/freeze? This is on a CentOS machine.
Does your OS have swapping enabled.
I've noticed HUGE problems with Java once it fills up all the ram on an OS with swapping enabled--it will actually devistate windows systems, effictevly locking them up and causing a reboot.
My theory is this:
The OS ram gets near full.
The OS requests memory back from Java.
This Triggers Java into a full GC to attempt to release memory.
The full GC touches nearly every piece of the VMs memory, even items that have been swapped out.
The system tries to swap data back into memory for the VM (on a system that is already out of ram)
This keeps snowballing.
At first it doesn't effect the system much, but if you try to launch an app that wants a bunch of memory it can take a really long time, and your system just keeps degrading.
Multiple large VMs can make this worse, I run 3 or 4 huge ones and my system now starts to sieze when I get over 60-70% RAM usage.
This is conjecture but it describes the behavior I've seen after days of testing.
The effect is that all the swapping seems to "Prevent" gc. More accurately the OS is spending most of the GC time swapping which makes it look like it's hanging doing nothing during GC.
A fix--set -Xmx to a lower value, drop it until you allow enough room to avoid swapping. This has always fixed my problem, if it doesn't fix yours then I'm wrong about the cause of your problem :)
It is really difficult to find the exact cause of your problem without more information.
But I can try to answer to your question :
Can the OS block the garbage collection ?
It is very unlikely than your OS blocks the thread garbage collector and let the other threads run. You should not investigate that way.
Can the OS block the JVM ?
Yes it perflecty can and it do it a lot, but so fast than you think that the processes are all running at the same time. jvm is a process like the other and his under the control of the OS. You have to check the cpu used by the application when it hangs (with monitoring on the server not in the jvm). If it is very low then I see 2 causes (but there are more) :
Your server doesn't have enough RAM and is swapping (RAM <-> disk), process becomes extremely slow. In this case cpu will be high on the server but low for the jvm
Another process or server grabs the resources and your application or server receive nothing. Check the priority on CentOs.
In theory, YES, it can. But it practice, it never should.
In most Java virtual machines, application threads are not the only threads that are running. Apart from the application threads, there are compilation threads, finalizer threads, garbage collection threads, and some more. Scheduling decisions for allocating CPU cores to these threads and other threads from other programs running on the machine are based on many parameters (thread priorities, their last execution time, etc), which try be fair to all threads. So, in practice no thread in the system should be waiting for CPU allocation for an unreasonably long time and the operating system should not block any thread for an unlimited amount of time.
There is minimal activity that the garbage collection threads (and other VM threads) need to do. They need to check periodically to see if a garbage collection is needed. Even if the application threads are all suspended, there could be other VM threads, such as the JIT compiler thread or the finalizer thread, that do work and ,hence, allocate objects and trigger garbage collection. This is particularly true for meta-circular JVM that implement VM threads in Java and not in a C/C++;
Moreover, most modern JVM use a generational garbage collector (A garbage collector that partitions the heap into separate spaces and puts objects with different ages in different parts of the heap) This means as objects get older and older, they need to be moved to other older spaces. Hence, even if there is no need to collect objects, a generational garbage collector may move objects from one space to another.
Of course the details of each garbage collector in different from JVM to JVM. To put more salt on the injury, some JVMs support more than one type of garbage collector. But seeing a minimal garbage collection activity in an idle application is no surprise.

Is a garbage collector (.net/java) an issue for real-time systems?

When building a system which needs to respond very consistently and fast, is having a garbage collector a potential problem?
I remember horror stories from years ago where the typical example always was an action game where your character would stop for a few seconds in mid-jump, when the garbage collector would do its cleanup.
We are some years further, but I'm wondering if this is still an issue. I read about the new garbage collector in .Net 4, but it still seems a lot like a big black box, and you just have to trust everything will be fine.
If you have a system which always has to be quick to respond, is having a garbage collector too big of a problem and is it better to chose for a more hardcore, control it yourself language like c++? I would hate it that if it turns out to be a problem, that there is basically almost nothing you can do about it, other than waiting for a new version of the runtime or doing very weird things to try and influence the collector.
EDIT
thanks for all the great resources. However, it seems that most articles/custom gc's/solutions pertain to the Java environment. Does .Net also have tuning capabilities or options for a custom GC?
To be precise, garbage collectors are a problem for real-time systems. To be even more precise, it is possible to write real-time software in languages that have automatic memory management.
More details can be found in the Real Time Specification for Java on one of the approaches for achieving real-time behavior using Java. The idea behind RTSJ is very simple - do not use a heap. RTSJ provides for new varieties of Runnable objects that ensure threads do not access heap memory of any kind. Threads can either access scoped memory (nothing unusual here; values are destroyed when the scope is closed) or immortal memory (that exists throughout the application lifetime). Variables in the immortal memory are written over, time and again with new values.
Through the use of immortal memory, RTSJ ensures that threads do not access the heap, and more importantly, the system does not have a garbage collector that preempts execution of the program by the threads.
More details are available in the paper "Project Golden Gate: Towards Real-Time Java in Space Missions" published by JPL and Sun.
I've written games in Java and .NET and never found this to be a big problem. I expect your "horror stories" are based on the garbage collectors of many years ago - the technology really has moved a long way since then.
The only thing I would hesitate to use Java/.NET for on the the basis of garbage collection would be something like embedded programming with hard real time constraints (e.g. motion controllers).
However you do need to be aware of GC pauses and all of the following can be helpful in minimising the risk of GC pauses:
Minimise new object allocations - while object allocations are extremely fast in modern GC systems, they do contribute to future pauses so should be minimised. You can use techniques like pre-allocating arrays of objects, keeping object pools or using unboxed primitives.
Use specialized low-latency libraries such as Javalution for heavily used functions and data types. These are designed specifically for real-time / low latency application
Make sure you are using the best GC algorithm when there are multiple versions available. I've heard good things about the Sun G1 Collector for low latency applications. The best GC systems do most of their collections concurrently so that garbage collections do not have to "stop the world" for very long if at all.
Tune the GC parameters appropriately. Usually there is a trade-off between overall performance and pause times, you may want to improve the latter at the expense of the former.
If you're very rich, you can of course buy machines with hardware GC support. :-)
Yes, garbage must be handled in a deterministic manner in real-time systems.
One approach is to schedule a certain amount of garbage collection time during each memory allocation. This is called "work-based garbage collection." The idea is that in the absence of leaks, allocation and collection should be proportional.
Another simple approach ("time-based garbage collection") is to schedule a certain proportion of time for periodic garbage collection, whether it is needed or not.
In either case, it is possible that a program will run out of usable memory because it is not allowed to spend enough time to do a full garbage collection. This is in contrast to a non-realtime system, which is permitted to pause as long as it needs to in order to collect garbage.
On a theoretical point of view, garbage collectors are not a problem but a solution. Real-time systems are hard, when there is dynamic memory allocation. In particular, the usual C functions malloc() and free() do not offer real-time guarantees (they are normally fast but have, at least theoretically, "worst cases" where they use inordinate amounts of time).
It so happens that it is possible to build a dynamic memory allocator which offers real-time guarantees, but this requires the allocator to do some heavy stuff, in particular moving some objects in RAM. Object moving implies adjusting pointers (transparently, from the application code point of view), and at that point the allocator is just one small step away from being a garbage collector.
Usual Java or .NET implementations do not offer real-time garbage collection, in the sense of guaranteed response times, but their GC are still heavily optimized and have very short response times most of the time. Under normal conditions, very short average response times are better than guaranteed response times ("guaranteed" does not mean "fast").
Also, note that usual Java or .NET implementations run on operating systems which are not real-time either (the OS can decide to schedule other threads, or may aggressively send some data to a swap file, and so on), and neither is the underlying hardware (e.g. a typical hard disk may make "recalibration pauses" on time to time). If you are ready to tolerate the occasional timing glitch due to the hardware, then you should be fine with a (carefully tuned) JVM garbage collector. Even for games.
It is a potential problem, BUT...
Your character might also freeze in the middle of your C++ program while the OS retrieves a page of memory from an overtaxed hard disk. If you are not using a real-time OS on hardware designed to provide concrete performance guarantees, you are never guaranteed performance.
To get a more specific answer, you'd have to ask about a specific implementation of a specific virtual machine. You can use a garbage-collected virtual machine for real-time systems if it provides suitable performance guarantees about garbage collection.
You bet it is a problem. If you are writing low-latency applications you cannot afford the stop-the-world pauses that most garbage collectors impose. Since Java does not allow you to turn off the GC, your only option is to produce no garbage. That can be done and has been done through object pooling and bootstrapping. I wrote a blog article where I talk about this in detail.
Our company is employing a large .Net-based software application that amongst other things monitors binary sensors over fieldbus networks. In some situations, the sensors activate only for a short amount of time (300 ms) but our software still needs to capture those events as the controlled system will immediately fail when an event is missed. We recently observed increased problems at our customer sites due to the garbage collector running for long timespans (up to 1 second). We are still trying to figure out how to enforce a time limit on the garbage collector. In conclusion of this short story, i would say the garbage collector is a handicap in time critical applications.

On switching out garbage collectors in Java

Recently I heard Kirk Pepperdine speak about changing garbage collectors for better performance -- but what exactly does that mean and what makes one garbage collector better or different than the other?
You ask two questions:
What does it mean to change garbage collectors in Java for better performance?
This is a huge topic, and like some of the other responders, I urge you to do some reading. I recommend Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning from Sun. The information below mostly comes from there. The "turbo-charging" java article recommended in another answer is older.
In brief, one of the many options we have when running the JVM is to select a garbage collector, of which there are presently three:
The serial collector (selected with the -XX:+UseSerialGC option) - this uses a single thread to do all collection work, and everything waits while it happens.
The parallel collector (selected with the -XX:+UseParallelGC option) - this does minor collections (of the young generation) in parallel, but everything waits during the major collections.
The concurrent collector (selected with the -XX:+UseConcMarkSweepGC option) - this allows most collection operations to happen while the application is running.
What makes one garbage collector better than another?
Your application does. Each of the garbage collectors has a "sweet spot" - a range of application profiles for which it is the superior collector.
First, know that the VM is pretty good at selecting a collector for you, and as with most optimizations, you should not consider second-guessing it until you've identified that your application is not performing well, and that garbage collection is the likely culprit.
In that case, you have to ask these questions: 1) is your app running on a single-processor machine, or multi? 2) Are you more concerned with "minimizing pause time", or with "maximizing throughput"? That is, if you had to choose between the application never pausing but getting less work done overall, versus getting more work done overall, but pausing from time to time, which would you pick?
Roughly speaking, as a starting point:
On a Multi-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
On a Multi-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Parallel collector (consider enabling parallel compaction)
On a Single-processor machine, with small datasets (up to roughly 100Mb), you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
Again, though, the VM does a pretty good job of selecting a collector for you, and you're better off not overriding that unless and until you discover that it's not working well enough for your application.
Some collectors are better for throughput, others are better for response time. The difference is usually in how the collector chooses to pause the application. Some such as CMS use mutiple passes to triage the garbage before stopping the application. This triage can happen in a background thread while the application is running, and thus not interfere with your application as much as one that "stops the world" to do a GC.
Edit
Check out this document by sun. Also, about half way down there is a nice image showing the default mark-compact collector against the CMS collector. A picture is worth a thousand words, but the article is a good read too ;) Also worth reading is all the documents on the new G1 collector.
The basic problem is that the way that Java program sees memory (you call "new MyObject" and there it is, and when you are done with it you just forget about it) does not map very well to the underlying operating system and hardware.
The job of the garbage collector is to identify those memory areas which are not in use by an object, and "melt" them together to give a LARGE memory area from where new objects can be allocated. This is very vaguely worded in the Java specification HOW this is done, most likely in order to provide maximum flexibility for the designers of this important task.
Several approaches exist, with advantages and disadvantages. What you usually want is a garbage collector that can keep up in the background with the rate of objects being abandoned, as the only way for it to catch up is to stop the program while catching up. That gives really bad user experiences.
A typical trend for Java objects is that either they live for a very short time (current block or method) or a very long time. Modern garbage collectors deal with this by having multiple pools so that young objects are treated differnetly than old objects.

Categories

Resources