On switching out garbage collectors in Java

On switching out garbage collectors in Java - java

Recently I heard Kirk Pepperdine speak about changing garbage collectors for better performance -- but what exactly does that mean and what makes one garbage collector better or different than the other?

You ask two questions:
What does it mean to change garbage collectors in Java for better performance?
This is a huge topic, and like some of the other responders, I urge you to do some reading. I recommend Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning from Sun. The information below mostly comes from there. The "turbo-charging" java article recommended in another answer is older.
In brief, one of the many options we have when running the JVM is to select a garbage collector, of which there are presently three:
The serial collector (selected with the -XX:+UseSerialGC option) - this uses a single thread to do all collection work, and everything waits while it happens.
The parallel collector (selected with the -XX:+UseParallelGC option) - this does minor collections (of the young generation) in parallel, but everything waits during the major collections.
The concurrent collector (selected with the -XX:+UseConcMarkSweepGC option) - this allows most collection operations to happen while the application is running.
What makes one garbage collector better than another?
Your application does. Each of the garbage collectors has a "sweet spot" - a range of application profiles for which it is the superior collector.
First, know that the VM is pretty good at selecting a collector for you, and as with most optimizations, you should not consider second-guessing it until you've identified that your application is not performing well, and that garbage collection is the likely culprit.
In that case, you have to ask these questions: 1) is your app running on a single-processor machine, or multi? 2) Are you more concerned with "minimizing pause time", or with "maximizing throughput"? That is, if you had to choose between the application never pausing but getting less work done overall, versus getting more work done overall, but pausing from time to time, which would you pick?
Roughly speaking, as a starting point:
On a Multi-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
On a Multi-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Parallel collector (consider enabling parallel compaction)
On a Single-processor machine, with small datasets (up to roughly 100Mb), you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
Again, though, the VM does a pretty good job of selecting a collector for you, and you're better off not overriding that unless and until you discover that it's not working well enough for your application.

Some collectors are better for throughput, others are better for response time. The difference is usually in how the collector chooses to pause the application. Some such as CMS use mutiple passes to triage the garbage before stopping the application. This triage can happen in a background thread while the application is running, and thus not interfere with your application as much as one that "stops the world" to do a GC.
Edit
Check out this document by sun. Also, about half way down there is a nice image showing the default mark-compact collector against the CMS collector. A picture is worth a thousand words, but the article is a good read too ;) Also worth reading is all the documents on the new G1 collector.

The basic problem is that the way that Java program sees memory (you call "new MyObject" and there it is, and when you are done with it you just forget about it) does not map very well to the underlying operating system and hardware.
The job of the garbage collector is to identify those memory areas which are not in use by an object, and "melt" them together to give a LARGE memory area from where new objects can be allocated. This is very vaguely worded in the Java specification HOW this is done, most likely in order to provide maximum flexibility for the designers of this important task.
Several approaches exist, with advantages and disadvantages. What you usually want is a garbage collector that can keep up in the background with the rate of objects being abandoned, as the only way for it to catch up is to stop the program while catching up. That gives really bad user experiences.
A typical trend for Java objects is that either they live for a very short time (current block or method) or a very long time. Modern garbage collectors deal with this by having multiple pools so that young objects are treated differnetly than old objects.

Related

Should I use SerialGC or G1GC ona single CPU system?

I have a container that is limited to 1 CPU, the default case for java 11+ (and probably older also) in such case it to user SerialGC.
Should I force a threaded GC (like G1GC) or just leave it at SerialGC?
Which one will perform better on a single CPU?
I always assumed SerialGC is better in such case but I frequently see G1GC forced in some cases.
EDIT: I'm asking for general case, because we have a lot of different apps running using the same configuration and it is hard to test each and every case.

According to the documentation.
The serial collector uses a single thread to perform all garbage
collection work, which makes it relatively efficient because there is
no communication overhead between threads.
It's best-suited to single processor machines because it can't take
advantage of multiprocessor hardware, although it can be useful on
multiprocessors for applications with small data sets (up to
approximately 100 MB).
I'm assuming processor = core in the documentation (and your question). While the documentation says that the serial collector is not a good option for multi-core machines, it doesn't say that other collectors would be bad for a single-core machine.
The other collectors do tend to use multiple threads though, and you won't get the full benefits of those in a single-core environment.
So why have you seen G1GC used? Maybe no reason other than it was the newest. However if there is a reason, it would most likely be the shorter GC pauses that G1 provides:
If response time is more important than overall throughput and garbage
collection pauses must be kept shorter than approximately one second,
then select a mostly concurrent collector with -XX:+UseG1GC or
-XX:+UseConcMarkSweepGC.
The best case scenario is that in those cases they measured the performance with different collectors and chose the one that provided the best results.
Also consider the String deduplication Holger mentioned in the comments. This is a specific memory optimization that can be the reason behind using G1GC. After all if you have a single core, you probably don't have a lot of memory at your disposal either.

What do you want to optimize? Do you want to be able always to answer extremely fast or to have better overall performance? In the first case, you should aim for shorter GC pauses, in the second for the lower sum of all the GC pauses.
There are other factors that you have in mind (i.e. how often applications are restarted) so IMO the best approach is a data-driven approach. Use GC easy or GC viewer to analyze the performance of each application and act accordingly.
Please have in mind that GC tuning is not always required so if you do not know what you want to achieve you probably optimize prematurely.
In general:
use The Serial GC for applications that do not have low pause time requirements and are run in the environment with low resources
go with G1 Garbage Collector if you have more resources or you need to answer fast (remember to measure the performance before and after the change)

As a more general comment, don't make the assumption that because you only have a single core/CPU that making a task multi-threaded will have no benefit. Depending on the task involved (in this case GC), there may well be situations where one thread becomes blocked (e.g. waiting for IO to complete), which allows other threads performing another part of the task to use the processor and complete useful work. Overall performance is increased, despite only one thread being able to run at a time.

One important thing that has not been mentioned in this thread is that the G1GC can return the memory (uncommit it) back to the OS, so if other applications are running on the server, they can make use of it.
I noticed this when switching from a single vCPU server to 2 vCPU server, as java by default uses SerialGC for single CPU and G1GC for multi-CPU (well at least it does for JDK 11)

What is the use of garbage collector methods in java?

System.gc() or Runtime.gc() if you call there is no guarantee that there will be a garbage collection. Its up to the JVM to perform a GC. Then what is the point in having such methods?

The javadoc for System.gc() and Runtime.gc() is alluding to the fact that it is possible to configure a JVM to ignore calls to those methods; e.g. using the -XX:+DisableExplicitGC JVM option.
However, they are not configured that way by default (at least in current versions of Oracle and OpenJDK Java). So, the calls will do something by default.
Having said that, in most situations it is a bad idea to call the garbage collector directly. They few cases where it is reasonable are mostly covered by the following:
if you are trying to investigate or test behavior of GC sensitive code; e.g. finalizers
if you are trying to avoid a GC pause at an inconvenient point, by running the GC at a point where the user won't notice.
I don't understand what is wrong with giving a guaranteed GC when I request System.gc()?
When you are able to invoke the garbage collector via a gc() call, it typically does a full collection. That is expensive, especially when the amount of non-garbage data is large1. Unfortunately, a lot of Java programmers don't realize this. So, (as I understand it) the primary reason for the JVM option to ignore explicit gc() calls is to mitigate the potentially catastrophic performance effect of programmers abusing the method.
If you do want your System.gc() calls to trigger a GC, the best advice is to make sure that you don't include -XX:+DisableExplicitGC in your JVM options.
Read the Oracle manual entry for the java command for more information.
1 - Most of the runtime cost of a garbage collection is in tracing and copying the graph of objects that are still reachable. If you tell the collector to run before it needs to, you reduce its efficiency. By contrast, the JVM itself knows when the heap is full, or close enough that a collection is warranted. Indeed, it can optimize for two different requirements; maximizing throughput, or minimizing GC pause times.

From the Java 7 docs
public static void gc()
Runs the garbage collector.
Calling the gc
method suggests that the Java Virtual Machine expend effort toward
recycling unused objects in order to make the memory they currently
occupy available for quick reuse. When control returns from the method
call, the Java Virtual Machine has made a best effort to reclaim space
from all discarded objects.
The call System.gc() is effectively equivalent to the call:
Runtime.getRuntime().gc()
So, essentially, it's a suggestion to the GC heuristics that right now is a good time to free some memory. For example, say you're writing a game where the framerate is locked to 60FPS. Each frame has a budget of 16.6 (repeating, of course ;)) milliseconds. Say your frame only takes 5ms to run. Usually, you would wait the remaining time with Thread.sleep. However, you could instead opt to call System.gc() first, to tell the VM "hey, I have some extra time -- feel free to clean up while I wait". Of course, you have no guarantee that the garbage collection will take less than the 11.6MS you have remaining! But if done carefully it can help your memory usage and prevent garbage collection from happening at a bad time. Similar principles apply to other kinds of applications -- basically, if you know that your application will have some downtime, you can let the VM know with System.gc() and hopefully prevent the GC from instead deciding to run in the middle of something important.

These methods are provided if someone wants to take hold of garbage collection(which was valid for old JVMs) but its always wise to leave garbage part on JVM specially Modern JVM implementations have highly optimized garbage collectors. Our jobs has made easy by modren JVM implementations so we have to only focus on java code.

Basically you are right saying there is no guarantee that jvm will actually start gc right after your System.gc() call. However gc can take a note of your willing to make collection now and actually run it.
It depends from jvm, but as far as I know hotspot jvm actually runs gc after System.gc() or Runtime.gc(), at least most of the times.
So I would say that not having at least one way to suggest vm to run gc would have been a mistake. There can be different vm implementations, and what if there is a vm which wants to provide possibility to call gc with guarantee that it would actually run after such call, it wouldn't break specification and it might be usefull for some cases, and as I already mentioned hotspot vm most probably wouldn't ignore this call.

You are correct, gc() calls should not be provided in the first place when there is no use case. I can think of at least one positive and two negative points of using explicit GC calls:
If you are building an application where you have little control of JVM options and want to achieve some God level tuning from within the code, you may use an explicit call. But rest assured, that this is not a magic call to get GC done in scenarios where you suddenly expect low load for a few minutes. You may need to put in a lot of effort to achieve that like estimating responsiveness of GC, amount of memory to be collected, etc.
System.gc() or Runtime.getRuntime().gc() may serve as a reminder or suggestion, but it is completely a prerogative of JVM to do so or not. On the contrary, it might not do anything at all upon seeing such request. Reference: Oracle Java
Having said that, it's usually avoided because GC is something which can be controlled and handled via external JVM options rather than from code itself. For example: -XX:-DisableExplicitGC

Multiple threads and garbage collection in Java

Can someone give me some advice on this? I am reading in an old text and some notes from my teacher that when using multiple threads with Java it's necessary to write a special program for garbage collection.
Does this still apply in Java SE6 and above? If it does could someone provide the standard way to do this.

Using a garbage collector makes writing multi-threaded code easier. This is because manual freeing of resources in a multi-threaded context is hard to get right. With GC its something you don't need to worry about most of the time.
I am reading that when using multiple threads it's necessary to write a special program for garbage collection.
I don't believe this was ever the case.
Does this still apply in SE6 and above and if so is there a standard way to do this.
The standard way to do this is to not reference objects you don't need. e.g. if you have a local variable you don't need, let it drop out of scope.
It doesn't have to be complicated.

As far as I know, as long if nothing is pointing to an object, that object get's freed by the garbage collector.
Java's garbage collector is very robust in terms of circular referencing, I don't see why It won't work with multiple threads running at the same time.
So it is safe for you to assume that you don't need to write a special program for garbage collection, because java will do it for you very effectively.
If you want to free objects in java, just make sure that no variables are referencing your object. (Including structures (lists, arrays, etc) from java collections or other libraries)

This article from JavaWorld in 2003, J2SE 1.4.1 boosts garbage collection, has this to say about the Java garbage collection prior to J2SE 1.4.1:
Mark and sweep is a "stop-the-world" garbage collection technique;
that is, all application threads stop until garbage collection
completes, or until a higher-priority thread interrupts the garbage
collector. If the garbage collector is interrupted, it must restart,
which can lead to application thrashing with little apparent result.
The other problem with mark and sweep is that many types of
applications can't tolerate its stop-the-world nature. That is
especially true of applications that require near real-time behavior
or those that service large numbers of transaction-oriented clients.
An article in Dr. Dobbs from 2009, G1: Java's Garbage First Garbage Collector, has this to say about Java garbage collector before SE 6.
Until recently, Java SE came with two main collectors: the parallel
collector, and the concurrent-mark-sweep (CMS) collector -- see the
sidebar Parallelism and Concurrency. As of the latest Java SE 6 update
release, the G1 collector is another option. The plan is for G1 to
eventually replace CMS as a low-pause, soft real-time collector. Let's
take a look at how it works.
So it may be that prior to SE 6 some additional precautions to assist with Java garbage collection may have helped, especially with multi-threaded applications with a fair amount of temporary variables generating garbage that needed collecting. However this should entail at most an explicit call to the garbage collector during slow times. Writing something special would seem very unusual.
However things are much more improved than they were. Plus garbage collection can vary between different versions of Java Virtual Machines.
So what may have been true years ago is almost definitely not true now with current technology.
This posting, How to monitor Java memory usage?, discusses monitoring Java memory usage as well as some of the pros and cons of calling the garbage collector explicitly.
Oracle has a Java Garbage Collection Basics tutorial that covers Java SE 7 Hotspot JVM.

Use following code to call garbage collector explicitly
Runtime runtime = Runtime.getRuntime();
runtime.gc();
But it is not needed, jvm will automatically handle correct timely running of GC.

Almost certainly your instructor's notes are stating (correctly) that since Java is a multithreaded environment, more care is needed when implementing the garbage collector inside the Java run time environment than would be necessary if only a single thread were involved. This is true of any multithreaded environment.
As others have said, you the programmer don't see any of this complexity. That's the gift of automatic memory management that gc provides.

Is a garbage collector (.net/java) an issue for real-time systems?

When building a system which needs to respond very consistently and fast, is having a garbage collector a potential problem?
I remember horror stories from years ago where the typical example always was an action game where your character would stop for a few seconds in mid-jump, when the garbage collector would do its cleanup.
We are some years further, but I'm wondering if this is still an issue. I read about the new garbage collector in .Net 4, but it still seems a lot like a big black box, and you just have to trust everything will be fine.
If you have a system which always has to be quick to respond, is having a garbage collector too big of a problem and is it better to chose for a more hardcore, control it yourself language like c++? I would hate it that if it turns out to be a problem, that there is basically almost nothing you can do about it, other than waiting for a new version of the runtime or doing very weird things to try and influence the collector.
EDIT
thanks for all the great resources. However, it seems that most articles/custom gc's/solutions pertain to the Java environment. Does .Net also have tuning capabilities or options for a custom GC?

To be precise, garbage collectors are a problem for real-time systems. To be even more precise, it is possible to write real-time software in languages that have automatic memory management.
More details can be found in the Real Time Specification for Java on one of the approaches for achieving real-time behavior using Java. The idea behind RTSJ is very simple - do not use a heap. RTSJ provides for new varieties of Runnable objects that ensure threads do not access heap memory of any kind. Threads can either access scoped memory (nothing unusual here; values are destroyed when the scope is closed) or immortal memory (that exists throughout the application lifetime). Variables in the immortal memory are written over, time and again with new values.
Through the use of immortal memory, RTSJ ensures that threads do not access the heap, and more importantly, the system does not have a garbage collector that preempts execution of the program by the threads.
More details are available in the paper "Project Golden Gate: Towards Real-Time Java in Space Missions" published by JPL and Sun.

I've written games in Java and .NET and never found this to be a big problem. I expect your "horror stories" are based on the garbage collectors of many years ago - the technology really has moved a long way since then.
The only thing I would hesitate to use Java/.NET for on the the basis of garbage collection would be something like embedded programming with hard real time constraints (e.g. motion controllers).
However you do need to be aware of GC pauses and all of the following can be helpful in minimising the risk of GC pauses:
Minimise new object allocations - while object allocations are extremely fast in modern GC systems, they do contribute to future pauses so should be minimised. You can use techniques like pre-allocating arrays of objects, keeping object pools or using unboxed primitives.
Use specialized low-latency libraries such as Javalution for heavily used functions and data types. These are designed specifically for real-time / low latency application
Make sure you are using the best GC algorithm when there are multiple versions available. I've heard good things about the Sun G1 Collector for low latency applications. The best GC systems do most of their collections concurrently so that garbage collections do not have to "stop the world" for very long if at all.
Tune the GC parameters appropriately. Usually there is a trade-off between overall performance and pause times, you may want to improve the latter at the expense of the former.
If you're very rich, you can of course buy machines with hardware GC support. :-)

Yes, garbage must be handled in a deterministic manner in real-time systems.
One approach is to schedule a certain amount of garbage collection time during each memory allocation. This is called "work-based garbage collection." The idea is that in the absence of leaks, allocation and collection should be proportional.
Another simple approach ("time-based garbage collection") is to schedule a certain proportion of time for periodic garbage collection, whether it is needed or not.
In either case, it is possible that a program will run out of usable memory because it is not allowed to spend enough time to do a full garbage collection. This is in contrast to a non-realtime system, which is permitted to pause as long as it needs to in order to collect garbage.

On a theoretical point of view, garbage collectors are not a problem but a solution. Real-time systems are hard, when there is dynamic memory allocation. In particular, the usual C functions malloc() and free() do not offer real-time guarantees (they are normally fast but have, at least theoretically, "worst cases" where they use inordinate amounts of time).
It so happens that it is possible to build a dynamic memory allocator which offers real-time guarantees, but this requires the allocator to do some heavy stuff, in particular moving some objects in RAM. Object moving implies adjusting pointers (transparently, from the application code point of view), and at that point the allocator is just one small step away from being a garbage collector.
Usual Java or .NET implementations do not offer real-time garbage collection, in the sense of guaranteed response times, but their GC are still heavily optimized and have very short response times most of the time. Under normal conditions, very short average response times are better than guaranteed response times ("guaranteed" does not mean "fast").
Also, note that usual Java or .NET implementations run on operating systems which are not real-time either (the OS can decide to schedule other threads, or may aggressively send some data to a swap file, and so on), and neither is the underlying hardware (e.g. a typical hard disk may make "recalibration pauses" on time to time). If you are ready to tolerate the occasional timing glitch due to the hardware, then you should be fine with a (carefully tuned) JVM garbage collector. Even for games.

It is a potential problem, BUT...
Your character might also freeze in the middle of your C++ program while the OS retrieves a page of memory from an overtaxed hard disk. If you are not using a real-time OS on hardware designed to provide concrete performance guarantees, you are never guaranteed performance.
To get a more specific answer, you'd have to ask about a specific implementation of a specific virtual machine. You can use a garbage-collected virtual machine for real-time systems if it provides suitable performance guarantees about garbage collection.

You bet it is a problem. If you are writing low-latency applications you cannot afford the stop-the-world pauses that most garbage collectors impose. Since Java does not allow you to turn off the GC, your only option is to produce no garbage. That can be done and has been done through object pooling and bootstrapping. I wrote a blog article where I talk about this in detail.

Our company is employing a large .Net-based software application that amongst other things monitors binary sensors over fieldbus networks. In some situations, the sensors activate only for a short amount of time (300 ms) but our software still needs to capture those events as the controlled system will immediately fail when an event is missed. We recently observed increased problems at our customer sites due to the garbage collector running for long timespans (up to 1 second). We are still trying to figure out how to enforce a time limit on the garbage collector. In conclusion of this short story, i would say the garbage collector is a handicap in time critical applications.

How to monitor Java memory usage?

We have a j2ee application running on Jboss and we want to monitor its memory usage. Currently we use the following code
System.gc();
Runtime rt = Runtime.getRuntime();
long usedMB = (rt.totalMemory() - rt.freeMemory()) / 1024 / 1024;
logger.information(this, "memory usage" + usedMB);
This code works fine. That means it shows memory curve which corresponds to reality. When we create a big xml file from a DB a curve goes up, after the extraction is finished it goes down.
A consultant told us that calling gc() explicitly is wrong, "let jvm decide when to run gc". Basically his arguments were the same as disscussed here.
But I still don't understand:
how can I have my memory usage curve?
what is wrong with the explicit gc()? I don't care about small performance issues which can happen with explicit gc() and which I would estimate in 1-3%. What I need is memory and thread monitor which helps me in analysis of our system on customer site.

If you want to really look at what is going on in the VM memory you should use a good tool like VisualVM. This is Free Software and it's a great way to see what is going on.
Nothing is really "wrong" with explicit gc() calls. However, remember that when you call gc() you are "suggesting" that the garbage collector run. There is no guarantee that it will run at the exact time you run that command.

There are tools that let you monitor the VM's memory usage. The VM can expose memory statistics using JMX. You can also print GC statistics to see how the memory is performing over time.
Invoking System.gc() can harm the GC's performance because objects will be prematurely moved from the new to old generations, and weak references will be cleared prematurely. This can result in decreased memory efficiency, longer GC times, and decreased cache hits (for caches that use weak refs). I agree with your consultant: System.gc() is bad. I'd go as far as to disable it using the command line switch.

You can take a look at stagemonitor. It is a open source java (web) application performance monitor. It captures response time metrics, JVM metrics, request details (including a call stack captured by the request profiler) and more. The overhead is very low.
Optionally, you can use the great timeseries database graphite with it to store a long history of datapoints that you can look at with fancy dashboards.
Example:
Take a look at the project website to see screenshots, feature descriptions and documentation.
Note: I am the developer of stagemonitor

I would say that the consultant is right in the theory, and you are right in practice. As the saying goes:
In theory, theory and practice are the same. In practice, they are not.
The Java spec says that System.gc suggests to call garbage collection. In practice, it just spawns a thread and runs right away on the Sun JVM.
Although in theory you could be messing up some finely tuned JVM implementation of garbage collection, unless you are writing generic code intended to be deployed on any JVM out there, don't worry about it. If it works for you, do it.

Have you tried JMX?
http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html
(source: sun.com)

Peek into what is happening inside tomcat through Visual VM.
http://www.skill-guru.com/blog/2010/10/05/increasing-permgen-size-in-your-server/

Take a look at the JVM args: http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp#DebuggingOptions
XX:-PrintGC Print messages at garbage collection. Manageable.
-XX:-PrintGCDetails Print more details at garbage collection.
Manageable. (Introduced in 1.4.0.)
-XX:-PrintGCTimeStamps Print timestamps at garbage collection.
Manageable (Introduced in 1.4.0.)
-XX:-PrintTenuringDistribution Print tenuring age information.
While you're not going to upset the JVM with explicit calls to System.gc() they may not have the effect you are expecting. To really understand what's going on with the memory in a JVM with read anything and everything the Brian Goetz writes.

Explicitly running System.gc() on a production system is a terrible idea. If the memory gets to any size at all, the entire system can freeze while a full GC is running. On a multi-gigabyte-sized server, this can easily be very noticeable, depending on how the jvm is configured, and how much headroom it has, etc etc - I've seen pauses of more than 30 seconds.
Another issue is that by explicitly calling GC you're not actually monitoring how the JVM is running the GC, you're actually altering it - depending on how you've configured the JVM, it's going to garbage collect when appropriate, and usually incrementally (It doesn't just run a full GC when it runs out of memory). What you'll be printing out will be nothing like what the JVM will do on it's own - for one thing you'll probably see fewer automatic / incremental GC's as you'll be clearing the memory manually.
As Nick Holt's post points out, options to print GC activity already exist as JVM flags.
You could have a thread that just prints out free and available at reasonable intervals, this will show you actual mem useage.

If you like a nice way to do this from the command line use jstat:
http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstat.html
It gives raw information at configurable intervals which is very useful for logging and graphing purposes.

If you use java 1.5, you can look at ManagementFactory.getMemoryMXBean() which give you
numbers on all kinds of memory. heap and non-heap, perm-gen.
A good example can be found there
http://www.freshblurbs.com/explaining-java-lang-outofmemoryerror-permgen-space

If you use the JMX provided history of GC runs you can use the same before/after numbers, you just dont have to force a GC.
You just need to keep in mind that those GC runs (typically one for old and one for new generation) are not on regular intervalls, so you need to extract the starttime as well for plotting (or you plot against a sequence number, for most practical purposes that would be enough for plotting).
For example on Oracle HotSpot VM with ParNewGC, there is a JMX MBean called java.lang:type=GarbageCollector,name=PS Scavenge, it has a attribute LastGCInfo, it returns a CompositeData of the last YG scavenger run. It is recorded with duration, absolute startTime and memoryUsageBefore and memoryUsageAfter.
Just use a timer to read that attribute. Whenever a new startTime shows up you know that it describes a new GC event, you extract the memory information and keep polling for the next update. (Not sure if a AttributeChangeNotification somehow can be used.)
Tip: in your timer you might measure the distance to the last GC run, and if that is too long for the resulution of your plotting, you could invoke System.gc() conditionally. But I would not do that in a OLTP instance.

As has been suggested, try VisualVM to get a basic view.
You can also use Eclipse MAT, to do a more detailed memory analysis.
It's ok to do a System.gc() as long as you dont depend on it, for the correctness of your program.

The problem with system.gc, is that the JVM already automatically allocates time to the garbage collector based on memory usage.
However, if you are, for instance, working in a very memory limited condition, like a mobile device, System.gc allows you to manually allocate more time towards this garbage collection, but at the cost of cpu time (but, as you said, you aren't that concerned about performance issues of gc).
Best practice would probably be to only use it where you might be doing large amounts of deallocation (like flushing a large array).
All considered, since you are simply concerned about memory usage, feel free to call gc, or, better yet, see if it makes much of a memory difference in your case, and then decide.

About System.gc()… I just read in Oracle's documentation the following sentence here
The performance effect of explicit garbage collections can be measured by disabling them using the flag -XX:+DisableExplicitGC, which causes the VM to ignore calls to System.gc().
If your VM vendor and version supports that flag you can run your code with and without it and compare Performance.
Also note the previous quoted sentence is preceded by this one:
This can force a major collection to be done when it may not be necessary (for example, when a minor collection would suffice), and so in general should be avoided.

JavaMelody might be a solution for your need.
Developed for Java EE applications, this tool measure and build report about the real operation of your applications on any environments. It's free and open-source and easy to integrate into applications with some history, no database nor profiling, really lightweight.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.