Does periodic garbage collection help JVM performance?

Does periodic garbage collection help JVM performance? - java

I just encountered the following code (slightly simplified):
/* periodically requests garbagecollect to improve memory usage and
garbage collect performance under most JVMs */
static class GCThread implements Runnable {
public void run() {
while(true) {
try {
Thread.sleep(300000);
} catch (InterruptedException e) {}
System.gc();
}
}
}
Thread gcThread = new Thread(new GCThread());
gcThread.setDaemon(true);
gcThread.start();
I respect the author of the code, but no longer has easy access to ask him to defend his assertion in the comment on top.
Is this true? It very much goes against my intuition that this little hack should improve anything. I would expect the JVM to be much better equipped to decide when to perform a collection.
The code is running in a web-application running inside a IBM WebSphere on Z/OS.

It depends.
The JVM can completely ignore System.gc() so this code could do absolutely nothing.
Secondly, a GC has a cost impact. If your program wouldn't otherwise have done a GC (say, it doesn't generate much garbage, or it has a huge heap and never needs to GC) then this code will be added overhead.
If the program would normally run with just minor GCs and this code causes a major GC, you will have a negative impact.
All in all, this kind of optimisation makes absolutely no sense whatsoever unless you have concrete evidence that it provides benefit and you would need to re-evaluate that evidence every time the program materially changed.

I also share your assumption. If this is really an optimization, it would have found its way in the JVM. Calling the garbage collector should be avoided - it can even have negative effect (because you are "disturbing" the JVM)
JVMs will probably have a setting for gc interval. See here for Sun's. And Hardcoding the value is rather questionable for anything, especially for garbage collection.

Maybe it could be a good thing if your application could time the calls so that GC would occur at times when the application has nothing useful to do, so the memory is clean whenever the next load peak comes. But this is not what this loop does (it simply runs all 5 minutes), so I would advise against it.
But I invite you to test it - run the application with and without this loop for similar work volumes and measure which takes more time (and maybe total memory, if important). Repeat the test some times. Maybe we get some surprising insights.

Related

Forcing Java virtual machine to run garbage collector [duplicate]

This question already has answers here:
How to force garbage collection in Java?
(25 answers)
Closed 8 years ago.
I have a complex java application running on a large dataset. The application performs reasonably fast but as time goes it seems to eat lots of memory and slow down. Is there a way to run the JVM garbage collector without re-starting the application?

No, You cant force garbage collection.
Even using
System.gc();
You can just make a request for garbage collection but it depends on JVM to do it or not.
Also Garbage collector are smart enough to collect unused memory when required so instead of forcing garbage collection you should check if you are handling objects in a wrong way.
If you are handling objects in a wrong way (like keeping reference to unnecessary objects) there is hardly anything JVM can do to free the memory.
From Doc
Calling the gc method suggests that the Java Virtual Machine expend
effort toward recycling unused objects in order to make the memory
they currently occupy available for quick reuse. When control returns
from the method call, the Java Virtual Machine has made a best effort
to reclaim space from all discarded objects.
Open Bug regarding System.gc() documentation
The documentation for System.gc() is extremely misleading and fails to
make reference to the recommended practise of never calling
System.gc().
The choice of language leaves it unclear what the behaviour would be
when System.gc() is called and what external factors will influence
the behaviour.
Few useful link to visit when you think you should force JVM to free up some memory
1. How does garbage collection work
2. When does System.gc() do anything
3. Why is it bad practice to call System.gc()?
All says
1. You dont have control over GC in Java even System.gc() dont guarantee it.
2. Also its bad practise as forcing it may have adverse effect on performance.
3. Revisit your design and let JVM do his work :)

you should not relay on System.gc() - if you feel like you need to force GC to run it usually means that there is something wrong with your code/design. GC will run and clear your unused objects if they are ready to be created - please verify your design and think more about memory management, look as well for loops in object references.

The
System.gc()
call in java, suggest to the vm to run garbage collection. Though it doesn't guarantee that it will actually do it. Nevertheless the best solution you have. As mentioned in other responses jvisualvm utility (present in JDK since JDK 6 update 7), provides a garbage functionality as well.
EDIT:
your question open my appetite for the topic and I came across this resource:
oracle gc resource

The application performs reasonably fast but as time goes it seems to eat lots of memory and slow down.
These are a classic symptoms of a Java memory. It is likely that somewhere in your application there is a data structure that just keeps growing. As the heap gets close to full, the JVM spends an increasing proportion of its time running the GC in a (futile) attempt to claw back some space.
Forcing the GC won't fix this, because the GC can't collect the data structure. In fact forcing the GC to run just makes the application slower.
The cure for the problem is to find what is causing the memory leak, and fix it.

Performance gain/drop depends how often you need garbage collection and how much memory your jvm has and how much your program needs.
There is no certainity(its just a hint to the interpreter) of garbage collection when you call System.gc() but at least has a probability. With enough number of calls, you can achieve some statistically derived performance multiplier for only your system setup.
Below graph shows an example program's executions' consumptions and jvm was given only 1GB(no gc),1GB(gc),3GB(gc),3GB(no gc) heaps respectively to each trials.
At first, when jvm was given only 1GB memory while program needed 3.75GB, it took more than 50 seconds for the producer thread pool to complete their job because having less garbage management lead to poor object creation rate.
Second example is about %40 faster because System.gc() is called between each production of 150MB object data.
At third example, jvm is given 3GB memory space while keeping System.gc() on. More memory has given more performance as expected.
But when I turned System.gc() off at the same 3GB environment, it was faster!
Even if we cannot force it, we can have some percentage gain or drain of performance trying System.g() if we try long enough. At least on my windows-7 64 bit operating system with latest jvm .

Garbage collector runs automatically. You can't force the garbage collector.

I do not suggest that you do that but to force the garbage collector to run from within your java code you can just use all the available memory, this works because the garbage collector will run before the JVM throws OutOfMemoryError...
try {
List<Object> tempList = new ArrayList<Object>();
while (true) {
tempList.add(new byte[Integer.MAX_VALUE]);
}
} catch (OutOfMemoryError OME) {
// OK, Garbage Collector will have run now...
}

My answer is going to be different than the others but it will lead to the same point.
Explain:
YES it is possible to force the garbage collector with two methods used at the same time and in the same order this are:
System.gc ();
System.runFinalization ();
this two methods call will force the garbage collector to execute the finalise() method of any unreachable object and free the memory. however the performance of the software will down considerable this is because garbage runs in his own thread and to that one is not way to controlled and depending of the algorithm used by the garbage collector could lead to a unnecessary over processing, It is better if you check your code because it must be broken to you need use the garbage collector to work in a good manner.
NOTE: just to keep on mind this will works only if in the finalize method is not a reassignment of the object, if this happens the object will keep alive an it will have a resurrection which is technically possible.

Benchmarking inside Java code

I have been looking into benchmarking lately, I have always been interested in logging program data etc. I was interested in knowing if we can implement our own memory usage code and implement our own time consumption code efficently inside our program. I know how to check long it takes for a code to run:
public static void main(String[]args){
long start = System.currentTimeMillis();
// code
System.out.println(System.currentTimeMillis() - start);
}
I also looked into Robust Java benchmarking, Part 1: Issues, this tutorial is very comprehensive. Displays the negative effects of System.currentTimeMillis();. The tutorial then suggests that we use System.nanoTime(); (making it more accurate?).
I also looked at Determining Memory Usage in Java for memory usage. The website shows how you can implement it. The code that has been provided looks inefficent because the person is calling
long L = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
After this he calls System.gc(); (4 * 4) = 16 times. Then repeating the process again.
Doesn't this also take up memory?
So in conlusion, is it possible to implement an efficent benchmarking code inside your java program?

Yes it is possible to effectively implement performance benchmarks in java code. The important question is that any kind of performance benchmark is going to add its own overhead and how much of it do you want. System.currentMill..() is good enough benchmark for performance and in most of the cases nanoTime() is an overkill.
For memory System.gc will show you varied results for different runs (as gc run is never guranteed.) I generally use Visual VM for memory profiling (its free) and then use TDA for dumps analyzing.
One way to do it less invasively is using Aspect oriented programing. You can create just one Aspect that runs on a particular Annotation or set of methods and write an #Around advice to collect performance data.
Here is a small snippet:
public class TestAspect {
#LogPerformance
public void thisMethodNeedsToBeMonitored(){
// Do Something
}
public void thisMethodNeedsToBeMonitoredToo(){
// Do Something
}
}
#interface LogPerformance{}
#Aspect
class PerformanceAspect{
#Around("the pointcut expression to pick up all " +
"the #PerfMonitor annotated methods")
public void logPerformance(){
// log performance here
// Log it to a file
}
}

It may be impossible to benchmark without some Heisenberg effect, i.e. your benching code also being measured. However, if you measure at a high enough granularity the effect will be negligible.

Any benchmarking code is going to be less efficient than non-benchmarked code just based on having more stuff to do. That said, Java in particular causes issues as the article states due to garbage collection happening whenever the jre feels like it. Even the documentation for System.gc says it makes a "best effort".
As for your specific questions:
System.gc shouldn't take up more memory, but it will take processor resources.
It is somewhat possible based on what you're trying to benchmark. There will always be some interference. If you are willing to go outside of your code, there are tools like VisualVM to watch memory usage from outside of your application.
Edit: Corrected the wording of what System.gc docs say.

System.gc problem

I'm running a scheduling algorithm with a garbage collection snippet that looks like this:
//garbage collection
if (state.children.isEmpty()) {//if this is a leaf node (no children)
state.parent.children.remove(state);
System.gc();
}
At first, the algorithm runs smoothly with no pauses; but after a while as the tree starts getting bigger, there's some sort of pause at each gc.
So I thought, maybe if a called gc less frequent? And modified my code to this:
//garbage collection
if (state.children.isEmpty()) {//if this is a leaf node (no children)
state.parent.children.remove(state);
if(index % 10000)
System.gc();
}
But this doesn't seem to actually do any cleanup, my program would throw an outOfMemory exception anyways.
How should I implement my garbage collector correctly so as not to be called too many times?

You shouldn't need to call the garbage collector explicitly at all. It's very occasionally appropriate, but I would normally be pretty suspicious if you find you need it.
Have you tried running with detailed GC logging turned on? It can be awkward to understand at first, but it should show you what's going on. I wouldn't be surprised to find that actually you've got a leak somewhere, and it's just that by GC-ing on every iteration, you've slowed your program down enough so that you just haven't reached the point at which it bites.
How much memory have you allocated for the VM? Tweaking the memory settings (and indeed the GC settings) can have a big impact on some workloads.

The pause is probably garbage collection happening. As Frederik mentions, are you sure you have to invoke the GC manually? Generally you shouldn't need to. If you're concerned about your memory usage, feel free to prune your tree more often, but let the GC handle when to run, and don't invoke it manually.
You mentioned that your second snippet results in OutOfMemoryExceptions though, so maybe you have some other problems going on, you might want to show some more code.

What conditions would prevent the JVM from running a FULL Garbage Collection?

What conditions would prevent the JVM from running a FULL Garbage Collection when the CPU is at 5% to 8% load?
I am seeing a constant shallow GC cycle, but not able to tune the JVM to want to run FULL GC.
Where can I go to find the conditions that the JVM says "I am too busy to run".

When I was studying for my SCJP certification a lot of emphasis was made on
"You can not do anything to force the
GC to run at any given time, you can
just give hints to it"
The whole idea of having an automatic GC is precisely not having to worry about how or when it runs to clean up free memory for you. So, there is no way to actually change when or how GC does actually run... you would have to re-implement one JVM to do what you want.
There are just so many factors involved in this, there may be other, more elegant solutions for this.

It depends entirely on the garbage collector algorithm that you're using in your particular JDK. About all you can guarantee about garbage collection is that if the JVM throws an OutOfMemoryError, the garbage collector made its best effort to collect every unreachable/weakly reachable object. Even System.gc() doesn't guarantee anything, a no-op is a completely legal implementation.
Hence in that light I don't know if your question has any weight. If you truly believe that you need to tweak the garbage collector, it would help if you posted the problems you're seeing, and the profiling data that leads to believe that poor GC performance is the problem.
Outside of this, the garbage collector should be treated like a black box. The logic behind its implementation is surprisingly complex, and there's a very good chance it knows better than you what it ought to be doing at any given time. 99 times out of 100, trying to force the garbage collector to behave in a particular way will lower performance, not increase it.

It's not that it's to busy to run, but it does simply not need extra memory.

Java: enough free heap to create an object?

I recently came across this in some code - basically someone trying to create a large object, coping when there's not enough heap to create it:
try {
// try to perform an operation using a huge in-memory array
byte[] massiveArray = new byte[BIG_NUMBER];
} catch (OutOfMemoryError oome) {
// perform the operation in some slower but less
// memory intensive way...
}
This doesn't seem right, since Sun themselves recommend that you shouldn't try to catch Error or its subclasses. We discussed it, and another idea that came up was explicitly checking for free heap:
if (Runtime.getRuntime().freeMemory() > SOME_MEMORY) {
// quick memory-intensive approach
} else {
// slower, less demanding approach
}
Again, this seems unsatisfactory - particularly in that picking a value for SOME_MEMORY is difficult to easily relate to the job in question: for some arbitrary large object, how can I estimate how much memory its instantiation might need?
Is there a better way of doing this? Is it even possible in Java, or is any idea of managing memory below the abstraction level of the language itself?
Edit 1: in the first example, it might actually be feasible to estimate the amount of memory a byte[] of a given length might occupy, but is there a more generic way that extends to arbitrary large objects?
Edit 2: as #erickson points out, there are ways to estimate the size of an object once it's created, but (ignoring a statistical approach based on previous object sizes) is there a way of doing so for yet-uncreated objects?
There also seems to be some debate as to whether it's reasonable to catch OutOfMemoryError - anyone know anything conclusive?

freeMemory isn't quite right. You'd also have to add maxMemory()-totalMemory(). e.g. assuming you start up the VM with max-memory=100M, the JVM may at the time of your method call only be using (from the OS) 50M. Of that, let's say 30M is actually in use by the JVM. That means you'll show 20M free (roughly, because we're only talking about the heap here), but if you try to make your larger object, it'll attempt to grab the other 50M its contract allows it to take from the OS before giving up and erroring. So you'd actually (theoretically) have 70M available.
To make this more complicated, the 30M it reports as in use in the above example includes stuff that may be eligible for garbage collection. So you may actually have more memory available, if it hits the ceiling it'll try to run a GC to free more memory.
You can try to get around this bit by manually triggering a System.GC, except that that's not such a terribly good thing to do because
-it's not guaranteed to run immediately
-it will stop everything in its tracks while it runs
Your best bet (assuming you can't easily rewrite your algorithm to deal with smaller memory chunks, or write to a memory-mapped file, or something less memory intensive) might be to do a safe rough estimate of the memory needed and insure that it's available before you run your function.

There are some kludges that you can use to estimate the size of an existing object; you could adapt some of these to predict the size of a yet-to-be created object.
However, in this case, I think it might be best to catch the Error. First of all, asking for the free memory doesn't account for what's available after garbage collection, which will be performed before raising an OOME. And, requesting a garbage collection with System.gc() isn't reliable. It's often explicitly disabled because it can wreck performance, and if it's not disabled… well, it can wreck performance when used unnecessarily.
It is impossible to recover from most errors. However, recoverability is up to the caller, not the callee. In this case, if you have a strategy to recover from an OutOfMemoryError, it is valid to catch it and fall back.
I guess that, in practice, it really comes down to the difference between the "slow" and "fast" way. If the "slow" method is fast enough, I'd stick with that, as it's safer and simpler. And, it seems to me, allowing it to be used as a fall back means that it is "fast enough." Don't let small optimizations derail the reliability of your application.

The "try to allocate and handle the error" approach is very dangerous.
What if you barely get your memory? A later OOM exception might occur because you brought things too close to the limits. Almost any library call will allocate memory at least briefly.
During your allocation a different thread may receive an OOM exception while trying to allocate a relatively small object. Even if your allocation is destined to fail.
The only viable approach is your second one, with the corrections noted in other answers. But you have to be sure and leave extra "slop space" in the heap when you decide to use your memory intensive approach.

I don't believe that there's a reasonable, generic approach to this that could safely be assumed to be 100% reliable. Even the Runtime.freeMemory approach is vulnerable to the fact that you may actually have enough memory after a garbage collection, but you wouldn't know that unless you force a gc. But then there's no foolproof way to force a GC either. :)
Having said that, I suspect if you really did know approximately how much you needed, and did run a System.gc() beforehand, and your running in a simple single-threaded app, you'd have a reasonably decent shot at getting it right with the .freeMemory call.
If any of those constraints fail, though, and you get the OOM error, your back at square one, and therefore are probably no better off than just catching the Error subclass. While there are some risks associated with this (Sun's VM does not make a lot of guarantees about what happens after an OOM... there's some risk of internal state corruption), there are many apps for which just catching it and moving on with life will leave you with no serious harm.
A more interesting question in my mind, however, is why are there cases where you do have enough memory to do this and others where you don't? Perhaps some more analysis of the performance tradeoffs involved is the real answer?

Definitely catching error is the worst approach. Error happens when there is NOTHING you can do about it. Not even create a log, puff, like "... Houston, we lost the VM".
I didn't quite get the second reason. It was bad because it is hard to relate SOME_MEMORY to the operations? Could you rephrase it for me?
The only alternative I see, is to use the hard disk as the memory ( RAM/ROM as in the old days ) I guess that is what you're pointing in your "else slower, less demanding approach"
Every platform has its limits, java suppport as much as RAM your hardware is willing to give ( well actually you by configuring the VM ) In Sun JVM impl that could be done with the
-Xmx
Option
like
java -Xmx8g some.name.YourMemConsumingApp
For instance
Of course you may end up trying to perform an operation that takes 10 gb of RAM
If that's your case then you should definitely swap to disk.
Additionally, using the strategy pattern could make a nicer code. Although here it looks overkill:
if (isEnoughMemory(SOME_MEMORY)) {
strategy = new InMemoryStrategy();
} else {
strategy = new DiskStrategy();
}
strategy.performTheAction();
But it may help if the "else" involves a lot of code and looks bad. Furthermore if somehow you can use a third approach ( like using a cloud for processing ) you can add a third Strategy
...
strategy = new ImaginaryCloudComputingStrategy();
...
:P
EDIT
After getting the problem with the second approach: If there are some times when you don't know how much RAM is going to be consumed but you do know how much you have left, you could use a mixed approach ( RAM when you have enough, ROM[disk] when you don't )
Suppose this theorical problem.
Suppose you receive a file from a stream and don't know how big it is.
Then you perform some operation on that stream ( encrypt it for instance ).
If you use RAM only it would be very fast, but if the file is large enough as to consume all your APP memory, then you have to perform some of the operation in memory and then swap to file and save temporary data there.
The VM will GC when running out of memory, you get more memory and then you perform the other chunk. And this repeat until you have the big stream processed.
while( !isDone() ) {
if (isMemoryLow()) {
//Runtime.getRuntime().freeMemory() < SOME_MEMORY + some other validations
swapToDisk(); // and make sure resources are GC'able
}
byte [] array new byte[PREDEFINED_BUFFER_SIZE];
process( array );
process( array );
}
cleanUp();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.