Does calling System.gc() reset the statistical data for generational collection? - java

In other words:
I need to know if after calling System.gc() object instances (that are not collected) are distributed between generations in the same way as before calling System.gc().
Thanks,

After a Full GC, which a System.gc() may or may not trigger,
the Eden space will be empty, so anything in there will be moved out or cleaned up.
the Survivor spaces will swap objects from the one which has objects to the empty one, and
the Tenured space will have the same retained objects it had before but some may have been cleaned up and some may be new to that generation.
In short, the only time System.gc() won't change objects between generations is when
it doesn't do anything because it is ignored
no objects have been created or discarded since the last Full GC.
if this could be another reason why calling System.gc() is evil
Mostly because it hurts performance and you gain very little in return. Note: the RMI can trigger a full collection periodically to ensure distributed objects are cleaned up but it tries to keep this to a minimum.

First of all, System.gc() is only a hint to the JVM that you want a GC, it might not trigger one (if using -XX:+DisableExplicitGC with Hotspot for example), and you should not rely on it doing anything -- the JVM usually knows better anyway.
If it does trigger a GC, then it's just like any other GC, and objects might get promoted from the young to the old generation if they satisfy the criteria (enough generations spent in the survivor spaces for example).

Related

some questions on Garbage Collection internals?

I am trying to understand how Garbage collection process works. Came across good link .
Most of the articles says that during minor GC collection object is moved from eden to survivor space and during major GC collection
object is moved from survivor to tenured space otherwise all unreachable objects memory is reclaimed. I have three questions(need to ask
in single go as they are related) based on above statements :-
1)Minor vs Major GC collection ? What is the difference between two that one is called major and other is called minor collection?
As per my understanding during minor collection happens in parallel to application run while major collection makes application to
pause during that period.
2) What actually happens when object is moved from eden to survivor space ? Does the memory location of object is changed internally?
3) Why not just one space exist instead of three i.e eden, survivor and tenured space exist ? I know there is must be a reason behind it but i am missing it.
My point is when GC runs , collect unreachable object and leaves the reachable ones in that space only. Just one space seems to be sufficient. So what advantage three different
spaces are proving over one?
1) Minor GC occurs on new generation, major GC occurs on old generation. Whether it is parallel to the application or not depends on the kind of GC, only CMS and G1 can work concurrently
2) Yes, moving object during GC changes its physical location so all pointers to this object will be updated
3) This is to avoid often and long application freezing during GC. If it was one big heap then application would often freeze for long periods of time. JVM creates objects in small young generation, GCs in it occur frequently but quickly. Most objects created by JVM die quickly and they never get to old generation, so major GC happens rarily or it may never happen at all.
Source for my answers is this Oracle article on GC basics, so these answers would apply for HotSpot. No clue as to other VMs, although I would guess that the general idea might remain the same if the same implementation techniques were used in other VMs.
Minor vs Major GC collection? What is the difference between two that one is called major and other is called minor collection?
Minor GC is GC of the young generation, where new objects are allocated. Major GC is GC of all live objects, including the permanent generation (which is a bit interesting to me, but that's what the article says). Also, it appears that both major and minor GC are stop-the-world events.
What actually happens when object is moved from eden to survivor space? Does the memory location of object is changed internally?
I can't seem to find a reference at the moment, but I would assume so. Allowing for memory location to be changed lets compaction be performed, which improves memory allocation performance and ease. Allowing each space to be compacted separately makes sense, so I would guess that moving an object from one part of the heap to another would involve physically moving the object from one memory location to another.
Why not just one space exist instead of three (i.e eden, survivor and tenured space) exist?
Short answer: efficiency. If you have only one space, you'd have to check all objects when you GC, which becomes inefficient if you have lots of long-lived objects (and you're almost guaranteed to have a decent number in a long-running application), as those long-lived objects are likely to still be reachable from one GC to the next. Splitting the heap allows for GC to be optimized, as most of the GC efforts can be concentrated where object life can be assumed to be short (i.e. young generation), with longer-living objects being GC'd less frequently.

System.gc() for garbage collection

My code is:
class Test{
Test c=new Text();
System.out.println(c.size());
System.gc();
}
Can programmer use System.gc() for garbage collection in java? Is it preferrable? JVM performs automatically, then why should programmer to call System.gc()?
System.gc() sends a request to the GC to perform a collection cycle. This request may be served or it may be ignored, therefore neither result should be relied on.
A garbage collection cycle will happen automatically (without any action on your part), usually when the generation responsible for allocation of new objects is full or an allocation request cannot be satisfied at all.
In most cases, you should not need to call System.gc() at all in your code. System.gc() should be used in a few cases in which conditions similar to the following apply:
You know that a large amount of memory has just become unreachable.
It is essential that this amount of memory be freed quickly.
Or your program is about to enter a time-critical state where a GC cycle should happen as late as possible (or not at all) and so it helps to perform a GC cycle before you enter that state.
You have at least a rough idea of how the GC of the target environment works.
You have verified that the strategy of that GC is not optimal for your scenario at that point.
You can call it. There will be no harm in that. But there is no gaurentee that the memory of object you are expecting immediately gets free or not.
More over JVM runs GC asynchronously and we need not to drive it. JVM intelligent enough to free memory.
Just for knowing purpose it is OK, If you are really thinking about to clear memory due to XYZ reason, definitely a design flaw is there in your programm structure.
Even if you use System.gc() there is no guarantee that memory will be freed
from the oracle site
Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects.
You can call it, but is not guarantee that memory will be freed. Furthermore, if the memory was released, this could have negative consequences for the execution of your program. I will try to explain it to you, but I noticed you that my english is not very good XD
Java heap memory is divided in three zones based on objects generations. Oversimplifying: young, adult and old. When occur an invocation to GC, first it do is to check "young zone" for unused objects and liberate them. If GC doesn't free enought memory in the "young zone", it examine "adult zone". If GC doesn't free enought memory in the "adult zone", it examine the "old zone". Each generation is more costly for examine by GC than the last.
Well, objects are initialy created in young zone, if GC perform an execution in the young zone and the object is still used, that object pass to adult zone. Idem for adult -> old zones. If you invoke an execution of GC, it can think that an young object is candidate for adult object, and move it to adult zone. This causes your adult zone grows in an unnecesary way. Later, when GC have to examine adult zone, the operation is more costly for him, and your program performance can go down.

System.gc does not clears in a single run. Uses 3 or more calls for clearing

I am testing the usage of Heap size in a java application running in JDK 1.6. I use the tool VisualVM to monitor the heap usage. I found the Maximum heap size usage of around 500 MB for a few mins. I used the option "Perform GC" which calls System.gc(). The first time i used it, the Maximum heap is reduced to 410MB, then once again I used it to get 130MB and the next time to 85MB. I made all the four calls next to next without any interval. Why does the call System.gc() does not collect all the Heap to 85MB at first time. Is there any other reason behind this. Or I should try with any other methods?
The System.gc() will return when all objects have been scanned once.
An object should be finalized() AFTER it has been collected. Most objects don't implement this method but for the ones which do, they are added to a queue to be cleaned up later. This means those objects cannot be cleaned up yet (not the queue nodes which hold them) i.e. the act of triggering a GC can increase memory consumption temporarily.
Additionally there are SoftReferences to objects which may or may not be cleaned up by a GC. The assumption is these should only be cleaned up if not much else was cleaned up.
In short, not all objects can be cleaned up in one cycle.
System.gc() requests the JVM to start garbage collection. If you are expecting that GC is invoked as soon as System.gc() then it is a wrong notion. Calling it multiple times will not help. It is not possible to map System.gc() with the actual garbage collection. Also no matter how many times you call System.gc(), JVM will do the GC only when it is ready to do so. What may be happening is that heap size is getting reduced even with the first System.gc() but not exactly as soon as you call it. Garbage collection related to your first System.gc() may be finishing in background and in parallel your code is reaching third System.gc() statement.
If you are pretty sure that only adding multiple System.gc() helps you reducing the heap size. Then you need to check what all objects are getting created in JVM in between first and last System.gc(). There may be other threads creating the objects.
One possible reason might be the use of java.lang.ref.Reference types. If the GC is going to break a "Reference" this will happen after the GC proper has completed. Any objects that become unreachable as a result are left for the next GC cycle to deal with.
Finalization works the same way. If an object requires finalization, it and all of the objects reachable from it (only) are likely to only be collectable in the next GC cycle.
Then there is the issue that the GC's algorithm for shrinking the heap is non-aggressive. According to the Java HotSpot VM Options page, the GC only shrinks the heap if more than 70% is free after garbage collection. However, it is not entirely clear if this refers to a full GC or not. So you could get the GC doing a partial GC and shrinking, and then a full GC and shrinking some more.
(Some people infer from the wording of the System.gc() javadocs that it will perform a full GC. However, I suspect that this is actually version / GC dependent.)
But to be honest this should all be moot. Trying to coerce an application into giving back as much memory is possible is pointless. The chances are that you are forcing it to throw away cached data. When the application gets active again it will start reloading its caches.

JVM GC demote object to eden space?

I'm guessing this isn't possible...but here goes. My understanding is that eden space is cheaper to collect than old gen space, especially when you start getting into very large heaps. Large heaps tend to come up with long running applications (server apps) and server apps a lot of the time want to use some kind of caches. Caches with some kind of eviction (LRU) tend to defeat some assumptions that GC makes (temporary objects die quickly). So cache evictions end up filling up old gen faster than you'd like and you end up with a more costly old gen collection.
Now, it seems like this sort of thing could be avoided if java provided a way to mark a reference as about to die (delete keyword)? The difference between this and c++ is that the use is optional. And calling delete does not actually delete the object, but rather is a hint to the GC that it should demote the object back to Eden space (where it will be more easily collected). I'm guessing this feature doesn't exist, but, why not (is there a reason it's a bad idea)?
Actually the eden space is the zone of memory in which objects are newly created. Once an object leaves the eden space it cannot be placed there again, then the GC implementation of Java is so much opaque that there is usually not much to do.
It would break some constrains in any case, the eden space is easily garbage collected in the sense that keep care of removing items that have a short life span. If an object survived enough time then it has to be moved somewhere else, it would be like trying to go against the rules imposed by the GC itself, which is something that is never easily obtainable in Java..

Why does java wait so long to run the garbage collector?

I am building a Java web app, using the Play! Framework. I'm hosting it on playapps.net. I have been puzzling for a while over the provided graphs of memory consumption. Here is a sample:
The graph comes from a period of consistent but nominal activity. I did nothing to trigger the falloff in memory, so I presume this occurred because the garbage collector ran as it has almost reached its allowable memory consumption.
My questions:
Is it fair for me to assume that my application does not have a memory leak, as it appears that all the memory is correctly reclaimed by the garbage collector when it does run?
(from the title) Why is java waiting until the last possible second to run the garbage collector? I am seeing significant performance degradation as the memory consumption grows to the top fourth of the graph.
If my assertions above are correct, then how can I go about fixing this issue? The other posts I have read on SO seem opposed to calls to System.gc(), ranging from neutral ("it's only a request to run GC, so the JVM may just ignore you") to outright opposed ("code that relies on System.gc() is fundamentally broken"). Or am I off base here, and I should be looking for defects in my own code that is causing this behavior and intermittent performance loss?
UPDATE
I have opened a discussion on PlayApps.net pointing to this question and mentioning some of the points here; specifically #Affe's comment regarding the settings for a full GC being set very conservatively, and #G_H's comment about settings for the initial and max heap size.
Here's a link to the discussion, though you unfortunately need a playapps account to view it.
I will report the feedback here when I get it; thanks so much everyone for your answers, I've already learned a great deal from them!
Resolution
Playapps support, which is still great, didn't have many suggestions for me, their only thought being that if I was using the cache extensively this may be keeping objects alive longer than need be, but that isn't the case. I still learned a ton (woo hoo!), and I gave #Ryan Amos the green check as I took his suggestion of calling System.gc() every half day, which for now is working fine.
Any detailed answer is going to depend on which garbage collector you're using, but there are some things that are basically the same across all (modern, sun/oracle) GCs.
Every time you see the usage in the graph go down, that is a garbage collection. The only way heap gets freed is through garbage collection. The thing is there are two types of garbage collections, minor and full. The heap gets divided into two basic "areas." Young and tenured. (There are lots more subgroups in reality.) Anything that is taking up space in Young and is still in use when the minor GC comes along to free up some memory, is going to get 'promoted' into tenured. Once something makes the leap into tenured, it sits around indefinitely until the heap has no free space and a full garbage collection is necessary.
So one interpretation of that graph is that your young generation is fairly small (by default it can be a fairly small % of total heap on some JVMs) and you're keeping objects "alive" for comparatively very long times. (perhaps you're holding references to them in the web session?) So your objects are 'surviving' garbage collections until they get promoted into tenured space, where they stick around indefinitely until the JVM is well and good truly out of memory.
Again, that's just one common situation that fits with the data you have. Would need full details about the JVM configuration and the GC logs to really tell for sure what's going on.
Java won't run the garbage cleaner until it has to, because the garbage cleaner slows things down quite a bit and shouldn't be run that frequently. I think you would be OK to schedule a cleaning more frequently, such as every 3 hours. If an application never consumes full memory, there should be no reason to ever run the garbage cleaner, which is why Java only runs it when the memory is very high.
So basically, don't worry about what others say: do what works best. If you find performance improvements from running the garbage cleaner at 66% memory, do it.
I am noticing that the graph isn't sloping strictly upward until the drop, but has smaller local variations. Although I'm not certain, I don't think memory use would show these small drops if there was no garbage collection going on.
There are minor and major collections in Java. Minor collections occur frequently, whereas major collections are rarer and diminish performance more. Minor collections probably tend to sweep up stuff like short-lived object instances created within methods. A major collection will remove a lot more, which is what probably happened at the end of your graph.
Now, some answers that were posted while I'm typing this give good explanations regarding the differences in garbage collectors, object generations and more. But that still doesn't explain why it would take so absurdly long (nearly 24 hours) before a serious cleaning is done.
Two things of interest that can be set for a JVM at startup are the maximum allowed heap size, and the initial heap size. The maximum is a hard limit, once you reach that, further garbage collection doesn't reduce memory usage and if you need to allocate new space for objects or other data, you'll get an OutOfMemoryError. However, internally there's a soft limit as well: the current heap size. A JVM doesn't immediately gobble up the maximum amount of memory. Instead, it starts at your initial heap size and then increases the heap when it's needed. Think of it a bit as the RAM of your JVM, that can increase dynamically.
If the actual memory use of your application starts to reach the current heap size, a garbage collection will typically be instigated. This might reduce the memory use, so an increase in heap size isn't needed. But it's also possible that the application currently does need all that memory and would exceed the heap size. In that case, it is increased provided that it hasn't already reached the maximum set limit.
Now, what might be your case is that the initial heap size is set to the same value as the maximum. Suppose that would be so, then the JVM will immediately seize all that memory. It will take a very long time before the application has accumulated enough garbage to reach the heap size in memory usage. But at that moment you'll see a large collection. Starting with a small enough heap and allowing it to grow keeps the memory use limited to what's needed.
This is assuming that your graph shows heap use and not allocated heap size. If that's not the case and you are actually seeing the heap itself grow like this, something else is going on. I'll admit I'm not savvy enough regarding the internals of garbage collection and its scheduling to be absolutely certain of what's happening here, most of this is from observation of leaking applications in profilers. So if I've provided faulty info, I'll take this answer down.
As you might have noticed, this does not affect you. The garbage collection only kicks in if the JVM feels there is a need for it to run and this happens for the sake of optimization, there's no use of doing many small collections if you can make a single full collection and do a full cleanup.
The current JVM contains some really interesting algorithms and the garbage collection itself id divided into 3 different regions, you can find a lot more about this here, here's a sample:
Three types of collection algorithms
The HotSpot JVM provides three GC algorithms, each tuned for a specific type of collection within a specific generation. The copy (also known as scavenge) collection quickly cleans up short-lived objects in the new generation heap. The mark-compact algorithm employs a slower, more robust technique to collect longer-lived objects in the old generation heap. The incremental algorithm attempts to improve old generation collection by performing robust GC while minimizing pauses.
Copy/scavenge collection
Using the copy algorithm, the JVM reclaims most objects in the new generation object space (also known as eden) simply by making small scavenges -- a Java term for collecting and removing refuse. Longer-lived objects are ultimately copied, or tenured, into the old object space.
Mark-compact collection
As more objects become tenured, the old object space begins to reach maximum occupancy. The mark-compact algorithm, used to collect objects in the old object space, has different requirements than the copy collection algorithm used in the new object space.
The mark-compact algorithm first scans all objects, marking all reachable objects. It then compacts all remaining gaps of dead objects. The mark-compact algorithm occupies more time than the copy collection algorithm; however, it requires less memory and eliminates memory fragmentation.
Incremental (train) collection
The new generation copy/scavenge and the old generation mark-compact algorithms can't eliminate all JVM pauses. Such pauses are proportional to the number of live objects. To address the need for pauseless GC, the HotSpot JVM also offers incremental, or train, collection.
Incremental collection breaks up old object collection pauses into many tiny pauses even with large object areas. Instead of just a new and an old generation, this algorithm has a middle generation comprising many small spaces. There is some overhead associated with incremental collection; you might see as much as a 10-percent speed degradation.
The -Xincgc and -Xnoincgc parameters control how you use incremental collection. The next release of HotSpot JVM, version 1.4, will attempt continuous, pauseless GC that will probably be a variation of the incremental algorithm. I won't discuss incremental collection since it will soon change.
This generational garbage collector is one of the most efficient solutions we have for the problem nowadays.
I had an app that produced a graph like that and acted as you describe. I was using the CMS collector (-XX:+UseConcMarkSweepGC). Here is what was going on in my case.
I did not have enough memory configured for the application, so over time I was running into fragmentation problems in the heap. This caused GCs with greater and greater frequency, but it did not actually throw an OOME or fail out of CMS to the serial collector (which it is supposed to do in that case) because the stats it keeps only count application paused time (GC blocks the world), application concurrent time (GC runs with application threads) is ignored for those calculations. I tuned some parameters, mainly gave it a whole crap load more heap (with a very large new space), set -XX:CMSFullGCsBeforeCompaction=1, and the problem stopped occurring.
Probably you do have memory leaks that's cleared every 24 hours.

Categories

Resources