Garbage Collection - Old Generation - java

I am learning how garbage collection works.
I am clear with Young generation, but not with Old generation
When old generation is full and major GC is performed then what happens?
Does all objects either live or dead are removed from Old generation or only objects which are dead are removed?
If all objects in old generation are live at the time of major GC then what happens? Does it throws OutOfMemoryError?

Young generation: Most of the newly created objects are located here. Since most objects soon become unreachable, many objects are created in the young generation, then disappear. When objects disappear from this area, we say a "minor GC" has occurred.
Old generation: The objects that did not become unreachable and survived from the young generation are copied here. It is generally larger than the young generation. As it is bigger in size, the GC occurs less frequently than in the young generation. When objects disappear from the old generation, we say a "major GC" (or a "full GC") has occurred.
OldGen : This pool is basically contain tenured and virtual (reserved) space and will be holding those objects which survived after garbage collection from YoungGen space.
If old generation becomes full, OOM will be thrown.
Q & A
When old generation is full and major gc is performed then what happens? Objects are removed from memory.
Does all objects either live or dead are removed from old generation or only objects which are dead are removed? only dead objects are removed, objects which are still referenced still live in old
If all objects in old generation are live then what happens?
Does it throws OutOfMemoryError? OOM will be thrown

Only those objects which are not reachable are removed by GC.
So yes, if all objects in old generation heap are reachable, JVM will throw OutOfMemoeryError.
Old generation: The objects that did not become unreachable and survived from the young generation are copied here. It is generally larger than the young generation. As it is bigger in size, the GC occurs less frequently than in the young generation. When objects disappear from the old generation, we say a "major GC" (or a "full GC") has occurred.
you may look here for more details.

Related

How GC knows if object in old heap references an object in young heap?

Minor GC - when garbage collectors clears objects in the young generation which are not referenced from the "roots". Minor GC works on young heap only. But what if a young object is referenced from the old heap?
Garbage Collector needs know old objects that refer to young objects. To find all these references, it can scan all old objects but it is very bad solution. So Remembered set keeping this information . Each thread then informs the GC if it changes a reference, which could cause a change in the remember set.
A card table(array of bytes) is a particular type of remembered set. If reference changed, card (Each byte is referred to as a card in card table) gets dirty. Dirty card contain new pointer from the old generation to the young generation. Finally java not scan all old object, instead of scan remembered set.
GC1 Card Table and Remembered Set
Marking card
Minor GC will collect the young generation but it doesn't mean that the GC will look only at the young generation heap area. The entire heap is considered and a reference from old generation to young generation will mark the object in young generation as alive.
This is described in Minor GC vs Major GC vs Full GC:
During a Minor GC event, Tenured generation is effectively ignored. References from tenured generation to young generation are considered de facto GC roots. References from young generation to Tenured generation are simply ignored during the markup phase.

Why are immutable objects 'more efficient' for generational GC?

"Young GC becomes inefficient if we have tenured objects referring to younger generations" is quoted as one of the reasons to favor immutable objects.
What exactly happens when the collector comes across such an object in the old generation?
Why should it be any more cumbersome than collecting an older object referring to an object in the young generation?
To collect the Eden space (of the young gen.) any live objects are copied from the Eden space to one of the survivor spaces. Objects already in a survivor space are copied from the 'from' space to the 'to' space unless they are old enough to be promoted to the old generation (in which case they are copied there).
All of this involves object relocation. To do this safely any objects in the old generation that point to objects in the new generation (that are being relocated during a minor GC) must have those references updated. The more objects that have references to objects being relocated, the more work the GC has to do during a minor GC.
If you use only immutable objects the number of objects that will contain pointers from the old gen. to the young gen. will be very small (most likely zero). There are only two ways this could happen:
An object is promoted to the old gen. whilst an object it refers to
is still in a survivor space.
An object is large enough to be allocated directly in the old gen. and refers to an object in the young gen.
To summarise the answer, by using immutable objects you're reducing the possible number of object references that the GC has to update during a minor collection, therefore improving its efficiency.
What exactly happens when the collector comes across such an object in the old generation?How does it handle it?
Typical it can't, before the collector considers collecting objects in the older generation it promotes anything it could not reap in the younger generation so by the time the collectors considers if the parent object can be collected they are no longer in the younger generation. I think they issue is what happens when it comes across the object in the younger generation, it has to skip it, and could skip it hundreds or thousands of times as it does the younger generation GCs before it has to do one on the older generation.
Why should it be any more cumbersome than collecting a younger object referring to an object in the tenured generation?
Being referenced by an older generation object means it is effectively frozen in the young generation, being referenced by a younger generation object is no issue as the younger generations are all resolved before it starts collecting from the older generation.
I think as long as you are disciplined about de-referencing all your unused objects then it will not hurt your GC efficiency but that can be a lot of extra work in a big application.

Why Major Garbage collection is slower than Minor?

Gone thru this link but still
has confusion what actually happens in minor and major GC collection.
Say i have 100 objects in younger generation out of which 85 object are unreachabe objects. Now when Minor GC runs,
it will reclaim the memory of 85 objects and move 15 objects to older(tenured) generation.
Now 15 live objects exists in older generation out of which 3 are unreachable. Say Major GC takes places. It will keep
15 objects as it is and reclaim the memory for 3 unreachable object. Major GC is said to be slower than minor GC. My question is why ? Is it because of major GC happens on generally greater number of objects than minor as minor gc occurs more frequently than major?
As per understanding major GC should be faster as it needs to do less work i.e reclaiming memory from unreachable objects than minor GC because
high mortality rate in young generation.
1) Minor GC will first move 15 objects to one of survivor spaces, eg SS1, next GC will move those who are still alive to SS2, next GC will move those who survived back to SS1 and so forth. Only those who survived several (eg 8) relocations (minor GCs) will finally go to old generation.
2) Major GC happens only when JVM cannot allocate an object in old generation because there is no free space in it. To clean memory from dead objects GC goes over all objects in old generation, since old generation is several times larger than new generation, it may hold several times more objects, so GC processing will take several times longer
My question is why? Is it because of major GC happens on generally greater number of objects than minor as minor gc occurs more frequently than major?
You pretty much hit the nail on its head. From the Oracle article, emphasis mine:
Often a major collection is much slower because it involves all live objects.
So not only does a major GC analyze those 15 objects in the old generation, it also goes through the young generation (again) and permgen and GCs those areas of the heap. Minor GC only analyzes the young generation, so there generally wouldn't be as many objects to look at.
As per understanding major GC should be faster as it needs to do less work (i.e reclaiming memory from unreachable objects) than minor GC because high mortality rate in young generation.
I think I understand why you think that. I could imagine that major GC could be run very soon after a minor GC, when objects are promoted to an almost-full old generation. Thus, the young generation would (presumably) not contain too many objects to collect.
However, if I'm remembering things correctly, the old generation is usually larger than the young generation, so not only does the GC have to analyze more space, it also has to go over permgen again, as well as the remaining objects in the young generation (again). So that would probably be why major GC is slower -- simply because there's more stuff to do. You might be able to make major GC faster than minor GC by changing the sizes of the generation spaces such that the young generation is larger than both the old generation and permgen, but I don't think that would be a common setting to use...

How generation help garbage collector?

I've read few articles about how garbage collection works and still don't understand how using generations helps? As I understood the main idea is that we start collection from the youngest generation and move to older generations. But why the authors of this idea decided that starting from the youngest generation is the most efficient way?
The older the generation, means object has been used quite a many times, and possibly will need again.
Removing recently created object makes no sense, May be its temporary(scope : local) object.
The authors start with the youngest generation first simply because that's what gets filled up first after your application starts, however in reality which generation is being swept and when is non-deterministic as your application runs.
The important points with generational GC are:
the young generation uses a copying collector which is copying objects to a space that it considers to be empty (the unused survivor spaces) from eden and the current survivor space and is therefore fast and the GC pause is minimal.
add to this fact that most objects die young and therefore the pause required to copy a small number of surviving objects from the eden and the current surviver space is small as only objects with live references are copied, after which eden and the previous survivor space can be wiped.
after being copied several times objects are copied to the tenured (old) generation; Eventually the tenured generation will fill up, however, this time there's not a clean space to copy the objects to, so the garbage collector has to sweap and compact within the generation, which is slow (when compared to the copy performed in eden and the survivor space) meaning a longer pause.
the good news, based on the most objects die young heuristic is, major GCs happen much less frequently than minor keeping GC pauses to a minimum over the lifetime of an application.
there's also a benefit that all new objects are allocated on the top of the heap, meaning there's mininal instructions required to do so, with defragmentation occurring naturally as part of the copy process.
Both these pages, Oracle Garbage Collection Tuning and Useful JVM Flags – Part 5 (Young Generation Garbage Collection), describe this.
Read this one.
Using different generations, makes the allocation of objects easy and fast as MOST of the allocations are done in a single region of Heap - Eden. Based on the observation that most objects die young from Weak Generational Hypothesis, collections in Young generation have more garbage which will reclaim more memory and its relatively small compared to the heap which means that time taken to scan the objects is also less. Thats why Young generation GCs are fast.
For more details on GC and generations, you can refer to this

Generational Garbage Collection

As I understand, a generational GC divides objects into generations.
And on each cycle, GC runs on only one generation.
Why? Why Garbage Collecting of only one generation is enough?
P.S: I understand all these from here .
If you read the link I provided in earlier question you had about Generational GC, you will understand why it does so, the cycle is when the white set memory is filled up.
To optimize for this scenario, memory
is managed in generations, or memory
pools holding objects of different
ages. Garbage collection occurs in
each generation when the generation
fills up. Objects are allocated in a
generation for younger objects or the
young generation, and because of
infant mortality most objects die
there. When the young generation fills
up it causes a minor collection. Minor
collections can be optimized assuming
a high infant mortality rate. The
costs of such collections are, to the
first order, proportional to the
number of live objects being
collected. A young generation full of
dead objects is collected very
quickly. Some surviving objects are
moved to a tenured generation. When
the tenured generation needs to be
collected there is a major collection
that is often much slower because it
involves all live objects.
Basically, each objects is divided into generations (based on the hypothesis about the object) and places them into a memory heap for a particular generation. When that memory heap is filled up, the GC cycle begins, and those objects that still references are moved to another memory heap and fresh objects are added.
It's not always enough -- it's just that it's usually enough, so it saves time by not examining objects that are likely to stay alive anyway.
Every object has a generation, saying how many garbage collections it has survived. If an object has survived a few garbage collections, chances are that it will also survive the next one.
MSDN has a great explanation:
A generational garbage collector makes the following assumptions:
The newer an object is, the shorter its lifetime will be.
The older an object is, the longer its lifetime will be.
Newer objects tend to have strong relationships to each other and are frequently accessed around the same time.
Compacting a portion of the heap is faster than compacting the whole heap.
Because of this, you could save some time by only trying to collect younger objects, and collecting the older generations only if that doesn't free up enough memory.
The answer is there really.
It has been empirically observed that in many programs, the most recently created objects are also those most likely to become unreachable quickly (known as infant mortality or the generational hypothesis).
And
Generational garbage collection is a heuristic approach, and some unreachable objects may not be reclaimed on each cycle. It may therefore occasionally be necessary to perform a full mark and sweep or copying garbage collection to reclaim all available space.
Basically, generational collection gives you better performance over a full garbage collection at the cost of completeness. That's why a mixture of the two is used in practice.

Categories

Resources