An object can be promoted from Young Generation to Old Generation when it reaches the Tenuring Threshold or when the "TO" Survival Space is full when it is being transferred.
Therefore, my question is: In order to improve performance, if I know my object will be frequently used (referenced), is it possible to automatically/manually declare an object in Old/Permanent Generation so that not declaring it in the Eden would delay the necessity of Minor Garbage Collection, thus delaying the "Stop The World" event and improving the applications performance?
Generally:
No - not for a specific single object.
In more detail:
An allocation rougly looks the following:
Use thread local allocation buffer (TLAB), if tlab_top + size <= tlab_end. This is the fastest path. Allocation is just the tlab_top pointer increment.
If TLAB is almost full, create a new TLAB in the Eden space and retry in a fresh TLAB.
If TLAB remaining space is not enough but is still to big to discard, try to allocate an object directly in the Eden space. Allocation in the Eden space needs to be done using an atomic operation, since Eden is shared between all threads.
If allocation in the Eden space fails (eden_top + size > eden_end), typically a minor collection occurs.
If there is not enough space in the Eden space even after a Young GC, an attempt to allocate directly in the old generation is made.
"Hack":
The following parameter:
XX:PretenureSizeThreshold=size
This parameter is default set to 0, so it is deactivated. If set, it defines a size threshold for objects to be automatically allocated into the old generation.
BUT:
You should use this with care: Setting this parameter wrong may change your GCs behaviour drastically. And: Only a few percent of objects survive their first GC, so most objects don't have to be copied during the young GC.
Therefore, the young GC is very fast and you should not really need to "optimize" it by forcing object allocation to old generation.
Java parameters:
If you want to get an overview over possible Java paramters, run the following:
java -XX:+PrintVMOptions -XX:+AggressiveOpts -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+PrintFlagsFinal -version
This will print all flags you can set.
Different garbage collectors:
Also keep in mind that there are different garbage collectors out there, and that it is planned that Java 9 should use the garbage first (G1) GC as default garbage collector, which again may handle big objects differently (by allocating them into humangous regions).
Additional source:
Stack overflow question: Size of Huge Objects directly allocated to Old Generation
You cannot create an object directly in old generation, it has to go through the eden space and survivor spaces (the young generation) before reaching the old generation. However, if you know that your objects are long lived (For example if your have implemented something like a cache) you can set the following JVM parameters:
-XX:InitialTenuringThreshold=7: Sets the initial tenuring threshold to use in adaptive GC sizing in the parallel young collector. The tenuring threshold is the number of times an object survives a young collection before being promoted to the old, or tenured, generation.
-XX:MaxTenuringThreshold=n: Sets the maximum tenuring threshold for use in adaptive GC sizing. The current largest value is 15. The default value is 15 for the parallel collector and is 4 for CMS.
Source: http://www.oracle.com/technetwork/articles/java/vmoptions-jsp-140102.html
So, you can reduce the application's tenuring threshold.
I actually did this and the stop the world GC time reduced for minor GC ( I had a huge 250GB JVM so the effect was quite profound
).
Related
My jvm memory allocation scheme.
-Xms2048m -Xmx2048m -Xmn1536m
The official recommendation is that the young generation is 3/8 of the heap memory.
if the memory allocated by my -Xmn is small or large. What effect will it have?
The size of the young generation will determine the time between minor GCs. Objects are allocated in the Eden space using a simple pointer bumping approach, which is very fast (for multiple threads it is a bit more complicated by having thread local allocation blocks to eliminate the issue of contention). The bigger your Eden space, the longer your application can create objects before the allocation pointer(s) reach the end of the address space.
When no more objects can be allocated in Eden space, a minor GC is performed that copies live objects from Eden to a survivor space and promotes objects that have reached the tenuring threshold to the old generation. Most objects are very short-lived (the weak generational hypothesis) so, typically, only a small number of objects need to be copied. Making your Eden space larger will also mean more objects have a chance to be dereferenced and you will end up placing a lower load on the old generation.
The 3/8 advice is good for a wide range of applications. Obviously, for different applications, you may tune this up or down to fit the memory usage profile. One important rule to follow though is to keep the young generation less than 50% of the heap space (i.e. the young generation should always be smaller than the old generation). The reason for this is that, if not, the collector will run a major GC every time a minor GC is run. This is because the collector needs to ensure there is enough space in the old gen to promote objects from the young gen.
What is the criteria to put a young object in an old region making it an old object or keep it in survivor regions?
The point 4 of Young GC of the official tutorial state:
"Live objects are evacuated (i.e., copied or moved) to one or more
survivor regions. If the aging threshold is met, some of the objects
are promoted to old generation regions."
But I can't find what that criteria is.
EDIT:
Amit Bhati pointed me to the MaxTenuringThreshold parameter. I don't understand much from the official doc about it but I think I started to understand how it works.
With your help I think I found the answer here:
-XX:InitialTenuringThreshold=7 Sets the initial tenuring threshold for use in adaptive GC sizing in the parallel young collector. The
tenuring threshold is the number of times an object survives a young
collection before being promoted to the old, or tenured, generation.
-XX:MaxTenuringThreshold=n Sets the maximum tenuring threshold for use in adaptive GC sizing. The current largest value is 15. The default
value is 15 for the parallel collector and is 4 for CMS.
It is under the Debugging Options title :)
Under Garbage First (G1) Garbage Collection Options you can find this:
-XX:MaxTenuringThreshold=n Maximum value for tenuring threshold. The default value is 15.
It is not very descriptive if you have not read InitialTenuringThreshold description on the other section. It seems InitialTenuringThreshold is not a valid G1 option but I think the algorithm is the described there.
The following doc is good at explaining how to alter (reduce) the rate at which items are promoted from the survivor spaces to the Old Gen in the G1 collector.
http://java-is-the-new-c.blogspot.co.uk/2013/07/tuning-and-benchmarking-java-7s-garbage.html (the section entitled Tuning The Young Generation)
As the above answers say, the MaxTenuringThreshold is the key setting, but this is only an upper limit, and would be ignored if your YoungGen wasn't big enough to allow this to be honoured. In which case you'd need to increase either the overall YoungGen via NewRatio or just the SurvivorSpace via SurvivorRatio
From the Javadocs:
The heap space is divided into the old and the new generation. The new
generation includes the new object space (eden), and two survivor
spaces. The JVM allocates new objects in the eden space, and moves
longer lived objects from the new generation to the old generation.
The young generation uses a fast copying garbage collector which
employs two semi-spaces (survivor spaces) in the eden, copying
surviving objects from one survivor space to the second. Objects that
survive multiple young space collections are tenured, meaning they are
copied to the tenured generation. The tenured generation is larger and
fills up less quickly. So, it is garbage collected less frequently;
and each collection takes longer than a young space only collection.
Collecting the tenured space is also referred to as doing a full
generation collection.
The frequent young space collections are quick (a few milliseconds),
while the full generation collection takes a longer (tens of
milliseconds to a few seconds, depending upon the heap size).
Other GC algorithms, such as the Concurrent Mark Sweep (CMS)
algorithm, are incremental. They divide the full GC into several
incremental pieces. This provides a high probability of small pauses.
This process comes with an overhead and is not required for enterprise
web applications.
Also check this article: Java Garbage Collectors – Moving to Java7 Garbage-First (G1) Collector
The young generation comprises of one Eden and two Survivor spaces.
The live objects in Eden are copied to the initially empty survivor
space, labeled S1 in the figure, except for ones that are too large to
fit comfortably in the S1 space. Such objects are directly copied to
the old generation. The live objects in the occupied survivor space
(labeled S0) that are still relatively young are also copied to the
other survivor space, while objects that are relatively old are copied
to the old generation. If the S1 space becomes full, the live objects
from Eden or S0 that have not been copied to it are tenured,
regardless of their age. Any objects remaining in Eden or the S0 space
after live objects have been copied are not live and need not be
examined. Figure below illustrates the heap after young generation
collection:
The young generation collection leads to stop the world pause. After
collection, eden and one survivor space are empty. Now let’s see how
CMS handles old generation collection. It essentially consists of two
major steps – marking all live objects and sweeping them.
I've read few articles about how garbage collection works and still don't understand how using generations helps? As I understood the main idea is that we start collection from the youngest generation and move to older generations. But why the authors of this idea decided that starting from the youngest generation is the most efficient way?
The older the generation, means object has been used quite a many times, and possibly will need again.
Removing recently created object makes no sense, May be its temporary(scope : local) object.
The authors start with the youngest generation first simply because that's what gets filled up first after your application starts, however in reality which generation is being swept and when is non-deterministic as your application runs.
The important points with generational GC are:
the young generation uses a copying collector which is copying objects to a space that it considers to be empty (the unused survivor spaces) from eden and the current survivor space and is therefore fast and the GC pause is minimal.
add to this fact that most objects die young and therefore the pause required to copy a small number of surviving objects from the eden and the current surviver space is small as only objects with live references are copied, after which eden and the previous survivor space can be wiped.
after being copied several times objects are copied to the tenured (old) generation; Eventually the tenured generation will fill up, however, this time there's not a clean space to copy the objects to, so the garbage collector has to sweap and compact within the generation, which is slow (when compared to the copy performed in eden and the survivor space) meaning a longer pause.
the good news, based on the most objects die young heuristic is, major GCs happen much less frequently than minor keeping GC pauses to a minimum over the lifetime of an application.
there's also a benefit that all new objects are allocated on the top of the heap, meaning there's mininal instructions required to do so, with defragmentation occurring naturally as part of the copy process.
Both these pages, Oracle Garbage Collection Tuning and Useful JVM Flags – Part 5 (Young Generation Garbage Collection), describe this.
Read this one.
Using different generations, makes the allocation of objects easy and fast as MOST of the allocations are done in a single region of Heap - Eden. Based on the observation that most objects die young from Weak Generational Hypothesis, collections in Young generation have more garbage which will reclaim more memory and its relatively small compared to the heap which means that time taken to scan the objects is also less. Thats why Young generation GCs are fast.
For more details on GC and generations, you can refer to this
Given a java memory configuration like the following
-Xmx2048m -Xms512m
What would be the behaviour of the VM when memory usage increases past 512m? Is there a particular algorithm that it follows? ie. Does it go straight to the max, does it double, does it go in increments, or does it only allocate as it needs memory? How expensive an operation is it?
I'm looking specifically at the Oracle/Sun JVM, version 1.6. I assume this is documented on the Oracle website somewhere, but I'm having trouble finding it.
It's the garbage collector's job to decide when resizing is necessary, so it is determined by the GC parameter 'MinFreeHeapRatio.' If the GC needs more space, it will grow to a size where the % of heap specified by that value is available.
A typical value on a modern platform is 40ish, so if you start at 512MB and have less than 40% free, meaning you exceeded ~308MB, it will increase until 40% is free again. So say after collection there are still 400MB worth of live objects, your heap will go up to ~667MB. (Yes it is named ratio but expects a % value as argument... search me!)
Note this is a bit inexact, the garbage collector is "generational" and actually can resize individual generations, but it also has forced ratios between generations sizes and if your objects are distributed between long lived and short lived in roughly the way it estimates, it works out pretty well for back of the envelope.
This applies to defaults in Java 6. If you use custom garbage collector config it may be different. You can read about that here: http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#generation_sizing.total_heap
(The "expense" of the operation kind of depends on the operating system and what else is going on. If the system is loaded down and the OS has to do some swapping to make a contiguous block of memory for you, then it could be very expensive!)
Use of -verbose:gc and/or -XX:+PrintGCDetails options should give you many finer details.
Here is an example output with the -verbose:gc option switched on:
[GC 325407K->83000K(776768K), 0.2300771 secs]
[GC 325816K->83372K(776768K), 0.2454258 secs]
[Full GC 267628K->83769K(776768K), 1.8479984 secs]
An explanation of the above taken from the official document:
Here we see two minor collections followed by one major collection.
The numbers before and after the arrow (e.g., 325407K->83000K from the
first line) indicate the combined size of live objects before and
after garbage collection, respectively. After minor collections the
size includes some objects that are garbage (no longer alive) but that
cannot be reclaimed. These objects are either contained in the tenured
generation, or referenced from the tenured or permanent generations.
The next number in parentheses (e.g., (776768K) again from the first
line) is the committed size of the heap: the amount of space usable
for java objects without requesting more memory from the operating
system. Note that this number does not include one of the survivor
spaces, since only one can be used at any given time, and also does
not include the permanent generation, which holds metadata used by the
virtual machine.
The last item on the line (e.g., 0.2300771 secs) indicates the time
taken to perform the collection; in this case approximately a quarter
of a second.
The format for the major collection in the third line is similar.
Running an application this way along with updating minimum and maximum heap sizes can give good insight into heap allocation and garbage collection patterns of the VM.
its to be noted that JVM on Initialization virtually reserves maximum address space but does not allocate physical memory. Typically the JVM will allocate space into Old and Young generation. There are intermidatory space in the Young Generation . The new objects invoked are contained in the Young Generation.
When the intermidatory space is filled up , GC is invoked which moves reference Objects to one of the Intermidatory space called Survivor Space in the Young Generation Segment. The GC might follow "stop-the-world" by saving thread's state algorithm or an alogorithm so the processes keep running(?).
When the Survivor space fills JVM invokes a full GC.
We know that there's few main memory domains: Young, Tenured (Old gen) and PermGen.
Young domain is divided into Eden and Survivor (with two).
OldGen is for surviving objects.
MaxTenuringThreshold keeps objects from being finally copied to the OldGen space too early. It's pretty clear and understandable.
But how does it work? How is the garbage collector dealing with these objects which are still surviving till MaxTenuringThreshold and in what way? Where are they located?
Objects are being copied back to Survivor spaces for garbage collection.. or does it happen somehow else?
Each object in Java heap has a header which is used by Garbage Collection (GC) algorithm. The young space collector (which is responsible for object promotion) uses a few bit(s) from this header to track the number of collections object that have survived (32-bit JVM use 4 bits for this, 64-bit probably some more).
During young space collection, every single object is copied. The Object may be copied to one of survival spaces (one which is empty before young GC) or to the old space. For each object being copied, GC algorithm increases it's age (number of collection survived) and if the age is above the current tenuring threshold it would be copied (promoted) to old space. The Object could also be copied to the old space directly if the survival space gets full (overflow).
The journey of Object has the following pattern:
allocated in eden
copied from eden to survival space due to young GC
copied from survival to (other) survival space due to young GC (this could happen few times)
promoted from survival (or possible eden) to old space due to young GC (or full GC)
the actual tenuring threshold is dynamically adjusted by JVM, but MaxTenuringThreshold sets an upper limit on it.
If you set MaxTenuringThreshold=0, all objects will be promoted immediately.
I have few articles about java garbage collection, there you can find more details.
(Disclaimer: This covers HotSpot VM only)
As Alexey states, the actually used tenuring threshold is determined by the JVM dynamically. There is very little value in setting it. For most applications the default value of 15 will be high enough, as usually way more object survive the collection.
When many objects survive the collection, the survivor spaces overflow directly to old. This is called premature promotion and an indicator of a problem. However it seldom can be solved by tuning MaxTenuringThreshold.
In those cases sometimes SurvivorRatio might be used to increase the space in the survivor spaces, allowing the tenuring to actually work.
However, most often enlarging the young generation is the only good choice (from configuration perspective).
If you are looking from coding perspective, you should avoid excess object allocation to let tenuring work as designed.
To answer exactly what you asked:
When an object reaches its JVM determinded tenuring threshold, it is copied to old. Before that, it will be copied to the empty survivor space. Objects that have been surviving a while but are de-referenced before reaching the threshold are cleaned from survivor very efficiently.