what are the effects of paging on garbage collection ?
The effects of paging on garbage-collection are pretty much the same as upon anything; it allows access to lots of memory, but hurts performance when it happens.
The more pressing question, is what is the effect of garbage-collection on paging?
Garbage collection can cause areas of memory to be read from and written to that would not be considered otherwise at a given point of time. Reducing the degree to which garbage collection causes paging to happen is therefore advantageous. This is one of the advantages that a generational compacting collector offers, as it leads to more short-lived objects being in one page, collected from that page, and the memory made available to other objects, while also keeping long-lived objects in a page where related objects are more likely to also be (long-lived objects will often be related to other long-lived objects because one long-lived object is keeping the others alive). This not only reduces the amount of paging necessary to perform the collection, but can help reduce the amount of paging necessary for the rest of the application.
First a bit of terminology. In some areas, e.g. Linux-related talks, paging is a feature of the operating system in which executable code needs not be permanently in RAM. Executable code comes from an executable file, and the kernel loads it from the disk on demand, when the CPU walks through the instructions in the program. When memory is tight, the kernel may decide to simply "forget" a page of code, because it knows that it can always reload it from the executable file, if that code needs to be executed again.
The kernel also implements another feature which is called swapping and is about something similar, but for data. Data is not obtained from the executable file. Hence, the kernel cannot simply forget a page of datal; it has to save it somewhere, in a dedicated area called a "swap file" or "swap partition". This makes swapping more expensive than paging: the kernel must write out the data page before reusing the corresponding RAM, whereas a code page can simply be reused directly. In practice, the kernel pages quite a lot before considering swapping.
Paging is thus orthogonal to garbage collection. Swapping, however, is not. The general rule of thumb is that swapping and GC do not mix well. Most GC work by regularly inspecting data, and if said data has been sent to the swap partition, then it will have to be reloaded from that partition, which means that some other data will have to be sent to the said partition, because if the data was in the swap and not in RAM then this means that memory is tight. In the presence of swapping, a GC tends to imply an awful lot of disk activity.
Some GC apply intricate strategies to reduce swap-related strategies. This includes generational GC (which try to explore old data less often) and strict typing (the GC looks at data because it needs to locate pointers; if it knows that a big chunk of RAM contains only non-pointers, e.g. it is some picture data with only pixel values, then it can leave it alone, and in particular not force it back from the swap area). The GC in the Java virtual machine (the one from Sun/Oracle) is known to be quite good at that. But that's only relative: if your Java application hits swap, then you will suffer horribly. But it could have been much worse.
Just buy some extra RAM.
Related
Is there any way to change the frequency of the garbage collector, whether if to reduce it or increase it?
I found some articles that say that in order to increase the frequency, I need to increase the young generation to allow more objects to get into it before a GC is called.
But I didn't find anywhere a real way to do it, with real commands or actions or instructions HOW to make it happen (to reduce or to increase GC frequency).
You first configure which garbage collector you want to use, then you may be able to configure said garbage collector.
Whatever you read about it is irrelevant / invalid / misleading / vastly oversimplified. Garbage Collection is extremely complicated, specifically so complicated that talking about 'frequency' doesn't really make sense. 20 years ago garbage collection was simple:
Freeze every thread.
Make a list of all live objects.
Tree-walk these objects to find all objects reachable from those live objects, and keep walking.
Start from position 0 in the heap memory allocated to the JVM and start moving every object still reachable, updating all pointers as you go, and therefore silently overwriting all non-reachables.
Now you're done; memory is nicely compacted, lots of free space, unfreeze the world.
That model? It died 20 years ago. Garbage collection is vastly more complicated now, with aspects like:
Live tracking: Where the JVM uses heuristic mechanisms to be able to fast-collect a subset of garbage. (basically, reference counting; if the refcount is 0, it's definitely garbage. However, non-0 refcounts could also be garbage, for example if a refers to b, b refers to a, and nothing 'live' refers to either: Both refcounts are 1, but they're still garbage). These garbage collectors still collect them, just, not as quickly as refcount-0 garbage. What does 'frequency' mean now?
Generations, with vastly different approaches between generations. For example, your basic eden/'fast garbage' system works in reverse: A java thread gets a page worth of memory, new objects are created here, completely unreachable by any other thread. Once it is full, the system does a quick check on what this and only this thread can currently reach in context, makes a new page, copies over just the objects still reachable, and marks the old page as free. "Free garbage collection" just occurred. What the heck would 'frequency' mean here? There is nothing to configure: When the page is full, this process kicks in. Until the page is full, it doesn't. There's nothing to configure.
that's just 2 of like 50 things that garbage collectors do that cannot be described simply as a thing to which the term 'frequency' can be applied unambiguously.
Every JDK version sees pretty massive changes to the GC implementations available, and the way these implementations works, and even the settings these implementations support. None of it is part of the core java spec, which means that the OpenJDK team is far more cavalier about changing them between java releases, and for the same reason, alternate JDK providers like Azul, coretto etc often provide extra GC impls and extra settings.
So what do I do?
Stop worrying. The general rule of thumb is: If you mess with GC settings, you'll make everything worse. Get an expert if you need to tweak GC settings, and rest safe in the knowledge that it is highly unlikely you need it.
Forget about what you read. It's outdated information.
There are some articles available online where they mention some of the cons of using Stream-s over old loop-s:
https://blog.jooq.org/2015/12/08/3-reasons-why-you-shouldnt-replace-your-for-loops-by-stream-foreach/
https://jaxenter.com/java-performance-tutorial-how-fast-are-the-java-8-streams-118830.html
But is there any impact from the GC perspective? As I assume (is it correct?) that every stream call creates some short-lived objects underneath. If the particular code fragment which uses streams is called frequently by the underlying system could it cause eventually some performance issue from the GC perspective or put extra pressure on GC? Or the impact is minimal and could be ignored most of the time?
Are there any articles covering this more in detail?
To be fair, it's very complicated to give an answer when Holger already linked the main idea via his answer; still I will try to.
Extra pressure on GC - may be. Extra time for a GC cycle to execute - most probably not. Ignorable? I'd say totally. In the end what you care from a GC - that it takes little time to reclaim lots of space, preferably with super tiny stop-the-world events.
Let's talk about the potential overhead in the GC main two phases : mark and evacuation/realocation (Shenandoah/ZGC). First mark phase, where GC finds out what is garbage (by actually identifying what is alive).
If objects that were created by the Stream internals are not reachable, they will never be scanned (zero overhead here) and if they are reachable, scanning them will be extremely fast. The other side of the story is: when you create an Object and GC might touch it while it's running in the mark phase, the slow path of a LoadBarrier (in case of Shenandoah) will be active. This will add some tens of ns I assume to the total time of that particular phase of the GC as well as some space in the SATB queues. Aleksey Shipilev in one talk said that he tried to measure the overhead from executing a single barrier and could not, so he measured 3 and the time was in the region of tens of ns. I don't know the exact details of ZGC, but a LoadBarrier is there in place too.
The main point is that this mark phase is done in a concurrent fashion, while the application is running, so you application will still run perfectly fine. And even if some GC code will be triggered to do something specific work (Load Barrier), it will be extremely fast and completely transparent to you.
The second phase is "compactation", or making space for future allocations. What a GC does is move live objects from regions with the most garbage (Shenandoah for sure) to regions that are empty. But only live objects. So if a certain region has 100 objects and only 1 is alive, only 1 will be moved, then that entire region is going to be marked as free. So potentially if the Stream implementation generated only garbage (i.e.: not currently alive), it is "free lunch" for GC, it will not even know it existed.
The better picture here is that this phase is still done concurrently. To keep the "concurrency" active, you need to know how much was allocated from start to end of a GC cycle. This amount is the minimum "extra" space you need to have on top of the java process in order for a GC to be happy.
So overall, you are looking at a super tiny impact; if any at all.
For example in a service adapter you might:
a. have an input data model and an output data model, maybe even immutable, with different classes and use Object Mappers to transform between classes and create some short-lived objects along the way
b. have a single data model, some of the classes might be mutable, but the same object that was created for the input is also sent as output
There are other use-cases when you'd have to choose between clear code with many objects and less clear code with less objects and I would like to know if Garbage Collection still has a weight in this decision.
I should make this a comment as IMO it does not qualify as an answer, but it will not fit.
Even if the answer(s) are going to most probably be - do whatever makes your code more readable (and to be honest I still follow that all the time); we have faced this issue of GC in our code base.
Suppose that you want to create a graph of users (we had to - around 1/2 million) and load all their properties in memory and do some aggregations on them and filtering, etc. (it was not my decision), because these graph objects where pretty heavy - once loaded even with 16GB of heap the JVM would fail with OOM or GC would take huge pauses. And it's understandable - lots of data requires lots of memory, you can't run away from it. The solution proposed and that actually worked was to model that with simple BitSets - where each bit would be a property and a potential linkage to some other data; this is by far not readable and extremely complicated to maintain to this day. Lots of shifts, lots of intrinsics of the data - you have to know at all time what the 3-bit means for example, there's no getter for usernameIncome let's say - you have to do quite a lot shifts and map that to a search table, etc. But it would keep the GC pretty low, at least in the ranges where we were OK with that.
So unless you can prove that GC is taken your app time so much - you probably are even safer simply adding more RAM and increasing it(unless you have a leak). I would still go for clear code like 99.(99) % of the time.
Newer versions of Java have quite sophisticated mechanisms to handle very short-living objects so it's not as bad as it was in the past. With a modern JVM I'd say that you don't need to worry about garbage collection times if you create many objects, which is a good thing since there are now many more of them being created on the fly that this was the case with older versions of Java.
What's still valid is to keep the number of created objects low if the creation is coming with high costs, e.g. accessing a database to retrieve data from, network operations, etc.
As other people have said I think it's better to write your code to solve the problem in an optimum way for that problem rather than thinking about what the garbage collector (GC) will do.
The key to working with the GC is to look at the lifespan of your objects. The heap is (typically) divided into two main regions called generations to signify how long objects have been alive (thus young and old generations). To minimise the impact of GC you want your objects to become eligible for collection while they are still in the young generation (either in the Eden space or a survivor space, but preferably Eden space). Collection of objects in the Eden space is effectively free, as the GC does nothing with them, it just ignores them and resets the allocation pointer(s) when a minor GC is finished.
Rather than explicitly calling the GC via System.gc() it's much better to tune your heap. For example, you can set the size of the young generation using command line options like -XX:NewRatio=n, where n signifies the ratio of new to old (e.g. setting it to 3 will make the ratio of new:old 1:3 so the young generation will be 1 quarter of the heap). Alternatively, you can set the size explicitly using -XX:NewSize=n and -XX:MaxNewSize=m. The GC may resize the heap during collections so setting these values to be the same will keep it at a fixed size.
You can profile your code to establish the rate of object creation and how long your objects typically live for. This will give you the information to (ideally) configure your heap to minimise the number of objects being promoted into the old generation. What you really don't want is objects being promoted and then becoming garbage shortly thereafter.
Alternatively, you may want to look at the Zing JVM from Azul (full disclosure, I work for them). This uses a different GC algorithm, called C4, which enables compaction of the heap concurrently with application threads and so eliminates most of the impact of the GC on application latency.
It is not clear to me how much memory my app is actually needs when I use Garbage Collection and Memory Visualizer in Eclipse. Looking at this graph:
At say 0:12 it has acquired a bit more than 0,4 GB. I know it acquires more heap (heap size in graph) than actually needs. But what I want to see is how much memory it really uses. The other two graphs just confuses the picture. Can I do that in gcmv?
There is no clear answer to the question, how much memory is used, as that depends on the definition of used. At some time, the JVM will acquire memory and the application will use memory by writing into it to be able to read it at a later time. You can say, it uses the memory, as long as there will be another access to that memory at a later time. If you’re never going to read a value, the memory holding the value is not really needed, but it’s a technical challenge to find out whether the application is going to read it again.
One way to predict whether there will be an access at a later time, is to utilize the object structure and consider every unreachable object as unused. This analysis, called garbage collection, runs at certain points of time, so at these points, we will often find the actually used memory to be smaller than the memory that the application used between two garbage collection runs. But between these points, there is no knowledge about the used memory amount regarding reachable objects.
So you see two graphs in your chart, one representing the memory that your application has written data into between two garbage collection cycles, and the other connects the resulting still-reachable memory after garbage collection. The required size is assumed to lie somewhere in-between. For arbitrary time points, the higher “before collection” values stem from the fact that the application has used them without a garbage collection happening, but if a garbage collection had happened at that time, the actually used (still reachable) memory is likely to be lower than that, but unlikely to be lower than the “after collection” values (though even that is not known for sure).
When you want to be on the safe side, you ensure that there’s always at least as much memory as the recorded “before collection” value available and the application should just run as in your recorded process. However, you possibly could reduce the available memory to a value between the “before collection” and the maximum “after collection” value and still run the application successfully, but with different garbage collection cycles, possibly spending more CPU time for garbage collection.
I am building a Java web app, using the Play! Framework. I'm hosting it on playapps.net. I have been puzzling for a while over the provided graphs of memory consumption. Here is a sample:
The graph comes from a period of consistent but nominal activity. I did nothing to trigger the falloff in memory, so I presume this occurred because the garbage collector ran as it has almost reached its allowable memory consumption.
My questions:
Is it fair for me to assume that my application does not have a memory leak, as it appears that all the memory is correctly reclaimed by the garbage collector when it does run?
(from the title) Why is java waiting until the last possible second to run the garbage collector? I am seeing significant performance degradation as the memory consumption grows to the top fourth of the graph.
If my assertions above are correct, then how can I go about fixing this issue? The other posts I have read on SO seem opposed to calls to System.gc(), ranging from neutral ("it's only a request to run GC, so the JVM may just ignore you") to outright opposed ("code that relies on System.gc() is fundamentally broken"). Or am I off base here, and I should be looking for defects in my own code that is causing this behavior and intermittent performance loss?
UPDATE
I have opened a discussion on PlayApps.net pointing to this question and mentioning some of the points here; specifically #Affe's comment regarding the settings for a full GC being set very conservatively, and #G_H's comment about settings for the initial and max heap size.
Here's a link to the discussion, though you unfortunately need a playapps account to view it.
I will report the feedback here when I get it; thanks so much everyone for your answers, I've already learned a great deal from them!
Resolution
Playapps support, which is still great, didn't have many suggestions for me, their only thought being that if I was using the cache extensively this may be keeping objects alive longer than need be, but that isn't the case. I still learned a ton (woo hoo!), and I gave #Ryan Amos the green check as I took his suggestion of calling System.gc() every half day, which for now is working fine.
Any detailed answer is going to depend on which garbage collector you're using, but there are some things that are basically the same across all (modern, sun/oracle) GCs.
Every time you see the usage in the graph go down, that is a garbage collection. The only way heap gets freed is through garbage collection. The thing is there are two types of garbage collections, minor and full. The heap gets divided into two basic "areas." Young and tenured. (There are lots more subgroups in reality.) Anything that is taking up space in Young and is still in use when the minor GC comes along to free up some memory, is going to get 'promoted' into tenured. Once something makes the leap into tenured, it sits around indefinitely until the heap has no free space and a full garbage collection is necessary.
So one interpretation of that graph is that your young generation is fairly small (by default it can be a fairly small % of total heap on some JVMs) and you're keeping objects "alive" for comparatively very long times. (perhaps you're holding references to them in the web session?) So your objects are 'surviving' garbage collections until they get promoted into tenured space, where they stick around indefinitely until the JVM is well and good truly out of memory.
Again, that's just one common situation that fits with the data you have. Would need full details about the JVM configuration and the GC logs to really tell for sure what's going on.
Java won't run the garbage cleaner until it has to, because the garbage cleaner slows things down quite a bit and shouldn't be run that frequently. I think you would be OK to schedule a cleaning more frequently, such as every 3 hours. If an application never consumes full memory, there should be no reason to ever run the garbage cleaner, which is why Java only runs it when the memory is very high.
So basically, don't worry about what others say: do what works best. If you find performance improvements from running the garbage cleaner at 66% memory, do it.
I am noticing that the graph isn't sloping strictly upward until the drop, but has smaller local variations. Although I'm not certain, I don't think memory use would show these small drops if there was no garbage collection going on.
There are minor and major collections in Java. Minor collections occur frequently, whereas major collections are rarer and diminish performance more. Minor collections probably tend to sweep up stuff like short-lived object instances created within methods. A major collection will remove a lot more, which is what probably happened at the end of your graph.
Now, some answers that were posted while I'm typing this give good explanations regarding the differences in garbage collectors, object generations and more. But that still doesn't explain why it would take so absurdly long (nearly 24 hours) before a serious cleaning is done.
Two things of interest that can be set for a JVM at startup are the maximum allowed heap size, and the initial heap size. The maximum is a hard limit, once you reach that, further garbage collection doesn't reduce memory usage and if you need to allocate new space for objects or other data, you'll get an OutOfMemoryError. However, internally there's a soft limit as well: the current heap size. A JVM doesn't immediately gobble up the maximum amount of memory. Instead, it starts at your initial heap size and then increases the heap when it's needed. Think of it a bit as the RAM of your JVM, that can increase dynamically.
If the actual memory use of your application starts to reach the current heap size, a garbage collection will typically be instigated. This might reduce the memory use, so an increase in heap size isn't needed. But it's also possible that the application currently does need all that memory and would exceed the heap size. In that case, it is increased provided that it hasn't already reached the maximum set limit.
Now, what might be your case is that the initial heap size is set to the same value as the maximum. Suppose that would be so, then the JVM will immediately seize all that memory. It will take a very long time before the application has accumulated enough garbage to reach the heap size in memory usage. But at that moment you'll see a large collection. Starting with a small enough heap and allowing it to grow keeps the memory use limited to what's needed.
This is assuming that your graph shows heap use and not allocated heap size. If that's not the case and you are actually seeing the heap itself grow like this, something else is going on. I'll admit I'm not savvy enough regarding the internals of garbage collection and its scheduling to be absolutely certain of what's happening here, most of this is from observation of leaking applications in profilers. So if I've provided faulty info, I'll take this answer down.
As you might have noticed, this does not affect you. The garbage collection only kicks in if the JVM feels there is a need for it to run and this happens for the sake of optimization, there's no use of doing many small collections if you can make a single full collection and do a full cleanup.
The current JVM contains some really interesting algorithms and the garbage collection itself id divided into 3 different regions, you can find a lot more about this here, here's a sample:
Three types of collection algorithms
The HotSpot JVM provides three GC algorithms, each tuned for a specific type of collection within a specific generation. The copy (also known as scavenge) collection quickly cleans up short-lived objects in the new generation heap. The mark-compact algorithm employs a slower, more robust technique to collect longer-lived objects in the old generation heap. The incremental algorithm attempts to improve old generation collection by performing robust GC while minimizing pauses.
Copy/scavenge collection
Using the copy algorithm, the JVM reclaims most objects in the new generation object space (also known as eden) simply by making small scavenges -- a Java term for collecting and removing refuse. Longer-lived objects are ultimately copied, or tenured, into the old object space.
Mark-compact collection
As more objects become tenured, the old object space begins to reach maximum occupancy. The mark-compact algorithm, used to collect objects in the old object space, has different requirements than the copy collection algorithm used in the new object space.
The mark-compact algorithm first scans all objects, marking all reachable objects. It then compacts all remaining gaps of dead objects. The mark-compact algorithm occupies more time than the copy collection algorithm; however, it requires less memory and eliminates memory fragmentation.
Incremental (train) collection
The new generation copy/scavenge and the old generation mark-compact algorithms can't eliminate all JVM pauses. Such pauses are proportional to the number of live objects. To address the need for pauseless GC, the HotSpot JVM also offers incremental, or train, collection.
Incremental collection breaks up old object collection pauses into many tiny pauses even with large object areas. Instead of just a new and an old generation, this algorithm has a middle generation comprising many small spaces. There is some overhead associated with incremental collection; you might see as much as a 10-percent speed degradation.
The -Xincgc and -Xnoincgc parameters control how you use incremental collection. The next release of HotSpot JVM, version 1.4, will attempt continuous, pauseless GC that will probably be a variation of the incremental algorithm. I won't discuss incremental collection since it will soon change.
This generational garbage collector is one of the most efficient solutions we have for the problem nowadays.
I had an app that produced a graph like that and acted as you describe. I was using the CMS collector (-XX:+UseConcMarkSweepGC). Here is what was going on in my case.
I did not have enough memory configured for the application, so over time I was running into fragmentation problems in the heap. This caused GCs with greater and greater frequency, but it did not actually throw an OOME or fail out of CMS to the serial collector (which it is supposed to do in that case) because the stats it keeps only count application paused time (GC blocks the world), application concurrent time (GC runs with application threads) is ignored for those calculations. I tuned some parameters, mainly gave it a whole crap load more heap (with a very large new space), set -XX:CMSFullGCsBeforeCompaction=1, and the problem stopped occurring.
Probably you do have memory leaks that's cleared every 24 hours.