Java Memory Usage Consumption - java

I am performing analysis of different sort algorithms. Currently I am analysing the Insertion Sort and Quick Sort. And as part of the analysis, I need to measure the memory consumption.
I am using Visual VM for profiling. However when I execute the Insertion Sort for a random data set of, let's say 70,000, I get different range of Heap Memory usage. For example, in the first run the heap memory consumption was 75 kbytes and then in the next round it drops to 35 kbytes. And if I execute it few more times then this value fluctuates randomly.
Is this normal or am I missing something here ? I have plot a graph of data size versus the memory consumption and with this fluctuation I won't be able to draw a chart.
java version "1.8.0_65"

This is Java's garbage collector at work, it kicks in at its own pace and does its job. Perhaps, it would be best for you to measure the amount of memory used after explicitly calling System.gc(), so that you're not taking notes of the garbage.
EDIT:
System.gc() should be called after you perform your tests, to explicitly request that garbage collector kicks in. While it is true that System.gc() is treated only as a request and it is not mathematically 100% sure that JVM will respect your request, it is most probably safe for your analysis, especially if you perform several runs of it.
With regards to measuring memory usage, it is quite tricky, especially for low values. Please see this answer which contains some nice details:

You may find JMH useful for running benchmarks while isolating side effects from the JVM.
Read through the code samples to understand how to use it.

Related

Inducing Java to do lengthy garbage collection

I would like to demonstrate to my students that using Java in real-time systems might be problematic since Java might do unexpected garbage collection. How can I write a Java program that will:
likely cause Java to stop and do garbage collection in an unexpected time (without System.gc());
The garbage collection will take noticeable time (e.g. several seconds)?
In case this matters, I use Open JDK 8 and Oracle JDK 8 on Ubuntu 16.04.
If it is not possible to do both, then I will be happy with at least item 2, i.e, a program where the garbage collection takes a long time when I do System.gc().
NOTE: I am not looking for a graphic representation of the garbage collection process - only to show that it takes a long time.
You might want to check out 'The Real-Time Specification for Java' here.
You might also want to read through this, an introduction to real time programming on/in java.
You can start with a relatively large heap space java -Xmx512m ... to achieve a sufficiently messy heap.
Then create many objects. Best graphs with cycles (cyclic references with long paths). Let many of them become obsolete. Best multiple threads.
Show some animation that on garbage collection would halt. Best showing a step time in a diagram and gc; Runtime.freeMemory().
Use a jvm monitor. It would be good moment to introduce memory & CPU profiling; in NetBeans or eclipse.
Have you considered that what you are telling your students is not generally true if you cannot reproduce it? If your toy definition of realtime is "pauses less than 1s" then many JVM applications can be considered realtime.
And that's without resorting to pauseless collectors.
But of course one can also trivially construct cases where a GC takes more than 1 second, e.g. by allocating a sufficiently large heap so that the system starts swapping. But in that case non-GCed languages would also experience latency spikes - maybe not quite as bad - and thus it would be a quite poor demonstration of GC issues.
System.gc() also is a bad demonstration since it currently invokes a single-threaded collection while the normal operation of the default collectors make use of multiple cores. Fixes are underway
To construct a somewhat realistic scenario under which modern collectors will experience >1s pauses, you will need
a large heap - multiple cores can generally chew through small heaps quite fast
a large live set size - dead objects are cheap
lots of small objects - large arrays of primitives are generally collected quickly
lots of reference between objects - null fields don't need updating. random object graphs also make life a lot harder for G1GC
And since you're using OpenJDK you would merely be testing collectors of a JVM targeting desktop and server-class workloads. There are other JVMs and 3rd-party garbage collector implementations which aim for realtime goals so one could simply dismiss your demonstration as calling a downhill bike a horrible bicycle because you can't win a velodrome race with it.
Small general java program cannot produce GC for seconds, but instead, you can run a program with creating some Objects with very less heap size and while executing program use below switches which will print GC logs in much details easy to explain.
-XX:+UseSerialGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
ie. java -XX:+UseSerialGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps abc.java
Reason System.gc() takes long time because it makes Garbage collector to do full collection, basically looking all the objects in the heap.
If you create many objects and let them die young, mostly young collection will occur. Young(minor) collection is much faster than major collection because it is only looking objects in the young generation space.
To ensure a long gc pause, I suggest hold object references to a degree where heap is almost full and release some of the older object references and try allocating more object this way you can ensure major collection will likely happen and it will cause longer latency.
You can view the latency with jdk tool flight recorder(jmc).

Java slower with big heap

I have a Java program that operates on a (large) graph. Thus, it uses a significant amount of heap space (~50GB, which is about 25% of the physical memory on the host machine). At one point, the program (repeatedly) picks one node from the graph and does some computation with it. For some nodes, this computation takes much longer than anticipated (30-60 minutes, instead of an expected few seconds). In order to profile these opertations to find out what takes so much time, I have created a test program that creates only a very small part of the large graph and then runs the same operation on one of the nodes that took very long to compute in the original program. Thus, the test program obviously only uses very little heap space, compared to the original program.
It turns out that an operation that took 48 minutes in the original program can be done in 9 seconds in the test program. This really confuses me. The first thought might be that the larger program spends a lot of time on garbage collection. So I turned on the verbose mode of the VM's garbage collector. According to that, no full garbage collections are performed during the 48 minutes, and only about 20 collections in the young generation, which each take less than 1 second.
So my questions is what else could there be that explains such a huge difference in timing? I don't know much about how Java internally organizes the heap. Is there something that takes significantly longer for a large heap with a large number of live objects? Could it be that object allocation takes much longer in such a setting, because it takes longer to find an adequate place in the heap? Or does the VM do any internal reorganization of the heap that might take a lot of time (besides garbage collection, obviously).
I am using Oracle JDK 1.7, if that's of any importance.
While bigger memory might mean bigger problems, I'd say there's nothing (except the GC which you've excluded) what could extend 9 seconds to 48 minutes (factor 320).
A big heap makes seemingly worse spatial locality possible, but I don't think it matters. I disagree with Tim's answer w.r.t. "having to leave the cache for everything".
There's also the TLB which a cache for the virtual address translation, which could cause some problems with very large memory. But again, not factor 320.
I don't think there's anything in the JVM which could cause such problems.
The only reason I can imagine is that you have some swap space which gets used - despite the fact that you have enough physical memory. Even slight swapping can be the cause for a huge slowdown. Make sure it's off (and possibly check swappiness).
Even when things are in memory you have multiple levels of caching of data on modern CPUs. Every time you leave the cache to fetch data the slower that will go. Having 50GB of ram could well mean that it is having to leave the cache for everything.
The symptoms and differences you describe are just massive though and I don't see something as simple as cache coherency making that much difference.
The best advice I can five you is to try running a profiler against it both when it's running slow and when it's running fast and compare the difference.
You need solid numbers and timings. "In this environment doing X took Y time". From that you can start narrowing things down.

Java's Serial garbage collector performing far better than other garbage collectors?

I'm testing an API, written in Java, that is expected to minimize latency in processing messages received over a network. To achieve these goals, I'm playing around with the different garbage collectors that are available.
I'm trying four different techniques, which utilize the following flags to control garbage collection:
1) Serial: -XX:+UseSerialGC
2) Parallel: -XX:+UseParallelOldGC
3) Concurrent: -XX:+UseConcMarkSweepGC
4) Concurrent/incremental: -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
I ran each technique over the course of five hours. I periodically used the list of GarbageCollectorMXBean provided by ManagementFactory.getGarbageCollectorMXBeans() to retrieve the total time spent collecting garbage.
My results? Note that "latency" here is "Amount of time that my application+the API spent processing each message plucked off the network."
Serial: 789 GC events totaling 1309 ms; mean latency 47.45 us, median latency 8.704 us, max latency 1197 us
Parallel: 1715 GC events totaling 122518 ms; mean latency 450.8 us, median latency 8.448 us, max latency 8292 us
Concurrent: 4629 GC events totaling 116229 ms; mean latency 707.2 us, median latency 9.216 us, max latency 9151 us
Incremental: 5066 GC events totaling 200213 ms; mean latency 515.9 us, median latency 9.472 us, max latency 14209 us
I find these results to be so improbable that they border on absurd. Does anyone know why I might be having these kinds of results?
Oh, and for the record, I'm using Java HotSpot(TM) 64-Bit Server VM.
I'm working on a Java application that is expected to maximize throughput and minimize latency
Two problems with that:
Those are often contradictory goals, so you need to decide how important each is against the other (would you sacrifice 10% latency to get 20% throughput gain or vice versa? Are you aiming for some specific latency target, beyond which it doesn't matter whether it's any faster? Things like that.)
Your haven't given any results around either of these
All you've shown is how much time is spent in the garbage collector. If you actually achieve more throughput, you would probably expect to see more time spent in the garbage collector. Or to put it another way, I can make a change in the code to minimize the values you're reporting really easily:
// Avoid generating any garbage
Thread.sleep(10000000);
You need to work out what's actually important to you. Measure everything that's important, then work out where the trade-off lies. So the first thing to do is re-run your tests and measure latency and throughput. You may also care about total CPU usage (which isn't the same as CPU in GC of course) but while you're not measuring your primary aims, your results aren't giving you particularly useful information.
I don't find this surprising at all.
The problem with serial garbage collection is that while it's running, nothing else can run at all (aka "stops the world"). That has a good point though: it keeps the amount of work spent on garbage collection to just about its bare minimum.
Almost any sort of parallel or concurrent garbage collection has to do a fair amount of extra work to ensure that all modifications to the heap appear atomic to the rest of the code. Instead of just stopping everything for a while, it has to stop just those things that depend on a particular change, and then for just long enough to carry out that specific change. It then lets that code start running again, gets to the next point that it's going to make a change, stops other pieces of code that depend on it, and so on.
The other point (though in this case, probably a fairly minor one) is that as you process more data, you generally expect to generate more garbage, and therefore spend more time doing garbage collection. Since the serial collector does stop all other processing while it does its job, that not only makes the garbage collection fast, but also prevents any more garbage from being generated during that time.
Now, why do I say that's probably a minor contributor in this case? That's pretty simple: the serial collector only used up a little over a second out of five hours. Even though nothing else got done during that ~1.3 seconds, that's such a small percentage of five hours that it probably didn't make any much (if any) real difference to your overall throughput.
Summary: the problem with serial garbage collection isn't that it uses excessive time overall -- it's that it can be very inconvenient if it stops the world right when you happen to need fast response. At the same time, I should add that as long as your collection cycles are short, this can still be fairly minimal. In theory, the other forms of GC mostly limit your worst case, but in fact (e.g., by limiting the heap size) you can often limit your maximum latency with a serial collector as well.
There was an excellent talk by a Twitter engineer at the 2012 QCon Conference on this topic - you can watch it here.
It discussed the various "generations" in the Hotspot JVM memory and garbage collection (Eden, Survivor, Old). In particular note that the "Concurrent" in ConcurrentMarkAndSweep only applies to the Old generation, i.e. objects that hang around for a while.
Short-lived objects are GCd from the "Eden" generation - this is cheap, but is a "stop-the-world" GC event regardless of which GC algorithm you have chosen!
The advice was to tune the young generation first e.g. allocate lots of new Eden so there's more chance for objects to die young and be reclaimed cheaply. Use +PrintGCDetails, +PrintHeapAtGC, +PrintTenuringDistribution... If you get more than 100% survivor then there wasn't room, so objects get quickly promoted to Old - this is Bad.
When tuning for the Old generatiohn, if latency is top priority, it was recommended to try ParallelOld with auto-tune first (+AdaptiveSizePolicy etc), then try CMS, then maybe the new G1GC.
You can not say one GC is better than the other. it depends on your requirements and your application.
but if u want to maximize throughput and minimize latency: GC is your enemy! you should not call GC at all and also try to prevent JVM from calling GC.
go with serial and use object pools.
With serial collection, only one thing happens at a time. For example, even when multiple CPUs are
available, only one is utilized to perform the collection. When parallel collection is used, the task of
garbage collection is split into parts and those subparts are executed simultaneously, on different
CPUs. The simultaneous operation enables the collection to be done more quickly, at the expense of
some additional complexity and potential fragmentation.
While the serial GC uses only one thread to process a GC, the parallel GC uses several threads to process a GC, and therefore, faster. This GC is useful when there is enough memory and a large number of cores. It is also called the "throughput GC."

Why does java wait so long to run the garbage collector?

I am building a Java web app, using the Play! Framework. I'm hosting it on playapps.net. I have been puzzling for a while over the provided graphs of memory consumption. Here is a sample:
The graph comes from a period of consistent but nominal activity. I did nothing to trigger the falloff in memory, so I presume this occurred because the garbage collector ran as it has almost reached its allowable memory consumption.
My questions:
Is it fair for me to assume that my application does not have a memory leak, as it appears that all the memory is correctly reclaimed by the garbage collector when it does run?
(from the title) Why is java waiting until the last possible second to run the garbage collector? I am seeing significant performance degradation as the memory consumption grows to the top fourth of the graph.
If my assertions above are correct, then how can I go about fixing this issue? The other posts I have read on SO seem opposed to calls to System.gc(), ranging from neutral ("it's only a request to run GC, so the JVM may just ignore you") to outright opposed ("code that relies on System.gc() is fundamentally broken"). Or am I off base here, and I should be looking for defects in my own code that is causing this behavior and intermittent performance loss?
UPDATE
I have opened a discussion on PlayApps.net pointing to this question and mentioning some of the points here; specifically #Affe's comment regarding the settings for a full GC being set very conservatively, and #G_H's comment about settings for the initial and max heap size.
Here's a link to the discussion, though you unfortunately need a playapps account to view it.
I will report the feedback here when I get it; thanks so much everyone for your answers, I've already learned a great deal from them!
Resolution
Playapps support, which is still great, didn't have many suggestions for me, their only thought being that if I was using the cache extensively this may be keeping objects alive longer than need be, but that isn't the case. I still learned a ton (woo hoo!), and I gave #Ryan Amos the green check as I took his suggestion of calling System.gc() every half day, which for now is working fine.
Any detailed answer is going to depend on which garbage collector you're using, but there are some things that are basically the same across all (modern, sun/oracle) GCs.
Every time you see the usage in the graph go down, that is a garbage collection. The only way heap gets freed is through garbage collection. The thing is there are two types of garbage collections, minor and full. The heap gets divided into two basic "areas." Young and tenured. (There are lots more subgroups in reality.) Anything that is taking up space in Young and is still in use when the minor GC comes along to free up some memory, is going to get 'promoted' into tenured. Once something makes the leap into tenured, it sits around indefinitely until the heap has no free space and a full garbage collection is necessary.
So one interpretation of that graph is that your young generation is fairly small (by default it can be a fairly small % of total heap on some JVMs) and you're keeping objects "alive" for comparatively very long times. (perhaps you're holding references to them in the web session?) So your objects are 'surviving' garbage collections until they get promoted into tenured space, where they stick around indefinitely until the JVM is well and good truly out of memory.
Again, that's just one common situation that fits with the data you have. Would need full details about the JVM configuration and the GC logs to really tell for sure what's going on.
Java won't run the garbage cleaner until it has to, because the garbage cleaner slows things down quite a bit and shouldn't be run that frequently. I think you would be OK to schedule a cleaning more frequently, such as every 3 hours. If an application never consumes full memory, there should be no reason to ever run the garbage cleaner, which is why Java only runs it when the memory is very high.
So basically, don't worry about what others say: do what works best. If you find performance improvements from running the garbage cleaner at 66% memory, do it.
I am noticing that the graph isn't sloping strictly upward until the drop, but has smaller local variations. Although I'm not certain, I don't think memory use would show these small drops if there was no garbage collection going on.
There are minor and major collections in Java. Minor collections occur frequently, whereas major collections are rarer and diminish performance more. Minor collections probably tend to sweep up stuff like short-lived object instances created within methods. A major collection will remove a lot more, which is what probably happened at the end of your graph.
Now, some answers that were posted while I'm typing this give good explanations regarding the differences in garbage collectors, object generations and more. But that still doesn't explain why it would take so absurdly long (nearly 24 hours) before a serious cleaning is done.
Two things of interest that can be set for a JVM at startup are the maximum allowed heap size, and the initial heap size. The maximum is a hard limit, once you reach that, further garbage collection doesn't reduce memory usage and if you need to allocate new space for objects or other data, you'll get an OutOfMemoryError. However, internally there's a soft limit as well: the current heap size. A JVM doesn't immediately gobble up the maximum amount of memory. Instead, it starts at your initial heap size and then increases the heap when it's needed. Think of it a bit as the RAM of your JVM, that can increase dynamically.
If the actual memory use of your application starts to reach the current heap size, a garbage collection will typically be instigated. This might reduce the memory use, so an increase in heap size isn't needed. But it's also possible that the application currently does need all that memory and would exceed the heap size. In that case, it is increased provided that it hasn't already reached the maximum set limit.
Now, what might be your case is that the initial heap size is set to the same value as the maximum. Suppose that would be so, then the JVM will immediately seize all that memory. It will take a very long time before the application has accumulated enough garbage to reach the heap size in memory usage. But at that moment you'll see a large collection. Starting with a small enough heap and allowing it to grow keeps the memory use limited to what's needed.
This is assuming that your graph shows heap use and not allocated heap size. If that's not the case and you are actually seeing the heap itself grow like this, something else is going on. I'll admit I'm not savvy enough regarding the internals of garbage collection and its scheduling to be absolutely certain of what's happening here, most of this is from observation of leaking applications in profilers. So if I've provided faulty info, I'll take this answer down.
As you might have noticed, this does not affect you. The garbage collection only kicks in if the JVM feels there is a need for it to run and this happens for the sake of optimization, there's no use of doing many small collections if you can make a single full collection and do a full cleanup.
The current JVM contains some really interesting algorithms and the garbage collection itself id divided into 3 different regions, you can find a lot more about this here, here's a sample:
Three types of collection algorithms
The HotSpot JVM provides three GC algorithms, each tuned for a specific type of collection within a specific generation. The copy (also known as scavenge) collection quickly cleans up short-lived objects in the new generation heap. The mark-compact algorithm employs a slower, more robust technique to collect longer-lived objects in the old generation heap. The incremental algorithm attempts to improve old generation collection by performing robust GC while minimizing pauses.
Copy/scavenge collection
Using the copy algorithm, the JVM reclaims most objects in the new generation object space (also known as eden) simply by making small scavenges -- a Java term for collecting and removing refuse. Longer-lived objects are ultimately copied, or tenured, into the old object space.
Mark-compact collection
As more objects become tenured, the old object space begins to reach maximum occupancy. The mark-compact algorithm, used to collect objects in the old object space, has different requirements than the copy collection algorithm used in the new object space.
The mark-compact algorithm first scans all objects, marking all reachable objects. It then compacts all remaining gaps of dead objects. The mark-compact algorithm occupies more time than the copy collection algorithm; however, it requires less memory and eliminates memory fragmentation.
Incremental (train) collection
The new generation copy/scavenge and the old generation mark-compact algorithms can't eliminate all JVM pauses. Such pauses are proportional to the number of live objects. To address the need for pauseless GC, the HotSpot JVM also offers incremental, or train, collection.
Incremental collection breaks up old object collection pauses into many tiny pauses even with large object areas. Instead of just a new and an old generation, this algorithm has a middle generation comprising many small spaces. There is some overhead associated with incremental collection; you might see as much as a 10-percent speed degradation.
The -Xincgc and -Xnoincgc parameters control how you use incremental collection. The next release of HotSpot JVM, version 1.4, will attempt continuous, pauseless GC that will probably be a variation of the incremental algorithm. I won't discuss incremental collection since it will soon change.
This generational garbage collector is one of the most efficient solutions we have for the problem nowadays.
I had an app that produced a graph like that and acted as you describe. I was using the CMS collector (-XX:+UseConcMarkSweepGC). Here is what was going on in my case.
I did not have enough memory configured for the application, so over time I was running into fragmentation problems in the heap. This caused GCs with greater and greater frequency, but it did not actually throw an OOME or fail out of CMS to the serial collector (which it is supposed to do in that case) because the stats it keeps only count application paused time (GC blocks the world), application concurrent time (GC runs with application threads) is ignored for those calculations. I tuned some parameters, mainly gave it a whole crap load more heap (with a very large new space), set -XX:CMSFullGCsBeforeCompaction=1, and the problem stopped occurring.
Probably you do have memory leaks that's cleared every 24 hours.

how to get the memory used by a java process in java

I am running JBOSS server by deploying my own classes.Now i started doing some operations on my application.Now i would like to know the memory used by my application before and after performing operations.please support me in this regard
By using
MemoryMXBean
(retrieved by calling
ManagementFactory.getMemoryMXBean())
as well as
Runtime.getRuntime()'s methods:
.totalMemory(),
.maxMemory()
and
.freeMemory().
Note that this is not an exact art: while creating a new object, other temporary ones may be allocated, which will not give you an accurate measurement. As we know, java garbage collection is not guaranteed so you can't necessarily do that to eliminate dead objects.
If you research, you'll see that most code that attempts to do these measurements will have loops of Runtime.gc() calls and sleeps etc to try and ensure that the measurement is accurate. And this will only work on certain JVM implementations...
On an app server/deployed application, you will likely only get gross measurements/usage changes as the heap is allocated and the gc fires, but it should be enough. [I'm presuming that you wouldn't implement gc()'s and sleeps in production code :)]
Get the free memory before doing the operation Runtime.getRuntime().freeMemory() and then again after finishing the operation and you will get the memory used by your operation.
You may find the results you get are inconclusive. The GC will clean up used memory at random points in the background so you might find at if you run the same operations many times you will get different results. You can even appear to have more memory free after performing an operation.

Categories

Resources