At my company we are trying an approach with JVM based microservices. They are designed to be scaled horizontally and so we run multiple instances of each using rather small containers (up to 2G heap, usually 1-1.5G). The JVM we use is 1.8.0_40-b25.
Each of such instances typically handles up to 100 RPS with max memory allocation rate around 250 MB/s.
The question is: what kind of GC would be a safe sensible default to start off with? So far we are using CMS with Xms = Xmx (to avoid pauses during heap resizing) and Xms = Xmx = 1.5G. Results are decent - we hardly ever see any Major GC performed.
I know that G1 could give me smaller pauses (at the cost of total throughput) but AFAIK it requires a bit more "breathing" space and at least 3-4G heap to perform properly.
Any hints (besides going for Azul's Zing :D) ?
Hint # 1: Do experiments!
Assuming that your microservice is deployed at least on two nodes run one on CMS, another on G1 and see what response times are.
Not very likely, but what if you can find that with G1 performance is so good that need half of original cluster size?
Side notes:
re: "250Mb/s" -> if all of this is stack memory (alternatively, if it's young gen) then G1 would provide little benefit since collection form these areas is free.
re: "100 RPS" -> in many cases on our production we found that reducing concurrent requests in system (either via proxy config, or at application container level) improves throughput. Given small heap it's very likely that you have small cpu number as well (2 to 4).
Additionally there are official Oracle Hints on tuning for a small memory footprint. It might not reflect latest config available on 1.8_40, but it's good read anyway.
Measure how much memory is retained after a full GC. add to this the amount of memory allocated per second and multiply by 2 - 10 depending on how often you would like to have a minor GC. e.g. every 2 second or every 10 second.
E.g. say you have up to 500 MB retained after a full GC and GCing every couple of seconds is fine, you can have 500 MB + 2 * 250 MB, or a heap of around 1 GB.
The number of RPS is not important.
Related
We inherited a system which runs in production and started to fail every 10 hours recently. Basically, our internal software marks the system that is has failed if it is unresponsive for a minute. We found that our problem that our Full GC cycles last for 1.5 minutes, we use 30 GB heap. Now the problem is that we cannot optimize a lot in a short period of time and we cannot partition of our service quickly but we need to get rid of 1.5 minutes pauses as soon as possible as our system fails because of these pauses in production. For us, an acceptable delay is 20 milliseconds but not more. What will be the quickest way to tweak the system? Reduce the heap to trigger GCs frequently? Use System.gc() hints? Any other solutions? We use Java 8 default settings and we have more and more users - i.e. more and more objects created.
Some GC stat
You have a lot of retained data. There is a few options which are worth considering.
increase the heap to 32 GB, this has little impact if you have free memory. Looking again at your totals it appears you are using 32 GB rather than 30 GB, so this might not help.
if you don't have plenty of free memory, it is possible a small portion of your heap is being swapped as this can increase full GC times dramatically.
there might be some simple ways to make the data structures more compact. e.g. use compact strings, use primitives instead of wrappers e.g. long for a timestamp instead of Date or LocalDateTime. (long is about 1/8th the size)
if neither of these help, try moving some of the data off heap. e.g. Chronicle Map is a ConcurrentMap which uses off heap memory can can reduce you GC times dramatically. i.e. there is no GC overhead for data stored off heap. How easy this is to add highly depends on how your data is structured.
I suggest analysing how your data is structured to see if there is any easy ways to make it more efficient.
There is no one-size-fits-all magic bullet solution to your problem: you'll need to have a good handle on your application's allocation and liveness patterns, and you'll need to know how that interacts with the specific garbage collection algorithm you are running (function of version of Java and command line flags passed to java).
Broadly speaking, a Full GC (that succeeds in reclaiming lots of space) means that lots of objects are surviving the minor collections (but aren't being leaked). Start by looking at the size of your Eden and Survivor spaces: if the Eden is too small, minor collections will run very frequently, and perhaps you aren't giving an object a chance to die before its tenuring threshold is reached. If the Survivors are too small, objects are going to be promoted into the Old gen prematurely.
GC tuning is a bit of an art: you run your app, study the results, tweak some parameters, and run it again. As such, you will need a benchmark version of your application, one which behaves as close as possible to the production one but which hopefully doesn't need 10 hours to cause a full GC.
As you stated that you are running Java 8 with the default settings, I believe that means that your Old collections are running with a Serial collector. You might see some very quick improvements by switching to a Parallel collector for the Old generation (-XX:+UseParallelOldGC). While this might reduce the 1.5 minute pause to some number of seconds (depending on the number of cores on your box, and the number of threads you specify for GC), this will not reduce your max pause to to 20ms.
When this happened to me, it was due to a memory leak caused by a static variable eating up memory. I would go through all recent code changes and look for any possible memory leaks.
A. If I execute a huge simulation program with -Xmx100000m (~100GB) I see some spikes in the used heap (~30 GB). That spikes increase the heap size and decreases the memory that can be used by other programs. I would like to limit the heap size to the size that is actually required to run the program without memory exceptions.
B. If I execute my simulation program with -Xmx10000 (~10GB) I am able to limit the used heap size (~ 7 GB). The total heap size is less, too (of course). I do not get out of memory exceptions in the first phase of the program that is shown in the VisualVM figures (about 16 minutes).
I naively expected that if I increase xmx from 10GB (B) to 100GB (A) that the used heap would stay about the same and that Java only would use more memory in order to avoid out of memory exceptions. However, the behavior seems to be different. I guess that Java works this way in order to improve performance.
An explanation for the large used heap in A might be that the growth behavior of hash maps is different if xmx is larger? Does xmx have an effect on the load factor?
In the phase of the program where a lot of mini spikes exist (see for example B at 12:06) instead of a few large ones (A) some java streams are processed. Does the memory allocation for stream processing automatically adapt with the xmx value? (There is still some memory left that could be used to have less mini spikes at 12:06 in B.)
If not, what might be the reasons for the larger used heap in A?
How can I tell Java to keep the used heap low if possible (like in the curves for B) but to take more memory if an out of memory exception could occur (allow to temporarily switch to A). Could this be done by tuning some garbage collection properties?
Edit
As stated by the answer below, the profile can be altered by garbage collection parameters. Applying -Xmx100000m -XX:MaxGCPauseMillis=1000 adapts the profile from A to consume less memory (~ 20 GB used) and more time (~ 22 min).
I would like to limit the heap size to the size that is actually required to run the program without memory exceptions.
You do not actually want to do that because it would make your program extremely slow because only providing the amount equivalent to the application peak footprint means that every single allocation would trigger a garbage collection while the application is near the maximum.
I guess that Java works this way in order to improve performance.
Indeed.
The JVM has several goals, in descending order:
pause times (latency)
allocation throughput
footprint
If you want to prioritize footprint over other goals you have to relax the other ones.
set -XX:MaxGCPauseMillis=18446744073709551615, this is the default for the parallel collector but G1 has a 200ms default.
configure it to keep less breathing room
I am using an Infinispan cache to store values. The code writes to the cache every 10 minutes and the cache reaches a size of about 400mb.
It has a time to live of about 2 hours, and the maximum entries is 16 million although currently in my tests the number of entries doesn't go above 2 million or so (I can see this by checking the mbeans/metrics in jconsole).
When I start jboss the java heap size is 1.5Gb to 2Gb. The -Xmx setting for the maximum allocated memory to jboss is 4Gb.
When I disable the Infinispan cache the heap memory usage stays flat at around 1.5Gb to 2Gb. It is very constant and stays at that level.
=> The problem is: when I have the Infinispan cache enabled the java heap size grows to about 3.5Gb/4Gb which is way more than expected.
I have done a heap dump to check the size of the cache in Eclipse MAT and it is only 300 or 400mb (which is ok).
So I would expect the memory usage to go to 2.5Gb and stay steady at that level, since the initial heap size is 2Gb and the maximum cache size should only be around 500mb.
However it continues to grow and grow over time. Every 2 or 3 hours a garbage collection is done and that brings the usage down to about 1 or 1.5Gb but it then increases again within 30 minutes up to 3.5Gb.
The number of entries stays steady at about 2 million so it is not due to just more entries going in to the cache. (Also the number of evictions stays at 0).
What could be holding on to this amount of memory if the cache is only 400-500mb?
Is it a problem with my garbage collection settings? Or should I look at Infinispan settings?
Thanks!
Edit: you can see the heap size over time here.
What is strange is that even after what looks like a full GC, the memory shoots back up again to 3Gb. This corresponds to more entries going into the cache.
Edit: It turns out this has nothing to do with Infinispan. I narrowed down the problem to a single line of code that is using a lot of memory (about 1Gb more than without the call).
But I do think more and more memory is being taken by the Infinispan cache, naturally because more entries are being added over the 2 hour time to live.
I also need to have upwards of 50 users query on Infinispan. When the heap reaches a high value like this (even without the memory leak mentioned above), I know it's not an error scenario in java however I need as much memory available as possible.
Is there any way to "encourage" a heap dump past a certain point? I have tried using GC options to collect at a given proportion of heap for the old gen but in general the heap usage tends to creep up.
Probably what you're seeing is the JVM not collecting objects which have been evicted from the cache. Cache's in general have a curious relationship with the prevailing idea of generational GC.
The generational GC idea is that, broadly speaking, there are two types of objects in the JVM - short lived ones, which are used and thrown away quickly, and longer lived ones, which are usually used throughout the lifetime of the application. In this model you want to tune your GC so that you put most of your effort attempting to identify the short lived objects. This means that you avoid looking at the long-lived objects as much as possible.
Cache's disrupt this pattern by having some intermediate-length object lifespans (i.e. a few seconds / minutes / hours, depending on your cache). These objects often get promoted to the tenured generation, where they're not usually looked at until a full GC becomes necessary, even after they've been evicted from the cache.
If this is what's happening then you've a couple of choices:
ignore it, let the full GC semantics do its thing and just be aware that this is what's happening.
try to tune the GC so that it takes longer for objects to get promoted to the tenured generation. There are some GC flags which can help with that.
We have a four-datanodes-cluster running CDH5.0.2, installed through Cloudera Manager parcels.
In order to import 13M users' rows into HBase, we wrote a simple Python script and used hadoop-streaming jar. It works as expected up to 100k rows. And then... then, one after the other, all datanodes crash with the same message:
The health test result for REGION_SERVER_GC_DURATION has become bad:
Average time spent in garbage collection was 44.8 second(s) (74.60%)
per minute over the previous 5 minute(s).
Critical threshold: 60.00%.
Any attempt to solve the issue following the advices found around the web (e.g. [1], [2], [3]) do not lead anywhere near a solution. "Playing" with java heap size is useless. The only thing which "solved" the situation was increasing Garbage Collection Duration Monitoring Period for region servers from 5' to 50'. Arguably a dirty workaround.
We don't have the workforce to create a monitor for our GC usage right now. We eventually will, but I was wondering how possibly importing 13M rows into HBase could lead to a sure crash of all region servers. Is there a clean solution?
Edit:
JVM Options on Datanodes are:
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled
Datanodes are physical machines running CentOS 6.5, each with 32Gb Ram and 1Quadcore at 2GHz with 30Mb cache.
Below excerpt of the Python script which we run. We fill two tables: one with a unique user ID as rowkey and a single columnfamily with users' info, another with all info we might want to access as rowkey.
#!/usr/bin/env python2.7
import sys
import happybase
import json
connection = happybase.Connection(host=master_ip)
hbase_main_table = connection.table('users_table')
hbase_index_table = connection.table('users_index_table')
header = ['ID', 'COL1', 'COL2', 'COL3', 'COL4']
for line in sys.stdin:
l = line.replace('"','').strip("\n").split("\t")
if l[header.index("ID")] == "ID":
#you are reading the header
continue
for h in header[1:]:
try:
id = str(l[header.index("ID")])
col = 'info:' + h.lower()
val = l[header.index(h)].strip()
hbase_table.put(id_au_bytes, {
col: val
})
indexed = ['COL3', 'COL4']
for typ in indexed:
idx = l[header.index(typ)].strip()
if len(idx) == 0:
continue
row = hbase_index_table.row(idx)
old_ids = row.get('d:s')
if old_ids is not None:
ids = json.dumps(list(set(json.loads(old_ids)).union([id_au])))
else:
ids = json.dumps([id_au])
hbase_index.put(idx, {
'd:s': ids,
'd:t': typ,
'd:b': 'ame'
})
except:
msg = 'ERROR '+str(l[header.index("ID")])
logging.info(msg, exc_info=True)
One of the major issues that a lot of people are running into these days is that the amount of RAM available to java applications has exploded but most of the information about tuning Java GC is based on experience from the 32-bit era.
I recently spent a good deal of time researching GC for large heap situations in order to avoid the dreaded "long pause". I watched this excellent presentation several time and finally GC and the issues I've faced with it started making more sense.
I don't know that much about Hadoop but I think you may be running into a situation where your young generation is too small. It's unfortunate but most information about JVM GC tuning fails to emphasize that the best place for your objects to be GC'd is in the young generation. It takes literally no time at all to collect garbage at this point. I won't go into the details (watch the presentation if you want to know) but what happens is that is if you don't have enough room in your young (new) generation, it fills up prematurely. This forces a collection and some objects will be moved to the tenured (old) generation. Eventually the tenured generation fills up and it will need to be collected too. If you have a lot of garbage in your tenured generation, this can be very slow as the tenured collection algorithm is generally mark sweep which is has a non-zero time for collecting garbage.
I think you are using Hotspot. Here's a good reference for the various GC arguments for hotspot. JVM GC options
I would start by increasing the size of the young generation greatly. My assumption here is that a lot of short to medium lived objects are being created. What you want to avoid is having these be promoted into the tenured generation. The way you do that is extend the time that they spend in the young generation. To accomplish that, you can either increase it's size (so it takes longer to fill up) or increase the tenuring threshold (essentially the number young collections the object will stay for). The problem with the tenuring threshold is that it takes time to move the object around in the young generation. Increasing the size of the young generation is inefficient in terms of memory but my guess is that you have lots to spare.
I've used this solution with caching servers and I have minor collections in the > 100 ms range and infrequent (less than one a day) major collections generally under 0.5s with a heap around 4GB. Our object live either 5 min, 15 min or 29 days.
Another thing you might want to consider is the G1 (garbage first) collector which was recently added (relatively speaking) to HotSpot.
I'm interested in how well this advice works for you. Good luck.
I am unsure whether there is a generic answer for this, but I was wondering what the normal Java GC pattern and java heap space usage looks like. I am testing my Java 1.6 application using JMeter. I am collecting JMX GC logs and plotting them with JMeter JMX GC and Memory plugin extension. The GC pattern looks quite stable with most GC operations being 30-40ms, occasional 90ms. The memory consumption goes in a saw-tooth pattern. The JHS usage grows constantly upwards e.g. to 3GB and every 40 minutes the memory usage does a free-fall drop down to around 1GB. The max-min delta however grows, so the sawtooth height constantly grows. Does it do a full GC every 40mins?
Most of your descriptions in general, are how the GC works. However, none of your specific observations, especially numbers, hold for general case.
To start with, each JVM has one or several GC implementations and you could choose which one to use. Take the mostly applied one i.e. SUN JVM (I like to call it this way) and the common server GC pattern as example.
Firstly, the memory are divided into 4 regions.
A young generation which holds all of the recently created objects. When this generation is full, GC does a stop-the-world collection by stopping your program from working, execute a black-gray-white algorithm and get the obselete objects and remove them. So this is your 30-40 ms.
If an object survived a certain rounds of GC in the young gen, it would be moved into a swap generation. The swap generation holds the objects until another number of GCs - then move them to the old generation. There are 2 swap generations which does a double buffering kind of thing to facilitate the young gen to work faster. If young gen dumps stuff to swap gen and found swap gen is mostly full, a GC would happen on swap gen and potentially move the survived objects to old gen. This most likely makes your 90ms, though I am not 100% sure how swap gen works. Someone correct me if I am wrong.
All the objects survived swap gen would be moved to the old generation. The old generation would only be GC-ed until it's mostly full. In your case, every 40 min.
There is another "permanent gen" which is used to load your jar target byte code and resources.
All size of the areas can be adjusted by JVM parameters.
You can try to use VisualVM which would give you a dynamic idea of how it works.
P.S. not all JVM / GC works the same way. If you use G1 collector, or JRocket, it might happens slightly different, but the general idea holds.
Java GC work in terms of generations of objects. There are young, tenure and permament generations. It seems like in your case: every 30-40ms GC process only young generation (and transfers survived objects into tenure generation). And every 40 mins it performs full collecting (it causes stop-the-world pause). Note: it happens not by time, but by percentage of used memory.
There are several JVM options, which allows you to chose generation's sizes, type of GC (there are several algorithms for GC, in java 1.6 Serial GC is used by default, for example -XX:-UseConcMarkSweepGC), parameters of GC work.
You'd better try to find good articles about generations and different types of GC (algorithms are really different, some of them allow to avoid stop-the-world pauses at all!)
yes, most likely. Instead of guessing you can use jstat to monitor your GCs.
I suggest you use a memory profiler to ensure there is nothing simple you can do ti improve the amount of garbage you are producing.
BTW, If you increase the size of the young generation, you can reduce how much garbage makes it into the tenured space reducing the frequency of full collections. You may find you less than one full collection per day if you tune it enough.
For a more extreme case, I have tuned a trading system to less than one collection per day (minor or major)