Java native memory leak with G1 and huge memory

Java native memory leak with G1 and huge memory - java

We currently have problems with a java native memory leak. Server is quite big (40cpus, 128GB of memory). Java heap size is 64G and we run a very memory intensive application reading lot of data to strings with about 400 threads and throwing them away from memory after some minutes.
So the heap is filling up very fast but stuff on the heap becomes obsolete and can be GCed very fast, too. So we have to use G1 to not have STW breaks for minutes.
Now, that seems to work fine - heap is big enough to run the application for days, nothing leaking here. Anyway the Java process is growing and growing over time until all the 128G are used and the aplication crashes with an allocation failure.
I've read a lot about native java memory leaks, including the glibc issue with max. arenas (we have wheezy with glibc 2.13, so no fix possible here with setting MALLOC_ARENA_MAX=1 or 4 without a dist upgrade).
So we tried jemalloc what gave us graphs for:
inuse-space:
and
inuse-objects:
.
I don't get it what's the issue here, has someone an idea?
If I set MALLOC_CONF="narenas:1" for jemalloc as environment parameter for the tomcat process running our app, could that still use the glibc malloc version anyway somehow?
This is our G1 setup, maybe some issue here?
-XX:+UseCompressedOops
-XX:+UseNUMA
-XX:NewSize=6000m
-XX:MaxNewSize=6000m
-XX:NewRatio=3
-XX:SurvivorRatio=1
-XX:InitiatingHeapOccupancyPercent=55
-XX:MaxGCPauseMillis=1000
-XX:PermSize=64m
-XX:MaxPermSize=128m
-XX:+PrintCommandLineFlags
-XX:+PrintFlagsFinal
-XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution
-XX:-UseAdaptiveSizePolicy
-XX:+UseG1GC
-XX:MaxDirectMemorySize=2g
-Xms65536m
-Xmx65536m
Thanks for your help!

We never called System.gc() explicitly, and meanwhile stopped using G1, not specifying anything other than xms and xmx.
Therefore using nearly all the 128G for the heap now. The java process memory usage is high - but constant for weeks. I'm sure this is some G1 or at least general GC issue. The only disadvantage by this "solution" are high GC pauses, but they decreased from up to 90s to about 1-5s with increasing the heap, which is ok for the benchmark we drive with our servers.
Before that, I played around with -XX:ParallelGcThreads options which had significant influence on the memory leak speed when decreasing from 28 (default for 40 cpus) downwards to 1. The memory graphs looked somewhat like a hand fan using different values on different instances...

Related

G1GC causes gradual memory growth and full GC brings it down

I am running my java application on centos 6 with openjdk version "1.8.0_232" using G1GC. I am seeing the total heap usage grows gradually and causing application to crash. When I am taking a heapdump of live objects
the dump size is only 1.6GB but my total used heap was 32GB.
Command used for taking dump: jmap -dump:live,format=b,file=/tmp/dump.hprof
Read somewhere , that the jmap dump command triggers a full GC and releases inaccessible heap , that is the reason for less dump size. I can see after triggering dump command my total heap usage came down and again it starts growing gradually.
My JVM args : -XX:-AllowUserSignalHandlers -Xmx49000m -DFCGI_PORT=6654 -XX:+UseG1GC -XX:+UseStringDeduplication -XX:InitiatingHeapOccupancyPercent=55 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/xyz -XX:+PerfDisableSharedMem -Djava.io.tmpdir=/var/XXX/temp
Is there a better way to efficiently do full GC with G1?

This is because what is being presented to you is the commited memory, which is different from the used memory.
In more recent versions of Java there have been improvements in the GC algorithms for the commited memory to be released more frequently for the operating system.
Use Java Mission Control to see your memory in more detail (commited memory vs used memory).
My recommendation is to use newer versions of Java, if it is not possible, change the value of XmX to a lower value (3Gb). You will notice that the JVM will always approach the defined limit.
https://openjdk.java.net/jeps/346
https://openjdk.java.net/jeps/351
https://blog.idrsolutions.com/2019/09/improved-garbage-collection-in-java-13/
https://www.slideshare.net/jelastic/choosing-right-garbage-collector-to-increase-efficiency-of-java-memory-usage

The Java ZGC garbage collector USES a lot of memory

I built a simple application using Springboot.The ZGC garbage collector I use when deploying to a Linux server USES a lot of memory..I tried to limit the maximum heap memory to 500MB with Xmx500m, but the JAVA program still used more than 1GB. When I used the G1 collector, it only used 350MB.I don't know why, is this a BUG of JDK11?Or do I have a problem with my boot parameters?
####Runtime environment
operating system: CentOS Linux release 7.8.2003
JDK version: jdk11
springboot version: v2.3.0.RELEASE
Here is my Java startup command
java -Xms128m -Xmx500m \
-XX:+UnlockExperimentalVMOptions -XX:+UseZGC \
-jar app.jar
Here is a screenshot of the memory usage at run time
Heap memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20201259.png?raw=true
System memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20201357.png?raw=true
Here's what happens when you use the default garbage collector
Java startup command
java -Xms128m -Xmx500m \
-jar app.jar
Heap memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20202442.png?raw=true
System memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20202421.png?raw=true
By default jdk11 USES the G1 garbage collector. Theoretically, shouldn't G1 be more memory intensive than ZGC?Why didn't I use it that way?Did I misunderstand?Since I'm a beginner to the JVM, I don't understand why.

ZGC employs a technique known as colored pointers. The idea is to use some free bits in 64-bit pointers into the heap for embedded metadata. However, when dereferencing such pointers, these bits need to be masked, which implies some extra work for the JVM.
To avoid the overhead of masking pointers, ZGC involves multi-mapping technique. Multi-mapping is when multiple ranges of virtual memory are mapped to the same range of physical memory.
ZGC uses 3 views of Java heap ("marked0", "marked1", "remapped"), i.e. 3 different "colors" of heap pointers and 3 virtual memory mappings for the same heap.
As a consequence, the operating system may report 3x larger memory usage. For example, for a 512 MB heap, the reported committed memory may be as large as 1.5 GB, not counting memory besides the heap. Note: multi-mapping affects the reported used memory, but physically the heap will still use 512 MB in RAM. This sometimes leads to a funny effect that RSS of the process looks larger than the amount of physical RAM.
See also:
ZGC: A Scalable Low-Latency Garbage Collector by Per Lidén
Understanding Low Latency JVM GCs by Jean Philippe Bempel

JVM uses much more than just the heap memory - read this excellent answer to understand JVM memory consumption better: Java using much more memory than heap size (or size correctly Docker memory limit)
You'll need to go beyond the heap inspection and use things like Native Memory Tracking to get a clearer picture.
I don't know what's the particular issue with your application, but ZGC is often mentioned as to be good for large heaps.
It's also a brand new collector and got many changes recently - I'd upgrade to JDK 14 if you want to use it (see "Change Log" here: https://wiki.openjdk.java.net/display/zgc/Main)

This is a result of the throughput-latency-footprint tradeoff. When choosing between these 3 things, you can only pick 2.
ZGC is a concurrent GC with low pause times. Since you don't want to give up throughput, you trade latency and throughput for footprint. So, there is nothing surprising in such high memory consumption.
G1 is not a low-pause collector, so you shift that tradeoff towards footprint and get bigger pause times but win some memory.

The amount of OS memory the JVM uses (ie, "committed heap") depends on how often the GC runs (and also whether it uncommits unneeded memory if the app starts to use less), which is a tunable option. Unfortunately ZGC isn't (currently) as aggressive about this by default as G1, but both have some tuning options that you can try.
P.S. As others have noted, the RES htop column is misleading, but the VisualVM chart shows the real picture.

High memory usage issues with G1 Garbage collector

We have been testing out G1 garbage collector recently with the following configuration:
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseG1GC -XX:MaxGCPauseMillis=1250 -XX:+PrintTenuringDistribution -Xloggc:${logdir}/gc-$(date +%Y_%m_%d-%H_%M).log -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics -XX:+PrintPromotionFailure -XX:+PrintAdaptiveSizePolicy -XX:+PrintHeapAtGC -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=15 -XX:ParallelGCThreads=8 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M
JAVA_OPTS_HEAP: -Xms16g -Xmx16g
We had recently come across an issue recently where two java processes were running with the above configuration on a box with 48 GB RAM and both the processes went on to consume around 20 - 22 GB of RAM each (few small processes consuming the remaining memory), thus filling the entire RAM and then it triggered disk swaps which finally led to the OOM and process getting killed.
This seems worrying because neither the NMT reports this memory usage in a meaningful way nor do we get any clues for this usage from the GC logs. In the NMT stats, the application memory was under 16G and the metaspace usage was under 1G.
We had tried setting maxMetaSpaceSize to 2G but that didn't help either. The RAM usage seems to grow unbounded when the process is running for days.
From the other questions it does seem that G1 garbage collector does tend to consume more memory but the disk swaps is a worrying issue. Could someone please provide some pointers on how this issue could be resolved?

As to long for a comment I put it in as an answer.
A good reading which explains why a java process might consume more memory than -Xmx. Based ony our provided information I believe this would be also the reason in your case.
For G1 there is an OBE Getting Started with the G1 Garbage Collector with details about the function of the G1GC. Have a look there for Recommended Use Cases for G1. Maybe you would not benefit from using G1.
cited from from the OBE (Oracle By Example)
If you are using CMS or ParallelOldGC and your application is not experiencing long garbage collection pauses, it is fine to stay with your current collector.

Here you can find testing results of G1, Parallel, ConcMarkSweep, Serial and Shenandoah garbage collectors in terms of scaling and resource consumption, as well as some suggestions on what settings can be applied to improve results. So you can choose the most appropriate one for your project and reduce memory usage.

How to reduce Sun/Oracle JVM internal overhead?

This problem is specifically about Sun Java JVM running on Linux x86-64. I'm trying to figure out why the Sun JVM takes so much of system's physical memory even when I have set Heap and Non-Heap limits.
The program I'm running is Eclipse 3.7 with multiple plugins/features. The most used features are PDT, EGit and Mylyn. I'm starting the Eclipse with the following command line switches:
-nosplash -vmargs -Xincgc -Xms64m -Xmx200m -XX:NewSize=8m -XX:PermSize=80m -XX:MaxPermSize=150m -XX:MaxPermHeapExpansion=10m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseParNewGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=5 -XX:GCTimeRatio=49 -XX:MaxGCPauseMillis=50 -XX:GCPauseIntervalMillis=1000 -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+DoEscapeAnalysis -XX:+UseCompressedOops -XX:+AggressiveOpts -Dorg.eclipse.swt.internal.gtk.disablePrinting
Worth noting are especially the switches:
-Xms64m -Xmx200m -XX:NewSize=8m -XX:PermSize=80m -XX:MaxPermSize=150m
These switches should limit the JVM Heap to maximum of 200 MB and Non-Heap to 150 MB ("CMS Permanent generation" and "Code Cache" as labeled by JConsole). Logically the JVM should take total of 350 MB plus the internal overhead required by the JVM.
In reality, the JVM takes 544.6 MB for my current Eclipse process as computed by ps_mem.py (http://www.pixelbeat.org/scripts/ps_mem.py) which computes the real physical memory pages reserved by the Linux 2.6+ kernel. That's internal Sun JVM overhead of 35% or roughly 200MB!
Any hints about how to decrease this overhead?
Here's some additional info:
$ ps auxw
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
me 23440 2.4 14.4 1394144 558440 ? Sl Oct12 210:41 /usr/bin/java ...
And according to JConsole, the process has used 160 MB of heap and 151 MB of non-heap.
I'm not saying that I cannot afford using extra 200MB for running Eclipse, but if there's a way to reduce this waste, I'd rather use that 200MB for kernel block device buffers or file cache. In addition, I have similar experience with other Java programs -- perhaps I could reduce the overhead for all of them with similar tweaks.
Update: After posting the question, I found previous post to SO:
Why does the Sun JVM continue to consume ever more RSS memory even when the heap, etc sizes are stable?
It seems that I should use pmap to investigate the problem.

I think the reason for the high memory consumption of your Eclipse Environment is the use of SWT. SWT is a native graphic library living outside of the heap of the JVM, and to worsen the situation, the implementation on Linux is not really optimized.
I don't think there's really a chance to reduce the memory consumption of your eclipse environment concerning the memory outside the heap.

Eclipse is a memory and cpu hog. In addition to the Java class libraries all the low end GUI stuff is handled by native system calls so you will have a substantial "native" JNI library to execute the low level X term calls attached to your process.
Eclipse offers millions of useful features and lots of helpers to speed up your day to day programming tasks - but lean and mean it is not. Any reduction in memory or resources will probably result in a noticeable slowdown. It really depends on how much you value your time vs. your computers memory.
If you want lean and mean gvim and make are unbeatable. If you want the code completion, automatic builds etc. you must expect to pay for this with extra resources.

If I run the following program
public static void main(String... args) throws InterruptedException {
for (int i = 0; i < 60; i++) {
System.out.println("waiting " + i);
Thread.sleep(1000);
}
}
with ps auwx prints
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
500 13165 0.0 0.0 596680 13572 pts/2 Sl+ 13:54 0:00 java -Xms64m -Xmx200m -XX:NewSize=8m -XX:PermSize=80m -XX:MaxPermSize=150m -cp . Main
The amount of memory used is 13.5 MB. There about 200 MB of shared libraries which counts towards the VSZ size. The rest can be acounted for in the max heap, max perm gen with an overhead for the thread stacks etc.
The problem doesn't appear to be with the JVM but the application running in it. Using additional shared libraries, direct memory and memory mapped files can increase the amount of memory used.
Given you can buy 16 GB for around $100, do you know this is actually a problem?

Full GC becoming very frequent

I've got a Java webapp running on one tomcat instance. During peak times the webapp serves around 30 pages per second and normally around 15.
My environment is:
O/S: SUSE Linux Enterprise Server 10 (x86_64)
RAM: 16GB
server: Tomcat 6.0.20
JVM: Java HotSpot(TM) 64-Bit Server VM 1.6.0_14
JVM options:
CATALINA_OPTS="-Xms512m -Xmx1024m -XX:PermSize=128m -XX:MaxPermSize=256m
-XX:+UseParallelGC
-Djava.awt.headless=true
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
JAVA_OPTS="-server"
After a couple of days of uptime the Full GC starts occurring more frequently and it becomes a serious problem to the application's availability. After a tomcat restart the problem goes away but, of course, returns after 5 to 10 or 30 days (not consistent).
The Full GC log before and after a restart is at http://pastebin.com/raw.php?i=4NtkNXmi
It shows a log before the restart at 6.6 days uptime where the app was suffering because Full GC needed 2.5 seconds and was happening every ~6 secs.
Then it shows a log just after the restart where Full GC only happened every 5-10 minutes.
I've got two dumps using jmap -dump:format=b,file=dump.hprof PID when the Full GCs where occurring (I'm not sure whether I got them exactly right when a Full GC was occurring or between 2 Full GCs) and opened them in http://www.eclipse.org/mat/ but didn't get anything useful in Leak Suspects:
60MB: 1 instance of "org.hibernate.impl.SessionFactoryImpl" (I use hibernate with ehcache)
80MB: 1,024 instances of "org.apache.tomcat.util.threads.ThreadWithAttributes" (these are probably the 1024 workers of tomcat)
45MB: 37 instances of "net.sf.ehcache.store.compound.impl.MemoryOnlyStore" (these should be my ~37 cache regions in ehcache)
Note that I never get an OutOfMemoryError.
Any ideas on where should I look next?

When we had this issue we eventually tracked it down to the young generation being too small. Although we had given plenty of ram the young generation wasn't given it's fair share.
This meant that small garbage collections would happen more frequently and caused some young objects to be moved into the tenured generation meaning more large garbage collections also.
Try using the -XX:NewRatio with a fairly low value (say 2 or 3) and see if this helps.
More info can be found here.

I've switched from -Xmx1024m to -Xmx2048m and the problem went away. I now have 100 days of uptime.

Beside tuning the various options of JVM I would also suggest to upgrade to a newer release of the VM, because later versions have much better tuned garbage collector (also without trying the new experimental one).
Beside that also if it's (partially) true that assigning more ram to JVM could increase the time required to perform GC there is a tradeoff point between using the whole 16 GB of memory and increasing your memory occupation, so you can try double all values, to start
Xms1024m -Xmx2048m -XX:PermSize=256m -XX:MaxPermSize=512m
Regards
Massimo

What might be happening in your case is that you have a lot of objects who live a little longer than NewGen life cycle. If survivor space is too small, they go straight to the OldGen. -XX:+PrintTenuringDistribution could provide some insight. Your NewGen is large enough, so try decreasing SurvivorRatio.
also, jconsole will probably provide more visual insight into what happens with your memory, try it.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.