The Java ZGC garbage collector USES a lot of memory

The Java ZGC garbage collector USES a lot of memory - java

I built a simple application using Springboot.The ZGC garbage collector I use when deploying to a Linux server USES a lot of memory..I tried to limit the maximum heap memory to 500MB with Xmx500m, but the JAVA program still used more than 1GB. When I used the G1 collector, it only used 350MB.I don't know why, is this a BUG of JDK11?Or do I have a problem with my boot parameters?
####Runtime environment
operating system: CentOS Linux release 7.8.2003
JDK version: jdk11
springboot version: v2.3.0.RELEASE
Here is my Java startup command
java -Xms128m -Xmx500m \
-XX:+UnlockExperimentalVMOptions -XX:+UseZGC \
-jar app.jar
Here is a screenshot of the memory usage at run time
Heap memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20201259.png?raw=true
System memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20201357.png?raw=true
Here's what happens when you use the default garbage collector
Java startup command
java -Xms128m -Xmx500m \
-jar app.jar
Heap memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20202442.png?raw=true
System memory usage
https://github.com/JoyfulAndSpeedyMan/assets/blob/master/2020-07-13%20202421.png?raw=true
By default jdk11 USES the G1 garbage collector. Theoretically, shouldn't G1 be more memory intensive than ZGC?Why didn't I use it that way?Did I misunderstand?Since I'm a beginner to the JVM, I don't understand why.

ZGC employs a technique known as colored pointers. The idea is to use some free bits in 64-bit pointers into the heap for embedded metadata. However, when dereferencing such pointers, these bits need to be masked, which implies some extra work for the JVM.
To avoid the overhead of masking pointers, ZGC involves multi-mapping technique. Multi-mapping is when multiple ranges of virtual memory are mapped to the same range of physical memory.
ZGC uses 3 views of Java heap ("marked0", "marked1", "remapped"), i.e. 3 different "colors" of heap pointers and 3 virtual memory mappings for the same heap.
As a consequence, the operating system may report 3x larger memory usage. For example, for a 512 MB heap, the reported committed memory may be as large as 1.5 GB, not counting memory besides the heap. Note: multi-mapping affects the reported used memory, but physically the heap will still use 512 MB in RAM. This sometimes leads to a funny effect that RSS of the process looks larger than the amount of physical RAM.
See also:
ZGC: A Scalable Low-Latency Garbage Collector by Per Lidén
Understanding Low Latency JVM GCs by Jean Philippe Bempel

JVM uses much more than just the heap memory - read this excellent answer to understand JVM memory consumption better: Java using much more memory than heap size (or size correctly Docker memory limit)
You'll need to go beyond the heap inspection and use things like Native Memory Tracking to get a clearer picture.
I don't know what's the particular issue with your application, but ZGC is often mentioned as to be good for large heaps.
It's also a brand new collector and got many changes recently - I'd upgrade to JDK 14 if you want to use it (see "Change Log" here: https://wiki.openjdk.java.net/display/zgc/Main)

This is a result of the throughput-latency-footprint tradeoff. When choosing between these 3 things, you can only pick 2.
ZGC is a concurrent GC with low pause times. Since you don't want to give up throughput, you trade latency and throughput for footprint. So, there is nothing surprising in such high memory consumption.
G1 is not a low-pause collector, so you shift that tradeoff towards footprint and get bigger pause times but win some memory.

The amount of OS memory the JVM uses (ie, "committed heap") depends on how often the GC runs (and also whether it uncommits unneeded memory if the app starts to use less), which is a tunable option. Unfortunately ZGC isn't (currently) as aggressive about this by default as G1, but both have some tuning options that you can try.
P.S. As others have noted, the RES htop column is misleading, but the VisualVM chart shows the real picture.

Related

Java - Odd memory consumption between x32 and x64

I've been profiling the x64 version of my application as the memory usage has been outrageously high, all of it seems to be coming from the JavaFX MediaPlayer, i'm correctly releasing listeners and eventhandlers.
Here is the stark contrast.
The x32 version at start
And now the x64 version at start
The x32 version stays below 256mb while the x64 will shoot over a gig; this is while both are left to play through their playlist.
All the code is the same.
JDK: jdk1.8.0_20
JRE: jre1.8.0_20
VM arguments on both
-XX:MinHeapFreeRatio=40 -XX:MaxHeapFreeRatio=70 -Xms3670k -Xmx256m -Dsun.java2d.noddraw=true -XX:+UseParallelGC
Same issue occurring on another x64 Java application
Is this a bug or am I overlooking something?

What you are seeing is the memory usage of the entire JVM running your process. The -Xmx256m setting only limits the maximum heap space available for your application to allocate (and the JVM would enforce that). Outside of heap space, the JVM can use additional memory for a host of other purposes (I am sure I will miss a few in the list below):
PermGen, which has now be replaced by the Metaspace. According to the documentation, there is no default limit for this:
-XX:MaxMetaspaceSize=size
Sets the maximum amount of native memory that can be allocated for class metadata. By default, the size is not limited. The amount of metadata for an application depends on the application itself, other running applications, and the amount of memory available on the system.
Stack space (memory used = (number of threads) * stack size. You can control this with the -Xss parameter
Off-heap space (either use of ByteBuffers in your code, or use of third pary libraries like EHCache which would in turn use off-heap memory)
JNI code
GC (garbage collectors need their own memory, which is again not part of the heap and can vary greatly depending on the collector used and the application memory usage)
In your case, you are seeing the "almost doubling" of memory use, plus probably a more relax Metaspace allocation when you move from a 32bit to a 64bit JVM. Using -XX:MaxMetaspaceSize=128m will probably bring the memory usage down to under 512MB for the 64bit JVM.

I don't know your application, respectively how it is implemented.
One possible reason for such a surprize differences could be how much memory can be used before a garbage collection is performed. It is thinkable that a machine with 64 bit words is allocated with more memory then a machine with 32 bit words. The garbage collector could run less often, so there would be more garbage memory still allocated, even when it is not really necessary or usefull.

JVM performance with these garbage collection settings

I have an enterprise level Java application that serves a few thousand users per day. This is a JAXB web service on weblogic 10.3.6 (Java 1.6 JVM), using Hibernate to hit an Oracle database. It also calls other web services.
We have it tuned the following GC settings on our production system:
-server -Xms2048m -Xmx2048m -XX:PermSize=512m -XX:MaxPermSize=512m
What is the effect of this GC sizing? The hardware has more than enough capacity to handle it.
I know that this sets the heap size and perm gen at a stable level. But what's the impact of that when you eventually have to do garbage collection?
To me it seems that it would make GC happen less frequently, but take longer when it does happen. Does that sound correct?

I would say please monitor the GC before deciding on the sizing as you never know how the application will behave under load. Have a look at this link and this it has some good references about GC and tools to calculate the same.

it would make GC happen less frequently, but take longer when it does happen
It might, it depends on your use case. You might even find that the GC is shorter in rare case.
A 2 GB heap isn't that much and I would use up to 26 GB without worrying about heap size. Above this size memory accesses are a little slower or use more memory.

Setting -Xmx & -Xms and PermSize & MaxPermSize to equal sizes will stop the JVM from resizing the heaps based on your requirement. These resizes are expensive as they trigger a Full GC.
-server will allow the JVM to make use of Server Compiler which will do more aggressive optimizations before compiling your code to native assembly instructions. Although now-a-days any machine with 2 or more cores and 2GB+ of memory will have server compiler on by default.
Increasing the memory doesn't always fix a problem. Sometimes adding more memory will be an overhead.
If you need details regarding GC, you can try this link
The very reason to tune something is to improve your application's performance and there by achieve your throughput and latency goals.

Java: Commandline parameter to release unused memory

In Bash, I use the commmand java -Xmx8192m -Xms512m -jar jarfile to start a Java process with an initial heap space of 512MB and maximum heap space of 8GB.
I like how the heap space increases based on demand, but once the heap space has been increased, it doesn't release although the process doesn't need the memory. How can I release the memory that isn't being used by the process?
Example: Process starts, and uses 600MB of memory. Heap space increases from 512MB to a little over 600MB. Process then drops down to 400MB RAM usage, but heap allocation stays at 600MB. How would I make the allocation stay near the RAM usage?

You cannot; it's simply not designed to work that way. Note that unused memory pages will simply be mapped out by your hardware, and so won't consume any real memory.

Generally you would not like JVM to return memory to the OS and later claim in back as both operations are not so cheap.
There are a couple XX parameters that may or may not work with your preferred garbage collector, namely
-XX:MaxHeapFreeRatio=70 Maximum percentage of heap free after GC to avoid shrinking.
-XX:MinHeapFreeRatio=40 Minimum percentage of heap free after GC to avoid expansion.
Source
I believe you'd need stop the world collector for them to be enforced.
Other JVMs may have their own parameters.
I'd normally have not replied but the amount of negative/false info ain't cool.

No, it is a required function. I think, the JVM in Android probably can do this, but I'm not sure.
But most of them - including all Java EE VMs - simply doesn't interested about this.
This is not so simple, as it seems - the VM is a process from the OS view, and has somewhere a mapped memory region for it, which is a stack or data segment.
In most cases it needs to be a continous interval. Memory allocation and release from the OS view happens with a system call, which the process uses to ask the OS its new segment limit.
What to do, if you have for example 2 gigabytes of RAM for your JVM, which uses only 500 megs, but this 500 meg is dispersed in some ten-bytes fragment in this 2 gigs? This memory release function would need also a defragmentation step, which would multiply the resource costs of the GC runs.
As Java runs, and Java objects are constructed and destructed by the garbage collector, the free and allocated memory areas are dispersed in the stack/data segment.
When we don't see java, but native OS processes, the situation is the same: if you malloc() ten 1meg block, and then release the first 9, there is no way to give it back to the OS, altough newer libraries and os apis have extensive development about this. Of course, if you later allocates memory again, this allocation will be done from the just-freed regions.
My opinion is, that even if this is a little bit costly and complex (and a quite large programming work), it worths its price, and I think it isn't the best image from our collective programming culture, that it isn't done since decades in everything, included the java vms.

Weird behavior of Java -Xmx on large amounts of ram

You can control the maximum heap size in java using the -Xmx option.
We are experiencing some weird behavior on Windows with this switch. We run some very beefy servers (think 196gb ram). Windows version is Windows Server 2008R2
Java version is 1.6.0_18, 64-Bit (obviously).
Anyway, we were having some weird bugs where processes were quitting with out of memory exceptions even though the process was using much less memory than specified by the -Xmx setting.
So we wrote simple program that would allocate a 1GB byte array each time one pressed the enter key, and initialize the byte array to random values (to prevent any memory compression etc).
Basically, whats happening is that if we run the program with -Xmx35000m (roughly 35 gb) we get an out of memory exception when we hit 25 GB of process space (using windows task manager to measure). We hit this after allocating 24 GB worth of 1 GB blocks, BTW, so that checks out.
Simply specifying a larger value for -Xmx option makes the program work fine to larger amounts of ram.
So, what is going on? Is -Xmx just "off". BTW: We need to specify -Xmx55000m to get a 35 GB process space...
Any ideas on what is going on?
Is their a bug in the Windows JVM?
Is it safe to simply set the -Xmx option bigger, even though there is a disconnect between the -Xmx option and what is going on process wise?

Theory #1
When you request a 35Gb heap using -Xmx35000m, what you are actually saying is that to allow the total space used for the heap to be 35Gb. But the total space consists of the Tenured Object space (for objects that survive multiple GC cycles), the Eden space for newly created objects, and other spaces into which objects will be copied during garbage collection.
The issue is that some of the spaces are not and cannot be used for allocating new objects. So in effect, you "lose" a significant percent of your 35Gb to overheads.
There are various -XX options that can be used to tweak the sizes of the respective spaces, etc. You might try fiddling with them to see if they make a difference. Refer to this document for more information. (The commonly used GC tuning options are listed in section 8. The -XX:NewSpace option looks promising ...)
Theory #2
This might be happening because you are allocating huge objects. IIRC, objects above a certain size can be allocated directly into the Tenured Object space. In your (highly artificial) benchmark, this might result in the JVM not putting stuff into the Eden space, and therefore being able to use less of the total heap space than is normal.
As an experiment, try changing your benchmark to allocate lots of small objects, and see if it manages to use more of the available space before OOME-ing.
Here are some other theories that I would discount:
"You are running into OS-imposed limits." I would discount this, since you said that you can get significantly greater memory utilization by increasing the -Xmx... setting.
"The Windows task manager is reporting bogus numbers." I would discount this because the numbers reported roughly match the 25Gb that you think your application had managed to allocate.
"You are losing space to other things; e.g. the permgen heap." AFAIK, the permgen heap size is controlled and accounted independently of the "normal" heaps. Other non-heap memory usage is either a constant (for the app) or dependent on the app doing specific things.
"You are suffering from heap fragmentation." All of the JVM garbage collectors are "copying collectors", and this family of collectors has the property that heap nodes are automatically compacted.
"JVM bug on Windows." Highly unlikely. There must be tens of thousands of 64bit Java on Windows installations that maximize the heap size. Someone else would have noticed ...
Finally, if you are NOT doing this because your application requires you to allocate memory in huge chunks, and hang onto it "for ever" ... there's a good chance that you are chasing shadows. A "normal" large-memory application doesn't do this kind of thing, and the JVM is tuned for normal applications ... not anomalous ones.
And if your application really does behave this way, the pragmatic solution is to just set the -Xmx... option larger, and only worry if you start running into OS-level issues.

To get a feeling for what exactly you are measuring you should use some different tools:
the Windows Task Manager (I only know Windows XP, but I heard rumours that the Task Manager has improved since then.)
procexp and vmmap from Sysinternals
jconsole from the JVM (you are using the SunOracle HotSpot JVM, aren't you?)
Now you should answer the following questions:
What does jconsole say about the used heap size? How does that differ from procexp?
Does the value from procexp change if you fill the byte arrays with non-zero numbers instead of keeping them at 0?

did you try turning on the verbose output for the GC to find out why the last allocation fails. is it because the OS fails to allocate a heap beyond 25GB for the native JVM process or is it because the GC is hitting some sort of limit on the maximum memory it can manage. I would recommend you also connect to the command line process using jconsole to see what the status of the heap is just before the allocation failure. Also tools like the sysinternals process explorer might give better details as where the failure is occurring if it is in the jvm process.
Since the process is dying at 25GB and you have a generational collector maybe the rest of the generations are consuming 10GB. I would recommend you install JDK 1.6_u24 and use jvisualvm with the visualGC plugin to see what the GC is doing especially factor in the size of all the generations to see how the 35GB heap is being chopped up into different regions by the GC / VM memory manager.
see this link if you are not familiar with Generational GC http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#generation_sizing.total_heap

I assume this has to do with fragmenting the heap. The free memory is probably not available as a single contiguous free area and when you try to allocate a large block this fails because the requested memory cannot be allocated in a single piece.

The memory displayed by windows task manager is the total memory allocated to the process which includes memory for code, stack, perm gen and heap.
The memory you measure using your click program is the amount of heap jvm makes available to running jvm programs.
Natrually the total allocated memory to JVM by windows should be greater than what JVM makes available to your program as heap memory.

Eclipse release heap back to system

I'm using Eclipse 3.6 with latest Sun Java 6 on Linux (64 bit) with a larger number of large projects. In some special circumstances (SVN updates for example) Eclipse needs up to 1 GB heap. But most of the time it only needs 350 MB. When I enable the heap status panel then I see this most of the time:
350M of 878M
I start Eclipse with these settings: -Xms128m -Xmx1024m
So most of the time lots of MB are just wasted and are just used rarely when memory usage peaks for a short time. I don't like that at all and I want Eclipse to release the memory back to the system, so I can use it for other programs.
When Eclipse needs more memory while there is not enough free RAM than Linux can swap out other running programs, I can live with that. I heard there is a -XX:MaxHeapFreeRatio option. But I never figured out what values I have to use so it works. No value I tried ever made a difference.
So how can I tell Eclipse (Or Java) to release unused heap?

Found a solution. I switched Java to use the G1 garbage collector and now the HeapFreeRatio parameters works as intended. So I use these options in eclipse.ini:
-XX:+UnlockExperimentalVMOptions
-XX:+UseG1GC
-XX:MinHeapFreeRatio=5
-XX:MaxHeapFreeRatio=25
Now when Eclipse eats up more than 1 GB of RAM for a complicated operation and switched back to 300 MB after Garbage Collection the memory is actually released back to the operating system.

You can go to the Preferences -> General and check the Show heap status. This activate a nice view of your heap in the corner of Eclipse. Something like this:
If you click the trash bin, it will try to run garbage collection and return the memory.

Java's heap is nothing more than a big data structure managed within the JVM process' heap space. The two heaps are logically-separate entities even though they occupy the same memory.
The JVM is at the mercy of the host system's implementations of malloc(), which allocates memory from the system using brk(). On Linux systems (Solaris, too), memory allocated for the process heap is almost never returned, largely because it becomes fragmented and the heap must be contiguous. This means that memory allocated to the process will increase monotonically, and the only way to keep the size down is not to allocate it in the first place.
-Xms and -Xmx tell the JVM how to size the Java heap ahead of time, which causes it to allocate process memory. Java can garbage collect until the sun burns out, but that cleanup is internal to the JVM and the process memory backing it doesn't get returned.
Elaboration from comment below:
The standard way for a program written in C (notably the JVM running Eclipse for you) to allocate memory is to call malloc(3), which uses the OS-provided mechanism for allocating memory to the process and then managing individual allocations within those allocations. The details of how malloc() and free() work are implementation-specific.
On most flavors of Unix, a process gets exactly one data segment, which is a contiguous region of memory that has pointers to the start and end. The process can adjust the size of this segment by calling brk(2) and increasing the end pointer to allocate more memory or decreasing it to return it to the system. Only the end can be adjusted. This means that if your implementation of malloc() enlarges the data segment, the corresponding implementation of free() can't shrink it unless it determines that there's space at the end that's not being used. In practice, a humongous chunk of memory you allocated with malloc() rarely winds up at the very end of the data segment when you free() it, which is why processes tend to grow monotonically.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.