Java program (Tomcat) keeps eating memory (RES in top)

Java program (Tomcat) keeps eating memory (RES in top) - java

I am running Tomcat on a 4-cpu and 32GB memory 64 bit machine (OS is CentOS 6.3). The Java option I start Tomcat with is -server -Xms1024m -Xmx1024m -XX:PermSize=512m -XX:MaxPermSize=512m
At the beginning, RES is just 810MB using top, and it keeps increasing. During this period, I run jmap -J-d64 -histo pID to check the Java memory heap, and I think the gc works fine, because heap peak is 510MB and around 200MB after gc. But when RES in top hits 1.1g, the CPU usage will exceeds 100% and Tomcat will hang.
Using jstack pid to see the dump when the CPU usage is 100%, a thread named "vm thread" eats up almost 100% CPU. I googled, it is the JVM gc thread. So my question is: Why does res keep growing when gc works fine? How could I resolve this problem? Thanks.

Might be a permgen leak. If you say your heap stays at ~500Mb, and -XX:MaxPermSize is set to 512Mb, full permgen will give you about 1gb of memory usage.
It could happen if you have lots (like, lots!) of jsps being loaded later in the lifecycle of your program, or you have used String.intern() a lot.
Follow this thread for further investigation: How to dump Permgen?
And this thread for tuning gc to sweep permgen What does JVM flag CMSClassUnloadingEnabled actually do?

If the garbage collection thread is spinning at 100% then it's likely that it is trying to perform garbage collection, but none of the objects can be collected so you are in a garbage collection death spiral where it keeps trying to run but not able to free any memory.
This may happen because you have a memory leak in your program, or because you are just not giving the vm enough memory to handle the number of objects you are loading during regular use. It sounds like you have plenty of headroom to increase the heap size. This may only prolong the amount of time before you hit the death spiral, or it may get you to a running state. You'll want to run a test for quite some time to ensure that you're not just delaying the inevitable.

Related

Can Java Garbage Collector disturb HPA scaling down?

I have a Spring API with a heavy use of memory deployed on kubernetes cluster.
I configured the auto scale (HPA) to look at the memory consumption as a scale criterion, and running a load test everything works well at the time of scale up, however at the time of scale down the memory does not go down and consequently the pods created are not removed. If I run the tests again, new pods will be created, but never removed.
Doing an local analysis using visual VM, I believe the problem is correlated to the GC. Locally the GC works correctly during the test, but at the end of the requests it stops running leaving garbage behind, and it only runs again after a long time. So I believe that this garbage left behind is preventing the HPA from scale down.
Does anyone have any tips on what may be causing this effect or something that I can try?
PS. In the profiler I have no indications of any memory leak, and when I run the GC manually, the garbage left is removed
Here are some additional details:
Java Version: 11
Spring Version: 2.3
Kubernetes Version: 1.17
Docker Image: openjdk:11-jre-slim
HPA Requests Memory: 1Gi
HPA Limits Memory: 2Gi
HPA Memory Utilization Metrics: 80%
HPA Min Pods: 2
HPA Max Pods: 8
JVM OPS: -Xms256m -Xmx1G
Visual VM After Load test
New Relic Memory Resident After Load Test

There most likely isn't a memory leak.
The JVM requests a memory from the operating system up to the limit set by the -Xmx... command-line option. After each major GC run, the JVM looks at the ratio of heap memory in use to the (current) heap size:
If the ratio is too close to 1 (i.e. the heap is too full), the JVM requests memory from the OS to make the heap larger. It does this "eagerly".
If the ration is too close to 0 (i.e. the heap is too large), the JVM may shrink the heap and return some memory to the OS. It does this "reluctantly". Specifically, it may take a number of full GC runs before the JVM decides to release memory.
I think that what you are seeing is the effects of the JVM's heap sizing policy. If the JVM is idle, there won't be enough full GC to trigger the JVM to shrink the heap, and memory won't be given back to the OS.
You could try to encourage the JVM to give memory back by calling System.gc() a few times. But running a full GC is CPU intensive. And if you do manage to get the JVM to shrink the heap, then expanding the heap again (for the next big request) will entail more full GCs.
So my advice would be: don't try that. Use some other criteria to triggering your autoscaling ... if it makes any sense.
The other thing to note is that a JVM + application may use a significant amount of non-heap memory; e.g. the executable and shared native libraries, the native (C++) heap, Java thread stack, Java metaspace, and so on. None of that usage is constrained by the -Xmx option.

Kubernetes Pod memory usage does not fall when jvm runs garbage collection

I'm struggling to understand why my Java application is slowly consuming all memory available to the pod causing Kubernetes to mark the pod as out of memory. The JVM (OpenJDK 8) is started with the following arguments:
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=2
I'm monitoring the memory used by the pod and also the JVM memory and was expecting to see some correlation e.g. after major garbage collection the pod memory used would also fall. However I don't see this. I've attached some graphs below:
Pod memory:
Total JVM memory
Detailed Breakdown of JVM (sorry for all the colours looking the same...thanks Kibana)
What I'm struggling with is why when there is a significant reduction in heap memory just before 16:00 does the pods memory not also fall?

It looks like you are creating a pod with a resource limit of 1GB Memory.
You are setting -XX:MaxRAMFraction=2 which means you are allocating 50% of available memory to the JVM which seem to match what you are graphing as Memory Limit.
JVM then reserves around 80% of that which is what you are graphing in Memory Consumed.
When you look at Memory Consumed you will not see internal garbage collection (as in your second graph), because that GC memory is released back to JVM but is still reserved by it.
Is it possible that there is a memory leak in your java application? it is possibly causing more memory to get reserved over time, until the JVM limit (512MB) is met and your pod gets OOM killed.

JVM heap not released

I am new to analyzing memory issues in Java. So pardon me if this question seems naive
I have application running with following JVM parameters set:
-Xms3072m -Xmx3072m
-XX:MaxNewSize=1008m -XX:NewSize=1008m
-XX:PermSize=224m -XX:MaxPermSize=224m -XX:SurvivorRatio=6
I am using visualVM to monitor the usage : Here is what I see
The problem is, even when the application is not receiving any data for processing, the used memory doesn't go down. When the application is started, the used space starts low (around 1GB) but grows as the application is running. and then the used memory never goes down.
My question is why the used heap memory doesn't go down even when no major processing happening in application and what configurations can be set to correct it.
My understanding is if application is not doing any processing then the heap used should be less and heap memory available ( or max heap) should remain the same (3GB) in this case.

This is a totally normal trend, even if you believe that it is not used there are probably threads running doing tasks that create objects that are unreferenced once the tasks are done, those objects are eligible for the next GC but as long as there is no minor/major GC they take more and more room in your heap so it goes up until a GC is triggered then you get the normal heap size and so on.
An abnormal trend will be the same thing but after a GC the heap size would be higher than the heap size just after the previous GC which is not the case here.
Your real question is more what my application is doing when is not receiving any data to process? For that a thread dump should help, you can launch jcmd to get the PID then launch jstack $pid to get the thread dump.
Here is an example of a typical trend in case of memory leak:
As you can see the starting heap size has changed between two GC, the new starting heap size is higher than the previous one which may be due to a memory leak.

Why is my Java heap dump size much smaller than used memory?

Problem
We are trying to find the culprit of a big memory leak in our web application. We have pretty limited experience with finding a memory leak, but we found out how to make a java heap dump using jmap and analyze it in Eclipse MAT.
However, with our application using 56/60GB memory, the heap dump is only 16GB in size and is even less in Eclipse MAT.
Context
Our server uses Wildfly 8.2.0 on Ubuntu 14.04 for our java application, whose process uses 95% of the available memory. When making the dump, our buffers/cache used space was at 56GB.
We used the following command to create the dump: sudo -u {application user} jmap -dump:file=/mnt/heapdump/dump_prd.bin {pid}
The heap dump file size is 16,4GB and when analyzing it with Eclipse MAT, it says there are around 1GB live objects and ~14,8GB unreachable/shallow heap.
EDIT: Here is some more info about the problem we see happening. We monitor our memory usage, and we see it grow and grow, until there is ~300mb free memory left. Then it stays around that amount of memory, until the process crashes, unfortunately without error in the application log.
This makes us assume it is a hard OOM error because this only happens when the memory is near-depleted. We use the settings -Xms25000m -Xmx40000m for our JVM.
Question
Basically, we are wondering why the majority of our memory isn't captured in this dump. The top retained size classes don't look too suspicious, so we are wondering if there is something heap dump-related what we are doing wrong.

When dumping its heap, the JVM will first run a garbage collection cycle to free any unreachable objects.
How can I take a heap dump on Java 5 without garbage collecting first?
In my experience, in a true OutOfMemoryError where your application is simply demanding more heap space than is available, this GC is a fool's errand and the final heap dump will be the size of the max. heap size.
When the heap dump is much smaller, that means the system was not truly out of memory, but perhaps had memory pressure. For example, there is the java.lang.OutOfMemoryError: GC overhead limit exceeded error, which means that the JVM may have been able to free enough memory to service some new allocation request, but it had to spend too much time collecting garbage.
It's also possible that you don't have a memory problem. What makes you think you do? You didn't mention anything about heap usage or an OutOfMemoryError. You've only mentioned the JVM's memory footprint on the operating system.

In my experience, having a heap dump much smaller than the real memory used can be due to a leak in the JNI.
Despite you don't use directly any native code, there are certain libraries that use it to speed up.
In our case, it was a Deflater and Inflater not properly ended.

how to make a jvm do garbage collection externally

I have an application deployed to production which sometimes throws OutOfMemory exception due to some memory leak. It is running on a headless ubuntu box on which I would prefer not to connect visualvm, jconsole etc remotely. Is there a way to make the jvm do gc (like in visualvm where you just click a button to do it).
I would like to run jmap -histo:live <pid> and this gc commands alternatively to find out which objects are surviving a gc. Which object numbers are growing etc. Right now I can see some unexpected object counts but it is happening across a number of my domain objects so I am not sure if it is a delayed gc or a memory leak.
So in short, I am looking for the linux command to be run against a jvm pid to cause it to do gc. Not system.gc.

The GC will aggressively try to clean up unreferenced objects as the heap gets full. So its not a "delayed gc". I think you are on the right track, use jmap and get a heap dump. Then analyze it to see what application objects are surviving that should not be. You may need to get a couple heap dumps and compare them against each other.

It's pretty hard to get a real memory leak in Java. If you're getting an out of memory error, then it most likely means that you're actually running out of memory. So to fix this, you need to find references to unused objects and clean them up manually. Because otherwise, the garbage collector can't free up the wasted memory.
When the JVM can't allocate any more memory, the garbage collector should automatically run.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.