JVM issues with a large in-memory object

JVM issues with a large in-memory object - java

I have a binary that contains a list of short strings which is loaded on startup and stored in memory as a map from string to protobuf (that contains the string..). (Not ideal, but hard to change that design due to legacy issues)
Recently that list has grown from ~2M to ~20M entries causing it to fail when constructing the map.
First I got OutOfMemoryError: Java heap space.
When I increased the heap size using the xms and xmx we ran into GC overhead limit exceeded.
Runs on a Linux 64-bit machine with 15GB available memory and the following JVM args (I increased the RAM 10G->15G and the heap flags 6000M -> 9000M):
-Xms9000M -Xmx9000M -XX:PermSize=512m -XX:MaxPermSize=2018m
This binary does a whole lot of things and is serving live traffic so I can't afford it being occasionally stuck.
Edit: I eventually went and did the obvious thing, which is fixing the code (change from HashMap to ImmutableSet) and adding more RAM (-Xmx11000M).

I'm looking for a temporary solution if that's possible until we have a more scalable one.
First, you need to figure out if the "OOME: GC overhead limit exceeded" is due to the heap being:
too small ... causing the JVM to do repeated Full GCs, or
too large ... causing the JVM to thrash the virtual memory when a Full GC is run.
You should be able to distinguish these two cases by turning on and examining the GC logs, and using OS-level monitoring tools to check for excessive paging loads. (When checking the paging levels, also check that the problem isn't due to competition for RAM between your JVM and another memory-hungry application.)
If the heap is too small, try making it bigger. If it is too big, make it smaller. If you system is showing both symptoms ... then you have a big problem.
You should also check that "compressed oops" is enabled for your JVM, as that will reduce your JVM's memory footprint. The -XshowSettings option lists the settings in effect when the JVM starts. Use -XX:+UseCompressedOops to enable compressed oops if they are disabled.
(You will probably find that compressed oops are enabled by default, but it is worth checking. This would be an easy fix ... )
If none of the above work, then your only quick fix is to get more RAM.
But obviously, the real solution is to reengineer the code so that you don't need a huge (and increasing over time) in-memory data structure.

Related

Java Heap Dump : How to find the objects/class that is taking memory by 1. io.netty.buffer.ByteBufUtil 2. byte[] array

I found that one of my spring boot project's memory (RAM consumption) is increasing day by day. When I uploaded the jar file to the AWS server, it was taking 582 MB of RAM (Max Allocated RAM is 1500 MB), but each day, the RAM is increasing by 50MB to 100 MB and today after 5 days, it's taking 835 MB. Right now the project is having 100-150 users and with normal usage of Rest APIs.
Because of this increase in the RAM, couple of times the application went down with the following error (error found from the logs):
Exception in thread "http-nio-3384-ClientPoller" java.lang.OutOfMemoryError: Java heap space
So to resolve this, I found that by using JAVA Heap Dump, I can find the objects/classes that are taking the memory. So by using Jmap in the command line, I've created a heap dump and uploaded it to Heap Hero and Eclipse Memory Analyzer Tool. In both of them I found the following:
1. Total Waste memory is: 64.69MB (73%) (check below screenshot)
2. Out of these, 34.06MB is taken by Byte [] array and LinkedHashmap[] (check below screenshot), which I have never used in my whole project. I searched for it in my project but didn't found.
3. Following 2 large objects taking 32 MB and 20 MB respectively.
1. Java Static io.netty.buffer.ByteBufUtil.DEFAULT_ALLOCATOR
2. Java Static com.mysql.cj.jdbc.AbandonedConnectionCleanupThread.connectionFinalizerPhantomRefs`
So I tried to find this netty.buffer. in my project, but I don't find anything which matched with netty or buffer.
Now my question is how can I reduce this memory leak or how can I find the exact memory consumption objects/class/variable so that I can reduce the heap size.
I know few of the experts will ask for the source code or anything similar to that but I believe that from the heap dump we can find the memory leak or live objects that are available in the memory. I am looking for that option or anything that reduces this heap dump!
I am working on this issue for the past 3 weeks. Any help would be appreciated.
Thank you!

Start with enabling the JVM native memory tracker to get an idea which part of the memory is increasing by adding the flag -XX:NativeMemoryTracking=summary. There is some performance overhead according to the documentation (5-10%), but if this isn't a issue I would recommend running the JVM with this flag enabled even in production.
Then you can check the values using jcmd <PID> VM.native_memory (there's a good writeup in this answer: Java native memory usage)
If there is indeed a big chunk of native memory allocated, it's likely this is allocated by Netty.
How do you run your application in AWS? If it's running in a Docker image, you might have stumbled upon this issue: What would cause a java process to greatly exceed the Xmx or Xss limit?
If this is the case, you may need to set the environment variable MALLOC_ARENA_MAX if your application is using native memory (which Netty does) and running on a server with a large number of cores. It's perfectly possible that the JVM allocates this memory for Netty but doesn't see any reason to release it, so it will appear to only continue to grow.
If you want to control how much native memory can be allocated by Netty, you can use the JVM flag -XX:MaxDirectMemorySize for this (I believe the default is the same value as Xmx) and lower it in case you application doesn't require that much memory.
JVM memory tuning is a complex process and it becomes even more complex when native memory is involved - as the linked answer shows it's not as easy as simply setting the Xms and Xmx flag and expecting that no more memory will be used.

Heap dump is not enough to detect memory leaks.
You need to look at the difference of two consecutive heaps snapshots both taken after calling the GC.
Or you need a profiling tool that can give the generations count for each class.
Then you should only look at your domain objects (not generic objects like bytes or strings ...etc) that survived the GC and passed from the old snapshot to the new one.
Or, if using the profiling tool, look for old domain objects that still alive and growing for many generations.
Having objects lived for many generations and keeps growing means those objects are still refernced and the GC is not able to reclaim them. However, living for many generations alone is not enough to cause a leak because cached or static Objects may stay for many generations. The other important factor is that they keep growing.
After you detected what object is being leaked, you may use heap dumb to analyse those objects and get the references.

Memory running out side of Xmx and Xms

I have run into an issue with a java application I wrote causing hardware performance issues. The problem (I'm fairly certain), is that a few of the machines that I'm running the application on only have 1GB of memory. When I start my java application, I"m setting the heap size to -Xms 512m -Xmx 1024m.
My first question, is my assumption correct that this will obviously cause performance problems because I'm allocating all of the machines memory to the java heap?
This leads to another question. I'm running jconsole on the app and monitoring the apps memory usage. What I'm seeing is that the app consumes about 30mb at startup, gets to about 150mb and the garbage collector runs and it goes back down to 30mb. What I'm also seeing using top on the pid is that the application starts by using about 6% memory then slowly climbs up to about 20%. I do not understand this. Why would it only get up to 20% memory usage when I'm allocating 1GB to it. Shouldn't it go to 100%. Also, why is it using that much memory (20%) when it doesn't appear that the app ever uses more than 150mb?
I think its pretty obvious I need to adjust my Xms and Xmx and that should resolve the issue, but I'm trying to understand better what exactly is happening.

Two possibilities for the memory use:
Your app just does not use that much memory
Or
Your app does not use that much memory fast enough.
What happens:
The garbage collector has several points where it will execute:
Just scheduled: It will clean up easy to remove objects
Full collection: This runs when you hit the set memory limits.
If options 1, the general much lower impact quick collection, can keep your memory use under control, it will not hit the full collection unless it the JVM GC options are set to run a full on a schedule.
With your application I would start setting lower xmx/xms values so that more guaranteed resources are left for the OS, and maybe some paging is prevented.

Weird behavior of Java -Xmx on large amounts of ram

You can control the maximum heap size in java using the -Xmx option.
We are experiencing some weird behavior on Windows with this switch. We run some very beefy servers (think 196gb ram). Windows version is Windows Server 2008R2
Java version is 1.6.0_18, 64-Bit (obviously).
Anyway, we were having some weird bugs where processes were quitting with out of memory exceptions even though the process was using much less memory than specified by the -Xmx setting.
So we wrote simple program that would allocate a 1GB byte array each time one pressed the enter key, and initialize the byte array to random values (to prevent any memory compression etc).
Basically, whats happening is that if we run the program with -Xmx35000m (roughly 35 gb) we get an out of memory exception when we hit 25 GB of process space (using windows task manager to measure). We hit this after allocating 24 GB worth of 1 GB blocks, BTW, so that checks out.
Simply specifying a larger value for -Xmx option makes the program work fine to larger amounts of ram.
So, what is going on? Is -Xmx just "off". BTW: We need to specify -Xmx55000m to get a 35 GB process space...
Any ideas on what is going on?
Is their a bug in the Windows JVM?
Is it safe to simply set the -Xmx option bigger, even though there is a disconnect between the -Xmx option and what is going on process wise?

Theory #1
When you request a 35Gb heap using -Xmx35000m, what you are actually saying is that to allow the total space used for the heap to be 35Gb. But the total space consists of the Tenured Object space (for objects that survive multiple GC cycles), the Eden space for newly created objects, and other spaces into which objects will be copied during garbage collection.
The issue is that some of the spaces are not and cannot be used for allocating new objects. So in effect, you "lose" a significant percent of your 35Gb to overheads.
There are various -XX options that can be used to tweak the sizes of the respective spaces, etc. You might try fiddling with them to see if they make a difference. Refer to this document for more information. (The commonly used GC tuning options are listed in section 8. The -XX:NewSpace option looks promising ...)
Theory #2
This might be happening because you are allocating huge objects. IIRC, objects above a certain size can be allocated directly into the Tenured Object space. In your (highly artificial) benchmark, this might result in the JVM not putting stuff into the Eden space, and therefore being able to use less of the total heap space than is normal.
As an experiment, try changing your benchmark to allocate lots of small objects, and see if it manages to use more of the available space before OOME-ing.
Here are some other theories that I would discount:
"You are running into OS-imposed limits." I would discount this, since you said that you can get significantly greater memory utilization by increasing the -Xmx... setting.
"The Windows task manager is reporting bogus numbers." I would discount this because the numbers reported roughly match the 25Gb that you think your application had managed to allocate.
"You are losing space to other things; e.g. the permgen heap." AFAIK, the permgen heap size is controlled and accounted independently of the "normal" heaps. Other non-heap memory usage is either a constant (for the app) or dependent on the app doing specific things.
"You are suffering from heap fragmentation." All of the JVM garbage collectors are "copying collectors", and this family of collectors has the property that heap nodes are automatically compacted.
"JVM bug on Windows." Highly unlikely. There must be tens of thousands of 64bit Java on Windows installations that maximize the heap size. Someone else would have noticed ...
Finally, if you are NOT doing this because your application requires you to allocate memory in huge chunks, and hang onto it "for ever" ... there's a good chance that you are chasing shadows. A "normal" large-memory application doesn't do this kind of thing, and the JVM is tuned for normal applications ... not anomalous ones.
And if your application really does behave this way, the pragmatic solution is to just set the -Xmx... option larger, and only worry if you start running into OS-level issues.

To get a feeling for what exactly you are measuring you should use some different tools:
the Windows Task Manager (I only know Windows XP, but I heard rumours that the Task Manager has improved since then.)
procexp and vmmap from Sysinternals
jconsole from the JVM (you are using the SunOracle HotSpot JVM, aren't you?)
Now you should answer the following questions:
What does jconsole say about the used heap size? How does that differ from procexp?
Does the value from procexp change if you fill the byte arrays with non-zero numbers instead of keeping them at 0?

did you try turning on the verbose output for the GC to find out why the last allocation fails. is it because the OS fails to allocate a heap beyond 25GB for the native JVM process or is it because the GC is hitting some sort of limit on the maximum memory it can manage. I would recommend you also connect to the command line process using jconsole to see what the status of the heap is just before the allocation failure. Also tools like the sysinternals process explorer might give better details as where the failure is occurring if it is in the jvm process.
Since the process is dying at 25GB and you have a generational collector maybe the rest of the generations are consuming 10GB. I would recommend you install JDK 1.6_u24 and use jvisualvm with the visualGC plugin to see what the GC is doing especially factor in the size of all the generations to see how the 35GB heap is being chopped up into different regions by the GC / VM memory manager.
see this link if you are not familiar with Generational GC http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#generation_sizing.total_heap

I assume this has to do with fragmenting the heap. The free memory is probably not available as a single contiguous free area and when you try to allocate a large block this fails because the requested memory cannot be allocated in a single piece.

The memory displayed by windows task manager is the total memory allocated to the process which includes memory for code, stack, perm gen and heap.
The memory you measure using your click program is the amount of heap jvm makes available to running jvm programs.
Natrually the total allocated memory to JVM by windows should be greater than what JVM makes available to your program as heap memory.

Can Sun JVM handle gigantic heap sizes without problems, and how?

I have heard several people claiming that you can not scale the JVM heap size up. I've heard claims of the practical limit being 4 gigabytes (I heard an IBM consultant say that), 10 gigabytes, 32 gigabytes, and so on... I simply can not believe any of those numbers and have been wondering about the issue now for a while.
So, I have three part question I would hope someone with experience could answer:
Given the following case how would you tune the heap and GC settings?
Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
Should this really still work? I think it should.
The case:
64 bit platform
64 cores
64 gigabytes of memory
The application server is client facing (ie. Jboss/tomcat web application server) - complete pauses of JVM would probably be noticed by end users
Sun JVM, probably 1.5
To prove I am not asking you guys to do my homework this is what I came up with:
-XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:+UnlockDiagnosticVMOptions -XX:-EliminateZeroing -Xmn768m -Xmx55000m
CMS should reduce the amount of pauses, although it comes with overhead. The other settings for CMS seem to default automatically to the number of CPUs so they seem sane to me. The rest that I added are extras that might do good or bad generally for performance, and they should probably be tested.
Definitely.

I think it's going to be difficult for anybody to give you anything more than general advice, without having further knowledge of your application.
What I would suggest is that you use VisualGC (or the VisualGC plugin for VisualVM) to actually look at what the garbage collection is doing when your app is running. Once you have a greater understanding of how the GC is working alongside your application, it'll be far easier to tune it.

#1. Given the following case how would you tune the heap and GC settings?
First, having 64 gigabytes of memory doesn't imply that you have to use them all for one JVM. Actually, it rather means you can run many of them. Then, it is impossible to answer your question without any access to your machine and application to measure and analyse things (knowing what your application is doing isn't enough). And no, I'm not asking to get access to your environment :)
#2. Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
The goal of tuning is to find a good compromise between frequency and duration of (major) GCs. With a ~55g heap, GC won't be frequent but will take noticeable time, for sure (the bigger the heap, the longer the major GC). Using a Parallel or Concurrent garbage collector will help on multiprocessor systems but won't entirely solve this issue. Why do you need ~55g (this is mega ultra huge for a webapp IMO), that's my question. I'd rather run many clustered JVMs to handle load if required (at some point, the database will become the bottleneck anyway with a data oriented application).
#3. Should this really still work? I think it should.
Hmm... not sure I get the question. What is "this"? Instantiating a JVM with a big heap? Yes, it should. Is it equivalent to running several JVMs? No, certainly not.
PS: 4G is the maximum theoretical heap limit for the 32-bit JVM running on a 64-bit operating system (see Why can't I get a larger heap with the 32-bit JVM?)
PPS: On 64-bit VMs, you have 64 bits of addressability to work with resulting in a maximum Java heap size limited only by the amount of physical memory and swap space your system provides. (see How large a heap can I create using a 64-bit VM?)

Obviously heap size is not unlimited and the larger is the heap size, the more your JVM will eventually spend on GC. Though I think it is possible to set heap size quite high on 64-bit JVM, I still think it's not really practical. The advice here is better to have several JVMs running with the same parameters i.e. cluster of JBoss/Tomcat nodes running on the same physical machine and you will get better throughput.
EDIT: Also your GC behavior depends on the taxonomy of your heap. If you have a lot of short-living objects and each request to the server creates a lot of those, then your GC will collect a lot of garbage very often and thus on large heap size this will result in longer pauses. If you have very many long-living objects (e.g. caching most of your data in memory) and the amount of short-living objects is not that big, then having bigger heap size is OK.

As Chris Rice already wrote, I wouldn't expect any obvious problems with the GC for heap sizes up to 32-64GB, although there may of course be some point of your application logic, which can cause problems.
Not directly related to GC, but I would still recommend you to perform a realistic load test on your production system. I used to work on a project, where we had a similar setup (relatively large, clustered JBoss/Tomcat setup to serve a public web application) and without exaggeration, JBoss is not behaving very well under high load or with a high number of concurrent calls if you are using EJBs. JBoss is spending a lot of time in synchronized blocks when accessing and managing the EJB instance pools and if you opt for a cluster, it will even wait for intra-cluster network communication within these synchronized blocks. Be especially aware of poorly performing state replication, if you are using SFSBs.

Only to add some more switches I would use by default: -Xms55g can help to reduce the rampup time because it frees Java from the need to check if it can fall back to the initial size and allows also better internal initial sizing of memory areas.
Additionally we made good experiences with NewSize to give you a large young size to get rid of short term garbage: -XX:NewSize=1g Additionally most webapps create a lot of short time garbage that will never survive the request processing. You can even make that bigger. With Xms55g, the VM reserves a large chunk already. Maybe downsizing can help.
-Xincgc helps to clean the young generation incrementally and return the cpu often to the user threads.
-XX:CMSInitiatingOccupancyFraction=70 If you really fill all that memory, try to start CMS garbage collection earlier.
-XX:+CMSIncrementalMode puts the CMS into incremental mode to return the cpu to the user threads more often.
Attach to the process with jstat -gc -h 10 <pid> 1s and watch the GC working.
Will you really fill up the memory? I assume that 64cpus for request processing might even be able to work with less memory. What do you store in there?

Depending on your GC pause analysis, you may wish to implement Incremental mode whereby the long pause may be broken out over a period of time.

I have found memory architecture plays a part in large memory sizes. Applications in general don't perform as well if they use more than one memory bank. The JVM appears to suffer as well, esp the GC which has to sweep the whole memory.
If you have an application which doesn't fit into one memory bank, your application has to pull in memory which is not local to a processor and use memory local to another processor.
On linux you can run numactl --hardware to see the layout of processors and memory banks.

Which heap size do you prefer?

I know there is no "right" heap size, but which heap size do you use in your applications (application type, jdk, os)?
The JVM Options -Xms (initial/minimum) and -Xmx (maximum) allow for controlling the heap size. What settings make sense under which circumstances? When are the defaults appropriate?

You have to try your application and see how it performs. for example, I used to always run IDEA out of the box until I've got this new job where I work on this huge monolithic project. IDEA was running very slow and regularly throwing out of memory errors when compiling the full project.
first thing I did is to ramp up the heap to 1 gig. this got rid of the out of memory issues but it was still slow. I also noticed IDEA was regularly freezing for 10 seconds or so after which the used memory was cut in half only to ramp up again and , and that triggered the garbage collection idea. I now use it with -Xms512m, -Xmx768m but, I also added -Xincgc, to activate incremental garbage collection
As a result, I've got my old IDEA back: it runs smooth, doesn't freeze anymore and never uses more than 600m of heap.
For your application you have to use a similar approach. try to determine the typical memory usage and tune your heap for the application to run well in those conditions. But also let advanced users tune the setting, to address out of the ordinary data loads.

It depends on the application type. A desktop application is much different than a web application. An application server is much different than a standalone application.
It also depends on the JVM that you are using. JDK5 and later 6 include enhancements that help understand how to tune your application.
Heap size is important, but its also important to know how it plays with the garbage collector.
JDK1.4 Garbage Collector Tuning
JDK5 Garbage Collector Tuning
JDK6 Garbage Collector Tuning

Actually I always considered it very strange that Java limits the heap size. A native application can usually use as much heap as it wants, until it runs out of virtual address space. The only reason to limit the heap in Java seems the garbage collector, which has a certain kind of "laziness" and may not garbage collect objects, unless there is a necessity to do so. That means if you choose the heap too big, your app constantly uses more memory than is really necessary.
However, Sun has improved the GC a lot over the years and to emulate the behavior of a native C app, I would set the initial heap size to 32 MB (for small programs) or 64 MB (for bigger ones) and the maximum to something between 1-2 GB. If your app really needs over a 1 GB of memory, it is most likely broken (unless you deal with data objects that large), but I see no reason why your app should be killed, just because it goes over a certain heap size.
Of course, this is referring to normal PCs. If you create Java code for mobile phones or other limited devices, you should probably adopt the initial and maximum heap size to the limitations of that device.

Typically i try not to use heaps which are larger than 1GB.
It will cost you on major garbage collections.
Sometime it is better to split your application to a few JVM on the same machine and not you large heap sizes.
Major collection with a large heap size can take >10 mintues (on unoptimized GC applications).

This is entirely dependent on your application and any hardware limitations you may have. There is no one size fits all.
jmap can be used to have a look at what heap you are actually using and is a good starting point for right-sizing the heap.

You need to spend quite some time in JConsole or visualvm to get a clear picture on what the plateau memory usage is. Wait until everything is stable and you see the characteristic sawtooth curve of heap memory usage. The peaks should be your 70-80% heap, depending on what garbage collector you use.
Most garbage collectors trigger full GCs when heap usage reaches a certain percentage. This percentage is from 60% to 80% of max heap, depending on what strategy is involved.

1.3Gb for a heavy GUI application.
Unfortunately on Linux the JVM seems to pre-request 1.3G of virtual memory in that situation, which looks bad even if it's not needed (and causes a lot of confused grumbling from users)

On my most memory intensive app:
-Xms250M -Xmx1500M -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.