I have a machine with 10GB of RAM. I am running 6 java processes with the -Xmx option set to 2GB. The probability of all 6 processes running simultaneously and consuming the entire 2GB memory is very very low. But I still want to understand this worst case scenario.
What happens when all 6 processes consume a little less than 2GB memory at the same instant such that the JVM does not start garbage collection yet the processes are holding that much memory and the sum of the memory consumed by these 6 processes exceeds the available RAM?
Will this crash the server? OR Will it slow down the processing?
You should expect each JVM could use more than 2 GB. This is because the heap is just one memory region, you also have
shared libraries
thread stacks
direct memory
native memory use by shared libraries
perm gen.
This means that setting a maximum heap of 2 GB doesn't mean your process maximum 2 GB.
Your processes should perform well until they get the point where you have swapped some of the heap and a GC is performed. A GC assumes random access to the whole heap and at this point, your system could start swapping like mad. If you have a SSD for swap your system is likely to stop, or almost stop for very long periods of time. If you have Windows (which I have found is worse than Linux in this regard) and a HDD, you might not get control of the machine back and have to power cycle it.
I would suggest either reducing the heap to say 1.5 GB at most, or buying more memory. You can get 8 GB for about $100.
Your machine will start swapping. As long as each java process uses only a small part of the memory it has allocated, you won't notice the effect, but if they all garbage collect at the same time, accessing all of their memory, your hard disk will have 100% utilization and the machine will "feel" very, very slow.
Related
This question already has answers here:
Java using much more memory than heap size (or size correctly Docker memory limit)
(5 answers)
Growing Resident Size Set in JVM
(1 answer)
Closed 2 years ago.
my java service is running on a 16 GB RAM host with -Xms and -Xmx set to 8GB.
The host is running a few other processes.
I noticed that my service consuming more memory over time.
I ran this command ps aux | awk '{print $6/1024 " MB\t\t" $11}' | sort -n on the host and recorded the memory usage by my java service.
When the service started, it used about 8GB memory (as -Xms and -Xmx set to 8GB) but after a week, it used about 9GB+ memory. It consumed about 100MB more memory per day.
I took a heap dump. I restarted my service and took another heap dump. I compared those two dumps but there were not much difference in the heap usage. The dump shows that the service used about 1.3GB before restart and used about 1.1 GB after restart.
From the process memory usage, my service is consuming more memory over time but that's not reported in the heap dump. How do I identify the increase in the memory usage in my service?
I set the -Xms and -Xmx to 8GB. Host has 16GB RAM. Do I set the min/max heap too high (50% of the total memory on the host)? would that cause any issues?
OK so you have told the JVM that it can use up to 8GB for the heap, and you are observing a total memory usage increasing from 1.1GB to 1.3GB. That's not actually an indication or problem per se. Certainly, the JVM is not using anywhere like as much memory as you have said it can do.
The second thing to note is that it is unclear how you are measuring memory usage. You should be aware that a JVM uses a lot of memory that is NOT Java heap memory. This includes:
The memory used by the java executable itself.
Memory used to hold native libraries.
Memory used to hold bytecodes and JIT compiled native code (in "metaspace")
Thread stacks
Off-heap memory allocations requested by (typically) native code.
Memory mapped files and shared memory segments.
Some of this usage is reported (if you use the right tools).
The third thing is that the actually memory used by the Java heap can vary a lot. The GC typically works by copying live objects from one "space" to another, so it needs a fair amount of free space to do this. Then once it has finished a run the GC looks at how much space is (now) free as a ratio with the space used. If that ratio is too small, it requests more memory from the OS. As a consequence, there can be substantial step increases in total memory usage even though the actual usage (in non-garbage objects) is only increasing gradually. This is quite likely for a JVM that has only started recently, due to various "warmup" effects.
Finally, the evidence you have presented does not say (to me!) that there is no memory leak. I think you need to take the heap dumps further apart. I would suggest taking one dump 2 hours after startup, and the second one 2 or more hours later. That would give you enough "leakage" to show up in a comparison of dumps.
From the process memory usage, my service is consuming more memory over time but that's not reported in the heap dump. How do I identify the increase in the memory usage in my service?
I don't think you need to do that. I think that the increase from 1.1GB to 1.3GB in overall memory usage is a red herring.
Instead, I think you should focus on the memory leak that the other evidence is pointing to. See my suggestion above.
Do I set the min/max heap too high (50% of the total memory on the host)? would that cause any issues?
Well ... a larger heap is going to have more pronounced performance degradation when the heap gets full. The flipside is that a larger heap means that it will take longer to fill up ... assuming that you have a memory leak ... which means it could take longer to diagnose the problem, or be sure that you have fixed it.
But the flipside of the flipside is that this might not be a memory leak at all. It could also be your application or a 3rd-party library caching things. A properly implemented cache could use a lot of memory, but if the heap gets too close to full, it should respond by breaking links1 and evicting cached data. Hence, not a memory leak ... hypothetically.
1 - Or if you use SoftReferences, the GC will break them for you.
It is related to my previous question
I set Xms as 512M, Xmx as 6G for one java process. I have three such processes.
My total ram is 32 GB. Out of that 2G is always occupied.
I executed free command to ensure that minimum 27G is free. But my jobs required only 18 GB max at any time.
It was running fine. Each job occupied around 4 to 5 GB but used around 3 to 4 GB. I understand that Xmx doesn't mean that process should always occupy 6 GB
When another X process started on the same server with another user, it has occupied 14G. Then one of my process got failed.
I understand that I need to increase ram or manage both collision jobs.
Here the question is that how can I force my job to use 6 GB always and why does it throw GC limit reached error in this case?
I used visualvm to monitor them. And jstat also.
Any advises are welcome.
Simple answer: -Xmx is not a hard limit to JVM. It only limits the heap available to Java inside JVM. Lower your -Xmx and you may stabilize process memory on a size that suits you.
Long answer: JVM is a complex machine. Think of this like an OS for your Java code. The Virtual Machine does need extra memory for its own housekeeping (e.g. GC metadata), memory occupied by threads' stack size, "off-heap" memory (e.g. memory allocated by native code through JNI; buffers) etc.
-Xmx only limits the heap size for objects: the memory that's dealt with directly in your Java code. Everything else is not accounted for by this setting.
There's a newer JVM setting -XX:MaxRam (1, 2) that tries to keep the entire process memory within that limit.
From your other question:
It is multi threading. 100 reader, 100 writer threads. Each one has it's own connection to the database.
Keep in mind that the OS' I/O buffers also need memory for their own function.
If you have over 200 threads, you also pay the price: N*(Stack size), and approx. N*(TLAB size) reserved in Young Gen for each thread (dynamically resizable):
java -Xss1024k -XX:+PrintFlagsFinal 2> /dev/null | grep -i tlab
size_t MinTLABSize = 2048
intx ThreadStackSize = 1024
Approximately half a gigabyte just for this (and probably more)!
Thread Stack Size (in Kbytes). (0 means use default stack size)
[Sparc: 512; Solaris x86: 320 (was 256 prior in 5.0 and earlier); Sparc 64 bit: 1024; Linux amd64: 1024 (was 0 in 5.0 and earlier); all others 0.] - Java HotSpot VM Options; Linux x86 JDK source
In short: -Xss (stack size) defaults depend on the VM and OS environment.
Thread Local Allocation Buffers are more intricate and help against allocation contention/resource locking. Explanation of the setting here, for their function: TLAB allocation and TLABs and Heap Parsability.
Further reading: "Native Memory Tracking" and Q: "Java using much more memory than heap size"
why does it throw GC limit reached error in this case.
"GC overhead limit exceeded". In short: each GC cycle reclaimed too little memory and the ergonomics decided to abort. Your process needs more memory.
When another X process started on the same server with another user, it has occupied 14g. Then one of my process got failed.
Another point on running multiple large memory processes back-to-back, consider this:
java -Xms28g -Xmx28g <...>;
# above process finishes
java -Xms28g -Xmx28g <...>; # crashes, cant allocate enough memory
When the first process finishes, your OS needs some time to zero out the memory deallocated by the ending process before it can give these physical memory regions to the second process. This task may need some time and until then you cannot start another "big" process that immediately asks for the full 28GB of heap (observed on WinNT 6.1). This can be worked around with:
Reduce -Xms so the allocation happens later in 2nd processes' life-time
Reduce overall -Xmx heap
Delay the start of the second process
Imagine I have a 64-bit machine with the total amount of memory (physical + virtual) equal to 6 GB. Now, what will happen when I run 5 applications with -Xmx2048m at the same time? They won't fail on start since this is 64-bit OS but what will happen when they all will need to use the 2GB of memory which I've set for them?
Question: Is it possible that there will be some memory leaks or something? What will happen? What are possible consequences of doing that?
I've read those questions: this and this, but they don't actually answer my question since I want to run more than one application which doesn't exceed the limit of memory on its own, but all together they do.
What you will experience is increased swapping during major garbage collections and thus increased GC pause-times.
When used memory exceeds physical memory (well, in fact even before that) modern OSs will write some of the less-used memory to disk. With a JVM this will most probably be some parts of the heap's tenured generation. When a major GC occurs it will have to touch all of the heap so it has to swap back in all of the pages it has offloaded to disk resulting in heavy IO-activity and increased CPU-load.
With multiple JVMs that have few major GCs this might work out with slightly increased pause times since one JVM's heap should easily fit into physical memory, but with one JVM's heap exceeding physical memory or simultaneous major GCs from several JVMs this might result in lots of swapping in and out of memory-pages ("thrashing") and GCs might take a really loooong time (up to several minutes).
Xmx merely reserves virtual address space. And it is virtual not physic.
The response of this setting depends on the OS :
Windows : The limit is defined by swap + physical. The exception are large pages (which need to be enabled with a group policy setting), which will limit it by physical ram (XMS is not possible).
Linux behavior is more complicated. (it depends on the vm.overcommit_memory and related sysctls and various flags from the mmap syscall) can (but not all) be controlled by JVM configuration flags. The behavior can be :
- Xms can exceed total ram + swap
- Xmx is capped by available physical ram.
I am running a Java application on a Linux-Cluster with SLURM as resource manager. To run my application I have to specify for SLURM the amount of memory I will need. SLURM will run my application in a kind of VM with the specified amount of memory. To tell my java application how much memory it can use I use the "-Xmx##g" parameter. I choose it 1GB less than I have requested from SLURM.
My problem is that I am exceeding the amount of memory I have chosen on SLURM and it terminates my application. It seems that the JVM uses about 1GB of memory, probably for things like GC or so.
Is there a possibility to restrict the size of the JVM or at least to tame it.
Cheers,
Markus
The maximum heap setting only limited the maximum heap. There are other memory regions which you have not limited such as
thread stacks
perm gen
shared libraries
native memory used by libraries
direct memory
memory mapped files.
If you want to limit the over all memory usage you need to be clear about whether you are limiting virtual memory or resident memory. Often monitoring tools make the mistake of monitoring virtual memory which shows a surprising lack of understanding of how applications work, or even why you monitor an application in the first place.
You want to monitor resident memory usage which means you need to know how much memory your application uses over time apart from the heap, then work out how much heap you can have plus some margin for error.
. To tell my java application how much memory it can use I use the "-Xmx##g" parameter. I choose it 1GB less than I have requested from SLURM.
At a guess I would start with 1/2 GB with -Xmx512m and see what is the peak resident memory and increase it if you find there is always a few hundred MB head room.
BTW 1 GB of memory doesn't cost that much these days (as little as $5). Your time could be worth much more than the resources you are trying to save.
I am currently working on a project where i need to have an in memory structure for my map task. I have made some calculations and i can say that i dont need more than 600MB of memory for every map task.
But the thing is that after a while i have java heap space problems or gc overhead limit. I don't know how can this be possible.
Here are some more details. I have two, quad-core system with 12GB of ram. So that means that i can have up to 8 map tasks running at the same time. I am building a tree, so i have an iterative algorithm that does a map-reduce job for every tree level. My algorithm works fine for small datasets, but for a medium dataset has heap space problems. My algorithm reaches a certain tree level and then it goes out of heap space, or has gc overhead problems. At that point, i made some calculations and i saw that every task doesnt need more than 100MB memory. So for 8 tasks, i am using about 800MB of memory. I don't know what is going on. I even updated my hadoop-env.sh file with these lines:
export HADOOP_HEAPSIZE=8000
export HADOOP_OPTS=-XX:+UseParallelGC
What is the problem? Does these lines even override the java options for my system? Using parallelGC is something that i saw on the internet and it was recommended when having multiple cores.
edits
Ok here are some edits after monitoring heap space and total memory.
I consume about 3500MB of RAM when running 6 task at the same time. That means that jobtracker, tasktracker, namenode, datanode, secondary namenode my operating system and 6 tasks all use 3500 of RAM which is a very logical size. So why do i get a gc overhead limit?
I follow the same algorithm for every tree level. The only thing that changes is the number of nodes in every tree level. Having many nodes in a tree level, does not add so much overhead to my algorithm. So why cant the gc work well?
If you maximum memory size hasn't changed, it will be 1/4 of main memory i.e. about 3 GB plus some overhead for non-heap usage could be 3.5 GB.
I suggest you try
export HADOOP_OPTS="-XX:+UseParallelGC -Xmx8g"
to set the maximum memory to 8 GB.
By default the maximum heap size is 1/4 of the memory (unless you are running a 32-bit JVM on Windows). So if the maximum heap size is being ignored it will still be 3 GB.
Whether you use one GC or another, it won't make much difference to when you run out of memory.
I suggest you take a heap dump with -XX:+HeapDumpOnOutOfMemoryError and read this in a profiler, e.g. VisualVM to see why it's using so much memory.