I have a Java application which uses 10 threads. Each thread opens a Matlab session by using the Java Matlabcontrol library. I'm running the application on a cluster running CentOS 6.
The used physical memory (Max Memory) for the whole application is around 5GB (as expected) but the reserved physical memory (Max Swap) is around 80GB which is too high. Here a short description from the cluster wiki:
A note on terminology: in LSF the Max Swap is the memory allocated by
an application and the Max Memory is the memory physically used (i.e.,
it is actually written to). As such, Max Swap > Max Memory. In most
applications Max Swap is about 10–20% higher than Max Memory
I think the problem is Java (or perhaps a mix between Java and Matlab). Java tends to allocate about 50% of the physically available memory on a compute node by default. A java process assumes that it can use the entire resources available on the system that it is running on. That is also the reason why it starts several hundred threads (although my application only uses 11 threads). It sees 24 cores and lots of memory even though the batch system reserves only 11 core for the job.
Is there a workaround for this issue?
Edit: I've just found the following line in the Matlabcontrol documentation:
When running outside MATLAB, the proxy makes use of multiple
internally managed threads. When the proxy becomes disconnected from
MATLAB it notifies its disconnection listeners and then terminates all
threads it was using internally. A proxy may disconnect from MATLAB
without exiting MATLAB by calling disconnect().
This explains why there are a lot of threads created but it does not explain the high amount of reserved memory.
Edit2: Setting MALLOC_ARENA_MAX=4 environment variable brought the amount of reserved memory down to 30GB. What value of MALLOC_ARENA_MAX should I choose and are they other tuning possibilities?
Related
This is a tricky one and is a little hard to explain but I will give it a shot to see if anyone out there has had a similar issue + fix.
Quick background:
Running a large Java Spring App on Tomcat in a Docker container. Other containers are simple, 1 for a JMS Queue and the other for Mysql. I run on Windows and have given Docker as much CPU as I have (and memory too). I have set JAVA_OPTS for Catalina to max out memory as well as memory limits in my docker-compose, but the issue seems to be CPU related.
When the app is idling it normally is sitting around 103% CPU (8 Cores, 800% max). There is a process we use which (using a Thread Pool) runs some workers to go out and run some code. On my local host (no docker in between) it runs very fast and flies, spitting out logs at a good clip.
Problem:
When running in Docker watching docker stats -a I can see the CPU start to ramp up when this process begins. Meanwhile in the logs, everything is flying by like expected while the CPU grows and grows. It seems to get close to 700% and then it kind of dies, but it doesn't. When it hits this threshold I see the CPU drop drastically down to < 5% where it stays for a little while. At this time logs stop printing, so I assume nothing is happening. Eventually it will kick back in and go back ~120% and continue its process like nothing happened sometimes respiking to ~400%.
What I am trying
I have played around with the memory settings to no success but it seems more like a CPU issue. I know Java in Docker is a bit wonky but I have given it all the room I can on my beefy dev box where locally this process runs without a hitch. I find it odd the CPU spikes then dies, but the container itself doesn't die or reset. Has anyone seen a similar issue or know some ways to further attack this CPU issue with Docker?
Thanks.
There is an issue in resource allocation in JVM containers, which occurs as it is referring to the overall system matrices instead of container matrices. In JAVA 7 and 8, JVM ergonomics are applying the systems’ (instance) matrices such as the number of cores and memory instead of docker allocated resources (cores and memory). As a result of that, the JVM initialized a number of parameters based on core count and Memory as below.
JVM memory footprints
-Perm/metaspace
-JIT Bytecode
-Heap Size (JVM ergonomics ¼ of instance memory)
CPU
-No. JIT compiler threads
-No. Garbage Collection threads
-No. Thread in the common fork-join pool
Therefore, the containers tend to become unresponsive due to high CPU or terminate the container by OOM kill. The reason for this is the container CGGroups and Namespaces are ignored by JVM in order to limit the Memory and CPU cycles. Therefore, JVM tends to get more resources of instance instead of limiting the separate allocation of docker allocated resources.E
Example
Assume two containers are running on 4 cores instance with 8GB memory. When it comes to the docker initialization point, assume that the docker is with 1GB memory and 2048 CPU cycles as a hard limit. Here, each and every container see 4 cores and those JVM allocate Memory, JIT compilers and GC threads separately according to their stats. However, the JVM will see the overall number of cores on that instance (4) and use that value to initialize the number of default threads that we have seen earlier. Accordingly, the JVM matrices of two containers will be as mentioned below.
-4 * 2 Jit Compiler Threads
-4 * 2 Garbage Collection threads
-2 GB Heap size * 2 (¼ of Instance full memory instead of docker allocated memory)
In terms of Memory
As per the above example, the JVM will gradually increase the heap usage as the JVM sees 2GB heap max size, which is a quarter of Instance memory (8GB). Once the memory usage of a container reaches to the Hard limit of 1GB, the container will be terminated by OOM kill.
In terms of CPU
As per the above example, one JVM has initialized with 4 Garbage Collection Threads and 4 JIT compiler. However, the docker allocates only 2048 CPU cycles. Therefore, it leads to high CPU, more context switching and unresponsive container, and finally will terminate the container due to high CPU.
Solution
Basically, there are two processes namely, CGGroups and Namespaces, which handling that kind of situation on the OS level. However, JAVA 7 and 8 do not accept the CGgroup and Namespaces, but the releases after jdk_1.8.131 are able to enable CGroup limit by JVM parameter (-XX:+UseCGroupMemoryLimitForHeap, -XX:+UnlockExperimentalVMOptions). However, it’s only providing solutions for memory issues but no concern on CPU set issue.
With OpenJDK 9, the JVM will automatically detect CPUsets. Especially in orchestration, it further able to manually overwrite the default parameters for CPU set thread counts as per the count of CPU cycles on the container by using JVM flags (XX:ParallelGCThreads, XX:ConcGCThreads).
We have a customer that uses WebSphere 7.0 on RedHat Linux Server 5.6 (Tikanga) with IBM JVM 1.6.
When we look at the OS reports for memory usage, we see very high numbers and OS starts to use SWAP memory in some point due to lack in memory.
On the other hand, JConsole graphs show perfectly normal behavior of memory - Heap size increases until GC is invoked when expected and Heap size drops to ~30% in normal cycles. Non heap is as expected and very constant in size.
Does anyone have an idea what this extra native memory usage can be attributed to?
I would check you are looking at resident memory and not virtual memory (the later can be very high)
If you swap, even slightly this can cause the JVM to halt for very long periods of time on a GC. If your application is not locking up for second or minutes, it probably isn't swapping (another program could be)
If your program really is using native memory, this will most like be due to a native library you have imported. If you have a look at /proc/{id}/mmap this may give you a clue, but more likely to will have to check which native libraries you are loading.
Note: if you have lots of threads, the stack space for all these reads can add up. I would try to keep these to a minimum if you can, but I have seen JVMs with many thousands and this can chew up native memory. GUI components can also use native memory but I assume you don't have any of those.
I have a Java SE desktop application which uses a lot of memory (1,1 GB would be desired). All target machines (Win 7, Win Vista) have plenty of physical memory (at least 4GB, most of them have more). There is also enough free memory.
Now, when the machines have some uptime and a lot of programs were started and terminated, the memory becomes fragmented (this is what I assume). This leads to the following error when the JVM is started:
JVM creation failed
Error occurred during initialization of VM
Could not reserve enough space for object heap
Even closing all running programs doesn't help in such a situation (despite Task Manager and other tools report enough free memory). The only thing thas helps is to reboot the machine and fire up the Java applicaton as one of the first programs launched.
As far as I've investigated, the Oracle VM requires one contiguous chunk of memory.
Is there any other way to assign about 1,1 GB of heap to my java application when this amount is available but may be fragmented?
I start my JVM with the following arguments:
-J-client -J-Xss2m -J-Xms512m -J-Xmx1100m -J-XX:PermSize=64m -J-Dsun.zip.disableMemoryMapping=true
Is there any other way to assign about 1,1 GB of heap to my java application when this amount is available but may be fragmented?
Use an OS which doesn't get fragmented virtual memory. e.g. 64-bit windows or any version of UNIX.
BTW It is hard for me to imagine how this is possible in the first place but I know it to be the case. Each process has its own virtual memory so its arrangement of virtual memory shouldn't depend on anything which is already running or has run before.
I believe it might be a hang over from the MS-DOS TSR days. Shared libraries loaded are given absolute addresses (added to the end of the signed address space, 2 GB, the high half is reserved for the OS and the last 512 MB for the BIOS) in memory meaning they must use the same address range in every program they are used in. Over time the maximum address is determined by the lowest shared library loaded or used (I don't know which one by I suspect the lowest loaded)
Using Oracle Java 1.7.0_05 on Ubuntu Linux 3.2.0-25-virtual, on an amazon EC2 instance having 7.5 GB of memory, we start three instances of java, each using the switch -Xmx 2000m.
We use the default Ubuntu EC2 AMI configuration of no swap space.
After running these instances for some weeks, one of them freezes -- possibly out of memory. But my question isn't about finding our memory leak.
When we try to restart the app, java gives us a message that it cannot allocate the 2000 mb of memory. We solved the problem by rebooting the server.
In other words, 2000 + 2000 + 2000 > 7500?
We have seen this issue twice, and I'm sorry to report we don't have good diagnostics. How could we run out of space with only two remaining java processes each using a max of 2000 mb? How should we proceed to diagnose this problem the next time it occurs? I wish I had a "free -h" output, taken while we cannot start the program, to show here.
TIA.
-Xmx sets the maximum size of the JVM heap, not the maximum size of the java process, which allocates more memory besides the heap available to the application: its own memory, the Permanent generation, what's allocated inside JNI libraries, etc.
There may be other processes using memory therefore the JVM cannot be started with 2G. If you really need that much memory for 3 Java processes each and you only have 7.5 total you might want to change your EC2 configuration to have more memory. Your just leaving 1.5 for everything else include the kernal, Oracle etc.
i made a server side aplication that uses 18mb of non-heap and around 6mbs of head of a max of 30mbs. i set the max of heap with -Xms and Xmx, the problem is that when i run the program on ubuntu server it takes arround 170mbs instead of 18+30 or atleast 100Mbs in the max. Some one know how to put VM only getting 100MBs?
The JVM uses heap, other memory like thread stacks and share libraries. The shared libraries can be relatively large but they don't use real memory unless they are actually used. If you run JVMs they are shared between them but you cannot see this in the process information.
In a modern PC, 1 GB of memory costs around $100 so reducing every last MB many not be seen to be as important as it used to be.
In response to your comment
i have made some tests with Jconsole
and VMvisual. Xmx 40mbs Xms 25. the
problem is that iam restricted to
512mbs since its a VPS and i can't pay
for it now. The other thing is that
with 100mbs each i could put atleast 3
process's running.
The problem is, you are going about it the wrong way. Don't try and get your VM super small so you can run 3 VMs. Combine everything into 1 VM. If you have 512 memory, then make 1 VM with 256MB of heap, and let it do everything. You can have 10s or 100s of threads in a single VM. In most cases, this will perform better and use less total memory than trying to run many small VMs.