I want to decrease memory footprint of Java application in order to decrease swapping. I've been thinking about decreasing stack size (Xss parameter) for this purpose, but not sure how stack memory is allocated and whether the default 512k (for 32 bit OS) per thread sits always in resident memory regardless of how much of it is actually used.
Will decreasing stack memory lead to decrease of swapping?
Update: Please don't suggest to profile the application - it is already done.
How many threads are you running? Even with a huge number of threads and a very generous stack size (say, 10k threads and 256KB stack size) that's only 2GB of heap space.
You say you are running on a 32bit JVM, so I assume this is a relatively small system. You have a few options:
Switch to a 64bit JVM. Now you have tons of address space and the stack size should be inconsequential
Your machine is too small. If the 2gb of stack is a problem for your 10k+ threads, you are running too "big" of an application on too "small" of a machine. Do less in software or buy more hardware
Reduce your thread count
The problem is actually elsewhere and you are barking up the wrong tree
yes it will of course its lifo rule last in first out , less stack less swap
How much memory are you using and how much do you need to save?
Since the stack is only 512K per thread, it means you would need 200 Threads to start entering a value that might be worth saving (100Mb)
Since the use of stack memory would be 'very often' I would consider it a bad target for being swapped out. Unless you are dealing with a memory constrained environment?
Related
As I know in Java thread stack size depends on JVM and OS Architecture and by default (unless -Xss is set) varies between 256k and 1m. Is there a way or tool allowing to see total stack size consumed by all currently running threads in runtime like I can see Heap Size or Metaspace Size using JVisualVM from JDK package? I understand that this value can be calculated as thread stack size * number of currently running threads however it would be great to monitor this value in runtime.
You can try JProfiler.
If you want to see it live in action before trying it yourself, check this
Also, your idea that thread stack size varies between 256k and 1m is absolutely correct.
In JDK 8, HotSpot installation comes with a feature named Native Memory Tracking (default: disabled). To enable it, use:
-XX:NativeMemoryTracking=[off|detail|summary]
After enabling NMT, you can examine the memory footprint taken by either Thread or Thread Stack using:
jcmd <pid> VM.native_memory [summary | detail | baseline | summary.diff | detail.diff | shutdown] [scale= KB | MB | GB]
First of all, the default stack size is platform specific, but that does not mean that the platform specific default "varies". My recollection is that the defaults have not changed (for any given "architecture") for a long time. (Probably since Java 5, if not earlier. But don't quote me!)
You also can't use the default to accurately determine how much stack memory has actually been allocated, since:
The platform default can be overridden via the -xss command line option. As you noted.
A non-default stack size can be specified when each Thread object is created.
The stack is only actually allocated when the Thread is started. (And it is deallocated when the Thread terminates.)
So if you wanted to measure the actual allocated stack memory you would need to iterate all of the threads and find their actual stack sizes. The first should be relatively straight forwards: traverse the ThreadGroup tree. (I don't think you can guarantee that you will see all threads in a traversal, but that shouldn't matter.) The second is more difficult since there is no getter for the private Thread.stackSize field ... and that field only records the parameter that the application supplied to the Thread constructor.
However, since typical applications just use the default stack size, counting the threads and multiplying by the default size will typically give a good estimate for the total thread stack usage.
It may also be possible to infer a JVM's allocated stack memory by examining the processes memory segments using the methodology of Andrei Pangin's stackmem script. (Noting that this is Linux specific, and that it relies on the JVM requesting individual memory segments from the OS for thread stacks.)
On the other hand, if you wanted to know the amount of stack space currently used (not just allocated), that would be difficult to get from within the application. And if you wanted to get it via an agent, I suspect that you would need to freeze the JVM first. That wouldn't be acceptable for regular monitoring.
But the bottom line is that getting the information will be (at least!) non-trivial and (IMO) probably not worth the effort. There is not a lot that you can do1 with an accurate measure of allocated stack space that you can't already do by looking at thread counts and multiplying ...
... it would be great to monitor this value in runtime.
Not convinced :-)
1 - There are two possible reasons for wanting to know how much stack memory is used: you need to optimize or curiosity. In the former case knowing how much stack memory is used doesn't tell you directly how much ought to be used. To determine the latter, you actually need to determine whether you have too many threads, or if those threads' stacks need to be as big as they currently are. Reducing stack memory usage "on principle" or because some says it is "best practice" could get you into trouble.
A. If I execute a huge simulation program with -Xmx100000m (~100GB) I see some spikes in the used heap (~30 GB). That spikes increase the heap size and decreases the memory that can be used by other programs. I would like to limit the heap size to the size that is actually required to run the program without memory exceptions.
B. If I execute my simulation program with -Xmx10000 (~10GB) I am able to limit the used heap size (~ 7 GB). The total heap size is less, too (of course). I do not get out of memory exceptions in the first phase of the program that is shown in the VisualVM figures (about 16 minutes).
I naively expected that if I increase xmx from 10GB (B) to 100GB (A) that the used heap would stay about the same and that Java only would use more memory in order to avoid out of memory exceptions. However, the behavior seems to be different. I guess that Java works this way in order to improve performance.
An explanation for the large used heap in A might be that the growth behavior of hash maps is different if xmx is larger? Does xmx have an effect on the load factor?
In the phase of the program where a lot of mini spikes exist (see for example B at 12:06) instead of a few large ones (A) some java streams are processed. Does the memory allocation for stream processing automatically adapt with the xmx value? (There is still some memory left that could be used to have less mini spikes at 12:06 in B.)
If not, what might be the reasons for the larger used heap in A?
How can I tell Java to keep the used heap low if possible (like in the curves for B) but to take more memory if an out of memory exception could occur (allow to temporarily switch to A). Could this be done by tuning some garbage collection properties?
Edit
As stated by the answer below, the profile can be altered by garbage collection parameters. Applying -Xmx100000m -XX:MaxGCPauseMillis=1000 adapts the profile from A to consume less memory (~ 20 GB used) and more time (~ 22 min).
I would like to limit the heap size to the size that is actually required to run the program without memory exceptions.
You do not actually want to do that because it would make your program extremely slow because only providing the amount equivalent to the application peak footprint means that every single allocation would trigger a garbage collection while the application is near the maximum.
I guess that Java works this way in order to improve performance.
Indeed.
The JVM has several goals, in descending order:
pause times (latency)
allocation throughput
footprint
If you want to prioritize footprint over other goals you have to relax the other ones.
set -XX:MaxGCPauseMillis=18446744073709551615, this is the default for the parallel collector but G1 has a 200ms default.
configure it to keep less breathing room
BACKGROUND
I recently wrote a java application that consumes a specified amount of MB. I am doing this purposefully to see how another Java application reacts to specific RAM loads (I am sure there are tools for this purpose, but this was the fastest). The memory consumer app is very simple. I enter the number of MB I want to consume and create a vector of that many bytes. I also have a reset button that removes the elements of the vector and prompts for a new number of bytes.
QUESTION
I noticed that the heap size of the java process never reduces once the vector is cleared. I tried clear(), but the heap remains the same size. It seems like the heap grows with the elements, but even though the elements are removed the size remains. Is there a way in java code to reduce heap size? Is there a detail about the java heap that I am missing? I feel like this is an important question because if I wanted to keep a low memory footprint in any java application, I would need a way to keep the heap size from growing or at least not large for long lengths of time.
Try garbage collection by making call to System.gc()
This might help you - When does System.gc() do anything
Calling GC extensively is not recommended.
You should provide max heap size with -Xmx option, and watch memory allocation by you app. Also use weak references for objects which have short time lifecycle and GC remove them automatically.
I was reading an article on handling Out Of Memory error conditions in Java (and on JBoss platform) and I saw this suggestion to reduce the size of the threadstack.
How would reducing the size of the threadstack help with a max memory error condition?
When Java creates a new thread, it pre-allocates a fixed-size block of memory for that thread's stack. By reducing the size of that memory block, you can avoid running out of memory, especially if you have lots of threads - the memory saving is the reduction in stack size times the number of threads.
The downside of doing this is that you increase the chance of a Stack Overflow error.
Note that the thread stacks are created outside of the JVM heap, so even if there's plenty of memory available in the heap, you can still fail to create a thread stack due to running out of memory (or running out of address space, as Tom Hawtin correctly points out).
The problem exists on 32-bit JVMs were address space can get exhausted. Reducing the maximum stack size will not normally decrease the amount of memory actually allocated. Consider 8k threads with 256kB reserved for stack of 1k of 2MB, that's 31 bits of address space (2GB) gone there.
The problem all but disappears with 64-bit JVMs (although the actual amount of memory will increase a bit because references are twice as big). Alternatively, use of non-blocking APIs can remove the need for quite so many threads.
There are N threads in a process, and M bytes of memory is allocated for each thread stack. Total memory allocated for stack usage is N x M.
You can reduce total memory consumed by the stack by reducing the number of threads (N), or reducing the memory allocated for each thread (M).
Often a thread won't use all of the stack. It's pre-allocated "in case" it will be needed later, but if the thread doesn't use a deep call path, or doesn't use recursion, it may not need all of the stack space allocated on its behalf.
Finding the optimal stack size can be an art.
I would try other things (such as changing the survivor ratio or the size of space allocated for class definitions) before trying to change the thread stack size. It is hard to get it right, thus very easy to get a stack overflow error (which is equally fatal as an out of memory error.)
I've never gotten this right even after careful examination. But then again, I might have never encountered a web application/container combination that could be fined-tuned by changing its thread stack size. I've had much better (and non-fatal) results modifying the survivor ratio. But that has been my work experience. In different workplaces and applications, YMMV.
It is not possible to increase the maximum size of Java's heap after the VM has started. What are the technical reasons for this? Do the garbage collection algorithms depend on having a fixed amount of memory to work with? Or is it for security reasons, to prevent a Java application from DOS'ing other applications on the system by consuming all available memory?
In Sun's JVM, last I knew, the entire heap must be allocated in a contiguous address space. I imagine that for large heap values, it's pretty hard to add to your address space after startup while ensuring it stays contiguous. You probably need to get it at startup, or not at all. Thus, it is fixed.
Even if it isn't all used immediately, the address space for the entire heap is reserved at startup. If it cannot reserve a large enough contiguous block of address space for the value of -Xmx that you pass it, it will fail to start. This is why it's tough to allocate >1.4GB heaps on 32-bit Windows - because it's hard to find contiguous address space in that size or larger, since some DLLs like to load in certain places, fragmenting the address space. This isn't really an issue when you go 64-bit, since there is so much more address space.
This is almost certainly for performance reasons. I could not find a terrific link detailing this further, but here is a pretty good quote from Peter Kessler (full link - be sure to read the comments) that I found when searching. I believe he works on the JVM at Sun.
The reason we need a contiguous memory
region for the heap is that we have a
bunch of side data structures that are
indexed by (scaled) offsets from the
start of the heap. For example, we
track object reference updates with a
"card mark array" that has one byte
for each 512 bytes of heap. When we
store a reference in the heap we have
to mark the corresponding byte in the
card mark array. We right shift the
destination address of the store and
use that to index the card mark array.
Fun addressing arithmetic games you
can't do in Java that you get to (have
to :-) play in C++.
This was in 2004 - I'm not sure what's changed since then, but I am pretty sure it still holds. If you use a tool like Process Explorer, you can see that the virtual size (add the virtual size and private size memory columns) of the Java application includes the total heap size (plus other required space, no doubt) from the point of startup, even though the memory 'used' by the process will be no where near that until the heap starts to fill up...
Historically there has been a reason for this limitiation, which was not to allow Applets in the browser to eat up all of the users memory. The Microsoft VM which never had such a limitiation actually allowed to do this which could lead to some sort of Denial of Service attack against the users computer. It was only a year ago that Sun introduced in the 1.6.0 Update 10 VM a way to let applets specify how much memory they want (limited to a certain fixed share of the physical memory) instead of always limiting them to 64MB even on computers that have 8GB or more available.
Now since the JVM has evolved it should have been possible to get rid of this limitation when the VM is not running inside a browser, but Sun obviously never considered it such a high priority issue even though there have been numerous bug reports been filed to finally allow the heap to grow.
I think the short, snarky, answer is because Sun hasn't found it worth the time and cost to develop.
The most compelling use case for such a feature is on the desktop, IMO, and Java has always been a disaster on the desktop when it comes to the mechanics of launching the JVM. I suspect that those who think the most about those issues tend to focus on the server side and view any other details best left to native wrappers. It is an unfortunate decision, but it should just be one of the decision points when deciding on the right platform for an application.
My gut feel is that it has to do with memory management with respect to the other applications running on the operating system.
If you set the maximum heap size to, for example, the amount of RAM on the box you effectively let the VM decide how much memory it requires (up to this limit). The problem with this is that the VM could effectively cripple the machine it is running on because it will take over all the memory on the box before it decides that it needs to garbage collect.
When you specify max heap size, what you're saying to the VM is, you are allowed to use this amount of memory before you need to start garbage collecting. You cannot have more because if you take more then the other applications running on the box will slow down and you will start swapping to the disk if you use more than this.
Also be aware that they are two values with respect to memory, that is "current heap size" and "max heap size". The current heap size is how much memory the heap size is currently using and, if it requires more it can resize the heap but it cannot resize the heap above the value of maximum heap size.
From IBM's performance tuning tips (so may not be directly applicable to Sun's VMs)
The Java heap parameters influence the behavior of garbage collection. Increasing the heap size supports more object creation. Because a large heap takes longer to fill, the application runs longer before a garbage collection occurs. However, a larger heap also takes longer to compact and causes garbage collection to take longer.
The JVM has thresholds it uses to manage the JVM's storage. When the thresholds are reached, the garbage collector gets invoked to free up unused storage. Therefore, garbage collection can cause significant degradation of Java performance. Before changing the initial and maximum heap sizes, you should consider the following information:
In the majority of cases you should set the maximum JVM heap size to value higher than the initial JVM heap size. This allows for the JVM to operate efficiently during normal, steady state periods within the confines of the initial heap but also to operate effectively during periods of high transaction volume by expanding the heap up to the maximum JVM heap size. In some rare cases where absolute optimal performance is required you might want to specify the same value for both the initial and maximum heap size. This will eliminate some overhead that occurs when the JVM needs to expand or contract the size of the JVM heap. Make sure the region is large enough to hold the specified JVM heap.
Beware of making the Initial Heap Size too large. While a large heap size initially improves performance by delaying garbage collection, a large heap size ultimately affects response time when garbage collection eventually kicks in because the collection process takes more time.
So, I guess the reason that you can't change the value at runtime is because it may not help: either you have enough space in your heap or you don't. Once you run out, a GC cycle will be triggered. If that doesn't free up the space, you're stuffed anyway. You'd need to catch the OutOfMemoryException, increase the heap size, and then retry you calculation, hoping that this time you have enough memory.
In general the VM won't use the maximum heap size unless you need it, so if you think you might need to expand the memory at runtime, you could just specify a large maximum heap size.
I admit that's all a bit unsatisfying, and seems a bit lazy, since I can imagine a reasonable garbage collection strategy which would increase the heap size when GC fails to free enough space. Whether my imagination translates to a high performance GC implementation is another matter though ;)