In other words -- with 1MB stacks (-Xss1m), do you actually bump your RSS by 1M every time you create a thread, or do you just consume 1MB of VSZ, plus a few actual pages top and/or bottom?
In other, other words, on 64b systems, and assuming it does the right thing (just map), is there any real downside to large (say, 10MB) "just-in-case" stacks?
As this answer already says: The Java VM will allocate the memory for the whole stack every time you create a new thread.
That means it depends on your OS and its virtual memory subsystem what happens next.
Related
As I know in Java thread stack size depends on JVM and OS Architecture and by default (unless -Xss is set) varies between 256k and 1m. Is there a way or tool allowing to see total stack size consumed by all currently running threads in runtime like I can see Heap Size or Metaspace Size using JVisualVM from JDK package? I understand that this value can be calculated as thread stack size * number of currently running threads however it would be great to monitor this value in runtime.
You can try JProfiler.
If you want to see it live in action before trying it yourself, check this
Also, your idea that thread stack size varies between 256k and 1m is absolutely correct.
In JDK 8, HotSpot installation comes with a feature named Native Memory Tracking (default: disabled). To enable it, use:
-XX:NativeMemoryTracking=[off|detail|summary]
After enabling NMT, you can examine the memory footprint taken by either Thread or Thread Stack using:
jcmd <pid> VM.native_memory [summary | detail | baseline | summary.diff | detail.diff | shutdown] [scale= KB | MB | GB]
First of all, the default stack size is platform specific, but that does not mean that the platform specific default "varies". My recollection is that the defaults have not changed (for any given "architecture") for a long time. (Probably since Java 5, if not earlier. But don't quote me!)
You also can't use the default to accurately determine how much stack memory has actually been allocated, since:
The platform default can be overridden via the -xss command line option. As you noted.
A non-default stack size can be specified when each Thread object is created.
The stack is only actually allocated when the Thread is started. (And it is deallocated when the Thread terminates.)
So if you wanted to measure the actual allocated stack memory you would need to iterate all of the threads and find their actual stack sizes. The first should be relatively straight forwards: traverse the ThreadGroup tree. (I don't think you can guarantee that you will see all threads in a traversal, but that shouldn't matter.) The second is more difficult since there is no getter for the private Thread.stackSize field ... and that field only records the parameter that the application supplied to the Thread constructor.
However, since typical applications just use the default stack size, counting the threads and multiplying by the default size will typically give a good estimate for the total thread stack usage.
It may also be possible to infer a JVM's allocated stack memory by examining the processes memory segments using the methodology of Andrei Pangin's stackmem script. (Noting that this is Linux specific, and that it relies on the JVM requesting individual memory segments from the OS for thread stacks.)
On the other hand, if you wanted to know the amount of stack space currently used (not just allocated), that would be difficult to get from within the application. And if you wanted to get it via an agent, I suspect that you would need to freeze the JVM first. That wouldn't be acceptable for regular monitoring.
But the bottom line is that getting the information will be (at least!) non-trivial and (IMO) probably not worth the effort. There is not a lot that you can do1 with an accurate measure of allocated stack space that you can't already do by looking at thread counts and multiplying ...
... it would be great to monitor this value in runtime.
Not convinced :-)
1 - There are two possible reasons for wanting to know how much stack memory is used: you need to optimize or curiosity. In the former case knowing how much stack memory is used doesn't tell you directly how much ought to be used. To determine the latter, you actually need to determine whether you have too many threads, or if those threads' stacks need to be as big as they currently are. Reducing stack memory usage "on principle" or because some says it is "best practice" could get you into trouble.
I see only disadvantage of this: you can get StackOverflow :) Why not use only Heap?
In Java, C, C++ the parameters to functions are passed on stack. The plain variables inside functions bodies are created in stack.
As I know the stack is limited per thread, has some default values, but relative low: 1-8 Mb.
Why not use the Heap instead of Stack. Both are in memory, just the OS make a separation from Address A to B is Heap and from C to D is Stack.
There are variable arguments. It says there are 10 variable of 4 byte each. If you read 11 than you maybe read some data a "memory" trash, and maybe exactly that you want for hacking or maybe you get a Segmentation fault ... if the OS detects you as bad boy. :) - So security can't be a reason for use Stack.
Performance is one of many reasons: memory in the stack is trivial to book-keep; it has no holes; it can be mapped directly into the cache; it is attached on a per-thread basis.
In contrast, memory in the heap is, well, a heap of stuff; it is more difficult to book-keep; it can have holes.
Check out this answer (excellent, in my opinion) explaining some other differences.
Others have already mentioned that the stack can be faster due to simplicity of incrementing/decrementing the stack pointer. This is, however, quite a ways from the whole story.
First of all, if you're using a garbage collector that compacts the heap (i.e., most modern collectors) allocation on the heap isn't much different from allocation on the stack. You simply keep a pointer to boundary between allocated and free memory, and to allocate some space, you just move that pointer, just like you would on the stack. Objects that will have extremely short lives (like the locals in most functions) cost next to nothing in a GC cycle too. Keeping a live object accessible takes (a little) work, but an object that's no longer accessible normally involves next to no work.
There is, however, often still a substantial advantage to using the stack for most variables. Many typical programs tend to run for fairly extended periods of time using nearly constant amounts of stack space. They enter one function, create some variables, use them for a while, pop them off the stack, then repeat the same cycle in another function.
This means most of the memory toward the top of the stack is almost always in the cache. Most function calls are re-using memory that was just vacated by the previous function call. By reusing the same memory continuously, you end up with considerably better cache usage.
By contrast, when you allocate items in the heap, you typically end up allocating separate space for nearly every item. You cache is in a constant state of "churn", throwing away the memory for objects you're no longer user to make space for newly allocated ones. Unless you use a minuscule heap, the chances of re-using an address while it's still in the cache are nearly nonexistent.
I'm sure this is answered a million times online, but...
Because you don't want every method call to be a memory allocation (slow). So, you pre-allocate your stack.
Some more reasons listed here (including security).
The answer is that you get holes when you allocate and de-allocate on the heap. This means that it gets more and more difficult to allocate memory since the places that are available are different sizes. The stack only reserves what is needed and gives it all back when you get out of scope. No hassle.
If everything was on the stack, each time you passed those values on, they would have to be copied. However, unlike the heap, it doesn't need to be cleverly managed - items on the heap require garbage collection.
So they work in two different ways that suit two different uses. The stack is a quick and lightweight home for values to be held for a short time whereas the heap allows you to pass objects around without copying them.
Neither stack nor heap is perfect for every scenario - that is why they both exist.
Using the heap requires "requesting" a bit of memory from the heap, using new or some similar function. Then, when it's finished, you delete the it again. This is very useful for variables that are long-lived and/or that take up quite a bit of space (or take up an "unknown at compile-time" space - for example if you read a string into a variable from a file, you don't necessarily know how much space it needs, and it's REALLY annoying to get a message from the program saying "String too large on line X in file Y").
On the other hand, the stack is "free" both when it comes to allocating and de-allocating (technically, any function that uses stack-space will need one extra instruction for the allocation of the stackspace, but compared to the several hundred or thousands that a call to new will involve, it's not noticeable). Of course, class objects will still have to have their respective constructors called, which may take almost any amount of time to complete, but that is true regardless of how/where the storage is allocated from.
Assume a multithreaded application scenario, in which every thread acquires some data (one or more files) from the network, performs some processing and then saves the results on the hard disk of the hosting machine.
In such a scenario, there is always the possibility that the disk space is exhausted, leading to unexpected service behavior (e.g., a system crash).
To avoid a case like that, it would be helpful if Java provided a means of reserving hard disk space, but, as verified in an earlier question, such an option is not available and even if it were, it could lead to inefficient allocation (e.g., in the case of a decompressing application, which does not know beforehand the total size of the decompressed data).
So, an alternative could be to make "virtual disk space reservations", e.g. by keeping in memory a static registry of the free space and having each thread request capacity from the registry before proceeding.
Are there any better alternatives, or improvements to this approach?
Is there any (preferably open source) Java library that implements such functionality?
An abstract way to implement this might be to use a constant or user inputted value of how much disk space the multi-threaded application is allowed to use, save this as variable and have synchronized get and set methods that will get the value of allowed space, allocate it space for the thread (as much is needed but not more then available), and then minus that from the total so other threads may see a decreased 'disk space' and once a thread has finished and data is deleted re-add the value so the used 'disk space' becomes usable to other threads?
EDIT:
it could lead to inefficient allocation (e.g., in the case of a decompressing application, which does not know beforehand the total size of the decompressed data).
If this occurs and the thread 'sees' (through a constant check while extracting file) that it has reached its limit of 'disk space', it could then request for more space allocation if available, or be put back into a queue until the needed space has been free'd up by other threads
Couldn't you also simply have an alert that triggers off after >XX% of the given partition is used up? That way your admin has time to go in and remove/copy data off or add additional storage to that mount point?
I want to decrease memory footprint of Java application in order to decrease swapping. I've been thinking about decreasing stack size (Xss parameter) for this purpose, but not sure how stack memory is allocated and whether the default 512k (for 32 bit OS) per thread sits always in resident memory regardless of how much of it is actually used.
Will decreasing stack memory lead to decrease of swapping?
Update: Please don't suggest to profile the application - it is already done.
How many threads are you running? Even with a huge number of threads and a very generous stack size (say, 10k threads and 256KB stack size) that's only 2GB of heap space.
You say you are running on a 32bit JVM, so I assume this is a relatively small system. You have a few options:
Switch to a 64bit JVM. Now you have tons of address space and the stack size should be inconsequential
Your machine is too small. If the 2gb of stack is a problem for your 10k+ threads, you are running too "big" of an application on too "small" of a machine. Do less in software or buy more hardware
Reduce your thread count
The problem is actually elsewhere and you are barking up the wrong tree
yes it will of course its lifo rule last in first out , less stack less swap
How much memory are you using and how much do you need to save?
Since the stack is only 512K per thread, it means you would need 200 Threads to start entering a value that might be worth saving (100Mb)
Since the use of stack memory would be 'very often' I would consider it a bad target for being swapped out. Unless you are dealing with a memory constrained environment?
I've written a simple application that works with database. My program have a table to show data from database. When I try to expand frame the program fails with OutOfMemory error, but if i don't try to do this, it works well.
I start my program with -Xmx4m parametre. Does it really need more than 4 megabytes to be in expanded state?
Another question: if I run the java visualVM I see the saw-edged chart of the heap usage of my program while other programs which is using java VM(such as netbeans) have more rectilinear charts. Why is heap usage of my program so unstable even if it does nothing(only waiting for user to push a button)?
You may want to try setting this value to generate a detailed heap dump to show you exactly what is going on.
-XX:+HeapDumpOnOutOfMemoryError
A typical "small" Java desktop application in 2011 is going to run with ~64-128MB. Unless you have a really pressing need, I would start by leaving it set to the default (i.e. no setting).
If you are trying to do something different (e.g. run this on an Android device), you are going to need to get very comfortable with profiling (and you should probably post with that tag).
Keep in mind that your 100 record cache (~12 bytes) may (probably) is double that if you are storing character data (Java uses UCS-16 internally).
RE: the "unstability", the JVM is going handling memory usage for you, and will perform garbage collection according to whatever algos it chooses (these have changed dramatically over the years). The graphing may just be an artifact of the tool and the sample period. The performance in a desktop app is affected by a huge number of factors.
As an example, we once had a huge memory "leak" that only showed up in one automated test but never showed up in normal real world usage. Turned out the test left the mouse hovering over a tool tip which included the name of the open file, which in turn had a set of references back to the entire (huge) project. Wiggling the mouse a few pixels got rid of the tooltip, which meant that the references all cleared up and the garbage collector took out the trash.
Moral of the story? You need to capture the exact heap dump at time of the out-of-memory and review it very carefully.
Why would you set your maximum heap size to 4 megabytes? Java is often memory intensive, so setting it at such a ridiculously low level is a recipe for disaster.
It also depends on how many objects are being created and destroyed by your code, and the underlying Swing (I am assuming) components use components to draw the elements, and how these elements are created and destroyed each time a components is redrawn.
Look at the CellRenderer code and this will show you why objects are being created and destroyed often, and why the garbage collector does such a wonderful job.
Try playing with the Xmx setting and see how the charts flatten out. I would expect Xmx64m or Xmx128m would be suitable (although the amount of data coming out of your database will obviously be an important contributing factor.
You may need more than 4Mb for a GUI with an expanded screen if you are using a double buffer. This will generate multiple image of the UI. It does this to show them quickly on the screen. Usually this is done assuming you have lots and lots of memory.
The Sawtooth memory allocation is due to something being done, then garbage collected. This may be on a repaint operation or other timer. Is there a timer in your code to check some process or value being changed. Or have you added code to a object repaint or other process?
I think 4mb is too small for anything except a trivial program - for example lots of GUI libraries (Swing included) will need to allocate temporary working space for graphics that alone may exceed that amount.
If you want to avoid out of memory errors but also want to avoid over-allocating memory to the JVM, I'd recommend setting a large maximum heap size and a small initial heap size.
Xmx (the maximum heap size) should
generally be quite large, e.g. 256mb
Xms (the initial heap size) can be
much smaller, 4mb should work -
though remember that if the application needs more
than this there will be a temporary performance
hit while it is resized