Computing method call stack size for checking StackOverflowException - java

Today morning I answered a question which is related to StackoverflowException . The person has asked when Stackoverflow exception occurs
See this link Simplest ways to cause stack overflow in C#, C++ and Java
So my question is that is there any method by which we can compute the method call stacks size dynamically in our program and then applying a check before calling a method which checks whether method call stack has space to accommodate it or not to prevent StackOverflowException.
As I am a java person I am looking for java but also looking for explanation related to the concept without boundation of any programming language.

The total memory available to a JVM is about 2-4GB for a 32bit JVM and the square of this for a 64bit JVM (about 4-16EB). The JVM splits it's memory into:
Heap Memory (allocation controlled via JVM options -Xms and -Xmx)
constructed object and array instances
static classes and array data (including contained object/array instances)
thread instances (object instances, runtime data & metadata including thread object monitor lock references)
Non-Heap Memory
aggregate stack memory
per-thread stack memory (per-thread allocation controlled via JVM option -Xss): method call frames, arguments, return values, locally declared primitives & references to objects
static constants (primitives)
String instance pool
java code: loaded classes and metadata
JVM internal-use memory (JVM code and data structures)
See http://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryMXBean.html and http://www.yourkit.com/docs/kb/sizes.jsp
Is there any method by which we can compute the method call stacks size dynamically in our program
There's no standard method included in Java SE/Java EE to obtain the per-thread stack actual memory usage.
There are standard methods to obtain the aggregate non-heap memory: MemoryMxBean.getNonHeapMemoryUsage(). Referring to this doesn't allow you to make dynamic in-code decisions to avoid StackOverflow exception
There are standard methods to obtain the call stack without it's memory usage: Thread.getStackTrace() ThreadMxBean.getThreadInfo() & ThreadInfo.getStackTrace()
I recommend that you don't do what you suggest in the question because:
You can't do it without some complex JVM-specific API that instruments/introspects on dynamic thread stack memory usage - where will you find such an API??
The per-thread stack normally consumes a tiny amount of memory relative to the entire JVM, so it is usually easy to assign enough to suit your algorithm (e.g. default of 128KB stack size for Windows 64bit JVM whilst 2GB of memory might have been budgeted for the entire JVM)
It would be very limited in power: if your logic actually needed to call a method, but you couldn't due to insufficient memory, then your program would be broken at that point. A StackOverflow exception would actually be the best response.
What you are trying to do could be an anti-design anti-pattern.
A "correct" approach would be to specify program requirements, specify required runtime environment (including minimum/needed memory!), and design your program accordingly for optimal performance and memory usage.
An anti-pattern is to not think about these things appropriately during design and development and just imagine some runtime introspection magic could cover for this. There may exist some (rare!) high-performance-demanding apps which need to drastically rearrange the algorithm at runtime to exactly match the discovered resources - but this is complex, ugly & expensive.
And even then, it would probably be better drive dynamic algorithm changes at a macro-level from the "-Xss" parameter, rather than at a micro-level from the exact stack memory consumption at a location in code.

I hope I am guessing what you are really asking. At first I thought you were asking how many calls deep your call was about to be. In other words, I thought you wanted to know how likely you were to trigger this exception, based on your current method circumstances. Then I decided you really wanted to find out how much stack depth you have to play with. In that case, there is another stack-overflow question that seems to address this, here. What is the maximum depth of the java call stack?
This tells you how to set that as a java command line parameter (to java, not your program).
Either way, I'd like to point out that stack overflow has mainly happened to me when I had an endless recursion. I had written methods (by mistake, of course) that called themselves, and were meant to stop when the problem got solved, but somehow the termination condition was never reached. This puts the method invocation onto the stack over and over until the max is exceeded. Not what I had in mind.
I hope that helps.

As far as I am aware, the stack limit in Java is quite abstract and not intended for measuring. In fact, I suspect that the stack size would vary from machine to machine, based on several factors such as memory.
I've never gotten a program to throw a stack overflow exception except for infinite loops / recursion. I'm scratching my head trying to figure out how it would even be possible to throw a stack overflow exception without an infinite loop. If your program is calling that many methods, then it is likely creating objects simultaneously, and you are much more likely to receive an OutOfMemory error than a stack overflow exception without infinite loop.
In fact, what the heck would be the point of a stack limit that could limit your ability to function properly? Java has memory limits to take care of you going overboard with resources. The purpose of stack overflow is to catch loops/recursion that have run amok and need to be caught.
The point I'm trying to make is: if stack overflow exceptions plague your unit testing, you ought to check those loops/recursive functions for some out of control behavior. The call stack is very, very long and I doubt you've reached it naturally.

Well, you can use something like it exists in C with Microsoft C++ compiler :
a specific function (i don't remember the name) which is called automatically on each start and end function.
Also, you count the number of calls and subcalls by increment and decrement the global counter after the start function and before the end function.
For example, with Microsoft .NET , you can insert some function call to increment and decrement your global counter on each call. It's JIT designed.
You can also use a nosql database in order to store your calls.
Also, there is an another thing : use a log system that automatically trace your calls.
Also, when your call stack is full, sometimes it is caused by a recursive function. With a few lines of code and an object, you can store some propagation on each function on each call.
That solution can be also used for detect in any function a special thing : "who is calling me ?"
Also, since Java is a byte-code generated, you can detect the byte-code of a function call and insert before one another function call and after one another function call in order to add your custom stack.

Related

return just before stack is about to overflow [duplicate]

Today morning I answered a question which is related to StackoverflowException . The person has asked when Stackoverflow exception occurs
See this link Simplest ways to cause stack overflow in C#, C++ and Java
So my question is that is there any method by which we can compute the method call stacks size dynamically in our program and then applying a check before calling a method which checks whether method call stack has space to accommodate it or not to prevent StackOverflowException.
As I am a java person I am looking for java but also looking for explanation related to the concept without boundation of any programming language.
The total memory available to a JVM is about 2-4GB for a 32bit JVM and the square of this for a 64bit JVM (about 4-16EB). The JVM splits it's memory into:
Heap Memory (allocation controlled via JVM options -Xms and -Xmx)
constructed object and array instances
static classes and array data (including contained object/array instances)
thread instances (object instances, runtime data & metadata including thread object monitor lock references)
Non-Heap Memory
aggregate stack memory
per-thread stack memory (per-thread allocation controlled via JVM option -Xss): method call frames, arguments, return values, locally declared primitives & references to objects
static constants (primitives)
String instance pool
java code: loaded classes and metadata
JVM internal-use memory (JVM code and data structures)
See http://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryMXBean.html and http://www.yourkit.com/docs/kb/sizes.jsp
Is there any method by which we can compute the method call stacks size dynamically in our program
There's no standard method included in Java SE/Java EE to obtain the per-thread stack actual memory usage.
There are standard methods to obtain the aggregate non-heap memory: MemoryMxBean.getNonHeapMemoryUsage(). Referring to this doesn't allow you to make dynamic in-code decisions to avoid StackOverflow exception
There are standard methods to obtain the call stack without it's memory usage: Thread.getStackTrace() ThreadMxBean.getThreadInfo() & ThreadInfo.getStackTrace()
I recommend that you don't do what you suggest in the question because:
You can't do it without some complex JVM-specific API that instruments/introspects on dynamic thread stack memory usage - where will you find such an API??
The per-thread stack normally consumes a tiny amount of memory relative to the entire JVM, so it is usually easy to assign enough to suit your algorithm (e.g. default of 128KB stack size for Windows 64bit JVM whilst 2GB of memory might have been budgeted for the entire JVM)
It would be very limited in power: if your logic actually needed to call a method, but you couldn't due to insufficient memory, then your program would be broken at that point. A StackOverflow exception would actually be the best response.
What you are trying to do could be an anti-design anti-pattern.
A "correct" approach would be to specify program requirements, specify required runtime environment (including minimum/needed memory!), and design your program accordingly for optimal performance and memory usage.
An anti-pattern is to not think about these things appropriately during design and development and just imagine some runtime introspection magic could cover for this. There may exist some (rare!) high-performance-demanding apps which need to drastically rearrange the algorithm at runtime to exactly match the discovered resources - but this is complex, ugly & expensive.
And even then, it would probably be better drive dynamic algorithm changes at a macro-level from the "-Xss" parameter, rather than at a micro-level from the exact stack memory consumption at a location in code.
I hope I am guessing what you are really asking. At first I thought you were asking how many calls deep your call was about to be. In other words, I thought you wanted to know how likely you were to trigger this exception, based on your current method circumstances. Then I decided you really wanted to find out how much stack depth you have to play with. In that case, there is another stack-overflow question that seems to address this, here. What is the maximum depth of the java call stack?
This tells you how to set that as a java command line parameter (to java, not your program).
Either way, I'd like to point out that stack overflow has mainly happened to me when I had an endless recursion. I had written methods (by mistake, of course) that called themselves, and were meant to stop when the problem got solved, but somehow the termination condition was never reached. This puts the method invocation onto the stack over and over until the max is exceeded. Not what I had in mind.
I hope that helps.
As far as I am aware, the stack limit in Java is quite abstract and not intended for measuring. In fact, I suspect that the stack size would vary from machine to machine, based on several factors such as memory.
I've never gotten a program to throw a stack overflow exception except for infinite loops / recursion. I'm scratching my head trying to figure out how it would even be possible to throw a stack overflow exception without an infinite loop. If your program is calling that many methods, then it is likely creating objects simultaneously, and you are much more likely to receive an OutOfMemory error than a stack overflow exception without infinite loop.
In fact, what the heck would be the point of a stack limit that could limit your ability to function properly? Java has memory limits to take care of you going overboard with resources. The purpose of stack overflow is to catch loops/recursion that have run amok and need to be caught.
The point I'm trying to make is: if stack overflow exceptions plague your unit testing, you ought to check those loops/recursive functions for some out of control behavior. The call stack is very, very long and I doubt you've reached it naturally.
Well, you can use something like it exists in C with Microsoft C++ compiler :
a specific function (i don't remember the name) which is called automatically on each start and end function.
Also, you count the number of calls and subcalls by increment and decrement the global counter after the start function and before the end function.
For example, with Microsoft .NET , you can insert some function call to increment and decrement your global counter on each call. It's JIT designed.
You can also use a nosql database in order to store your calls.
Also, there is an another thing : use a log system that automatically trace your calls.
Also, when your call stack is full, sometimes it is caused by a recursive function. With a few lines of code and an object, you can store some propagation on each function on each call.
That solution can be also used for detect in any function a special thing : "who is calling me ?"
Also, since Java is a byte-code generated, you can detect the byte-code of a function call and insert before one another function call and after one another function call in order to add your custom stack.

Display available stack memory in Java

I am trying to display the consumed stack space during a recursive program execution, like inorder traversal of a tree. Is there a way to print the available stack space in Java, I know that available heap memory could be displayed using Runtime.getRuntime().freeMemory().
Java doesn't give you access to this kind of information, for various reasons:
keeping track of that information might actually make some kinds of optimizations impossible
giving an exact number at a certain point might slow down the execution (due to a potentially necessary de-optimization).
the exact amount of available space left might vary over time (for example if stack space were dynamically allocated from a shared space and not fixed).
Any value you might get is very likely to only be of limited value (as you can't know for sure how many bytes a given method invocation needs in various scenarios).
The closest you can get is to get a current stack trace (for example using Thread.getStackTrace()) and check the returned arrays size to know how many stack frames are in use (i.e. how deep the stack already is).
But even that operation is likely to be somewhat costly.

How does Java store primitive types in RAM? [duplicate]

This question already has answers here:
Do Java primitives go on the Stack or the Heap?
(4 answers)
Closed 9 years ago.
This is NOT about whether primitives go to the stack or heap, it's about where they get saved in the actual physical RAM.
Take a simple example:
int a = 5;
I know 5 gets stored into a memory block.
My area of interest is where does the variable 'a' get stored?
Related Sub-questions: Where does it happen where 'a' gets associated to the memory block that contains the primitive value of 5? Is there another memory block created to hold 'a'? But that will seem as though a is a pointer to an object, but it's a primitive type involved here.
To expound on Do Java primitives go on the Stack or the Heap? -
Lets say you have a function foo():
void foo() {
int a = 5;
system.out.println(a);
}
Then when the compiler compiles that function, it'll create bytecode instructions that leave 4 bytes of room on the stack whenever that function is called. The name 'a' is only useful to you - to the compiler, it just creates a spot for it, remembers where that spot is, and everywhere where it wants to use the value of 'a' it instead inserts references to the memory location it reserved for that value.
If you're not sure how the stack works, it works like this: every program has at least one thread, and every thread has exactly one stack. The stack is a continuous block of memory (that can also grow if needed). Initially the stack is empty, until the first function in your program is called. Then, when your function is called, your function allocates room on the stack for itself, for all of its local variables, for its return types etc.
When your function main call another function foo, here's one example of what could happen (there are a couple simplifying white lies here):
main wants to pass parameters to foo. It pushes those values onto the top of the stack in such a way that foo will know exactly where they will be put (main and foo will pass parameters in a consistent way).
main pushes the address of where program execution should return to after foo is done. This increments the stack pointer.
main calls foo.
When foo starts, it sees that the stack is currently at address X
foo wants to allocate 3 int variables on the stack, so it needs 12 bytes.
foo will use X + 0 for the first int, X + 4 for the second int, X + 8 for the third.
The compiler can compute this at compile time, and the compiler can rely on the value of the stack pointer register (ESP on x86 system), and so the assembly code it writes out does stuff like "store 0 in the address ESP + 0", "store 1 into the address ESP + 4" etc.
The parameters that main pushed on the stack before calling foo can also be accessed by foo by computing some offset from the stack pointer.
foo knows how many parameters it takes (say 3) so it knows that, say, X - 8 is the first one, X - 12 is the second one, and X - 16 is the third one.
So now that foo has room on the stack to do its work, it does so and finishes
Right before main called foo, main wrote its return address on the stack before incrementing the stack pointer.
foo looks up the address to return to - say that address is stored at ESP - 4 - foo looks at that spot on the stack, finds the return address there, and jumps to the return address.
Now the rest of the code in main continues to run and we've made a full round trip.
Note that each time a function is called, it can do whatever it wants with the memory pointed to by the current stack pointer and everything after it. Each time a function makes room on the stack for itself, it increments the stack pointer before calling other functions to make sure that everybody knows where they can use the stack for themselves.
I know this explanation blurs the line between x86 and java a little bit, but I hope it helps to illustrate how the hardware actually works.
Now, this only covers 'the stack'. The stack exists for each thread in the program and captures the state of the chain of function calls between each function running on that thread. However, a program can have several threads, and so each thread has its own independent stack.
What happens when two function calls want to deal with the same piece of memory, regardless of what thread they're on or where they are in the stack?
This is where the heap comes in. Typically (but not always) one program has exactly one heap. The heap is called a heap because, well, it's just a big ol heap of memory.
To use memory in the heap, you have to call allocation routines - routines that find unused space and give it to you, and routines that let you return space you allocated but are no longer using. The memory allocator gets big pages of memory from the operating system, and then hands out individual little bits to whatever needs it. It keeps track of what the OS has given to it, and out of that, what it has given out to the rest of the program. When the program asks for heap memory, it looks for the smallest chunk of memory that it has available that fits the need, marks that chunk as being allocated, and hands it back to the rest of the program. If it doesn't have any more free chunks, it could ask the operating system for more pages of memory and allocate out of there (up until some limit).
In languages like C, those memory allocation routines I mentioned are usually called malloc() to ask for memory and free() to return it.
Java on the other hand doesn't have explicit memory management like C does, instead it has a garbage collector - you allocate whatever memory you want, and then when you're done, you just stop using it. The Java runtime environment will keep track of what memory you've allocated, and will scan your program to find out if you're not using all of your allocations any more and will automatically deallocate those chunks.
So now that we know that memory is allocated on the heap or the stack, what happens when I create a private variable in a class?
public class Test {
private int balance;
...
}
Where does that memory come from? The answer is the heap. You have some code that creates a new Test object - Test myTest = new Test(). Calling the java new operator causes a new instance of Test to be allocated on the heap. Your variable myTest stores the address to that allocation. balance is then just some offset from that address - probably 0 actually.
The answer at the very bottom is all just .. accounting.
...
The white lies I spoke about? Let's address a few of those.
Java is first a computer model - when you compile your program to bytecode, you're compiling to a completely made-up computer architecture that doesn't have registers or assembly instructions like any other common CPU - Java, and .Net, and a few others, use a stack-based processor virtual machine, instead of a register-based machine (like x86 processors). The reason is that stack based processors are easier to reason about, and so its easier to build tools that manipulate that code, which is especially important to build tools that compile that code to machine code that will actually run on common processors.
The stack pointer for a given thread typically starts at some very high address and then grows down, instead of up, at least on most x86 computers. That said, since that's a machine detail, it's not actually Java's problem to worry about (Java has its own made-up machine model to worry about, its the Just In Time compiler's job to worry about translating that to your actual CPU).
I mentioned briefly how parameters are passed between functions, saying stuff like "parameter A is stored at ESP - 8, parameter B is stored at ESP - 12" etc. This generally called the "calling convention", and there are more than a few of them. On x86-32, registers are sparse, and so many calling conventions pass all parameters on the stack. This has some tradeoffs, particularly that accessing those parameters might mean a trip to ram (though cache might mitigate that). x86-64 has a lot more named registers, which means that the most common calling conventions pass the first few parameters in registers, which presumably improves speed. Additionally, since the Java JIT is the only guy that generates machine code for the entire process (excepting native calls), it can choose to pass parameters using any convention it wants.
I mentioned how when you declare a variable in some function, the memory for that variable comes from the stack - that's not always true, and it's really up to the whims of the environment's runtime to decide where to get that memory from. In C#/DotNet's case, the memory for that variable could come from the heap if the variable is used as part of a closure - this is called "heap promotion". Most languages deal with closures by creating hidden classes. So what often happens is that the method local members that are involved in closures are rewritten to be members of some hidden class, and when that method is invoked, instead allocate a new instance of that class on the heap and stores its address on the stack; and now all references to that originally-local variable occur instead through that heap reference.
I think I got the point that you do not mean to ask whether data is store in heap or stack! we have the same puzzle about this!
The question you asked is highly related with programming language and how operating system deal with process and variables.
That is very interesting because when I were in my university studying C and C++, I encounter the same question as you. after reading some ASM code compiled by GCC, I have little bit of comprehension with this, let's discuss about it, if any problem, please comment it and let me learn more about it.
In my opinion, the variable name will not be stored and variable value are stored in, because in ASM code, there is no real variable name except for cache name for short, all the so called variable is just an off set from stack or heap.
which I think is a hint for my learning, since ASM deal with variable name in this way, other language might have the same strategy.
They just store off set for real place for holding data.
let us make an example, say the variable name a is placed in address #1000 and the type of this a is integer, thus in memory address
addr type value
#1000 int 5
which #1000 is the off set where the real data stored in.
as you can see that the data is put in the real off set for that.
In my understanding of process, that all the variable will be replaced by "address" of this "variable" at the beginning of a process, which means while CPU only deal with "address" that already allocated in memory.
let us review this procedure again: that you have defined
int a=5; print(a);
after compilation, the program is transfer into another format(all by my imagination) :
stack:0-4 int 5
print stack:0-4
while in the situation of process that real executing, I think the memory will be like this:
#2000 4 5 //allocate 4 byte from #2000, and put 5 into it
print #2000 4 //read 4 byte from #2000, then print
Since process's memory is allocated by CPU, the #2000 is an off set of this variable name, which means the name will be replaced by just an memory address, then will read data 5 from this address, and then execute the print command.
RETHINK
after completion of my writing, I found it rather hard to image by other people, we can discuss it if any problem or and mistake I have made.

btrace test the memory used by calling a function

Using btrace, I want to test how much heap my function used, so I write:
Above the code is samples of btrace I used.
And operating my function twice I got two different results:
As the pic shows, the heaps cost differs:one is as much as twice as another one.
You can not tell how much memory is needed by a certain method by diffing the JVM heap usage before and after the method has been invoked. Too many things go on in the system while the method is executing and the numbers you are obtaining represent the memory allocated by the JVM from OS - the results will tell you nothing.
If you want something at least remotely usable you should take a heap dump before and after the method invocation (Sys.Memory.dumpHeap(fileName)) and use a heapwalker to diff those two. Still, you will get quite a lot of noise there but it is much better than relying on the memory allocated by OS.
The most precise memory tracking would consist of capturing allocation info of all new instances created during the method invocation and directly connected to that invocation - eg. created in the invoked method, in all methods invoked from the tracked one recursively and also in all runnables spawned anywhere in the tracked method call tree recursively. Getting this done might be a bit tricky but it is perfectly achievable by BTrace.

Strange profiling results: definitely non-bottleneck method pops up

I'm profiling a program using sampling profiling in YourKit and JProfiler, and also "manually" (I launch it and press Ctrl-Break several times to get thread dumps).
All three methods give me extremely strange results: some tens of percents of time spent in a 3-line method that does not even do any allocation or synchronization and doesn't have loops etc. Moreover, after I made this method into a NOP and even removed its invocation completely, the observable program performance didn't change at all (although it got a negligible memory leak, since it was a method for freeing a cheap resource).
I'm thinking that this might be because of the constraints that JVM puts on the moments at which a thread's stacktrace may be taken, and it somehow turns out that in my program it is exactly the moments where this method is invoked, although there is absolutely nothing special about it or the context in which it is invoked.
What can be the explanation for this phenomenon?
What are the aforementioned constraints?
What further measurements can I take to clarify the situation?
To my minds, these results only show that this method gets called a huge number of times. Since its code is quite small, and it may be called as an invocation tree leaf, its impact on your profiling results seems neglectable. However, I had many time that kind of weird results.
Some 3rd party libraries cause the heap dumps to go completely haywire due to unexpected usage patterns, for example if cglib is used, it will mask away the actual cause of the issues and instead show a lot of Proxy objects (if I remember correctly) filling up the VM instead.
So in short, code generation and reflection may cause the stats to go wrong.
When you did the Ctrl-Break several times and got thread dumps, I'm curious what you saw. The call stacks are, to my mind, the most useful information. If your 3-liner is on the stack a large % of time, then you can see why by looking at where it is called from, and where that is called from, etc. Those call sites are just as responsible for the time being spent as the 3-liner is.
If the stack traces seem to make no sense, it may be because they are being delayed until after something completes. If that is so, look up the stack to see what has just completed, because the break could have occurred within that.

Categories

Resources