This is the context of my program.
A function has 50% chance to do nothing, 50% to call itself twice.
What is the probability that the program will finish?
I wrote this piece of code, and it works great apparently. The answer which may not be obvious to everyone is that this program has 100% chance to finish. But there is a StackOverflowError (how convenient ;) ) when I run this program, occuring in Math.Random(). Could someone point to me where does it come from, and tell me if maybe my code is wrong?
static int bestDepth =0;
static int numberOfPrograms =0;
#Test
public void testProba(){
for(int i = 0; i <1000; i++){
long time = System.currentTimeMillis();
bestDepth = 0;
numberOfPrograms = 0;
loop(0);
LOGGER.info("Best depth:"+ bestDepth +" in "+(System.currentTimeMillis()-time)+"ms");
}
}
public boolean loop(int depth){
numberOfPrograms++;
if(depth> bestDepth){
bestDepth = depth;
}
if(proba()){
return true;
}
else{
return loop(depth + 1) && loop(depth + 1);
}
}
public boolean proba(){
return Math.random()>0.5;
}
.
java.lang.StackOverflowError
at java.util.Random.nextDouble(Random.java:394)
at java.lang.Math.random(Math.java:695)
.
I suspect the stack and the amount of function in it is limited, but I don't really see the problem here.
Any advice or clue are obviously welcome.
Fabien
EDIT: Thanks for your answers, I ran it with java -Xss4m and it worked great.
Whenever a function is called or a non-static variable is created, the stack is used to place and reserve space for it.
Now, it seems that you are recursively calling the loop function. This places the arguments in the stack, along with the code segment and the return address. This means that a lot of information is being placed on the stack.
However, the stack is limited. The CPU has built-in mechanics that protect against issues where data is pushed into the stack, and eventually override the code itself (as the stack grows down). This is called a General Protection Fault. When that general protection fault happens, the OS notifies the currently running task. Thus, originating the Stackoverflow.
This seems to be happening in Math.random().
In order to handle your problem, I suggest you to increase the stack size using the -Xss option of Java.
As you said, the loop function recursively calls itself. Now, tail recursive calls can be rewritten to loops by the compiler, and not occupy any stack space (this is called the tail call optimization, TCO). Unfortunately, java compiler does not do that. And also your loop is not tail-recursive. Your options here are:
Increase the stack size, as suggested by the other answers. Note that this will just defer the problem further in time: no matter how large your stack is, its size is still finite. You just need a longer chain of recursive calls to break out of the space limit.
Rewrite the function in terms of loops
Use a language, which has a compiler that performs TCO
You will still need to rewrite the function to be tail-recursive
Or rewrite it with trampolines (only minor changes are needed). A good paper, explaining trampolines and generalizing them further is called "Stackless Scala with Free Monads".
To illustrate the point in 3.2, here's how the rewritten function would look like:
def loop(depth: Int): Trampoline[Boolean] = {
numberOfPrograms = numberOfPrograms + 1
if(depth > bestDepth) {
bestDepth = depth
}
if(proba()) done(true)
else for {
r1 <- loop(depth + 1)
r2 <- loop(depth + 1)
} yield r1 && r2
}
And initial call would be loop(0).run.
Increasing the stack-size is a nice temporary fix. However, as proved by this post, though the loop() function is guaranteed to return eventually, the average stack-depth required by loop() is infinite. Thus, no matter how much you increase the stack by, your program will eventually run out of memory and crash.
There is nothing we can do to prevent this for certain; we always need to encode the stack in memory somehow, and we'll never have infinite memory. However, there is a way to reduce the amount of memory you're using by about 2 orders of magnitude. This should give your program a significantly higher chance of returning, rather than crashing.
We can do this by noticing that, at each layer in the stack, there's really only one piece of information we need to run your program: the piece that tells us if we need to call loop() again or not after returning. Thus, we can emulate the recursion using a stack of bits. Each emulated stack-frame will require only one bit of memory (right now it requires 64-96 times that, depending on whether you're running in 32- or 64-bit).
The code would look something like this (though I don't have a Java compiler right now so I can't test it):
static int bestDepth = 0;
static int numLoopCalls = 0;
public void emulateLoop() {
//Our fake stack. We'll push a 1 when this point on the stack needs a second call to loop() made yet, a 0 if it doesn't
BitSet fakeStack = new BitSet();
long currentDepth = 0;
numLoopCalls = 0;
while(currentDepth >= 0)
{
numLoopCalls++;
if(proba()) {
//"return" from the current function, going up the callstack until we hit a point that we need to "call loop()"" a second time
fakeStack.clear(currentDepth);
while(!fakeStack.get(currentDepth))
{
currentDepth--;
if(currentDepth < 0)
{
return;
}
}
//At this point, we've hit a point where loop() needs to be called a second time.
//Mark it as called, and call it
fakeStack.clear(currentDepth);
currentDepth++;
}
else {
//Need to call loop() twice, so we push a 1 and continue the while-loop
fakeStack.set(currentDepth);
currentDepth++;
if(currentDepth > bestDepth)
{
bestDepth = currentDepth;
}
}
}
}
This will probably be slightly slower, but it will use about 1/100th the memory. Note that the BitSet is stored on the heap, so there is no longer any need to increase the stack-size to run this. If anything, you'll want to increase the heap-size.
The downside of recursion is that it starts filling up your stack which will eventually cause a stack overflow if your recursion is too deep. If you want to ensure that the test ends you can increase your stack size using the answers given in the follow Stackoverflow thread:
How to increase to Java stack size?
Related
public class Factorial {
int factR(int n){
int result;
if(n==1)return 1;
result=factR(n-1)*n;
System.out.println("Recursion"+result);
return result;
}
I know that this method will have the output of
Recursion2
Recursion6
Recursion24
Recursion120
Recursive120
However, my question is how does java store the past values for the factorial? It also appears as if java decides to multiply the values from the bottom up. What is the process by which this occurs? It it due to how java stores memory in its stack?
http://www.programmerinterview.com/index.php/recursion/explanation-of-recursion/
The values are stored on Java's call stack. It's in reverse because of how this recursive function is defined. You're getting n, then multiplying it by the value from the same function for n-1 and so on, and so on, until it reaches 1 and just returns 1 at that level. So, for 5, it would be 5 * 4 * 3 * 2 * 1. Answer is the same regardless of the direction of multiplication.
You can see how this works by writing a program that will break the stack and give you a StackOverflowError. You cannot store infinite state on the call stack!
public class StackTest {
public static void main(String[] args) {
run(1);
}
private static void run(int index) {
System.out.println("Index: " + index);
run(++index);
}
}
It actually isn't storing 'past values' at all. It stores the state of the program in the stack, with a frame for each method call containing data such as the current line the program is on. But there is only one value for the variable result at any time, for the current method on top of the stack. That gets returned and used to compute result in the frame that called this, and so on backwards, hence the bottom up behaviour you see.
One way to make this less confusing is to take recursion out of the picture temporarily. Suppose Java did not support recursion, and methods were only allowed to call other, different methods. If you wanted to still take a similar approach, one crude way would be to copy paste the factR method into multiple distinct but similar methods, something like:
int fact1(int n){
int result;
if(n==1)return 1;
// Here's the difference: call the 'next' method
result=fact2(n-1)*n;
System.out.println("Recursion"+result);
return result;
}
Similarly define a fact2 which calls fact3 and so on, although eventually you have to stop defining new methods and just hope that the last one doesn't get called. This would be a horrible program but it should be very obvious how it works as there's nothing magical. With some thought you can realise that factR is essentially doing the same thing. Then you can see that Java doesn't 'decide' to multiply the values bottom up: the observed behaviour is the only logical possibility given your code.
well i am trying to understand you,
if someone call likewise then
factR(3) it's recursive process so obviously java uses Stack for maintaining work flow,
NOTE : please see below procedural task step by step and again note
where it get back after current task complete.
result=factR(2)*3 // again call where n=2
-> result=factR(1)*2 // again call where n=1
-> now n=1 so that it will return 1
-> result=1*2 // after return it will become 6
print "Recursion2" // print remaning stuff
return 2;
result=2*3 // after return it will become 6
print "Recursion3" // print remaning stuff
return 3
I have a nested loop which iterates over all combinations of two elements from an array. However, if the sum of the two values is too large, I want to skip to the next x.
Here's the Java code snippet:
/* Let array be an array of integers
* and size be equal to its length.
*/
for (int a = 0; a < size; a++)
{
int x = array[a];
for (int b = 0; b < size(); b++)
{
int y = array[b];
if ((x + y) < MAX)
{
// do stuff with x and y
}
else
{
// x + y is too big; skip to next x
break;
}
}
}
This works exactly as expected.
However, if I replace the break statement with b = size;, it surprisingly runs about 20% faster. Note that by setting b = size;, the inner for conditional becomes false and execution continues to the next iteration of the outer a loop.
Why would this happen? It seems like break should be faster, as I would have thought it saves an assignment, jump, and compare. Though clearly it does not.
Why would this happen? It seems like break should be faster ...
IMO, the most likely explanation is some kind of JVM warmup effect, especial since the overall times (120ms versus 74ms) are so small. If you wrapped that loop in another one, so that you could perform the time measurements repeatedly in the same run, this anomaly is likely to go away.
(Just increasing the array sizes isn't necessarily going to help. The best way to be sure that you have accounted for JVM warmup anomalies it to use a benchmarking framework; e.g. Caliper. But, failing that, put the "snippet" into a method and call it repeatedly.)
... as I would have thought it saves an assignment, jump, and compare. Though clearly it does not.
It is not clear at all. Your Java code gets compiled to bytecodes by javac (or your IDE). When you run the code, it starts out interpreting the bytecodes, and then after a bit they are compiled to native code by the JIT compiler:
The JIT compilation takes time that is (probably) included in your time measurements ... and one source of warmup anomalies.
The code produced by the JIT compiler is influenced by statistics gathered while interpreting. One of the things that is typically measured is whether branches (e.g. if tests) go one way or the other. This is used to make branch predictions ... which if correct make the test-and-branch instruction sequences a lot faster.
My problem is that I usually get a java.lang.StackOverflowError when I use recursion.
My question is - why does recursion cause stackoverflow so much more than loops do, and is there any good way of using recursion to avoid stack overflow?
This is an attempt to solve problem 107, it works well for their example but runs out of stack space for the problem it self.
//-1 16 12 21 -1 -1 -1 16 -1 -1 17 20 -1 -1 12 -1 -1 28 -1 31 -1 21 17 28 -1 18 19 23 -1 20 -1 18 -1 -1 11 -1 -1 31 19 -1 -1 27 -1 -1 -1 23 11 27 -1
public class tries
{
public static int n=7,min=Integer.MAX_VALUE;
public static boolean[][] wasHere=new boolean[n][60000];
public static void main(String[] args)
{
int[] lines=new int[n]; Arrays.fill(lines, -1000); lines[0]=0;
int[][] networkMatrix=new int[n][n];
Scanner reader=new Scanner(System.in);
int sum=0;
for(int k=0; k<n; k++)
{
for(int r=0; r<n; r++)
{
networkMatrix[k][r]=reader.nextInt();
if(networkMatrix[k][r]!=-1) sum+=networkMatrix[k][r];
Arrays.fill(wasHere[k], false);
}
}
recursive(lines,networkMatrix,0,0);
System.out.println((sum/2)-min);
}
public static void recursive(int[] lines, int[][] networkMatrix, int row,int lastRow)
{
wasHere[row][value((int)use.sumArr(lines))]=true;
if(min<sum(lines)) return;
if(isAllNotMinus1000(lines)) min=sum(lines);
int[][] copyOfMatrix=new int[n][n];
int[] copyOfLines;
for(int i=0; i<n; i++)
{
copyOfLines=Arrays.copyOf(lines, lines.length);
for(int k=0; k<n; k++) copyOfMatrix[k]=Arrays.copyOf(networkMatrix[k], networkMatrix[k].length);
if(i!=0&©OfMatrix[i][row]!=0) copyOfLines[i]=copyOfMatrix[i][row];
copyOfMatrix[i][row]=0; copyOfMatrix[row][i]=0;
if(networkMatrix[row][i]==-1) continue;
if(wasHere[i][value((int)use.sumArr(copyOfLines))]) continue;
if(min<sum(copyOfLines)) continue;
recursive(copyOfLines,copyOfMatrix,i,row);
}
}
public static boolean isAllNotMinus1000(int[] lines)
{
for(int i=0; i<lines.length; i++) {if(lines[i]==-1000) return false;}
return true;
}
public static int value(int n)
{
if(n<0) return (60000+n);
return n;
}
public static int sum(int[] arr)
{
int sum=0;
for(int i=0; i<arr.length; i++)
{
if(arr[i]==-1000) continue;
sum+=arr[i];
}
return sum;
}
}
why does recursion cause stackoverflow so much more than loops do
Because each recursive call uses some space on the stack. If your recursion is too deep, then it will result in StackOverflow, depending upon the maximum allowed depth in the stack.
When using recursion, you should be very careful and make sure that you provide a base case. A base case in recursion is the condition based on which the recursion ends, and the stack starts to unwind. This is the major reason of recursion causing StackOverflow error. If it doesn't find any base case, it will go into an infinite recursion, which will certainly result in error, as Stack is finite only.
In most cases, a stack overflow occurs because a recursive method was ill-defined, with a non-existent or unreachable ending condition, which causes the stack memory space to be exhausted. A correctly written recursion should not produce a stack overflow.
However, there are situations where a method can produce a stack overflow even if it was correctly implemented. For instance:
A fast-growing (say, exponential) recursion. For example: the naive recursive implementation of the Fibonacci function
A very big input data, that will eventually cause the stack space to be exhausted
Bottom line: it all depends on the particular case, it's impossible to generalize regarding what causes a stack overflow.
Each recursive call uses some space on the stack (to house anything specific to that one call, such as arguments, local variables, etc.). Thus, if you make too many recursive calls (either by not correctly providing a base case or just by trying to do too many recursive calls), then there is not enough room to provide space for it all, and you end up with a StackOverflow.
The reason why loops do not have this problem is that each iteration of a loop does not use its own unique space (i.e. if I loop n times, I don't need extra space to do the n+1st loop).
The reason why the recursion causes stack overflow is because we fail to establish when the recursion should stop, and thus the function/method will keep calling itself "forever" (until it causes the error). You will have the same problem even if you are using loops, if you have something as the following:
bool flag = true;
while (flag == true){
count++;
}
Since flag will always be true, the while loop will never stop until it gives you the stack overflow error.
Every level of recursion that you go down, you are add state information to the runtime stack. This information is stored in an activation record and contains information like which variables are in scope and what their values are. Loops do not have extra activation records each time you loop so they take less memory.
In certain situations your recursion may go deep enough that it causes the stack to overflow but there are ways to help prevent this from happening. When working with recursion, I usually follow this format:
public obj MyMethod(string params) {
if (base-case) {
do something...
} else {
do something else...
obj result = MyMethod(parameters here);
do something else if needed..
}
}
Recursion can be super effective and do things that loops cannot. Sometimes you just get to a point where recursion is the obvious decision. What makes you a good programmer is being able to use it when it is not completely obvoius.
When properly used, recursion will not produce a StackOverflowError. If it does, then your base case is not being triggered, and the method keeps calling itself ad infinitum. Every method call that does not complete remains on the stack, and eventually it overflows.
But loops don't involve method calls by themselves, so nothing builds up on the stack and a StackOverflowError does not result.
Every time you call a method, you consume a "frame" from the stack, this frame is not released until the method returns, it doesn't happen the same with loops.
recursion causes stack overflow cause all the previous calls are in memory. so your method calls itself with new parameters, then that again calls itself. so all these calls stack up and normally can run out of memory.
loops store the results normally in some variables and call the methods which is like a new fresh call to methods, after each call, the caller methods ends and returns results.
As in my opinion, getting error as StackOverFlow in Recursion due to :
not implemented the recursion correctly which results in infinite recursion, so check out the base case, etc.
If your input is large, it preferred to use Tail Recursion to avoid StackOverflow.
Here for loop is used inside the recursive function. When the recursive function is called, for(int i=0; i<n; i++) the value of i is initialized to zero, as it calls itself, the value of i will again be initialized to zero and it conitues infintely. This will lead you to Stack overflow error.
Solution: Avoid for loop inside recursive function; instead go for while or do-while and initialize the value of i outside recursive function
I have a variable that gets read and updated thousands of times a second. It needs to be reset regularly. But "half" the time, the value is already the reset value. Is it a good idea to check the value first (to see if it needs resetting) before resetting (a write operaion), or I should just reset it regardless? The main goal is to optimize the code for performance.
To illustrate:
Random r = new Random();
int val = Integer.MAX_VALUE;
for (int i=0; i<100000000; i++) {
if (i % 2 == 0)
val = Integer.MAX_VALUE;
else
val = r.nextInt();
if (val != Integer.MAX_VALUE) //skip check?
val = Integer.MAX_VALUE;
}
I tried to use the above program to test the 2 scenarios (by un/commenting the 2nd "if" line), but any difference is masked by the natural variance of the run duration time.
Thanks.
Don't check it.
It's more execution steps = more cycles = more time.
As an aside, you are breaking one of the basic software golden rules: "Don't optimise early". Unless you have hard evidence that this piece if code is a performance problem, you shouldn't be looking at it. (Note that doesn't mean you code without performance in mind, you still follow normal best practice, but you don't add any special code whose only purpose is "performance related")
The check has no actual performance impact. We'd be talking about a single clock cycle or something, which is usually not relevant in a Java program (as hard-core number crunching usually isn't done in Java).
Instead, base the decision on readability. Think of the maintainer who's going to change this piece of code five years on.
In the case of your example, using my rationale, I would skip the check.
Most likely the JIT will optimise the code away because it doesn't do anything.
Rather than worrying about performance, it is usually better to worry about what it
simpler to understand
cleaner to implement
In both cases, you might remove the code as it doesn't do anything useful and it could make the code faster as well.
Even if it did make the code a little slower it would be very small compared to the cost of calling r.nextInt() which is not cheap.
I've been reading about non-blocking approaches for some time. Here is a piece of code for so called lock-free counter.
public class CasCounter {
private SimulatedCAS value;
public int getValue() {
return value.get();
}
public int increment() {
int v;
do {
v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));
return v + 1;
}
}
I was just wondering about this loop:
do {
v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));
People say:
So it tries again, and again, until all other threads trying to change the value have done so. This is lock free as no lock is used, but not blocking free as it may have to try again (which is rare) more than once (very rare).
My question is:
How can they be so sure about that? As for me I can't see any reason why this loop can't be infinite, unless JVM has some special mechanisms to solve this.
The loop can be infinite (since it can generate starvation for your thread), but the likelihood for that happening is very small. In order for you to get starvation you need some other thread succeeding in changing the value that you want to update between your read and your store and for that to happen repeatedly.
It would be possible to write code to trigger starvation but for real programs it would be unlikely to happen.
The compare and swap is usually used when you don't think you will have write conflicts very often. Say there is a 50% chance of "miss" when you update, then there is a 25% chance that you will miss in two loops and less than 0.1% chance that no update would succeed in 10 loops. For real world examples, a 50% miss rate is very high (basically not doing anything than updating), and as the miss rate is reduces, to say 1% then the risk of not succeeding in two tries is only 0.01% and in 3 tries 0.0001%.
The usage is similar to the following problem
Set a variable a to 0 and have two threads updating it with a = a+1 a million times each concurrently. At the end a could have any answer between 1000000 (every other update was lost due to overwrite) and 2000000 (no update was overwritten).
The closer to 2000000 you get the more likely the CAS usage is to work since that mean that quite often the CAS would see the expected value and be able to set with the new value.
Edit: I think I have a satisfactory answer now. The bit that confused me was the 'v != compareAndSwap'. In the actual code, CAS returns true if the value is equal to the compared expression. Thus, even if the first thread is interrupted between get and CAS, the second thread will succeed the swap and exit the method, so the first thread will be able to do the CAS.
Of course, it is possible that if two threads call this method an infinite number of times, one of them will not get the chance to run the CAS at all, especially if it has a lower priority, but this is one of the risks of unfair locking (the probability is very low however). As I've said, a queue mechanism would be able to solve this problem.
Sorry for the initial wrong assumptions.