Understanding Recursion logic

Understanding Recursion logic - java

I really need your help for me to understand recursion properly. I can understand basic recursions and their logic like fibonacci
int factorial(int n)
if(n <=1)
return n
else
return(n*factorial(n-1))
That's easy the function keep calling factorial until n becomes zero and finally multiply all the results. But recursions like tree traversal is hard for me to understand
void inorderTraverse(Node* head)
if(head!=NULL){
inorderTraverse(head->left)
cout << head-> data
inorderTraverse(head->right)
}
}
Here I lost the logic how does this function goes if first recursion call will go back to function how can it goes to cout line or how can it show right child data. I really need your help.
Thank you

A binary search tree in alphabetical order:
B
/ \
A C
inorderTraverse(&A) will first descend to A and print it (recursively printing any subtree), then print B, then descend to C and print it (recursively printing any subtree).
So in this case, A B C.
For a more complicated tree:
D
/ \
B E
/ \
A C
This will be printed as A B C D E. Notice how the original tree is on the left of D, so is printed in its entirety first; the problem is reduced to a smaller instance of the starting problem. This is the essence of recursion. Next D is printed, then everything on the right of D (which is just E).
Note that in this implementation, the nodes don't know about their parent. The recursion means this information is stored on the call stack. You could replace the whole thing with an iterative implementation, but you would need some stack-like data structure to keep track of moving back up through the tree.

Inorder traversal says you need to traverse as Left-Root-Right.So for one level it is fine we print in left-root-right format. But With the level increases you need to makesure your algorithm traverse in the same way.So you need to print the leftSubtree first then the root of that subTree and then the right subTree at each level.
The Recursive code inorderTraverse(head->left) tells till the node is not null go to its leftside of the tree.Once it reaches the end it prints the left node then print the Root node of that subTree and wahtever operation u performed on leftSubTree you need to perform the same on Right subTree that's why you write inorderTraverse(head->right). Start debugging by creating 3level trees. Happy Learning.

Try to imagine binary tree and then start traversing it from root. You always go left. If there is no more lefts then you go right and after that you just go up. And you will finish back in root (from right side).
It is similar as going thought maze. You can choose that you will always go to left. (you will always touch left wall). At the end you will finish in exit or back in entrance if there isn't another exit.
In this code is important that you have two recursive calls in body. First is for left subtree and second is for right subtree. When you finish one function returns you back to node where you started.

Binary search trees have the property that for every node, the left subtree contains values that are smaller than the current node's value, and the right subtree contains values that are larger.
Thus, for a given node to yield the values in its subtree in-order the function needs to:
Handle the values less than the current value;
Handle its value;
Handle the values greater than the current value.
If you think of your function initially as a black box that deals with a subtree, then the recursion magically appears
Apply the function to the left subtree;
Deal with the current value;
Apply the function to the right subtree.
Basically, the trick is to initially think of the function as a shorthand way to invoke an operation, without worrying about how it might be accomplished. When you think of it abstractly like that, and you note that the current problem's solution can be achieved by applying that same functionality to a subset of the current problem, you have a recursion.
Similarly, post-order traversal boils down to:
Deal with all my children (left subtree, then right subtree, or vice-versa if you're feeling contrary);
Now I can deal with myself.

Related

Find the pairings such that the sum of the weights is minimized?

When solving the Chinese postman problem (route inspection problem), how can we find the pairings (between odd vertices) such that the sum of the weights is minimized?
This is the most crucial step in the algorithm that successfully solves the Chinese Postman Problem for a non-Eulerian Graph. Though it is easy to implement on paper, but I am facing difficulty in implementing in Java.
I was thinking about ways to find all possible pairs but if one runs the first loop over all the odd vertices and the next loop for all the other possible pairs. This will only give one pair, to find all other pairs you would need another two loops and so on. This is rather strange as one will be 'looping over loops' in a crude sense. Is there a better way to resolve this problem.
I have read about the Edmonds-Jonhson algorithm, but I don't understand the motivation behind constructing a bipartite graph. And I have also read Chinese Postman Problem: finding best connections between odd-degree nodes, but the author does not explain how to implement a brute-force algorithm.
Also the following question: How should I generate the partitions / pairs for the Chinese Postman problem? has been asked previously by a user of Stack overflow., but a reply to the post gives a python implementation of the code. I am not familiar with python and I would request any community member to rewrite the code in Java or if possible explain the algorithm.
Thank You.

Economical recursion
These tuples normally are called edges, aren't they?
You need a recursion.
0. Create main stack of edge lists.
1. take all edges into a current edge list. Null the found edge stack.
2. take a next current edge for the current edge list and add it in the found edge stack.
3. Create the next edge list from the current edge list. push the current edge list into the main stack. Make next edge list current.
4. Clean current edge list from all adjacent to current edge and from current edge.
5. If the current edge list is not empty, loop to 2.
6. Remember the current state of found edge stack - it is the next result set of edges that you need.
7. Pop the the found edge stack into current edge. Pop the main stack into current edge list. If stacks are empty, return. Repeat until current edge has a next edge after it.
8. loop to 2.
As a result, you have all possible sets of edges and you never need to check "if I had seen the same set in the different order?"

It's actually fairly simple when you wrap your head around it. I'm just sharing some code here in the hope it will help the next person!
The function below returns all the valid odd vertex combinations that one then needs to check for the shortest one.
private static ObjectArrayList<ObjectArrayList<IntArrayList>> getOddVertexCombinations(IntArrayList oddVertices,
ObjectArrayList<IntArrayList> buffer){
ObjectArrayList<ObjectArrayList<IntArrayList>> toReturn = new ObjectArrayList<>();
if (oddVertices.isEmpty()) {
toReturn.add(buffer.clone());
} else {
int first = oddVertices.removeInt(0);
for (int c = 0; c < oddVertices.size(); c++) {
int second = oddVertices.removeInt(c);
buffer.add(new IntArrayList(new int[]{first, second}));
toReturn.addAll(getOddVertexCombinations(oddVertices, buffer));
buffer.pop();
oddVertices.add(c, second);
}
oddVertices.add(0, first);
}
return toReturn;
}

Count number of nodes within given range in AVL tree

Suppose I have an AVL tree of distinct integers. I need to determine the number of nodes which lie in the interval [a, b) where a < b. Note that [a, b) is supplied by the user and hence I do not know beforehand what the value of a and b are. Also, a and b may not be present in the tree at all. For example, if I have a tree containing the integers {1, 2, 4, 5, 6, 7} then the user should expect an answer of 3 if he supplies the interval [3, 7).
A naive implementation would be to traverse every node and increment the count by 1 every time a node is found in the given interval. But this would have a worst case time complexity of O(n), as it is possible for every single integer in the tree to be within the given range. I need a faster algorithm, and after doing some research I found that it requires storing a size statistic in every node so that the rank of any given node can be easily computed.
I would like to do something like rank(b) - rank(a), but the problem is that a and b may not exist in the tree. In the above example, rank(7) would return 6 but rank(3) will not return any meaningful value.
Can anyone offer suggestions as to how I can address this issue? Also, I know that there is another similar question on this website, but that one involves C++ while this one involves Java. Also, I could not find a satisfactory answer there.

I've implemented a stack based tree iterator for an AVL tree some (long) time ago. It should work for your case like this:
create an array "treestack" which holds structs for traversal info. The struct just needs a bool "visited", and a pointer to your node type. The array can be of static size e.g. hold 64 info elements (one for each level of your tree, so this 64 will mean your tree contains max 4G nodes)
change the search method of your AVL tree to put the root node at treestack[0] when you begin with the search, and put all other nodes on top of the treestack as you follow the left and right child nodes during your search. Edit: Note that an unsuccessful search will result in your treestack having a node with the next smaller or next higher value, which is exactly what you want (just skip counting in case it's smaller, we still have abvalid iterator start oath).
You've now a path un your treestack which you can use for subsequent in-order traversal to find the next higher values. In-order traversal using the stack works like this:
start at the last element in treestack, keep a treeindex which is initially = arrayindex of the last item.
When there is a right node, and it is not marked visited: try follow ONE right of the current node, then ENDLESS to the left of any following nodes. Wherever you stop (so also if there are no left nodes but a single right one) is the next higher value. Put all nodes at the end of your tree stack as you follow them and inc your treeindex as you follow paths. And Mark the choosen final node with the next higher value as visited. Increase your node counter+1.
now to find subsequent higher values, ascend in the tree by taking treeindex-1 in your treestack as current node, an repeat the above step to find the next node with higher value.
if there is no right child node the current node, and the node is not marked visited: mark as visited, and inc node counter by 1
you're done when you either reach the root node with treeindex 0 or your node containing max value.
I hope it helps.

Instead of
rank(b) - rank(a)
what I would do is
rank(X) - rank(Y)
X is the very first node having value > b.
Y is the very first node having value >= a.

Why store the points in a binary tree?

This question covers a software algorithm, from On topic
I am working on an interview question from Amazon Software Question,
specifically "Given a set of points (x,y) and an integer "n", return n number of points which are close to the origin"
Here is the sample high level psuedocode answer to this question, from Sample Answer
Step 1: Design a class called point which has three fields - int x, int y, int distance
Step 2: For all the points given, find the distance between them and origin
Step 3: Store the values in a binary tree
Step 4: Heap sort
Step 5: print the first n values from the binary tree
I agree with steps 1 and 2 because it makes sense in terms of object-oriented design to have one software bundle of data, Point, encapsulate away the fields of x, y and distance.Ensapsulation
Can someone explain the design decisions from 3 to 5?
Here's how I would do steps of 3 to 5
Step 3: Store all the points in an array
Step 4: Sort the array with respect to distance(I use some build in sort here like Arrays.Sort
Step 5: With the array sorted in ascending order, I print off the first n values
Why the author of that response use a more complicated data structure, binary tree and not something simpler like an array that I used? I know what a binary tree is - hierarchical data structure of nodes with two pointers. In his algorithm, would you have to use a BST?

First, I would not say that having Point(x, y, distance) is good design or encapsulation. distance is not really part of a point, it can be computed from x and y. In term of design, I would certainly have a function, i.e. a static method from Point or an helper class Points.
double distance(Point a, Point b)
Then for the specific question, I actually agree with your solution, to put the data in an array, sort this array and then extract the N first.
What the example may be hinted at is that the heapsort actually often uses a binary tree structure inside the array to be sorted as explained here :
The heap is often placed in an array with the layout of a complete binary tree.
Of course, if the distance to the origin is not stored in the Point, for performance reason, it had to be put with the corresponding Point object in the array, or any information that will allow to get the Point object from the sorted distance (reference, index), e.g.
List<Pair<Long, Point>> distancesToOrigin = new ArrayList<>();
to be sorted with a Comparator<Pair<Long, Point>>

It is not necessary to use BST. However, it is a good practice to use BST when needing a structure that is self-sorted. I do not see the need to both use BST and heapsort it (somehow). You could use just BST and retrieve the first n points. You could also use an array, sort it and use the first n points.
If you want to sort an array of type Point, you could implement the interface Comparable (Point would imolement that interface) and overload the default method.
You never have to choose any data structures, but by determining the needs you have, you would also easily determine the optimum structure.

The approach described in this post is more complex than needed for such a question. As you noted, simple sorting by distance will suffice. However, to help explain your confusion about what your sample answer author was trying to get at, maybe consider the k nearest neighbors problem which can be solved with a k-d tree, a structure that applies space partitioning to the k-d dataset. For 2-dimensional space, that is indeed a binary tree. This tree is inherently sorted and doesn't need any "heap sorting."
It should be noted that building the k-d tree will take O(n log n), and is only worth the cost if you need to do repeated nearest neighbor searches on the structure. If you only need to perform one search to find k nearest neighbors from the origin, it can be done with a naive O(n) search.
How to build a k-d tree, straight from Wiki:
One adds a new point to a k-d tree in the same way as one adds an element to any other search tree. First, traverse the tree, starting from the root and moving to either the left or the right child depending on whether the point to be inserted is on the "left" or "right" side of the splitting plane. Once you get to the node under which the child should be located, add the new point as either the left or right child of the leaf node, again depending on which side of the node's splitting plane contains the new node.
Adding points in this manner can cause the tree to become unbalanced, leading to decreased tree performance. The rate of tree performance degradation is dependent upon the spatial distribution of tree points being added, and the number of points added in relation to the tree size. If a tree becomes too unbalanced, it may need to be re-balanced to restore the performance of queries that rely on the tree balancing, such as nearest neighbour searching.
Once have have built the tree, you can find k nearest neighbors to some point (the origin in your case) in O(k log n) time.
Straight from Wiki:
Searching for a nearest neighbour in a k-d tree proceeds as follows:
Starting with the root node, the algorithm moves down the tree recursively, in the same way that it would if the search point were being inserted (i.e. it goes left or right depending on whether the point is lesser than or greater than the current node in the split dimension).
Once the algorithm reaches a leaf node, it saves that node point as the "current best"
The algorithm unwinds the recursion of the tree, performing the following steps at each node:
If the current node is closer than the current best, then it becomes the current best.
The algorithm checks whether there could be any points on the other side of the splitting plane that are closer to the search point than the current best. In concept, this is done by intersecting the splitting hyperplane with a hypersphere around the search point that has a radius equal to the current nearest distance. Since the hyperplanes are all axis-aligned this is implemented as a simple comparison to see whether the difference between the splitting coordinate of the search point and current node is lesser than the distance (overall coordinates) from the search point to the current best.
If the hypersphere crosses the plane, there could be nearer points on the other side of the plane, so the algorithm must move down the other branch of the tree from the current node looking for closer points, following the same recursive process as the entire search.
If the hypersphere doesn't intersect the splitting plane, then the algorithm continues walking up the tree, and the entire branch on the other side of that node is eliminated.
When the algorithm finishes this process for the root node, then the search is complete.
This is a pretty tricky algorithm that I would hate to need to describe as an interview question! Fortunately the general case here is more complex than is needed, as you pointed out in your post. But I believe this approach may be close to what your (wrong) sample answer was trying to describe.

Why is this called backtracking?

I have read in Wikipedia and have also Googled it,
but I cannot figure out what "Backtracking Algorithm" means.
I saw this solution from "Cracking the Code Interviews"
and wonder why is this a backtracking algorithm?

Backtracking is a form of recursion, at times.
This boolean based algorithm is being faced with a choice, then making that choice and then being presented with a new set of choices after that initial choice.
Conceptually, you start at the root of a tree; the tree probably has some good leaves and some bad leaves, though it may be that the leaves are all good or all bad. You want to get to a good leaf. At each node, beginning with the root, you choose one of its children to move to, and you keep this up until you get to a leaf.(See image below)
Explanation of Example:
Starting at Root, your options are A and B. You choose A.
At A, your options are C and D. You choose C.
C is bad. Go back to A.
At A, you have already tried C, and it failed. Try D.
D is bad. Go back to A.
At A, you have no options left to try. Go back to Root.
At Root, you have already tried A. Try B.
At B, your options are E and F. Try E.
E is good. Congratulations!
Source: upenn.edu

"Backtracking" is a term that occurs in enumerating algorithms.
You built a "solution" (that is a structure where every variable is assigned a value).
It is however possible that during construction, you realize that the solution is not successful (does not satisfy certain constraints), then you backtrack: you undo certain assignments of values to variables in order to reassign them.
Example:
Based on your example you want to construct a path in a 2D grid. So you start generating paths from (0,0). For instance:
(0,0)
(0,0) (1,0) go right
(0,0) (1,0) (1,1) go up
(0,0) (1,0) (1,1) (0,1) go left
(0,0) (1,0) (1,1) (0,1) (0,0) go down
Oops, visiting a cell a second time, this is not a path anymore
Backtrack: remove the last cell from the path
(0,0) (1,0) (1,1) (0,1)
(0,0) (1,0) (1,1) (0,1) (1,1) go right
Oops, visiting a cell a second time, this is not a path anymore
Backtrack: remove the last cell from the path
....

From Wikipedia:
Backtracking is a general algorithm for finding all (or some) solutions to some computational problem, that incrementally builds candidates to the solutions, and abandons each partial candidate c ("backtracks") as soon as it determines that c cannot possibly be completed to a valid solution.
Backtracking is easily implemented as a recursive algorithm. You look for the solution of a problem of size n by looking for solutions of size n - 1 and so on. If the smaller solution doesn't work you discard it.
That's basically what the code above is doing: it returns true in the base case, otherwise it 'tries' the right path or the left path discarding the solution that doesn't work.
Since the code above it's recursive, it might not be clear where the "backtracking" comes into play, but what the algorithm actually does is building a solution from a partial one, where the smallest possible solution is handled at line 5 in your example. A non recursive version of the algorithm would have to start from the smallest solution and build from there.

I cannot figure out what "backtracking algorithm" means.
An algorithm is "back-tracking" when it tries a solution, and on failure, returns to a simpler solution as the basis for new attempts.
In this implementation,
current_path.remove(p)
goes back along the path when the current path does not succeed so that a caller can try a different variant of the path that led to current_path.

Indeed, the word "back" in the term "backtracking" could sometimes be confusing when a backtracking solution "keeps going forward" as in my solution of the classic N queens problem:
/**
* Given *starting* row, try all columns.
* Recurse into subsequent rows if can put.
* When reached last row (stopper), increment count if put successfully.
*
* By recursing into all rows (of a given single column), an entire placement is tried.
* Backtracking is the avoidance of recursion as soon as "cannot put"...
* (eliminating current col's placement and proceeding to the next col).
*/
int countQueenPlacements(int row) { // queen# is also queen's row (y axis)
int count = 0;
for (int col=1; col<=N; col++) { // try all columns for each row
if (canPutQueen(col, row)) {
putQueen(col, row);
count += (row == N) ? print(board, ++solutionNum) : countQueenPlacements(row+1);
}
}
return count;
}
Note that my comment defines Backtracking as the avoidance of recursion as soon as "cannot put" -- but this is not entirely full. Backtracking in this solution could also mean that once a proper placement is found, the recursion stack unwinds (or backtracks).

Backtracking basically means trying all possible options. It's usually the naive, inefficient solutions to problems.
In your example solution, that's exactly what's going on - you simply try out all possible paths, recursively:
You try each possible direction; if you found a successful path - good. if not - backtrack and try another direction.

better way to find depth of a node in tree

I want to compute depth of a tree
I have written code below
My question is that :
what is the order of this code? Is it O(n) [n is number of tree Nodes]
Is there any other way which you think is better and faster?
Thanks in advance
public int height(Node n)
{
if(n == null)
return 0;
return 1 + Math.max(height(n.left), height(n.right));
}

It's O(n), regardless of what type of tree you have, since you're visiting every single node to establish the maximum depth.
The only more efficient way is to have extra information about the tree stored somewhere. If it's balanced, you know the maximum height based on the number of nodes.
Alternatively, you can cache the information. Have two extra variables depth and dirty and initially set dirty to true:
When a caller requests the depth and dirty is true, call your function to work it out but the store that into depth and set dirty to false.
When a caller requests the depth and dirty is false, just return depth.
Whenever the structure of the tree is changed (insert or delete node), set dirty to true.

The order of your function is O(n).
Store the height in each node as you create it, and maintain that height as you add, remove, and balance the tree.

Well, I see that your question asks for depth and the code is to find the height of the node.
I am not really sure, what you want height or depth. If you require the code for depth, this is not the way. See below:
Depth (n) = 1 + Depth(P(n))
is the recursive definition of depth.
For height, what you wrote is correct.
Check Tree Operations , I have written it for most of the operations on a tree with their asymptotic analysis

Yes, your code is O(n), because in the worst case you call height on the root node (and then you have to touch every node to find the longest path to a leaf).
TTS's code is the same as yours, and I don't think there is a faster way to do it.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.