Height of binary search tree in constant time

Height of binary search tree in constant time - java

I need to fine the height of a binary search tree with a time of O(1) the only way i could think to do this is to put a check in the add and remove methods incrementing a global counter is there any other way?

O(1) time suggests that you should already have the height when it is requested.
The best way is it to keep/update the correct value whenever a new node is added/deleted . You are doing it in a right way , however it increases the complexity on addition and deletion.
You can do it number of ways , like keep the depth value along with the node in tree etc.
class Node{
int depth;
Object value;
}
Node lowestNode;
I can think of storing the max depth node reference in an object and keep that as depth . So whenever you add a new element , you can check the depth of element based on its parent and replace the max depth node if the new element has more depth .
If you are deleting the max depth node then replace it with the parent otherwise correct the depth of all elments recursively along the tree.
The height of tree is , lowestNode.depth

Storing an attribute for the height and update it when you are doing add/remove should be the most reasonable solution.
If the tree is guaranteed to be balanced (e.g. AVL), you can calculate the height by number of element in the tree (you have to keep the number of elements though :P )

Related

Do not understand the solution for the Binary Tree Maximum Path Sum problem

The website GeeksforGeeks has presented a solution for the problem of Maximum path sum for a binary tree. The question is as follows:
Given a binary tree, find the maximum path sum. The path may start and
end at any node in the tree.
The core of the solution is as follows:
int findMaxUtil(Node node, Res res)
{
if (node == null)
return 0;
// l and r store maximum path sum going through left and
// right child of root respectively
int l = findMaxUtil(node.left, res);
int r = findMaxUtil(node.right, res);
// Max path for parent call of root. This path must
// include at-most one child of root
int max_single = Math.max(Math.max(l, r) + node.data,
node.data);
// Max Top represents the sum when the Node under
// consideration is the root of the maxsum path and no
// ancestors of root are there in max sum path
int max_top = Math.max(max_single, l + r + node.data);
// Store the Maximum Result.
res.val = Math.max(res.val, max_top);
return max_single;
}
int findMaxSum() {
return findMaxSum(root);
}
// Returns maximum path sum in tree with given root
int findMaxSum(Node node) {
// Initialize result
// int res2 = Integer.MIN_VALUE;
Res res = new Res();
res.val = Integer.MIN_VALUE;
// Compute and return result
findMaxUtil(node, res);
return res.val;
}
Res has the following definition:
class Res {
public int val;
}
I am confused about the reasoning behind these lines of code:
int max_single = Math.max(Math.max(l, r) + node.data, node.data);
int max_top = Math.max(max_single, l + r + node.data);
res.val = Math.max(res.val, max_top);
return max_single;
I believe the code above follows this logic but I do not understand WHY this logic is correct or valid:
For each node there can be four ways that the max path goes through
the node:
Node only
Max path through Left Child + Node
Max path through Right Child + Node
Max path through Left Child + Node + Max path through Right Child
In particular, I do not understand why max_single is being returned in the function findMaxUtil when we the variable res.val contains the answer we are interested in. The following reason is given on the website but I do not understand it:
An important thing to note is, root of every subtree need to return
maximum path sum such that at most one child of root is involved.
Could someone provide an explanation for this step of the solution?

In particular, I do not understand why max_single is being returned in the function findMaxUtil when we the variable res.val contains the answer we are interested in.
The problem is that findMaxUtil() really does two things: it returns largest sum of the tree that it's applied to, and it updates a variable that keeps track of the largest sum yet encountered. There's a comment to that effect in the original code, but you edited it out in your question, perhaps for brevity:
// This function returns overall maximum path sum in 'res'
// And returns max path sum going through root.
int findMaxUtil(Node node, Res res)
Because Java passes parameters by value, but every object variable in Java implicitly references the actual object, it's easy to miss the fact that the Res that's passed in the res parameter may be changed by this function. And that's exactly what happens in the lines you asked about:
int max_single = Math.max(Math.max(l, r) + node.data, node.data);
int max_top = Math.max(max_single, l + r + node.data);
res.val = Math.max(res.val, max_top);
return max_single;
That first line finds the maximum of the node itself or the node plus the greatest subtree, and that result is the max path sum going through root. Returning that value on the last line is one thing that this function does. The second and third lines look at that value and consider whether either it or the path that includes both children is larger than any previously seen path, and if so, it updates res, which is the other thing this function does. Keep in mind that res is some object that exists outside the method, so changes to it persist until the recursion stops and findMaxSum(Node), which started the whole thing, returns the res.val.
So, getting back to the question at the top, the reason that the findMaxUtil returns max_single is that it uses that value to recursively determine the max path through each subtree. The value in res is also updated so that findMaxSum(Node) can use it.

You're missing the value of res.val. The algorithm is trying to explore the whole tree, using res.val equal to the maximum path length explored up till then. In each step it iterates recursively across the children and updates res.val with the maximum path length, if higher than the one already present.
Proof:
Assume your algorithm works with trees with height n. For trees with height n+1 there's a root and 2 sub trees of height n. Also consider that findMaxUtil works fine for i<=n and will return the maximum path, starting with the partial root of the sub trees.
So the maximum path in your tree with height n+1 is calculated as follows
findMaxUtil(subtree1)
findMaxUtil(subtree2)
findmaxUtil(subtree1)+root.data
findmaxUtil(subtree2)+root.data
findmaxUtil(subtree1)+findmaxUtil(subtree2)+root.data
res.val
And finally the result is: findmaxUtil(newTree)=max(items 1:6).

Honestly I think the description on that website is very unclear. I'll try to convince you of the reasoning behind the algorithm as best I can.
We have a binary tree, with values at the nodes:
And we are looking for a path in that tree, a chain of connected nodes.
As it's a directed tree, any nonempty path consists of a lowest-depth node (i.e. the node in the path that is closest to the root of the tree), a path of zero or more nodes descending to the left of the lowest-depth node, and a path of zero or more nodes descending to the right of the lowest-depth node. In particular, somewhere in the tree there is a node that is the lowest-depth node in the maximum path. (Indeed, there might be more than one such path tied for equal value, and they might each have their own distinct lowest-depth node. That's fine. As long as there's at least one, that's what matters.)
(I've used "highest" in the diagram but I mean "lowest-depth". To be clear, any time I use "depth" or "descending" I'm talking about position in the tree. Any time I use "maximum" I'm talking about the value of a node or the sum of values of nodes in a path.)
So if we can find its lowest-depth node, we know the maximum value path is composed of the node itself, a sub-path of zero or more nodes descending from (and including) its left child, and a sub-path of zero or more nodes descending from (and including) its right child. It's a small step to conclude that the left and right descending paths must be the maximum value such descending path on each side. (If this isn't obvious, consider that whatever other path you picked, you could increase the total value by instead picking the maximum value descending path on that side.) If either or both of those paths would have a negative value then we just don't include any nodes at all on the negative side(s).
So we have a separate subproblem - given a subtree, what is the value of the maximum value path descending through its root? Well, it might just be the root itself, if all the paths rooted at its children have negative sum, or if it has no children. Otherwise it is the root plus the maximum value descending path of either of those rooted at its children. This subproblem could easily be answered on its own, but to avoid repeated traversals and redoing work we'll combine them both into one traversal of the tree.
Going back to the main problem, we know that some node is the lowest-depth node in the maximum value path. We're not even particularly concerned with knowing when we visit it - we're just going to recursively visit every node and find the maximum value path that has that path as its lowest-depth node, assured that at some point we will visit the one we want. At each node we calculate both the maximum value path starting at that point and descending within the subtree (max_single) and the maximum value path for which this node is the lowest-depth node in the path (max_top). The latter is found by taking the node and "gluing on" zero, one or both of the maximum descending-only paths through its children. (Since max_single is already the maximum value path descending from zero or one of the children, the only extra thing we need to consider is the path that goes through both children.) By calculating max_top at every node and keeping the largest value found in res.val, we guarantee that we will have found the largest of all values by the time we have finished traversing the tree. At every node we return max_single to use in the parent's calculations. And at the end of the algorithm we just pull out the answer from res.val.

Count number of nodes within given range in AVL tree

Suppose I have an AVL tree of distinct integers. I need to determine the number of nodes which lie in the interval [a, b) where a < b. Note that [a, b) is supplied by the user and hence I do not know beforehand what the value of a and b are. Also, a and b may not be present in the tree at all. For example, if I have a tree containing the integers {1, 2, 4, 5, 6, 7} then the user should expect an answer of 3 if he supplies the interval [3, 7).
A naive implementation would be to traverse every node and increment the count by 1 every time a node is found in the given interval. But this would have a worst case time complexity of O(n), as it is possible for every single integer in the tree to be within the given range. I need a faster algorithm, and after doing some research I found that it requires storing a size statistic in every node so that the rank of any given node can be easily computed.
I would like to do something like rank(b) - rank(a), but the problem is that a and b may not exist in the tree. In the above example, rank(7) would return 6 but rank(3) will not return any meaningful value.
Can anyone offer suggestions as to how I can address this issue? Also, I know that there is another similar question on this website, but that one involves C++ while this one involves Java. Also, I could not find a satisfactory answer there.

I've implemented a stack based tree iterator for an AVL tree some (long) time ago. It should work for your case like this:
create an array "treestack" which holds structs for traversal info. The struct just needs a bool "visited", and a pointer to your node type. The array can be of static size e.g. hold 64 info elements (one for each level of your tree, so this 64 will mean your tree contains max 4G nodes)
change the search method of your AVL tree to put the root node at treestack[0] when you begin with the search, and put all other nodes on top of the treestack as you follow the left and right child nodes during your search. Edit: Note that an unsuccessful search will result in your treestack having a node with the next smaller or next higher value, which is exactly what you want (just skip counting in case it's smaller, we still have abvalid iterator start oath).
You've now a path un your treestack which you can use for subsequent in-order traversal to find the next higher values. In-order traversal using the stack works like this:
start at the last element in treestack, keep a treeindex which is initially = arrayindex of the last item.
When there is a right node, and it is not marked visited: try follow ONE right of the current node, then ENDLESS to the left of any following nodes. Wherever you stop (so also if there are no left nodes but a single right one) is the next higher value. Put all nodes at the end of your tree stack as you follow them and inc your treeindex as you follow paths. And Mark the choosen final node with the next higher value as visited. Increase your node counter+1.
now to find subsequent higher values, ascend in the tree by taking treeindex-1 in your treestack as current node, an repeat the above step to find the next node with higher value.
if there is no right child node the current node, and the node is not marked visited: mark as visited, and inc node counter by 1
you're done when you either reach the root node with treeindex 0 or your node containing max value.
I hope it helps.

Instead of
rank(b) - rank(a)
what I would do is
rank(X) - rank(Y)
X is the very first node having value > b.
Y is the very first node having value >= a.

better way to find depth of a node in tree

I want to compute depth of a tree
I have written code below
My question is that :
what is the order of this code? Is it O(n) [n is number of tree Nodes]
Is there any other way which you think is better and faster?
Thanks in advance
public int height(Node n)
{
if(n == null)
return 0;
return 1 + Math.max(height(n.left), height(n.right));
}

It's O(n), regardless of what type of tree you have, since you're visiting every single node to establish the maximum depth.
The only more efficient way is to have extra information about the tree stored somewhere. If it's balanced, you know the maximum height based on the number of nodes.
Alternatively, you can cache the information. Have two extra variables depth and dirty and initially set dirty to true:
When a caller requests the depth and dirty is true, call your function to work it out but the store that into depth and set dirty to false.
When a caller requests the depth and dirty is false, just return depth.
Whenever the structure of the tree is changed (insert or delete node), set dirty to true.

The order of your function is O(n).
Store the height in each node as you create it, and maintain that height as you add, remove, and balance the tree.

Well, I see that your question asks for depth and the code is to find the height of the node.
I am not really sure, what you want height or depth. If you require the code for depth, this is not the way. See below:
Depth (n) = 1 + Depth(P(n))
is the recursive definition of depth.
For height, what you wrote is correct.
Check Tree Operations , I have written it for most of the operations on a tree with their asymptotic analysis

Yes, your code is O(n), because in the worst case you call height on the root node (and then you have to touch every node to find the longest path to a leaf).
TTS's code is the same as yours, and I don't think there is a faster way to do it.

how to link elements with a certain probability inversely proportional to a variable

I have two array lists and I would like to link element from the first array to element of the second array list. Elements have a property, say A.
The condition is: an element of the first array with an high value of element.getA() prefers to link with an element of the second array with a low value of A.
I understand that for selecting an element according to a biased probability I can calculate the cumulative probabilities and then do something like this Selecting nodes with probability proportional to trust
Let's see if this is more clear: think about preferential attachment mechanism. In that case, a node links to another node with a probability which increments with the degree of the chosen node. I simply would like to hack the preferential attachment and bias the probability for a node to link another node not only on a property of the second node, but also on a property of the first node. And I want this to be inverse, like small node prefers to link big nodes and big nodes prefers to link small nodes.
Best regards,
Simone

[edited]
for each pair, calculate the difference (or absolute difference, or difference squared). then use that difference as weighting to select one pair.
remove pairs that are no longer valid and repeat.

Is this list a Binary Search Tree?

Is the following list a BST or not?
list:{2,5,3,8,6}
Is there a way I can determine this?
Consider that my list will have 100000 elements.

Quick answer is: no. BST is a tree and you have a list.
Long answer is: it may be possible to encode a tree as a list, in which case, whether your list can be decoded into a tree or not will depend on the encoding algorithm. See BST wiki page for more details http://en.wikipedia.org/wiki/Binary_search_tree
In fact your list may be an encoded version of BST. If you read elements from left to right pushing them onto a stack, and whenever you stack has 3 elements do:
parent = pop()
right = pop()
left = pop()
push new Node(parent, left, right)
then you should get a valid BST. But I am only speculating here.

you are having a list , you need to construct BST from this list
A BST has following properties
1- Each node has two children or it is a leaf node
2- For each node its left subtree is smaller than node's value
3- For each node its right subtree is greater than node's value
a bst MUST BE BALANCED i.e. while inserting nodes in a BST , code must respect above 3 conditions.
Searching in a BST is O(log n) operation that is because , each search step divide the search space into two halfs and choose one of the half.
There is a case where search will take O(N) time
Consider following
node = {1,2,3,4,5}
if we make the BST from this node set it will be right alifned that means every next node will be on the right subtree , here if we want to search for a item , we need to traverse whole right sub tree like a link list.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.