Suppose I have an AVL tree of distinct integers. I need to determine the number of nodes which lie in the interval [a, b) where a < b. Note that [a, b) is supplied by the user and hence I do not know beforehand what the value of a and b are. Also, a and b may not be present in the tree at all. For example, if I have a tree containing the integers {1, 2, 4, 5, 6, 7} then the user should expect an answer of 3 if he supplies the interval [3, 7).
A naive implementation would be to traverse every node and increment the count by 1 every time a node is found in the given interval. But this would have a worst case time complexity of O(n), as it is possible for every single integer in the tree to be within the given range. I need a faster algorithm, and after doing some research I found that it requires storing a size statistic in every node so that the rank of any given node can be easily computed.
I would like to do something like rank(b) - rank(a), but the problem is that a and b may not exist in the tree. In the above example, rank(7) would return 6 but rank(3) will not return any meaningful value.
Can anyone offer suggestions as to how I can address this issue? Also, I know that there is another similar question on this website, but that one involves C++ while this one involves Java. Also, I could not find a satisfactory answer there.
I've implemented a stack based tree iterator for an AVL tree some (long) time ago. It should work for your case like this:
create an array "treestack" which holds structs for traversal info. The struct just needs a bool "visited", and a pointer to your node type. The array can be of static size e.g. hold 64 info elements (one for each level of your tree, so this 64 will mean your tree contains max 4G nodes)
change the search method of your AVL tree to put the root node at treestack[0] when you begin with the search, and put all other nodes on top of the treestack as you follow the left and right child nodes during your search. Edit: Note that an unsuccessful search will result in your treestack having a node with the next smaller or next higher value, which is exactly what you want (just skip counting in case it's smaller, we still have abvalid iterator start oath).
You've now a path un your treestack which you can use for subsequent in-order traversal to find the next higher values. In-order traversal using the stack works like this:
start at the last element in treestack, keep a treeindex which is initially = arrayindex of the last item.
When there is a right node, and it is not marked visited: try follow ONE right of the current node, then ENDLESS to the left of any following nodes. Wherever you stop (so also if there are no left nodes but a single right one) is the next higher value. Put all nodes at the end of your tree stack as you follow them and inc your treeindex as you follow paths. And Mark the choosen final node with the next higher value as visited. Increase your node counter+1.
now to find subsequent higher values, ascend in the tree by taking treeindex-1 in your treestack as current node, an repeat the above step to find the next node with higher value.
if there is no right child node the current node, and the node is not marked visited: mark as visited, and inc node counter by 1
you're done when you either reach the root node with treeindex 0 or your node containing max value.
I hope it helps.
Instead of
rank(b) - rank(a)
what I would do is
rank(X) - rank(Y)
X is the very first node having value > b.
Y is the very first node having value >= a.
Related
I really need your help for me to understand recursion properly. I can understand basic recursions and their logic like fibonacci
int factorial(int n)
if(n <=1)
return n
else
return(n*factorial(n-1))
That's easy the function keep calling factorial until n becomes zero and finally multiply all the results. But recursions like tree traversal is hard for me to understand
void inorderTraverse(Node* head)
if(head!=NULL){
inorderTraverse(head->left)
cout << head-> data
inorderTraverse(head->right)
}
}
Here I lost the logic how does this function goes if first recursion call will go back to function how can it goes to cout line or how can it show right child data. I really need your help.
Thank you
A binary search tree in alphabetical order:
B
/ \
A C
inorderTraverse(&A) will first descend to A and print it (recursively printing any subtree), then print B, then descend to C and print it (recursively printing any subtree).
So in this case, A B C.
For a more complicated tree:
D
/ \
B E
/ \
A C
This will be printed as A B C D E. Notice how the original tree is on the left of D, so is printed in its entirety first; the problem is reduced to a smaller instance of the starting problem. This is the essence of recursion. Next D is printed, then everything on the right of D (which is just E).
Note that in this implementation, the nodes don't know about their parent. The recursion means this information is stored on the call stack. You could replace the whole thing with an iterative implementation, but you would need some stack-like data structure to keep track of moving back up through the tree.
Inorder traversal says you need to traverse as Left-Root-Right.So for one level it is fine we print in left-root-right format. But With the level increases you need to makesure your algorithm traverse in the same way.So you need to print the leftSubtree first then the root of that subTree and then the right subTree at each level.
The Recursive code inorderTraverse(head->left) tells till the node is not null go to its leftside of the tree.Once it reaches the end it prints the left node then print the Root node of that subTree and wahtever operation u performed on leftSubTree you need to perform the same on Right subTree that's why you write inorderTraverse(head->right). Start debugging by creating 3level trees. Happy Learning.
Try to imagine binary tree and then start traversing it from root. You always go left. If there is no more lefts then you go right and after that you just go up. And you will finish back in root (from right side).
It is similar as going thought maze. You can choose that you will always go to left. (you will always touch left wall). At the end you will finish in exit or back in entrance if there isn't another exit.
In this code is important that you have two recursive calls in body. First is for left subtree and second is for right subtree. When you finish one function returns you back to node where you started.
Binary search trees have the property that for every node, the left subtree contains values that are smaller than the current node's value, and the right subtree contains values that are larger.
Thus, for a given node to yield the values in its subtree in-order the function needs to:
Handle the values less than the current value;
Handle its value;
Handle the values greater than the current value.
If you think of your function initially as a black box that deals with a subtree, then the recursion magically appears
Apply the function to the left subtree;
Deal with the current value;
Apply the function to the right subtree.
Basically, the trick is to initially think of the function as a shorthand way to invoke an operation, without worrying about how it might be accomplished. When you think of it abstractly like that, and you note that the current problem's solution can be achieved by applying that same functionality to a subset of the current problem, you have a recursion.
Similarly, post-order traversal boils down to:
Deal with all my children (left subtree, then right subtree, or vice-versa if you're feeling contrary);
Now I can deal with myself.
Assume the height of the BST is h.
If we want to delete a node with two children, then what would be the time complexity of the process.
I know that in a normal binary tree, the time complexity for deletion is O(h); O(n) worst case and O(logn) best case. But since we are replacing the key of the deleting node by the minimum node of right sub tree of it, it will take more time to find the minimum key.
So does anybody know how to explain the time complexity in this situation?
Source Wikipedia :
Deletion
There are three possible cases to consider:
Deleting a leaf (node with no children): Deleting a leaf is easy, as we can simply remove it from the tree.
Deleting a node with one child: Remove the node and replace it with its child.
Deleting a node with two children: Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Copy the value of R to N, then recursively call delete on R until reaching one of the first two cases. If you choose in-order successor of a node, as right sub tree is not NIL( Our present case is node has 2 children), then its in-order successor is node with least value in its right sub tree, which will have at a maximum of 1 sub tree, so deleting it would fall in one of first 2 cases.
Deleting a node with two children from a binary search tree. First the rightmost node in the left subtree, the inorder predecessor 6, is identified. Its value is copied into the node being deleted. The inorder predecessor can then be easily deleted because it has at most one child. The same method works symmetrically using the inorder successor labelled 9.
Consistently using the in-order successor or the in-order predecessor for every instance of the two-child case can lead to an unbalanced tree, so some implementations select one or the other at different times.
Runtime analysis:
Although this operation does not always traverse the tree down to a leaf, this is always a possibility; thus in the worst case it requires time proportional to the height of the tree. It does not require more even when the node has two children, since it still follows a single path and does not visit any node twice. Hence the pointer adjustments in all three cases need constant time.
Useful Links :
http://en.wikipedia.org/wiki/Binary_search_tree#Deletion
http://cse.iitkgp.ac.in/~pb/algo-1-pb-10.pdf
I need to fine the height of a binary search tree with a time of O(1) the only way i could think to do this is to put a check in the add and remove methods incrementing a global counter is there any other way?
O(1) time suggests that you should already have the height when it is requested.
The best way is it to keep/update the correct value whenever a new node is added/deleted . You are doing it in a right way , however it increases the complexity on addition and deletion.
You can do it number of ways , like keep the depth value along with the node in tree etc.
class Node{
int depth;
Object value;
}
Node lowestNode;
I can think of storing the max depth node reference in an object and keep that as depth . So whenever you add a new element , you can check the depth of element based on its parent and replace the max depth node if the new element has more depth .
If you are deleting the max depth node then replace it with the parent otherwise correct the depth of all elments recursively along the tree.
The height of tree is , lowestNode.depth
Storing an attribute for the height and update it when you are doing add/remove should be the most reasonable solution.
If the tree is guaranteed to be balanced (e.g. AVL), you can calculate the height by number of element in the tree (you have to keep the number of elements though :P )
I'm faced with a problem which requires a Queue data structure supporting fast k-th largest element finding.
The requirements of this data structure are as follows:
The elements in the queue are not necessarily integers, but they must be comparable to each other, i.e we can tell which one is greater when we compare two elements(they can be equal as well).
The data structure must support enqueue(adds the element at the tail) and dequeue(removes the element at the head).
It can quickly find the k-th largest element in the queue, pls note k is not a constant.
You can assume that operations enqueue , dequeue and k-th largest element finding all occur with the same frequency.
My idea is to use a modified balanced binary search tree. The tree is the same as ordinary balanced binary search tree except that every nodei is augmented with another field ni, ni denotes the number of nodes contained in the subtree with root nodei. The aforementioned operations are supported as follows:
For simplicity assume that all elements are distinct.
Enqueue(x): x is first inserted into the tree, suppose the corresponding node is nodet, we append pair(x,pointer to nodet) to the queue.
Dequeue: suppose (e1, node1) is the element at the head, node1 is the pointer into the tree corresponding to e1. We delete node1 from the tree and remove (e1, node1) from the queue.
K-th largest element finding: suppose root node is noderoot, its two children are nodeleft and noderight(suppose they all exist), we compare K with nroot , three cases may happen:
if K< nleft we find the K-th largest element in the left subtree of nroot;
if K>nroot-nright we find the (K-nroot+nright)-th largest element in the right subtree of nroot;
otherwise nroot is the node we want.
The time complexity of all the three operations are O(logN) , where N is the number of elements currently in the queue.
How can I speed up the operations mentioned above? With what data structures and how?
Note - you cannot achieve better then O(logn) for all, at best you need to "chose" which op you care for the most. (Otherwise, you could sort in O(n) by feeding the array to the DS, and querying 1st, 2nd, 3rd, ... nth elements)
Using a skip list instead of a Balanced BST as the sorted structure
can reduce dequeue complexity to O(1) average case. It does
not affect complexity of any other op.
To remove from a skip list - all you need to do is to get to the element using the pointer from the head of the queue, and follow the links up and remove each. The expected number of nodes needed to be deleted is 1 + 1/2 + 1/4 + ... = 2.
find Kth can be achieved in O(logK) by starting from the leftest node (and not the root) and making your way up until you find you have "more sons then needed", and then treat the just found node as the root just like the algorithm in the question. Though it is better in asymptotic complexity - the constant factor is double.
I found an interesting paper:
Sliding-Window Top-k Queries on Uncertain Streams published in VLDB 2008 and cited by 71.
https://www.cse.ust.hk/~yike/wtopk.pdf
VLDB is the best conference in database research area, and the number of citations proves the data structure actually works.
The paper looks pretty difficult, but if you really need improve your data structure, I suggest you to read this paper or papers in the reference page of this paper.
You can also use a finger tree.
For example, a priority queue can be implemented by labeling the internal nodes by the minimum priority of its children in the tree, or an indexed list/array can be implemented with a labeling of nodes by the count of the leaves in their children. Finger trees can provide amortized O(1) cons, reversing, cdr, O(log n) append and split; and can be adapted to be indexed or ordered sequences.
Also note that being a purely functional structure makes this a good choice for concurrent usage.
Is the following list a BST or not?
list:{2,5,3,8,6}
Is there a way I can determine this?
Consider that my list will have 100000 elements.
Quick answer is: no. BST is a tree and you have a list.
Long answer is: it may be possible to encode a tree as a list, in which case, whether your list can be decoded into a tree or not will depend on the encoding algorithm. See BST wiki page for more details http://en.wikipedia.org/wiki/Binary_search_tree
In fact your list may be an encoded version of BST. If you read elements from left to right pushing them onto a stack, and whenever you stack has 3 elements do:
parent = pop()
right = pop()
left = pop()
push new Node(parent, left, right)
then you should get a valid BST. But I am only speculating here.
you are having a list , you need to construct BST from this list
A BST has following properties
1- Each node has two children or it is a leaf node
2- For each node its left subtree is smaller than node's value
3- For each node its right subtree is greater than node's value
a bst MUST BE BALANCED i.e. while inserting nodes in a BST , code must respect above 3 conditions.
Searching in a BST is O(log n) operation that is because , each search step divide the search space into two halfs and choose one of the half.
There is a case where search will take O(N) time
Consider following
node = {1,2,3,4,5}
if we make the BST from this node set it will be right alifned that means every next node will be on the right subtree , here if we want to search for a item , we need to traverse whole right sub tree like a link list.