Is the following list a BST or not?
list:{2,5,3,8,6}
Is there a way I can determine this?
Consider that my list will have 100000 elements.
Quick answer is: no. BST is a tree and you have a list.
Long answer is: it may be possible to encode a tree as a list, in which case, whether your list can be decoded into a tree or not will depend on the encoding algorithm. See BST wiki page for more details http://en.wikipedia.org/wiki/Binary_search_tree
In fact your list may be an encoded version of BST. If you read elements from left to right pushing them onto a stack, and whenever you stack has 3 elements do:
parent = pop()
right = pop()
left = pop()
push new Node(parent, left, right)
then you should get a valid BST. But I am only speculating here.
you are having a list , you need to construct BST from this list
A BST has following properties
1- Each node has two children or it is a leaf node
2- For each node its left subtree is smaller than node's value
3- For each node its right subtree is greater than node's value
a bst MUST BE BALANCED i.e. while inserting nodes in a BST , code must respect above 3 conditions.
Searching in a BST is O(log n) operation that is because , each search step divide the search space into two halfs and choose one of the half.
There is a case where search will take O(N) time
Consider following
node = {1,2,3,4,5}
if we make the BST from this node set it will be right alifned that means every next node will be on the right subtree , here if we want to search for a item , we need to traverse whole right sub tree like a link list.
Related
Assume we have a tree of nodes (Huffman-tree) that hold string values in them. How would I go about traversing the tree and spitting out an index of a specific node if I had a tree like this? The numbers I drew inside the circles would be the index I want (especially 12 or 13).
Note: Due to miscommunication, I will repeat: the #'s I wrote inside the circles are not the values that the nodes hold. They are the index of that node. My problem was that I couldn't find the index, since the trees are structured weirdly - not a classic BST tree, and the values inside aren't numerical.
Edit: I redrew the image to make my question more clear.
Either way, I figured it out. I'll write up the answer after my finals.
The tree you are showing is not a binary search tree. The central property of a binary search tree that allows it to be searched efficiently is that the left descendants of a node are smaller, and the right descendants bigger than the node itself (in terms of the index value).
If you have a proper binary search tree, you can find a node with given index by comparing with nodes and following the corresponding branch, starting with the root.
Suppose I have an AVL tree of distinct integers. I need to determine the number of nodes which lie in the interval [a, b) where a < b. Note that [a, b) is supplied by the user and hence I do not know beforehand what the value of a and b are. Also, a and b may not be present in the tree at all. For example, if I have a tree containing the integers {1, 2, 4, 5, 6, 7} then the user should expect an answer of 3 if he supplies the interval [3, 7).
A naive implementation would be to traverse every node and increment the count by 1 every time a node is found in the given interval. But this would have a worst case time complexity of O(n), as it is possible for every single integer in the tree to be within the given range. I need a faster algorithm, and after doing some research I found that it requires storing a size statistic in every node so that the rank of any given node can be easily computed.
I would like to do something like rank(b) - rank(a), but the problem is that a and b may not exist in the tree. In the above example, rank(7) would return 6 but rank(3) will not return any meaningful value.
Can anyone offer suggestions as to how I can address this issue? Also, I know that there is another similar question on this website, but that one involves C++ while this one involves Java. Also, I could not find a satisfactory answer there.
I've implemented a stack based tree iterator for an AVL tree some (long) time ago. It should work for your case like this:
create an array "treestack" which holds structs for traversal info. The struct just needs a bool "visited", and a pointer to your node type. The array can be of static size e.g. hold 64 info elements (one for each level of your tree, so this 64 will mean your tree contains max 4G nodes)
change the search method of your AVL tree to put the root node at treestack[0] when you begin with the search, and put all other nodes on top of the treestack as you follow the left and right child nodes during your search. Edit: Note that an unsuccessful search will result in your treestack having a node with the next smaller or next higher value, which is exactly what you want (just skip counting in case it's smaller, we still have abvalid iterator start oath).
You've now a path un your treestack which you can use for subsequent in-order traversal to find the next higher values. In-order traversal using the stack works like this:
start at the last element in treestack, keep a treeindex which is initially = arrayindex of the last item.
When there is a right node, and it is not marked visited: try follow ONE right of the current node, then ENDLESS to the left of any following nodes. Wherever you stop (so also if there are no left nodes but a single right one) is the next higher value. Put all nodes at the end of your tree stack as you follow them and inc your treeindex as you follow paths. And Mark the choosen final node with the next higher value as visited. Increase your node counter+1.
now to find subsequent higher values, ascend in the tree by taking treeindex-1 in your treestack as current node, an repeat the above step to find the next node with higher value.
if there is no right child node the current node, and the node is not marked visited: mark as visited, and inc node counter by 1
you're done when you either reach the root node with treeindex 0 or your node containing max value.
I hope it helps.
Instead of
rank(b) - rank(a)
what I would do is
rank(X) - rank(Y)
X is the very first node having value > b.
Y is the very first node having value >= a.
Assume the height of the BST is h.
If we want to delete a node with two children, then what would be the time complexity of the process.
I know that in a normal binary tree, the time complexity for deletion is O(h); O(n) worst case and O(logn) best case. But since we are replacing the key of the deleting node by the minimum node of right sub tree of it, it will take more time to find the minimum key.
So does anybody know how to explain the time complexity in this situation?
Source Wikipedia :
Deletion
There are three possible cases to consider:
Deleting a leaf (node with no children): Deleting a leaf is easy, as we can simply remove it from the tree.
Deleting a node with one child: Remove the node and replace it with its child.
Deleting a node with two children: Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Copy the value of R to N, then recursively call delete on R until reaching one of the first two cases. If you choose in-order successor of a node, as right sub tree is not NIL( Our present case is node has 2 children), then its in-order successor is node with least value in its right sub tree, which will have at a maximum of 1 sub tree, so deleting it would fall in one of first 2 cases.
Deleting a node with two children from a binary search tree. First the rightmost node in the left subtree, the inorder predecessor 6, is identified. Its value is copied into the node being deleted. The inorder predecessor can then be easily deleted because it has at most one child. The same method works symmetrically using the inorder successor labelled 9.
Consistently using the in-order successor or the in-order predecessor for every instance of the two-child case can lead to an unbalanced tree, so some implementations select one or the other at different times.
Runtime analysis:
Although this operation does not always traverse the tree down to a leaf, this is always a possibility; thus in the worst case it requires time proportional to the height of the tree. It does not require more even when the node has two children, since it still follows a single path and does not visit any node twice. Hence the pointer adjustments in all three cases need constant time.
Useful Links :
http://en.wikipedia.org/wiki/Binary_search_tree#Deletion
http://cse.iitkgp.ac.in/~pb/algo-1-pb-10.pdf
I'm faced with a problem which requires a Queue data structure supporting fast k-th largest element finding.
The requirements of this data structure are as follows:
The elements in the queue are not necessarily integers, but they must be comparable to each other, i.e we can tell which one is greater when we compare two elements(they can be equal as well).
The data structure must support enqueue(adds the element at the tail) and dequeue(removes the element at the head).
It can quickly find the k-th largest element in the queue, pls note k is not a constant.
You can assume that operations enqueue , dequeue and k-th largest element finding all occur with the same frequency.
My idea is to use a modified balanced binary search tree. The tree is the same as ordinary balanced binary search tree except that every nodei is augmented with another field ni, ni denotes the number of nodes contained in the subtree with root nodei. The aforementioned operations are supported as follows:
For simplicity assume that all elements are distinct.
Enqueue(x): x is first inserted into the tree, suppose the corresponding node is nodet, we append pair(x,pointer to nodet) to the queue.
Dequeue: suppose (e1, node1) is the element at the head, node1 is the pointer into the tree corresponding to e1. We delete node1 from the tree and remove (e1, node1) from the queue.
K-th largest element finding: suppose root node is noderoot, its two children are nodeleft and noderight(suppose they all exist), we compare K with nroot , three cases may happen:
if K< nleft we find the K-th largest element in the left subtree of nroot;
if K>nroot-nright we find the (K-nroot+nright)-th largest element in the right subtree of nroot;
otherwise nroot is the node we want.
The time complexity of all the three operations are O(logN) , where N is the number of elements currently in the queue.
How can I speed up the operations mentioned above? With what data structures and how?
Note - you cannot achieve better then O(logn) for all, at best you need to "chose" which op you care for the most. (Otherwise, you could sort in O(n) by feeding the array to the DS, and querying 1st, 2nd, 3rd, ... nth elements)
Using a skip list instead of a Balanced BST as the sorted structure
can reduce dequeue complexity to O(1) average case. It does
not affect complexity of any other op.
To remove from a skip list - all you need to do is to get to the element using the pointer from the head of the queue, and follow the links up and remove each. The expected number of nodes needed to be deleted is 1 + 1/2 + 1/4 + ... = 2.
find Kth can be achieved in O(logK) by starting from the leftest node (and not the root) and making your way up until you find you have "more sons then needed", and then treat the just found node as the root just like the algorithm in the question. Though it is better in asymptotic complexity - the constant factor is double.
I found an interesting paper:
Sliding-Window Top-k Queries on Uncertain Streams published in VLDB 2008 and cited by 71.
https://www.cse.ust.hk/~yike/wtopk.pdf
VLDB is the best conference in database research area, and the number of citations proves the data structure actually works.
The paper looks pretty difficult, but if you really need improve your data structure, I suggest you to read this paper or papers in the reference page of this paper.
You can also use a finger tree.
For example, a priority queue can be implemented by labeling the internal nodes by the minimum priority of its children in the tree, or an indexed list/array can be implemented with a labeling of nodes by the count of the leaves in their children. Finger trees can provide amortized O(1) cons, reversing, cdr, O(log n) append and split; and can be adapted to be indexed or ordered sequences.
Also note that being a purely functional structure makes this a good choice for concurrent usage.
I've been working with Binary Search Trees in my spare time, and I want to be able to delete nodes from a tree.
In order to get this to work, I need to find the maximum value. How do you go about doing that? Pseudo-code or hints would be appreciated. I'm stuck and not exactly sure how to even begin this.
A binary search tree has the following properties:
The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
Both the left and right subtrees must also be binary search trees.
With that definition in mind, it should be very easy to find the max.
A simple pseudocode would be this. It it is indepandant to binary search I think.
int maxi = 0
foreach(array as item) // or any other loop
if item>maxi then maxi = item