I am trying to create a tree from a source that provides: the 2 nodes to be added to the tree, and the node which these 2 news nodes should be added. To find where this node is in the tree, I used a inorder traversal which takes O(n). So if there was n number of nodes to be added in the tree, will the creation of the whole tree be O(n^2). My constraint is that it should only take O(n) to create the tree.
Looking up a node in a binary tree is O(log(n)) because the tree has log(n) levels (each level holds twice as much as the level above it). Therefore to create/insert n elements into a binary tree it's O(nlog(n)).
You could keep references to each node of the tree in a HashMap [1], to get O(1) access to each node instead of the O(log(n)) which is typical of trees. That would make it possible to build the tree in O(n) time, because that HashMap lets you jump directly to a node instead of traversing there from the tree's root node.
[1] The key would be whatever the source uses for uniquely identifying the nodes (I'm assuming it to be an integer or string). The value would be a reference to the node in the tree. Note that the tree implementation must make all its nodes public, so you will probably need to write the tree yourself or find a suitable library (JDK's trees such as TreeMap keep their internal structure private, so they won't cut it).
for Binary search tree time complexity will be O(nlogn) when the elements are not sorted and sorted it takes O(n^2). It is because to to insert one element in a sorted list in a BST O(n) time is taken so for n elements O(n^2) and for a balanced or almost balanced binary search tree max time for insertion is logn so for n elements it is nlogn
Related
How can I write an algorithm that finds all multiple duplicate values in Binary Search Tree when you are allowed to add duplicate values either on the left subtree or right subtree after applying balanced tree algorithm to an unbalanced tree?
Normally in a tree search, you stop when you have found the desired element. In this case, you keep the recursion going when the values match. Instead of returning the node found, you return a count that is accumulated in the recursive calls.
What is the time complexity of inserting a node in a sorted linked list in Java? Is there an algorithm with complexity less than O(n)?
If all you have is a linked links and you're starting from the head, in the worst case you have to iterate over the entire list to find the insertion point. This gives O(n) worst-case time.
Something like a skiplist could give O(log n) insertion. However, that's a different data structure to what you're asking about (and so are trees etc).
My tree is represented by its edges and the root node. The edge list is undirected.
char[][] edges =new char[][]{
new char[]{'D','B'},
new char[]{'A','C'},
new char[]{'B','A'}
};
char root='A';
The tree is
A
B C
D
How do I do depth first traversal on this tree? What is the time complexity?
I know time complexity of depth first traversal on linked nodes is O(n). But if the tree is represented by edges, I feel the time complexity is O(n^2). Am I wrong?
Giving code is appreciated, although I know it looks like homework assignment..
The general template behind DFS looks something like this:
function DFS(node) {
if (!node.visited) {
node.visited = true;
for (each edge {node, v}) {
DFS(v);
}
}
}
If you have your edges represented as a list of all the edges in the graph, then you could implement the for loop by iterating across all the edges in the graph and, every time you find one with the current node as its source, following the edge to its endpoint and running a DFS from there. If you do this, then you'll do O(m) work per node in the graph (here, m is the number of edges), so the runtime will be O(mn), since you'll do this at most once per node in the graph. In a tree, the number of edges is always O(n), so for a tree the runtime is O(n2).
That said, if you have a tree and there are only n edges, you can speed this up in a bunch of ways. First, you could consider doing an O(n log n) preprocessing step to sort the array of edges. Then, you can find all the edges leaving a given node by doing a binary search to find the first edge leaving the node, then iterating across the edges starting there to find just the edges leaving the node. This improves the runtime quite a bit: you do O(log n) work per node for the binary search, and then every edge gets visited only once. This means that the runtime is O(n log n). Since you've mentioned that the edges are undirected, you'll actually need to create two different copies of the edges array - one that's the original one, and one with the edges reversed - and should sort each one independently. The fact that DFS marks visited nodes along the way means that you don't need to do any extra bookkeeping here to figure out which direction you should go at each step, and this doesn't change the overall time complexity, though it does increase the space usage.
Alternatively, you could use a hashing-based solution. Before doing the DFS, iterate across the edges and convert them into a hash table whose keys are the nodes and whose values are lists of the edges leaving that node. This will take expected time O(n). You can then implement the "for each edge" step quite efficiently by just doing a hash table lookup to find the edges in question. This reduces the time to (expected) O(n), though the space usage goes up to O(n) as well. Since your edges are undirected, as you populate the table, just be sure to insert the edge in each direction.
Assume the height of the BST is h.
If we want to delete a node with two children, then what would be the time complexity of the process.
I know that in a normal binary tree, the time complexity for deletion is O(h); O(n) worst case and O(logn) best case. But since we are replacing the key of the deleting node by the minimum node of right sub tree of it, it will take more time to find the minimum key.
So does anybody know how to explain the time complexity in this situation?
Source Wikipedia :
Deletion
There are three possible cases to consider:
Deleting a leaf (node with no children): Deleting a leaf is easy, as we can simply remove it from the tree.
Deleting a node with one child: Remove the node and replace it with its child.
Deleting a node with two children: Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Copy the value of R to N, then recursively call delete on R until reaching one of the first two cases. If you choose in-order successor of a node, as right sub tree is not NIL( Our present case is node has 2 children), then its in-order successor is node with least value in its right sub tree, which will have at a maximum of 1 sub tree, so deleting it would fall in one of first 2 cases.
Deleting a node with two children from a binary search tree. First the rightmost node in the left subtree, the inorder predecessor 6, is identified. Its value is copied into the node being deleted. The inorder predecessor can then be easily deleted because it has at most one child. The same method works symmetrically using the inorder successor labelled 9.
Consistently using the in-order successor or the in-order predecessor for every instance of the two-child case can lead to an unbalanced tree, so some implementations select one or the other at different times.
Runtime analysis:
Although this operation does not always traverse the tree down to a leaf, this is always a possibility; thus in the worst case it requires time proportional to the height of the tree. It does not require more even when the node has two children, since it still follows a single path and does not visit any node twice. Hence the pointer adjustments in all three cases need constant time.
Useful Links :
http://en.wikipedia.org/wiki/Binary_search_tree#Deletion
http://cse.iitkgp.ac.in/~pb/algo-1-pb-10.pdf
I'm faced with a problem which requires a Queue data structure supporting fast k-th largest element finding.
The requirements of this data structure are as follows:
The elements in the queue are not necessarily integers, but they must be comparable to each other, i.e we can tell which one is greater when we compare two elements(they can be equal as well).
The data structure must support enqueue(adds the element at the tail) and dequeue(removes the element at the head).
It can quickly find the k-th largest element in the queue, pls note k is not a constant.
You can assume that operations enqueue , dequeue and k-th largest element finding all occur with the same frequency.
My idea is to use a modified balanced binary search tree. The tree is the same as ordinary balanced binary search tree except that every nodei is augmented with another field ni, ni denotes the number of nodes contained in the subtree with root nodei. The aforementioned operations are supported as follows:
For simplicity assume that all elements are distinct.
Enqueue(x): x is first inserted into the tree, suppose the corresponding node is nodet, we append pair(x,pointer to nodet) to the queue.
Dequeue: suppose (e1, node1) is the element at the head, node1 is the pointer into the tree corresponding to e1. We delete node1 from the tree and remove (e1, node1) from the queue.
K-th largest element finding: suppose root node is noderoot, its two children are nodeleft and noderight(suppose they all exist), we compare K with nroot , three cases may happen:
if K< nleft we find the K-th largest element in the left subtree of nroot;
if K>nroot-nright we find the (K-nroot+nright)-th largest element in the right subtree of nroot;
otherwise nroot is the node we want.
The time complexity of all the three operations are O(logN) , where N is the number of elements currently in the queue.
How can I speed up the operations mentioned above? With what data structures and how?
Note - you cannot achieve better then O(logn) for all, at best you need to "chose" which op you care for the most. (Otherwise, you could sort in O(n) by feeding the array to the DS, and querying 1st, 2nd, 3rd, ... nth elements)
Using a skip list instead of a Balanced BST as the sorted structure
can reduce dequeue complexity to O(1) average case. It does
not affect complexity of any other op.
To remove from a skip list - all you need to do is to get to the element using the pointer from the head of the queue, and follow the links up and remove each. The expected number of nodes needed to be deleted is 1 + 1/2 + 1/4 + ... = 2.
find Kth can be achieved in O(logK) by starting from the leftest node (and not the root) and making your way up until you find you have "more sons then needed", and then treat the just found node as the root just like the algorithm in the question. Though it is better in asymptotic complexity - the constant factor is double.
I found an interesting paper:
Sliding-Window Top-k Queries on Uncertain Streams published in VLDB 2008 and cited by 71.
https://www.cse.ust.hk/~yike/wtopk.pdf
VLDB is the best conference in database research area, and the number of citations proves the data structure actually works.
The paper looks pretty difficult, but if you really need improve your data structure, I suggest you to read this paper or papers in the reference page of this paper.
You can also use a finger tree.
For example, a priority queue can be implemented by labeling the internal nodes by the minimum priority of its children in the tree, or an indexed list/array can be implemented with a labeling of nodes by the count of the leaves in their children. Finger trees can provide amortized O(1) cons, reversing, cdr, O(log n) append and split; and can be adapted to be indexed or ordered sequences.
Also note that being a purely functional structure makes this a good choice for concurrent usage.