How to most efficiently find multiple duplicate values in Binary Search Tree

How to most efficiently find multiple duplicate values in Binary Search Tree - java

How can I write an algorithm that finds all multiple duplicate values in Binary Search Tree when you are allowed to add duplicate values either on the left subtree or right subtree after applying balanced tree algorithm to an unbalanced tree?

Normally in a tree search, you stop when you have found the desired element. In this case, you keep the recursion going when the values match. Instead of returning the node found, you return a count that is accumulated in the recursive calls.

Related

Finding Index of a Node in a Huffman Tree

Assume we have a tree of nodes (Huffman-tree) that hold string values in them. How would I go about traversing the tree and spitting out an index of a specific node if I had a tree like this? The numbers I drew inside the circles would be the index I want (especially 12 or 13).
Note: Due to miscommunication, I will repeat: the #'s I wrote inside the circles are not the values that the nodes hold. They are the index of that node. My problem was that I couldn't find the index, since the trees are structured weirdly - not a classic BST tree, and the values inside aren't numerical.
Edit: I redrew the image to make my question more clear.
Either way, I figured it out. I'll write up the answer after my finals.

The tree you are showing is not a binary search tree. The central property of a binary search tree that allows it to be searched efficiently is that the left descendants of a node are smaller, and the right descendants bigger than the node itself (in terms of the index value).
If you have a proper binary search tree, you can find a node with given index by comparing with nodes and following the corresponding branch, starting with the root.

Binary Tree traversal maze

I'm creating a binary tree maze. The tree has 8 leaves and the goal is to traverse the tree and find "food" at one or more of the leaves. At each node, the participant can either chose the left or right node to go to next. Or, it can traverse both, but at some cost (maybe 1 time step versus 2 by choosing on or the other). If it reaches a leaf with no food, it has to backtrack and remake its decision. This is eventually going to turn into an evolutionary algorithm where the strategies are stored and evolved over multiple generations. What is the most efficient way to store the path traversed (so the participant may backtrack if food is not found)?

There are many ways to approach this. One thing that comes to mind is to have a boolean flag at each node, and if the node is visited return and store the node's index or key value in this array.
Example:
1. Start tree traversal
2. User picks direction(right/left)
3. flag which node was visited
4. store node's index or key in an array

Queue data structure supporting fast k-th largest element finding

I'm faced with a problem which requires a Queue data structure supporting fast k-th largest element finding.
The requirements of this data structure are as follows:
The elements in the queue are not necessarily integers, but they must be comparable to each other, i.e we can tell which one is greater when we compare two elements(they can be equal as well).
The data structure must support enqueue(adds the element at the tail) and dequeue(removes the element at the head).
It can quickly find the k-th largest element in the queue, pls note k is not a constant.
You can assume that operations enqueue , dequeue and k-th largest element finding all occur with the same frequency.
My idea is to use a modified balanced binary search tree. The tree is the same as ordinary balanced binary search tree except that every nodei is augmented with another field ni, ni denotes the number of nodes contained in the subtree with root nodei. The aforementioned operations are supported as follows:
For simplicity assume that all elements are distinct.
Enqueue(x): x is first inserted into the tree, suppose the corresponding node is nodet, we append pair(x,pointer to nodet) to the queue.
Dequeue: suppose (e1, node1) is the element at the head, node1 is the pointer into the tree corresponding to e1. We delete node1 from the tree and remove (e1, node1) from the queue.
K-th largest element finding: suppose root node is noderoot, its two children are nodeleft and noderight(suppose they all exist), we compare K with nroot , three cases may happen:
if K< nleft we find the K-th largest element in the left subtree of nroot;
if K>nroot-nright we find the (K-nroot+nright)-th largest element in the right subtree of nroot;
otherwise nroot is the node we want.
The time complexity of all the three operations are O(logN) , where N is the number of elements currently in the queue.
How can I speed up the operations mentioned above? With what data structures and how?

Note - you cannot achieve better then O(logn) for all, at best you need to "chose" which op you care for the most. (Otherwise, you could sort in O(n) by feeding the array to the DS, and querying 1st, 2nd, 3rd, ... nth elements)
Using a skip list instead of a Balanced BST as the sorted structure
can reduce dequeue complexity to O(1) average case. It does
not affect complexity of any other op.
To remove from a skip list - all you need to do is to get to the element using the pointer from the head of the queue, and follow the links up and remove each. The expected number of nodes needed to be deleted is 1 + 1/2 + 1/4 + ... = 2.
find Kth can be achieved in O(logK) by starting from the leftest node (and not the root) and making your way up until you find you have "more sons then needed", and then treat the just found node as the root just like the algorithm in the question. Though it is better in asymptotic complexity - the constant factor is double.

I found an interesting paper:
Sliding-Window Top-k Queries on Uncertain Streams published in VLDB 2008 and cited by 71.
https://www.cse.ust.hk/~yike/wtopk.pdf
VLDB is the best conference in database research area, and the number of citations proves the data structure actually works.
The paper looks pretty difficult, but if you really need improve your data structure, I suggest you to read this paper or papers in the reference page of this paper.

You can also use a finger tree.
For example, a priority queue can be implemented by labeling the internal nodes by the minimum priority of its children in the tree, or an indexed list/array can be implemented with a labeling of nodes by the count of the leaves in their children. Finger trees can provide amortized O(1) cons, reversing, cdr, O(log n) append and split; and can be adapted to be indexed or ordered sequences.
Also note that being a purely functional structure makes this a good choice for concurrent usage.

Binary Search Trees, how do you find maximum?

I've been working with Binary Search Trees in my spare time, and I want to be able to delete nodes from a tree.
In order to get this to work, I need to find the maximum value. How do you go about doing that? Pseudo-code or hints would be appreciated. I'm stuck and not exactly sure how to even begin this.

A binary search tree has the following properties:
The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
Both the left and right subtrees must also be binary search trees.
With that definition in mind, it should be very easy to find the max.

A simple pseudocode would be this. It it is indepandant to binary search I think.
int maxi = 0
foreach(array as item) // or any other loop
if item>maxi then maxi = item

Time Complexity of Creating a Binary Tree

I am trying to create a tree from a source that provides: the 2 nodes to be added to the tree, and the node which these 2 news nodes should be added. To find where this node is in the tree, I used a inorder traversal which takes O(n). So if there was n number of nodes to be added in the tree, will the creation of the whole tree be O(n^2). My constraint is that it should only take O(n) to create the tree.

Looking up a node in a binary tree is O(log(n)) because the tree has log(n) levels (each level holds twice as much as the level above it). Therefore to create/insert n elements into a binary tree it's O(nlog(n)).

You could keep references to each node of the tree in a HashMap [1], to get O(1) access to each node instead of the O(log(n)) which is typical of trees. That would make it possible to build the tree in O(n) time, because that HashMap lets you jump directly to a node instead of traversing there from the tree's root node.
[1] The key would be whatever the source uses for uniquely identifying the nodes (I'm assuming it to be an integer or string). The value would be a reference to the node in the tree. Note that the tree implementation must make all its nodes public, so you will probably need to write the tree yourself or find a suitable library (JDK's trees such as TreeMap keep their internal structure private, so they won't cut it).

for Binary search tree time complexity will be O(nlogn) when the elements are not sorted and sorted it takes O(n^2). It is because to to insert one element in a sorted list in a BST O(n) time is taken so for n elements O(n^2) and for a balanced or almost balanced binary search tree max time for insertion is logn so for n elements it is nlogn

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.