Can someone show me the basic structure of a 2,3,4 balanced search tree node in Java? I am not sure how to represent a 3 key node. Should I use an array of size three? Or should I have a left entry (entry=key +value object), middle entry and right entry fields?
And how does the left child and right child work out? In a binary search tree the left and right childs are for a node. In a 2,3,4 tree, the left and right child seem to be for each key. So should a 2,3,4 tree node itself be an object that holds 3 binary tree nodes, instead of holding 3 entries?
The "traditional OO" option of having separate classes for the 4 node variants (0 for leaf, 2, 3 or 4 children otherwise) is awkward, because references to the node would then have to be updated, leading to having to store a back reference to the parent and updating the parent every time the number of children changed.
Instead, it seems far more simple and performant to break the "state affecting behaviour rather than subclass" anti-pattern and use LinkedLists for both the values and the child nodes:
public class Node<T> {
LinkedList<Node<T>> children = new LinkedList<>();
LinkedList<T> values = new LinkedList<>();
public Node(T value) {
values.add(value); // there is always a value
}
// other methods to find, insert, delete, etc
}
Using a LinkedList gives you constant time (ie O(1) time complexity) on all operations (although n=4 is small, still every bit helps), the ability to iterate without an iterator via next(), which is needed to iterator conveniently over both lists simultaneously, and a more convenient API for balancing operations.
Related
Is it possible to convert a linked list to binary search tree BST.
and also have a link between them in a way of the current element, is pointing to the same in both linked list and BST
linked list :
3 , 5 ,6 ,1, 2 ,0,4
BST :
3
/ \
1 5
/ \ / \
0 2 4 6
when current of the binary search tree points to 1
it should also point to 1 in the linked list
Absolutely, you can use the same nodes in both data structures. The nodes would then each have the two links of their "linked-list-ness" and the two children of their "BST-ness".
So as a quick example, a node might have fields like this:
class Node {
public Node next, prev; // for the linked list
public Node left, right; // for the BST
// put some data in here too
}
It takes some care to use those nodes, because it's easy to get the two data structures "out of sync", for example if you remove a node from one but forget to remove it from the other. Then again, maybe that's what you want. But other things stay the same, for example if you do rotations in the BST then you don't have to touch the linked-lists fields at all, and similarly if you swap two nodes in the linked-list then the BST fields are not affected at all. So in those cases the code will be the same as it would have been if you had a normal kind of node that is part of only one data structure.
Of course this trick does not apply to the standard library implementations.
You can absolutely have that as harold pointed out above. And as he also points out, you can get them out of sync if your implementation has bugs that allow the data structures to go out of sync.
So, you could do this instead:
//YourBST.java
public class YourBST<T> {
public List<T> toList() { //Convert this to a list }
}
//YourList.java
public class YourList<T> {
public YourBST<T> toBST() { //... }
}
And then, when modifying one or the other..just call the conversion methods to set the other datastructure.
YourBST<String> bst = new YourBST<String>();
YourList<String> list = new YourList<String>();
modify(bst); //modify your BST in some way
list = bst.toList();
modify(list); //modify your List in some way
bst = list.toBST();
Edit: if you indeed want the same object to have prev/next references as well as left/right references, that still requires the node class as Harold described above. This solution simply fixes the "out of sync" problem.
I'm creating a binary tree maze. The tree has 8 leaves and the goal is to traverse the tree and find "food" at one or more of the leaves. At each node, the participant can either chose the left or right node to go to next. Or, it can traverse both, but at some cost (maybe 1 time step versus 2 by choosing on or the other). If it reaches a leaf with no food, it has to backtrack and remake its decision. This is eventually going to turn into an evolutionary algorithm where the strategies are stored and evolved over multiple generations. What is the most efficient way to store the path traversed (so the participant may backtrack if food is not found)?
There are many ways to approach this. One thing that comes to mind is to have a boolean flag at each node, and if the node is visited return and store the node's index or key value in this array.
Example:
1. Start tree traversal
2. User picks direction(right/left)
3. flag which node was visited
4. store node's index or key in an array
Suppose I have an AVL tree of distinct integers. I need to determine the number of nodes which lie in the interval [a, b) where a < b. Note that [a, b) is supplied by the user and hence I do not know beforehand what the value of a and b are. Also, a and b may not be present in the tree at all. For example, if I have a tree containing the integers {1, 2, 4, 5, 6, 7} then the user should expect an answer of 3 if he supplies the interval [3, 7).
A naive implementation would be to traverse every node and increment the count by 1 every time a node is found in the given interval. But this would have a worst case time complexity of O(n), as it is possible for every single integer in the tree to be within the given range. I need a faster algorithm, and after doing some research I found that it requires storing a size statistic in every node so that the rank of any given node can be easily computed.
I would like to do something like rank(b) - rank(a), but the problem is that a and b may not exist in the tree. In the above example, rank(7) would return 6 but rank(3) will not return any meaningful value.
Can anyone offer suggestions as to how I can address this issue? Also, I know that there is another similar question on this website, but that one involves C++ while this one involves Java. Also, I could not find a satisfactory answer there.
I've implemented a stack based tree iterator for an AVL tree some (long) time ago. It should work for your case like this:
create an array "treestack" which holds structs for traversal info. The struct just needs a bool "visited", and a pointer to your node type. The array can be of static size e.g. hold 64 info elements (one for each level of your tree, so this 64 will mean your tree contains max 4G nodes)
change the search method of your AVL tree to put the root node at treestack[0] when you begin with the search, and put all other nodes on top of the treestack as you follow the left and right child nodes during your search. Edit: Note that an unsuccessful search will result in your treestack having a node with the next smaller or next higher value, which is exactly what you want (just skip counting in case it's smaller, we still have abvalid iterator start oath).
You've now a path un your treestack which you can use for subsequent in-order traversal to find the next higher values. In-order traversal using the stack works like this:
start at the last element in treestack, keep a treeindex which is initially = arrayindex of the last item.
When there is a right node, and it is not marked visited: try follow ONE right of the current node, then ENDLESS to the left of any following nodes. Wherever you stop (so also if there are no left nodes but a single right one) is the next higher value. Put all nodes at the end of your tree stack as you follow them and inc your treeindex as you follow paths. And Mark the choosen final node with the next higher value as visited. Increase your node counter+1.
now to find subsequent higher values, ascend in the tree by taking treeindex-1 in your treestack as current node, an repeat the above step to find the next node with higher value.
if there is no right child node the current node, and the node is not marked visited: mark as visited, and inc node counter by 1
you're done when you either reach the root node with treeindex 0 or your node containing max value.
I hope it helps.
Instead of
rank(b) - rank(a)
what I would do is
rank(X) - rank(Y)
X is the very first node having value > b.
Y is the very first node having value >= a.
This falls under "a software algorithm" from https://stackoverflow.com/help/on-topic
This is from an interview question http://www.glassdoor.com/Interview/Yelp-Software-Engineering-Intern-Interview-Questions-EI_IE43314.0,4_KO5,32_IP2.htm,
particularly "performance of binary tree if implemented thru array or linkedlist"
How would you go about implementing a binary tree via an array or a linked list?
The way I was taught to do it was by having a linked node type of structure that has two pointers, left and right, that is (from https://courses.cs.washington.edu/courses/cse143/12wi/lectures/02-22/programs/IntTreeNode.java)
public class IntTreeNode {
public int data;
public IntTreeNode left;
public IntTreeNode right;
public IntTreeNode(int data) {
this(data, null, null);
}
public IntTreeNode(int data, IntTreeNode left, IntTreeNode right) {
this.data = data;
this.left = left;
this.right = right;
}
}
And then in the actual binary tree
public class IntTree {
IntTreeNode overallRoot;
public IntTree() {
overallRoot = null;
}
....
}
How would you go about this if you were just using an array or a linked list(one pointer)?
But anyways this is supposed to be a quick fire question. Even if you didn't implement the tree, which you aren't supposed to, how would you analyze the performance of the tree? Doesn't the performance depend on the state of the tree, like if it is a BST? Like for a BST, find would be O(log n) because you're cutting off half the tree each time.
How would you analyze performance based on these two implementations right away?
I'm not sure if I understood correctly, but this is what I thought of.
Basically, you can store the nodes in the tree as elements of an array/list.
For arrays, think of something like this:
public class Node {
public int data;
public int left;
public int right;
...
}
Your tree would be an array of Nodes (Node[] tree), such that the root would be the first element tree[0].
Every element refers to its left and right children as indices in the array.
For example, tree[ tree[0].left ] would be the left child of the root.
A left value of -1 could indicate that the node does not have a left child; similarly for right.
For example, consider the following tree:
5
/ \
2 8
\ / \
3 6 9
Suppose you have initially allocated 10 elements in your array.
Since you have fewer than 10 nodes in the tree, some of them will be null.
Here is what it could look like:
(I am representing each Node as a (data,left,right) tuple)
{ (5,1,2) , (2,-1,4) , (8,5,3) , (9,-1,-1) , (3,-1,-1) , (6,-1,-1) , null , null , null , null }
Thus for the node (8,5,3), you can tell that its left child is the sixth element (node (6,-1,-1)) and its right child is the fourth element (node (9,-1,-1)).
The performance of the insertion/deletion functions could vary depending on your precise implementation.
A similar idea can hold for linked lists (but remember that they do not have random access: finding the i-th element requires traversing the list, element by element).
Hope this helps.
When analyzing algorithms as such, you want to look at what type of binary tree is it (balanced vs. unbalanced), plus the three factors regarding sapce/time complexity:
Insertion
Deletion
Search
Comparing linked list vs. array implementations of binary trees, we see the following:
Linked lists insertions and deletions are much less expensive than when done in arrays (think of array element shifts you have to do to fulfill those two operations.
Linked lists offer flexible size, while arrays do not; you will have to handle array expansion when data does not fit within initial array size.
Arrays offer random access, while linked lists do not; e.g. when dealing with an array implementation of a full or complete binary tree, we can easily compute the indices of any node in the tree.
Having that said, for specific implementations of Binary Search Trees, linked lists are better implementations simply because in a binary search tree, access follows the rules of a binary search tree (root's value is greater than left child and less than right child). Therefore, for insertion/deletion and search, average complexity should be O(log n), provided the tree is balanced. If the binary search tree is not balanced, your complexity becomes O(n) for all operations - this is the worst case scenario.
Assume the height of the BST is h.
If we want to delete a node with two children, then what would be the time complexity of the process.
I know that in a normal binary tree, the time complexity for deletion is O(h); O(n) worst case and O(logn) best case. But since we are replacing the key of the deleting node by the minimum node of right sub tree of it, it will take more time to find the minimum key.
So does anybody know how to explain the time complexity in this situation?
Source Wikipedia :
Deletion
There are three possible cases to consider:
Deleting a leaf (node with no children): Deleting a leaf is easy, as we can simply remove it from the tree.
Deleting a node with one child: Remove the node and replace it with its child.
Deleting a node with two children: Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Copy the value of R to N, then recursively call delete on R until reaching one of the first two cases. If you choose in-order successor of a node, as right sub tree is not NIL( Our present case is node has 2 children), then its in-order successor is node with least value in its right sub tree, which will have at a maximum of 1 sub tree, so deleting it would fall in one of first 2 cases.
Deleting a node with two children from a binary search tree. First the rightmost node in the left subtree, the inorder predecessor 6, is identified. Its value is copied into the node being deleted. The inorder predecessor can then be easily deleted because it has at most one child. The same method works symmetrically using the inorder successor labelled 9.
Consistently using the in-order successor or the in-order predecessor for every instance of the two-child case can lead to an unbalanced tree, so some implementations select one or the other at different times.
Runtime analysis:
Although this operation does not always traverse the tree down to a leaf, this is always a possibility; thus in the worst case it requires time proportional to the height of the tree. It does not require more even when the node has two children, since it still follows a single path and does not visit any node twice. Hence the pointer adjustments in all three cases need constant time.
Useful Links :
http://en.wikipedia.org/wiki/Binary_search_tree#Deletion
http://cse.iitkgp.ac.in/~pb/algo-1-pb-10.pdf