I'm looking to return the nth largest data value in the subtree rooted at this node in a BST that can have duplicate values.
Right now I have this but it doesn't seem to work
static int get(BSTNode node, int n) {
if ( node == null || n < 0)
return 0;
if ( n == 0)
return node.data;
get(node.right, n--);
get(node.left, n--);
return 0;
}
This is my Node class
private int data; // key
private int dataCount; // no. of keys with equal value stored at this node
private int treeSize; // no. of keys stored in the subtree rooted at this node
private BSTNode leftChild;
private BSTNode rightChild;
I have implemented an iterative solution.
Idea
The basic idea is to always go as far to the right as possible as the largest values will be there as long as a node has not yet been counted against n and save all values along the path we go in a stack.
If we can't go to the right anymore and we've not counted this node against n already we've found the largest node in the tree that we have not accounted for yet. So in that case we can account for that value by getting the node from the stack, decrementing n and adding that node to the set of nodes we've already accounted for. To find the next largest value we'll have to check if the current node has a left subtree and if it has, we need to continue there. If it does not we need to go back to our parent node which we do by getting the next value from the stack. This parent node will then automatically be the next largest value.
Here an illustration of those two scenarios possible when being at the largest value that has not been accounted for.
We're at the largest node that has not been accounted for and it has no left node, so we need to go back to the parent (using the stack) which will be the next largest node.
Parent
/ \
/ \
x largest
Alternatively: there is a left subtree, we need to examine first.
Parent
/ \
/ \
x largest
/
/
left subtree
/ \
/ \
Implementation
/**
* Finds the n-th largest key in a given subtree.
* #param root root node to start from
* #param n n-th largest key to get, n =< 1 will result in the largest key being returned
* #return returns n-th larges value, if n <= 1 return largest value
*/
public Node findNthLargestKey(Node root, int n){
if(root == null) return null;
if(n < 1) n = 1;
// early out: if you know how many nodes you have in the tree return if n is bigger than that
// based on number of nodes compared to n some assumptions could be made to speed up algorithm
// e.g. n == number of nodes -> get to left-most node and return it
// if n is close to number of nodes also probably start in left branch instead of all the way on the right side at the biggest node
var stack = new Stack<Node>();
// remember nodes that have been visited and have been counted against n
var done = new HashSet<Integer>();
// start at root node
stack.add(root);
// continue as long as the stack is not empty, if it is n was to big and the n-th largest value could not be found
while (!stack.empty()){
// pop next value from the stack (will be root in first iteration)
current = stack.pop();
// always try to go as far to the right as possible to get the biggest value that has not yet been counted against n
while (current != null && !done.contains(current.getKey())){
stack.add(current);
current = current.getRight();
}
// if we are here we've found the biggest value that has not yet been counted against n
var prev = stack.pop();
// if we have found the nth biggest value return
if(--n == 0){
return prev;
}
// otherwise mark this node as done and counted against n
done.add(prev.getKey());
// if node has a left successor, visit it first as this node has no right successors that have not been counted against n already
if(prev.getLeft() != null) stack.add(prev.getLeft());
}
// n-th largest value was not found (n was too big)
return null;
}
My Node looks like this with getters and setters defined of course. But the implementation will also work for your node, as the number of nodes with same value are irrelevant to find the n-th largest node. And even if they were not , the same algorithm would work but then you would need to decrement by the number of nodes with same value and the condition would need to be adjusted to n <= 0 to return.
public class Node {
private int key;
private Node right;
private Node left;
private Object anyData;
public Node(int key) {
this(key, null);
}
public Node(int key, Object anyData) {
this.key = key;
this.anyData = anyData;
this.left = null;
this.right = null;
}
}
Test
I've tested my implementation against random trees and the results have always been correct. This Test class however only checks results for the root node to be able to test the method for every node in the tree. I've additionally also run some test where n > number of nodes in tree which always has to return null for not found and for smaller subtrees.
public class Test {
public static void main(String[] args){
// values to insert into the tree
int[] curVals = fillArrayRand(20, 1, 200);
// Create tree
BinarySearchTree tree = new BinarySearchTree();
System.out.println("Tree has been created.");
// fill tree
for (var cur: curVals) {
tree.insertIter(new Node(cur));
}
// print values in order of insertion, first value will be the root value
System.out.println("Tree was filled with the following values: %s".formatted(Arrays.toString(curVals)));
// print tree in using inorder traversal
tree.printRec(Traversal.INORDER);
var sorted = Arrays.stream(curVals).sorted().distinct().toArray();
// always start at root node; which is the first node that gets inserted
// find() returns a given node by key
var startNode = tree.find(curVals[0]);
// now loop over sorted values (sorted in ascending order -> nth largest is at position n - i in the sorted array)
for (int i = 0; i < sorted.length; i++) {
var result = tree.findNthLargestKey(startNode, sorted.length - i);
// if key in i-th position of sorted array is the same as the key of result => success
// if result is null, the node was not found (should not happen here as sorted.length - i is never > sorted.length)
System.out.printf("#%d largest value:\t%d (expected)\t-\t%s (result)\t", sorted.length - i, sorted[i], result == null ? "not found": result.getKey());
if (result != null && sorted[i] == result.getKey()) {
System.out.println("SUCCESS");
} else System.out.println("FAILED");
}
}
public static int[] fillArrayRand(int size, int randStart, int randEnd){
int[] randArray = new int[size];
for(int i = 0; i < size; i++){
randArray[i] = (int)( (randEnd - randStart) * Math.random() + randStart);
}
return randArray;
}
}
Expected output
Tree has been created.
Tree was filled with the following values: [148, 65, 18, 168, 8, 148, 194, 186, 114, 22, 102, 51, 123, 169, 68, 118, 37, 18, 26, 18]
((((n,8,n),18,(n,22,(((n,26,n),37,n),51,n))),65,(((n,68,n),102,n),114,((n,118,n),123,n))),148,(n,168,(((n,169,n),186,n),194,n)))
#17 largest value: 8 (expected) - 8 (result) SUCCESS
#16 largest value: 18 (expected) - 18 (result) SUCCESS
#15 largest value: 22 (expected) - 22 (result) SUCCESS
#14 largest value: 26 (expected) - 26 (result) SUCCESS
#13 largest value: 37 (expected) - 37 (result) SUCCESS
#12 largest value: 51 (expected) - 51 (result) SUCCESS
#11 largest value: 65 (expected) - 65 (result) SUCCESS
#10 largest value: 68 (expected) - 68 (result) SUCCESS
#9 largest value: 102 (expected) - 102 (result) SUCCESS
#8 largest value: 114 (expected) - 114 (result) SUCCESS
#7 largest value: 118 (expected) - 118 (result) SUCCESS
#6 largest value: 123 (expected) - 123 (result) SUCCESS
#5 largest value: 148 (expected) - 148 (result) SUCCESS
#4 largest value: 168 (expected) - 168 (result) SUCCESS
#3 largest value: 169 (expected) - 169 (result) SUCCESS
#2 largest value: 186 (expected) - 186 (result) SUCCESS
#1 largest value: 194 (expected) - 194 (result) SUCCESS
Note: the output of the line with all the parenthesis is the output of the inorder traversal where (left node, parent, right node) and n means null for i. e. no node. The first node that gets inserted is the root node, so it's best to start to read the output from there.
Correctness
I should be possible using a loop invariant and induction to proof the algorithm is correct and produces the expected result for every correct binary search tree and input. The loop variant (informally) would be after an iteration i of the outer while loop, we have found the i-th largest node in the tree for 1 <= i <= n. I have not done that here, but using the idea that should be straightforward.
Effectiveness
I have not done a complexity analysis but it is obvious that the best case e.g. root node is largest value and we want the largest value the complexity is O(1). In worst case it will be O(n) no matter which node we search for. The algorithm could certainly be improved for some inputs i. e. if we have n close to the number of nodes in the tree, meaning we are searching for a small value. In that case it will be faster to start from the left-most node which is the smallest and search for the (number of nodes - n)-th smallest value. If you were to implement something like this you could certainly greatly improve the average case runtime.
When dealing with recursive data structures, you can often expect some recursive code to work on them. This case is not an exception either, just the recursive parts will need some dirtiness.
Let's use the 3-element BSTs for designing, labeled with insertion-order:
123 132 213,231 312 321
1 1 2 3 3
\ \ / \ / /
2 3 1 3 1 2
\ / \ /
3 2 2 1
Finding the largest element is easy:
just go to the right as long as you can
You will bump into 3, no matter what level it is.
Finding the second largest element is the revealing part:
going to the right as long as it's possible still seems to be a good start
then we look at where we are:
if it's a leaf node, return to the parent, and that will be the one (123,213,231 cases)
if there's a left-child, check that one, but as the 312 case in particular shows, "checking" the left-child actually means step 1, so again go to the right as long as it's possible.
The recursion is somewhat found, and these really are the steps we need for larger cases too. It's also somewhat seen that when we are going to the right, the "nth-ness" of the next number we will check doesn't change. It starts changing only when we are stepping to the left (132,312,321), or returning to a previous level (123,213,231).
The dark part is that we have to track this counter somehow. We found the answer when it reaches 0 (so starting this algorithm with n=0 finds the largest element), and after that (when n goes negative) it will just return the value it got from recursion.
First here is a JavaScript PoC, using a bit dirty hacks, like if a member variable doesn't exist at all yet, it still can be checked (the if(this.left) things), and the counter is a one-element array (so it can be modified across the recursive calls), the method is called as root.nthback([i]), where the [i] is an array literal. Also, the method doesn't bother returning anything when the element doesn't exist, that produces the two undefineds at the end of the test output. These shortcuts will be addressed in the Java variant at the end of this post.
The example input was just taken from the other answer, on top of their availability, they have some repeats too.
const init = [148, 65, 18, 168, 8, 148, 194, 186, 114, 22, 102, 51, 123, 169, 68, 118, 37, 18, 26, 18];
class Node {
nthback(n) {
if (this.right) {
let res = this.right.nthback(n);
if (n[0] < 0)
return res;
}
if (n[0]-- === 0)
return this.value;
if (this.left) {
let res = this.left.nthback(n);
if (n[0] < 0)
return res;
}
}
constructor() {
this.value = NaN;
}
add(value) {
if (isNaN(this.value)) {
this.value = value;
} else if (value < this.value) {
if (!this.left)
this.left = new Node;
this.left.add(value);
} else {
if (!this.right)
this.right = new Node;
this.right.add(value);
}
}
walk() {
let result = "";
if (this.left)
result = this.left.walk() + ",";
result += this.value;
if (this.right)
result += "," + this.right.walk();
return result;
}
}
const root = new Node;
for (const value of init)
root.add(value);
console.log(root.walk());
for (let i = 0; i < 22; i++)
console.log(root.nthback([i]));
So the actual magic is quite short and also symmetric:
nthback(n) {
if (this.right) { // 1
const res = this.right.nthback(n); // 2
if (n[0] < 0) // 3
return res;
}
if (n[0]-- === 0) // 4
return this.value;
if (this.left) { // 5
const res = this.left.nthback(n);
if (n[0] < 0)
return res;
}
}
If there is something on the right (1), it has to be checked (2), and if the counter is negative afterwards (3), the result we got back is the actual result of the entire call, so we pass it back.
If we are still in the method, (4) is where we check if the counter is exactly 0, because then this node has the actual result, which we can return. It's worth to remember that the n[0]-- part decrements the counter regardless of the outcome of the comparison. So if n[0] was 0 initially, it will become -1 and we return this.value;. If it was something else, it just gets decremented.
(5) does the (1)-(2)-(3) part for the left branch. And the hidden JavaScript thing is that we don't have to return anything. But the (4)-(5) parts will change when using your complete structure anyway.
Adding subtree-size allows early return if the incoming counter is simply larger than the size of the entire subtree: we just decrement the counter by the size, and return, the element is somewhere else. And this also means that when we don't return, the result is in the subtree, so we check the possible right branch, then ourselves, and if we are still inside the method, we don't even have to check if we have a left branch, because we do have it for sure, and it does contain the result, also for sure. So (5) will simply become a direct return this.left.nthback(n);. Which is quite a simplification.
Tracking multiplicity affects (4): instead of checking for 0, we will have to check if the counter is less than the multiplicity, and also, instead of decrementing the counter by 1, we have to subtract the actual multiplicity from it.
const init = [148, 65, 18, 168, 8, 148, 194, 186, 114, 22, 102, 51, 123, 169, 68, 118, 37, 18, 26, 18];
class Node {
nthback(n) {
if (this.size <= n[0]) {
n[0] -= this.size;
return 0;
}
if (this.right) {
let res = this.right.nthback(n);
if (n[0] < 0)
return res;
}
if (n[0] < this.count) {
n[0] -= this.count;
return this.value;
}
n[0] -= this.count;
return this.left.nthback(n);
}
constructor() {
this.value = NaN;
this.size = 0;
}
add(value) {
this.size++;
if (isNaN(this.value)) {
this.value = value;
this.count = 1;
} else if (value === this.value) {
this.count++;
} else if (value < this.value) {
if (!this.left)
this.left = new Node;
this.left.add(value);
} else {
if (!this.right)
this.right = new Node;
this.right.add(value);
}
}
walk() {
let result = "";
if (this.left)
result = this.left.walk() + ",";
result += this.value;
if (this.count > 1)
result += "x" + this.count;
result += "(" + this.size + ")";
if (this.right)
result += "," + this.right.walk();
return result;
}
}
const root = new Node;
for (const value of init)
root.add(value);
console.log(root.walk());
for (let i = 0; i < 22; i++)
console.log(root.nthback([i]));
So the final JavaScript variant could look like this:
nthback(n) {
if (this.size <= n[0]) { // 1
n[0] -= this.size;
return 0;
}
if (this.right) { // 2
let res = this.right.nthback(n);
if (n[0] < 0)
return res;
}
if (n[0] < this.count) { // 3
n[0] -= this.count; // 4
return this.value;
}
n[0] -= this.count; // 4
return this.left.nthback(n); // 5
}
Subtree-skipping happens in (1), by comparing subtree-size, and the target count we can immediately tell if the result is in this subtree or somewhere else. While JavaScript would allow a simple return; here, a 0 is produced instead, as it seems to be desired in the question.
The next step (2) is unchanged from the previous variant, looks exactly same, does exactly same.
(3) had to be taken apart. While the post-decrement operator was helpful previously, there's no post--= operator, so it happens for both outcomes of the comparison (4). Also, we don't compare === 0 as before, but we compare < this.count instead. By the way, we could have done that in the previous case too if we really wanted to, there this.count was always 1, so < 1 could have done the thing (the counter is never negative at this point, that can only happen in the left/right returns). In fact if we really-really wanted to, we could do the subtraction prior to the comparison, and instead of the current 0<=n fact and n<count check, we could shift the comparison downwards, knowing that now -count<=n and checking for n<0.
(5) became the one-liner as promised. If we reach this point, the result is unconditionally inside the left-branch, which unconditionally exists.
Making it Java
The simplest part is the array-trick: there could be a public method to call, accepting an actual number, and then it could wrap it into an array, and call a private method doing the actual job. Also changed the variable names to the ones you have:
public int nthback(int n) {
return nthback(new int[] { n });
}
private int nthback(int[] n) {
if (treeSize <= n[0]) {
n[0] -= treeSize;
return 0;
}
if (rightChild != null) {
int res = rightChild.nthback(n);
if (n[0] < 0)
return res;
}
if (n[0] < dataCount) {
n[0] -= dataCount;
return data;
}
n[0] -= dataCount;
return leftChild.nthback(n);
}
As you have private members, these methods have to reside in the same source file, and then they could just reside directly inside class BSTNode anyway.
I am currently trying to write a program that takes an array and makes it into a heap.
I am trying to implement a siftDown method that will take the elements of the array and make them into a heap but i am just not getting the correct output i want and im not sure why:
public void makeHeap(int[] arr)
{
for(int i = 0; i <= arr.length; i++)//for each element in the array, we we check its children and sift it down
{
siftDown(arr, i);
}
}
//insert a new root in the correct position. Byu comparing the top element and swapping it with its largest child.
public void siftDown(int[]heap, int i)
{
int c = i * 2; //grab the children of the current index we are at.
if(c == 0) // 0*2 is 0 so we make it 1 so it will register the first nodes parents
{
c+=1;
}
if(c >= heap.length || c+1 >= heap.length) // so we dont go off the end of the array
{
return;
}
if(heap[c] < heap[c + 1]) //Is child 1 less than child 2?
{
c += 1; //if it is we want c to be the greater child to eventually move upwards
}
if(heap[i] < heap[c])//If the parent we have just gotten is smaller than the child defined last loop, then we swap the two. We then call sift down again to compare child and parent.
{
heap = swap(heap,i,c);//swap the values
siftDown(heap, c);//call again to compare the new root with its children.
}
}
Here is my swap method:
public int[] swap(int[]heap, int i, int c)
{
//capture the two values in variables
int ele1 = heap[i];
int ele2 = heap[c];
heap[i] = ele2;//change heap i to heap c
heap[c] = ele1;//change heap c to heap i
return heap;
}
The starting numbers are: 4,7,9,6,3,1,5
. The output i want to get is 9,7,5,6,3,1,4 but i seem to only be able to get 9,7,4,6,3,1,5. It seems once 4 gets sifted down once after being replaced by 9, the algorithm goes out of wack and it believes that 3 and 1 are its children when 1,5 should be.
Thank you or your help!
int c = i * 2; //grab the children of the current index we are at.
if(c == 0) // 0*2 is 0 so we make it 1 so it will register the first nodes parents
{
c+=1;
}
This looks wrong to me.
The heap indices for children should be a general formula that works for all cases.
Given parentIdx in 0-based array,
leftChildIdx = parentIdx * 2 + 1
rightChildIdx = parentIdx * 2 + 2
This means children of 0 are 1 and 2, children of 1 are 3 and 4, children of 2 are 5 and 6, etc. It works.
Your code places children of 1 at 2 and 3, which is clearly wrong.
On this implementation of Priority Queue I have to use recursion to determinate if the array that I receive as a parameter of the method IsMaxHeap is a max heap.
I want to be sure I evaluate all the cases that could make this work or not work without any problem.
static boolean isMaxHeap(int[] H, int idx) {
if(2*idx == (H.length -1) && H[2 * idx] <= H[idx])
return true;
if(2*idx > (H.length -1))
return true;
if ((2*idx+1) == (H.length -1) && H[2 * idx] <= H[idx] && H[2 * idx + 1] <= H[idx])
return true;
if((2*idx+1) > (H.length -1))
return true;
if (H[2 * idx] <= H[idx] && H[2 * idx + 1] <= H[idx])
return isMaxHeap(H, (idx + 1));
else
return false;
}
Could you help me?
Your code is hard to follow because you do so many calculations in your conditionals. So it's hard to say whether it would actually work. Also, what you've written is basically a recursive implementation of a for loop. That is, you check nodes 1, 2, 3, 4, 5, etc.
Whereas that can work, you end up using O(n) stack space. If you have a very large heap (say, several hundred thousand items), you run the risk of overflowing the stack.
A more common way to implement this recursively is to do a depth-first traversal of the tree. That is, you follow the left child all the way to the root, then go up one level and check that node's right child, and its left children all the way to the root, etc. So, given this heap:
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
You would check nodes in this order: [1, 2, 4, 8, 9, 5, 10, 11, 3, 6, 12, 13, 7, 14, 15]
Doing it that way simplifies your code and also limits your stack depth to O(log n), meaning that even with a million nodes your stack depth doesn't exceed 20.
Since you're using 2*idx and 2*idx+1 to find the children, I'm assuming that your array is set up so that your root node is at index 1. If the root is at index 0, then those calculations would be: 2*idx+1 and 2*idx+2.
static boolean isMaxHeap(int[] H, int idx)
{
// Check for going off the end of the array
if (idx >= H.length)
{
return true;
}
// Check the left child.
int leftChild = 2*idx;
if (leftChild < H.length)
{
if (H[leftChild] > H[idx])
return false;
if (!isMaxHeap(H, leftChild)
return false;
}
// Check the right child.
int rightChild = 2*idx + 1;
if (rightChild < H.length)
{
if (H[rightChild] > H[idx])
return false;
return isMaxHeap(H, rightChild);
}
return true;
}
I was trying to solve a problem in BST where the question was "Check if all leaves are at same level"
In this question, I need to keep incrementing the level per stack call but also maintain a value across all calls, the maximum level. Along with this, I need to return a boolean with the result.
It is simple enough to solve, I need to keep doing down the tree, I did this
int maxlevel = 0;
public boolean allAtSameLevel(Node root, int level){
if(root== null){
return false;
}
if(root.left== null && root.right == null){
if(maxlevel == 0){
maxlevel = level;
}
return(level== maxlevel);
}
return allAtSameLevel(root.left, level+1) && allAtSameLevel(root.right, level+1) ;
}
My problem is that for a value that needs to be shared, I have to maintain an instance variable in java. Is there a better way to do this? My confusion is that since, it's going to go all the way to the right leaf first and then go up, passing the value won't help. Any ideas?
The trick is to communicate the depth of the subtree back to the higher level of the stack, but doing so only when both right and left depths are the same. You can do it with a simple recursive function:
int eqDepth(Node n) {
if (n == null) return 0; // This is a leaf, its subtree depth is zero
int dLeft = eqDepth(n.left); // Make two recursive calls
int dRight = eqDepth(n.right);
// If one of the depths is negative, or the depths are different,
// report it by returning negative 1:
if (dLeft < 0 || dRight < 0 || dLeft != dRight) return -1;
return 1+dLeft; // It's the same as dRight
}
With this function in hand, you can code allAtSameLevel in one line:
public boolean allAtSameLevel(Node root) {
return eqDepth(root) >= 0;
}
Here is a demo on ideone. It starts off with an unbalanced tree and gets a -1
a
/ \
b c
/ \ / \
d e f -
then adds the missing node
a
/ \
b c
/ \ / \
d e f g
and gets a positive result.