B+ Tree split bug - java

I want to be up front so I will say this homework that I am about to talk about. We are suppose to do a B+ tree. I've got it most of the way there but I am having a problem when I have a node split. Specifically when the node is a non-leaf (excluding the root) and it splits I am losing my far right pointer.
For example if the tree was
|3 5|
|1 2| |4| |5 6|
I lose the pointer to |5 6|. So when I search for those values I cannot find them or when I go to add a value that would follow that path I get a null pointer exception.
Anyway I normally would just paste my code here but, unfortunately, we have developed a problem with cheating in my school and since the program is due soon, I am sure a lot of my classmates are scouring the internet for the code. The last thing I want to happen is some jerk rip off my code.
If anyone wouldn't mind looking at the code I will gladly send it to you to check out. Once again it is in Java and is pretty lengthy.
Thanks in advance.
Here is the code. On a side node When I clear offsets and keys I use int and long MAX_VALUE so when I sort I know those cleared values will go to the end of the node. The Split class is just a dumb idea from earlier I need to fix. It consists of a node, offset, and key. Originally I was thinking that I may need to return an offset and key that wasn't in the split node. I then realized that was dumb and all I would ever need to return was the new node itself.
public void add (int key, long offset) throws IOException
{
if (root != null) //start search of where to add the book
{
SplitBucket split = add(root, key, offset); //recursive call
if (split != null) //root has split
{
long newRootOffset;
//make new root and have it point to old root and the split node
BookNode newRoot = new BookNode();
newRoot.changeCurrentChildren(1);
newRoot.setChildKey(0, split.key);
newRoot.setChildOffset(0, root.getMyOffset());
newRoot.setChildOffset(1, split.offset);
newRoot.setChildOffset(2,
root.getChildOffset(Constants.childSize -1));
newRoot.setNode(0, root);
newRoot.setNode(1, split.node);
newRoot.setNode(2, root.getNode(Constants.childSize - 1));
io.setBookNode(root.getMyOffset(), root);
newRootOffset = io.insertNewNode(newRoot);
io.setRoot(newRootOffset);
root = newRoot;
}
}
else //empty tree so create root and add
{
long rootOffset = Long.MAX_VALUE;
root = new BookNode();
root.setChildKey(0, key);
root.setChildOffset(0, offset);
root.changeCurrentChildren(1);
root.switchLeaf(true);
rootOffset = io.insertNewNode(root);
io.setRoot(rootOffset);
root.setMyOffset(rootOffset);
}
}
/**
*
* #param current current BookNode
* #param key Isbn to add
* #param offset offset of Book to add
* #return BookNode if a split occurs
* #throws IOException
*/
private SplitBucket add (BookNode current, int key, long offset)
throws IOException
{
if (current.isLeaf()) // at the bottom level
{
//room to add
if (current.getCurrentChildren() < Constants.childSize - 1)
{
//add the offset and key to the end of the node.
//sort the node and rewrite to file
current.setChildOffset(current.getCurrentChildren(), offset);
current.setChildKey(current.getCurrentChildren(), key);
current.changeCurrentChildren(1);
current.sortKeysAndOffsets();
io.setBookNode(current.getMyOffset(), current);
return null;
}
else //not enough room must split
{ //add offset and key to end of node and sort
current.setChildKey(current.getCurrentChildren(), key);
current.setChildOffset(current.getCurrentChildren(), offset);
current.changeCurrentChildren(1);
current.sortKeysAndOffsets();
int start = current.getCurrentChildren() / 2;
long newNodeOffset =Long.MAX_VALUE;
SplitBucket bucket = new SplitBucket();
BookNode newNode = new BookNode();
newNode.switchLeaf(true);
for(int i = start; i < Constants.childSize; i++)
{
//new node will hold the larger split values
newNode.setChildKey(i - start, current.getChildKey(i));
newNode.setChildOffset(i - start, current.getChildOffset(i));
newNode.setNode(i - start, current.getNode(i));
newNode.changeCurrentChildren(1);
current.setChildKey(i, Integer.MAX_VALUE);
current.setChildOffset(i, Long.MAX_VALUE);
current.setNode(i, null);
current.changeCurrentChildren(-1);
}
//since sorted prior to for loop all data
//needs not to be sorted again
newNode.sortKeysAndOffsets();
current.sortKeysAndOffsets();
//Transferring pre-split nodes 'next' pointer to new node
newNode.setChildOffset(Constants.childSize,
current.getChildOffset(Constants.childSize));
newNode.setNode(Constants.childSize,
current.getNode(Constants.childSize));
newNodeOffset = io.insertNewNode(newNode);
newNode.setMyOffset(newNodeOffset);
current.setChildOffset(Constants.childSize, newNodeOffset);
current.setNode(Constants.childSize, newNode);
io.setBookNode(current.getMyOffset(), current);
bucket.key = newNode.getChildKey(0);
bucket.offset = newNode.getMyOffset();
bucket.node = newNode;
return bucket;
}
}
else //not at a leaf
{
int index = 0;
//find pointer index to follow
while (index < current.getCurrentChildren()
&& key >= current.getChildKey(index))
{
index++;
}
//recursive call
SplitBucket bucket = add(current.getNode(index), key, offset);
if(bucket != null) //split occurred
{
//bucket not full so add here
if(current.getCurrentChildren() < Constants.childSize)
{
current.setChildKey(current.getCurrentChildren(), bucket.key);
current.setChildOffset(current.getCurrentChildren(),
bucket.offset);
current.setNode(current.getCurrentChildren(), bucket.node);
current.changeCurrentChildren(1);
current.sortKeysAndOffsets();
io.setBookNode(current.getMyOffset(), current);
bucket = null;
}
else //bucket is full so split
{
int start = current.getCurrentChildren() / 2;
long newNodeOffset = Long.MAX_VALUE;
BookNode newNode = new BookNode();
for(int i = start; i < Constants.childSize; i++)
{
//larger keys go to the new node
newNode.setChildKey(i - start, current.getChildKey(i));
newNode.setChildOffset(i - start,
current.getChildOffset(i));
newNode.setNode(i - start, current.getNode(i));
newNode.changeCurrentChildren(1);
current.setChildKey(i, Integer.MAX_VALUE);
current.setChildOffset(i, Long.MAX_VALUE);
current.setNode(i, null);
current.changeCurrentChildren(-1);
}
if(bucket.key > newNode.getChildKey(0)) //goes in new bucket
{
newNode.setChildKey(newNode.getCurrentChildren(),
bucket.key);
newNode.setChildOffset(newNode.getCurrentChildren(),
bucket.offset);
newNode.setNode(newNode.getCurrentChildren(),
bucket.node);
newNode.changeCurrentChildren(1);
newNode.sortKeysAndOffsets();
}
else //goes in old bucket
{
current.setChildKey(current.getCurrentChildren(),
bucket.key);
current.setChildOffset(current.getCurrentChildren(),
bucket.offset);
current.setNode(current.getCurrentChildren(),
bucket.node);
current.changeCurrentChildren(1);
current.sortKeysAndOffsets();
}
//may not need this line and next
newNode.setChildOffset(newNode.getCurrentChildren(),
current.getChildOffset(Constants.childSize));
newNode.setNode(newNode.getCurrentChildren(),
current.getNode(Constants.childSize));
newNodeOffset = io.insertNewNode(newNode);
newNode.setMyOffset(newNodeOffset);
io.setBookNode(current.getMyOffset(), current);
bucket = new SplitBucket();
//return middle key value of split node
bucket.key = newNode.getChildKey(
newNode.getCurrentChildren() /2);
bucket.offset = newNode.getMyOffset();
bucket.node = newNode;
return bucket;
}
}
}
return null;
}

Write a test case, or a 'main' method, for the test that fails. Then you can breakpoint & debug just that situation.
Put logging in your code, to output the important/ decisive information & things it's doing -- so you can see where it's going wrong.
Don't log uninteresting stuff -- log the API calls, which nodes are being created/ updated & which key ranges are being split. Log what really tells you what's going on.
If you don't like logging, you step thru & debug. It's not as efficient/ productive as using logging to debug & engineer your code, though.

Related

Linked List sorting issue

YES, this is a homework project.
That being said, I'm looking to learn from my mistakes rather than just have someone do it for me.
My project is a word frequency list - I accept a text file (or website URL) and count the:
- Number of unique words, and
- How many times they appear.
All methods are provided for me except for one: the insert(E word) method, where the argument is a generic type word.
The word is stored in a Node (Linked List project) that also has a 'count' value, which is the value representing the number of times the word appears in the text being read.
What this method has to do is the following:
If the argument is already in the list, increment the count of that element. I have done this part
If the argument is not found in the list, append it to the list. I also have done this part.
sort the list by descending count value. i.e. highest -> lowest count
3.5. If two elements have the same count value, they are sorted by the dictionary order of their word.
I am VERY unfamiliar with Linked Lists, so as such I am running into a lot of NullPointerExceptions. This is my current insert method:
public void insert(E word){
if(word.equals("")){
return;
}
if(first == null){//if list is null (no elements)
/*Node item = new Node(word);
first = item;*/
first = new Node(word);
}
else{//first != null
Node itemToAdd = new Node(word);
boolean inList = false;
for(Node x = first; x != null; x=x.next){
if (x.key.equals(word)){// if word is found in list
x.count++;//incr
inList = true;//found in list
break;//get out of for
}//end IF
if(x.next == null && inList == false){//if end of list && not found
x.next = itemToAdd;//add to end of list
break;
}//end IF
}//end FOR
//EVERYTHING ABOVE THIS LINE WORKS.
if (!isSorted()){
countSort();
}
}//end ELSE
}//end method
My isSorted() method:
public boolean isSorted(){
for(Node copy = first; copy.next != null; copy = copy.next){
if (copy.count < copy.next.count){
return false;
}
}
return true;
}
and last but not least, the part where I'm struggling, the sort method:
public void countSort(){
for (Node x = first, p = x.next; p != null; x=x.next, p=p.next){
// x will start at the first Node, P will always be 1 node ahead of X.
if(x == first && (x.count < p.count)){
Node oldfirst = first;
x.next = p.next;
first = p;
first.next = oldfirst;
break;
}
if (x.count < p.count){
//copy.next == x.
Node oldfirst = first;
oldfirst.next = first.next;
x.next = p.next;
first = p;
first.next = oldfirst;
break;
}
if (x.count == p.count){
if(x.toString().charAt(0) < p.toString().charAt(0)){
//[x]->[p]->[q]
Node oldfirst = first;
x.next = p.next;
first = p;
first.next = oldfirst;
break;
}
}
}
}
Here is the output of my insert method when called by the classes/methods given to me:
Elapsed time:0.084
(the,60)
(of,49)
(a,39)
(is,46)
(to,36)
(and,31)
(can,9)
(in,19)
(more,7)
(thing,7)
(violent,3)
(things,3)
(from,9)
(collected,1)
(quotes,1)
(albert,1)
(einstein,2)
(any,2)
(intelligent,1)
(fool,1)
(make,1)
(bigger,1)
(complex,1)
(it,11)
(takes,1)
(touch,1)
(genius,1)
(lot,1)
(courage,1)
(move,1)
(opposite,1)
(direction,1)
(imagination,1)
(important,5)
(than,3)
(knowledge,3)
(gravitation,1)
(not,17)
(responsible,1)
(for,14)
(people,2)
(falling,1)
(love,2)
(i,13)
(want,1)
(know,3)
(god,4)
(s,8)
(thoughts,2)
(rest,2)
(are,11)
(details,2)
(hardest,1)
(world,7)
(understand,3)
(income,1)
(tax,1)
(reality,3)
(merely,1)
(an,7)
(illusion,2)
(albeit,1)
(very,3)
(persistent,2)
(one,12)
(only,7)
(real,1)
(valuable,1)
(intuition,1)
(person,1)
(starts,1)
(live,2)
(when,3)
(he,11)
(outside,1)
(himself,4)
(am,1)
(convinced,1)
(that,14)
(does,5)
(play,2)
(dice,1)
(subtle,1)
(but,8)
(malicious,1)
(weakness,2)
(attitude,1)
(becomes,1)
(character,1)
(never,3)
(think,1)
(future,2)
(comes,1)
(soon,1)
(enough,1)
(eternal,1)
(mystery,1)
(its,4)
(comprehensibility,1)
(sometimes,1)
My initial idea has been to try and loop the if(!isSorted()){ countSort();} part to just repeatedly run until it's sorted, but I seem to run into an infinite loop when doing that. I've tried following my professor's lecture notes, but unfortunately he posted the previous lecture's notes twice so I'm at a loss.
I'm not sure if it's worth mentioning, but they provided me an iterator with methods hasNext() and next() - how can I use this as well? I can't imagine they'd provide it if it were useless.
Where am I going wrong?
You are close. First the function to compare the items is not complete, so isSorted() could yield wrong results (if the count is the same but the words are in wrong order). This is also used to sort, so it's best to extract a method for the comparison:
// returns a value < 0 if a < b, a value > 0 if a > b and 0 if a == b
public int compare(Node a, Node b) {
if (a.count == b.count)
return a.word.compareTo(b.word);
// case-insensitive: a.word.toLoweCase().compareTo(b.word.toLowerCase())
} else {
return a.count - b.count;
}
}
Or simplified which is enough in your case:
public boolean correctOrder(Node a, Node b) {
if (a.count > b.count)
return true;
else if (a.count < b.count)
return false;
else
return a.word.compareTo(b.word) <= 0;
}
For the sort you seem to have chosen bubble sort, but you are missing the outer part:
boolean change;
do {
change = false;
Node oldX = null;
// your for:
for (Node x = first; x.next != null; x = x.next) {
if (!correctOrder(x, x.next)) {
// swap x and x.next, if oldX == null then x == first
change = true;
}
oldX = x;
}
} while (change);
We could use the help of Java native library implementation or more efficient sort algorithms, but judging from the exercise the performance of the sort algorithm is of no concern yet, first need to grasp basic concepts.
With looking your codes, it sounds like to me that two things can be done:
Firstly, you can make use of Comparable class method. So, I assume you wrote the class Node, thus you may want to inherit from Comparable class. When you inherited from that class, java will automatically provide you the compareTo method, and all you need to do is to specify in that method that "I want to compare according to your counts and I want it to be in ascending order."
**Edit(1):By the way, I forgot the mention before but after you impelement your compareTo method, you can use Collections.sort(LinkedList list), and it will be done.
The second solution came to mind is that you can sort your list during the countSort() operation with the technique of adding all to an another list with sorting and after add all them back to the real list. The sorting technique I'm trying to say is, keep going towards to the end of the list until you find a Node in the list that has a count smaller than currently adding Node's counts. Hope that doesn't confuse your head, but by this way you can achieve more clear method and less complicated view. To be clear I want to repeat the procedure:
Look the next
If (next is null), add it //You are at the end.
else{
if (count is smaller than current count), add it there
else, keep moving to the next Node. //while can be used for that.
}

Binary search in Java - learning it "my way"

So I'm trying to teach myself how to implement a binary search in Java, as the topic might have given away, but am having some trouble.
See, I tend to be a little stubborn, and I'd rather not just copy some implementation off the internet.
In order to teach myself this, I created a very (VERY) rough little class which looks as follows:
public class bSearch{
/**
* #param args
*/
public static void main(String[] args) {
int one = 1;
int two = 2;
int three = 3;
int four = 4;
int five = 5;
int six = 6;
ArrayList tab = new ArrayList();
tab.add(one);
tab.add(two);
tab.add(three);
tab.add(four);
tab.add(five);
tab.add(six);
System.out.println(bSearch(tab, 53));
}
#SuppressWarnings({ "rawtypes", "unchecked" })
public static int bSearch(ArrayList tab, int key) {
if (tab.size() == 0)
return 0;
if ((int) tab.get(tab.size() / 2) == key)
return key;
ArrayList smallerThanKey = new ArrayList();
ArrayList largerThanKey = new ArrayList();
for (int i = 0; i < (tab.size() + 1) / 2; i++) {
smallerThanKey.add(tab.get(i));
}
System.out.println("Smaller array = " + smallerThanKey);
for (int i = (tab.size() + 1) / 2; i < tab.size(); i++) {
largerThanKey.add(tab.get(i));
}
System.out.println("Larger array = " + largerThanKey);
if (key < (int) tab.get(tab.size() / 2)) {
bSearch(smallerThanKey, key);
} else {
bSearch(largerThanKey, key);
}
return key;
}
}
As you can see, it's pretty far from beautiful, but it's clear enough for a noobie like myself to understand, anyway.
Now, here's the problem; when I feed it a number that is in the ArrayList, it feeds the number back to me (hurray!), but when I feed it a number that's not in the ArrayList, it still feeds me my number back to me (boo!).
I have a feeling my error is very minor, but I just can't see it.
Or am I all wrong, and there is some larger fundamental error?
Your help is deeply appreciated!
UPDATE
Thanks for all the constructive comments and answers! Many helpful pointer in the right direction by several of you. +1 for everyone who bumped me along the right path.
By following the advice you gave, mostly relating to my recursions not ending properly, I added a few return statements, as follows;
if (key < (int) tab.get(tab.size() / 2)) {
return bSearch(smallerThanKey, key);
} else {
return bSearch(largerThanKey, key);
}
Now, what this does is one step closer to what I want to achieve.
I now get 0 if the number is nowhere to be found, and the number itself if it is to be found. Thus progress is being made!
However, it does not work if I have it search for a negative number or zero (not that I know why I should, but just throwing that out there).
Is there a fix for this, or am I barking up the wrong tree in questioning?
EDIT
Just as a quick solution to the exact question you're asking: you need to change the last few lines to the following
if (key < (int) tab.get(tab.size() / 2)) {
return bSearch(smallerThanKey, key);
} else {
return bSearch(largerThanKey, key);
}
}
Having said that, let me point out a few more issues that I see here:
(a) you can use generics. That is use ArrayList<Integer> rather than just ArrayList this will save you from all those casts.
(b) Instead of returning the value that you found you'd be better off returning the index in the ArrayList where the value is located, or -1 if it was not found. Here's why: returning the key provides the caller with very little new information. I mean - the caller already known what key is. If you return the index to the key you let the caller know if the key was found or not, and if it was found where in the list it resides.
(c) You essentially copying the entire list each time you go into bSearch(): you copy roughly half of the list into smallerThanKey and (roughly) half into greaterThanKey. This means that the complexity of this implementation is not O(log n) but instead O(n).
(EDIT #2)
Summarizing points (a), (b), (c) here's how one could write that method:
public static int bSearch(ArrayList<Integer> tab, int key) {
return bSearch(tab, 0, tab.size(), key);
}
public static int bSearch(ArrayList<Integer> tab, int begin, int end, int key) {
int size = end - begin;
if (size <= 0)
return -1;
int midPoint = (begin + end) / 2;
int midValue = tab.get(midPoint);
if (midValue == key)
return midPoint;
if (key < midValue) {
return bSearch(tab, begin, midPoint, key);
} else {
return bSearch(tab, midPoint + 1, end, key);
}
}
As you can see, I added a second method that takes a begin, end parameters. These parameters let the method which part of the list it should look at. This is much cheaper than creating a new list and copying elements to it. Instead, the recursive function just uses the list object and simply calls itself with new begin, end values.
The return value is now the index of the key inside the list (or -1 if not found).
Your recursion is not properly ended. At the end of the method you recursively call the bSearchmethod for the left or right part of the array. At that point you need to return the search result of the recursive calls.
The idea of the binary search is: If your current node is not the key, look at the left if the value of the current node is bigger than the key or look at the right if it is smaller. So after looking there you need to return the search result from there.
if (key < (int) tab.get(tab.size() / 2)) {
return bSearch(smallerThanKey, key);
} else {
return bSearch(largerThanKey, key);
}
As a side remark, have a look at System.arraycopy and it is always a good idea to not suppress warnings.
I think the issue is here:
if (key < (int) tab.get(tab.size() / 2)) {
bSearch(smallerThanKey, key);
} else {
bSearch(largerThanKey, key);
}
return key;
You're just throwing away the result of your recursive call to bSearch and returning key. So it isn't really much of a surprise you get back whatever number you feed into the method.
Remember how binary search is supposed to work -- if the value isn't in the middle, return the result of searching in the left/right half of the array. So you need to do something with those recursive calls....
And with binary search, you really should be more concerned about finding the location of whatever you're looking for, not its value -- you know that already! So what you think was the binary search working right was a bit mistaken -- searching for 1 should have returned 0 -- the index/location of 1.
Also, you shouldn't need to deal with copying arrays and such -- that's an operation that is unnecessary for searches. Just use parameters to indicate where to begin/end searching.

Why is my probability so far off with this Binary Tree using?

It's basically just an implementation of the Huffman Coding algorithm, but when I check the probability of the end BinaryTree (the only item in the queue left) it's grossly high.
// Make a BinaryTree for each item in CharOccurrences and add as an entry in initialQueue
for (int i = 0; i < charOccurrences.size(); i++) {
BinaryTree<CharProfile> bTree = new BinaryTree<CharProfile>();
bTree.makeRoot(charOccurrences.get(i));
initialQueue.add(bTree);
}
// Create the BinaryTree that we're adding to the resultQueue
BinaryTree<CharProfile> treeMerge = new BinaryTree<CharProfile>();
// Create the CharProfile that will hold the probability of the two merged trees
CharProfile data;
while (!initialQueue.isEmpty()) {
// Check if the resultQueue is empty, in which case we only need to look at initialQueue
if (resultQueue.isEmpty()) {
treeMerge.setLeft(initialQueue.remove());
treeMerge.setRight(initialQueue.remove());
// Set treeMerge's data to be the sum of its two child trees' probabilities with a null char value
data = new CharProfile('\0');
data.setProbability(treeMerge.getLeft().getData().getProbability() + treeMerge.getRight().getData().getProbability());
treeMerge.setData(data);
}
else {
// Set the left part of treeMerge to the lowest of the front of the two queues
if (initialQueue.peek().getData().getProbability() <= resultQueue.peek().getData().getProbability()) {
treeMerge.setLeft(initialQueue.remove());
}
else {
treeMerge.setLeft(resultQueue.remove());
}
if (!initialQueue.isEmpty()) {
// Set the right part of treeMerge to the lowest of the front of the two queues
if (initialQueue.peek().getData().getProbability() <= resultQueue.peek().getData().getProbability()) {
treeMerge.setRight(initialQueue.remove());
}
else {
treeMerge.setRight(resultQueue.remove());
}
}
// In the case that initialQueue is now empty (as a result of just dequeuing the last element), simply make the right tree resultQueue's head
else {
treeMerge.setRight(resultQueue.remove());
}
// Set treeMerge's data to be the sum of its two child trees' probabilities with a null char value
data = new CharProfile('\0');
data.setProbability(treeMerge.getLeft().getData().getProbability() + treeMerge.getRight().getData().getProbability());
treeMerge.setData(data);
}
// Add the new tree we create to the resultQueue
resultQueue.add(treeMerge);
}
if (resultQueue.size() > 1) {
while (resultQueue.size() != 1) {
treeMerge.setLeft(resultQueue.remove());
treeMerge.setRight(resultQueue.remove());
data = new CharProfile('\0');
data.setProbability(treeMerge.getLeft().getData().getProbability() + treeMerge.getRight().getData().getProbability());
treeMerge.setData(data);
resultQueue.add(treeMerge);
}
}
I then have this at the end:
System.out.println("\nProbability of end tree: "
+ resultQueue.peek().getData().getProbability());
Which gives me:
Probability of end tree: 42728.31718061674
Move the following lines inside the while loop:
// Create the BinaryTree that we're adding to the resultQueue
BinaryTree<CharProfile> treeMerge = new BinaryTree<CharProfile>();
Otherwise, one iteration adds treeMerge to resultQueue, and the next one may do treeMerge.setLeft(resultQueue.remove());, which makes treeMerge a child of itself...

Level Order tree Traversal for a generic tree, displaying the tree level by level

I would like to display the tree structure level by level. My current code does a BFS or Level Order Traversal but I cannot get the output to display the tree structure like a tree
See current output and Expected output.
My idea was to use some kind of count to iterate over elements from the same level in the queue.
How can I do so.
Original Code without this function can be found in the link below in case someone needs the entire implementation else just look at the displayBFS function below.
Level Order traversal of a generic tree(n-ary tree) in java
Thanks!
void displayBFS(NaryTreeNode n)
{
Queue<NaryTreeNode> q = new LinkedList<NaryTreeNode>();
System.out.println(n.data);
while(n!=null)
{
for(NaryTreeNode x:n.nary_list)
{
q.add(x);
System.out.print(x.data + " ");
}
n=q.poll();
System.out.println();
}
}
Current Tree Structure for reference:
root(100)
/ | \
90 50 70
/ \
20 30 200 300
Current Output:
100
90 50 70
20 30
200 300
Expected Output
100
90 50 70
20 30 200 300
Also, I had posted a logic issue with the same function earlier, as that was answered and the current question realtes to a different problem, I posted a new question, is this approach okay or should I make edits to the earlier question and not open a new one?
only need to keep track of current level and next level.
static void displayBFS(NaryTreeNode root) {
int curlevel = 1;
int nextlevel = 0;
LinkedList<NaryTreeNode> queue = new LinkedList<NaryTreeNode>();
queue.add(root);
while(!queue.isEmpty()) {
NaryTreeNode node = queue.remove(0);
if (curlevel == 0) {
System.out.println();
curlevel = nextlevel;
nextlevel = 0;
}
for(NaryTreeNode n : node.nary_list) {
queue.addLast(n);
nextlevel++;
}
curlevel--;
System.out.print(node.data + " ");
}
}
when you switch levels, swap nextlevel for currentlevel and reset nextlevel. i prefer the simplicity of this over keeping a whole separate queue.
i had this question for a microsoft interview last week... it didn't go so well for me over the phone. good on you for studying it.
The simplest solution I know of to this problem is to use a sentinel. The queue is initialized with the root node followed by the sentinel, and then you loop through the queue:
remove the front element
if it is the sentinel:
we're at the end of a level, so we can end the output line
if the queue is not empty, push the sentinel back onto the queue at the end.
if it is not the sentinel:
print it out
push all its children onto the queue.
I don't do Java, but I have some C++ code for depth-aware BFS, which I stripped down to do this printing task:
void show_tree_by_levels(std::ostream& os, Node* tree) {
Node* sentinel = new Node;
std::deque<Node*> queue{tree, sentinel};
while (true) {
Node* here = queue.front();
queue.pop_front();
if (here == sentinel) {
os << std::endl;
if (queue.empty())
break;
else
queue.push_back(sentinel);
} else {
for (Node* child = here->child; child; child = child->sibling)
queue.push_back(child);
os << here->value << ' ';
}
}
}
Note that I prefer to use a two-pointer solution (first_child/next_sibling), because it usually works out to be simpler than embedded lists. YMMV.
Use another queue to indicate depth.
The below code is not tested, but it should give you the idea (the sep variable is introduced to avoid trailing white-spaces):
void displayBFS(NaryTreeNode n) {
Queue<NaryTreeNode> q = new LinkedList<NaryTreeNode>();
Queue<Integer> depth = new LinkedList<Integer>();
q.add(n);
depth.add(0);
String sep = "";
int oldDepth = 0
while(!q.isEmpty()) {
NaryTreeNode currN = q.poll();
int currDepth = depth.poll();
if (currDepth > oldDepth) {
System.out.println();
oldDepth = currDepth;
sep = "";
}
System.out.print(sep + currN.data);
sep = " ";
for(NaryTreeNode x : currN.nary_list) {
q.add(x);
depth.add(currDepth + 1);
}
}
}
For my taste this approach is more self-explanatory compared to other ways one could do it.
I think we need three more variables. numInCurrentLevel for keeping track of the number of elements in current level, indexInCurrentLevel for doing the count when traversal in current level and numInNextLevel for keeping track of the number of elements in next level. The code is below:
static void displayBFS(NaryTreeNode root) {
Queue<NaryTreeNode> q = new LinkedList<NaryTreeNode>();;
q.add(root);
int numInCurrentLevel = 1;
int numInNextLevel = 0;
int indexInCurrentLevel=0;
while(!q.isEmpty()) {
NaryTreeNode node = q.poll();
System.out.print(node.data + " ");
indexInCurrentLevel++;
for(NaryTreeNode n : node.nary_list) {
q.add(n);
numInNextLevel++;
}
//finish traversal in current level
if(indexInCurrentLevel==numInCurrentLevel) {
System.out.println();
numInCurrentLevel=numInNextLevel;
numInNextLevel=0;
indexInCurrentLevel=0;
}
}
}
Hope it helps, I am not so familiar with java programming.
def printLevelWiseTree(tree):
q= queue.Queue()
if tree == None:
return None
q.put(tree)
while (not(q.empty())):
c = q.get()
print(c.data,end=":")
for i in range(len(c.children)):
if i != len(c.children)-1:
print(c.children[i].data,end=",")
else:
print(c.children[i].data,end="")
q.put(c.children[i])
print()

Huffman Tree Encoding

My Huffman tree which I had asked about earlier has another problem! Here is the code:
package huffman;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.PriorityQueue;
import java.util.Scanner;
public class Huffman {
public ArrayList<Frequency> fileReader(String file)
{
ArrayList<Frequency> al = new ArrayList<Frequency>();
Scanner s;
try {
s = new Scanner(new FileReader(file)).useDelimiter("");
while (s.hasNext())
{
boolean found = false;
int i = 0;
String temp = s.next();
while(!found)
{
if(al.size() == i && !found)
{
found = true;
al.add(new Frequency(temp, 1));
}
else if(temp.equals(al.get(i).getString()))
{
int tempNum = al.get(i).getFreq() + 1;
al.get(i).setFreq(tempNum);
found = true;
}
i++;
}
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return al;
}
public Frequency buildTree(ArrayList<Frequency> al)
{
Frequency r = al.get(1);
PriorityQueue<Frequency> pq = new PriorityQueue<Frequency>();
for(int i = 0; i < al.size(); i++)
{
pq.add(al.get(i));
}
/*while(pq.size() > 0)
{
System.out.println(pq.remove().getString());
}*/
for(int i = 0; i < al.size() - 1; i++)
{
Frequency p = pq.remove();
Frequency q = pq.remove();
int temp = p.getFreq() + q.getFreq();
r = new Frequency(null, temp);
r.left = p;
r.right = q;
pq.add(r); // put in the correct place in the priority queue
}
pq.remove(); // leave the priority queue empty
return(r); // this is the root of the tree built
}
public void inOrder(Frequency top)
{
if(top == null)
{
return;
}
else
{
inOrder(top.left);
System.out.print(top.getString() +", ");
inOrder(top.right);
return;
}
}
public void printFreq(ArrayList<Frequency> al)
{
for(int i = 0; i < al.size(); i++)
{
System.out.println(al.get(i).getString() + "; " + al.get(i).getFreq());
}
}
}
What needs to be done now is I need to create a method that will search through the tree to find the binary code (011001 etc) to the specific character. What is the best way to do this? I thought maybe I would do a normal search through the tree as if it were an AVL tree going to the right if its bigger or left if it's smaller.
But because the nodes don't use ints doubles etc. but only using objects that contain characters as strings or null to signify its not a leaf but only a root. The other option would be to do an in-order run through to find the leaf that I'm looking for but at the same time how would I determine if I went right so many times or left so many times to get the character.
package huffman;
public class Frequency implements Comparable {
private String s;
private int n;
public Frequency left;
public Frequency right;
Frequency(String s, int n)
{
this.s = s;
this.n = n;
}
public String getString()
{
return s;
}
public int getFreq()
{
return n;
}
public void setFreq(int n)
{
this.n = n;
}
#Override
public int compareTo(Object arg0) {
Frequency other = (Frequency)arg0;
return n < other.n ? -1 : (n == other.n ? 0 : 1);
}
}
What I'm trying to do is find the binary code to actually get to each character. So if I were trying to encode aabbbcccc how would I create a string holding the binary code for a going left is 0 and going right is 1.
What has me confused is because you can't determine where anything is because the tree is obviously unbalanced and there is no determining if a character is right or left of where you are. So you have to search through the whole tree but if you get to a node that isn't what you are looking for, you have backtrack to another root to get to the other leaves.
Traverse through the huffman tree nodes to get a map like {'a': "1001", 'b': "10001"} etc. You can use this map to get the binary code to a specific character.
If you need to do in reverse, just handle it as a state machine:
state = huffman_root
for each bit
if (state.type == 'leaf')
output(state.data);
state = huffman_root
state = state.leaves[bit]
Honestly said, I didn't look into your code. It ought be pretty obvious what to do with the fancy tree.
Remember, if you have 1001, you will never have a 10010 or 10011. So your basic method looks like this (in pseudocode):
if(input == thisNode.key) return thisNode.value
if(input.endsWith(1)) return search(thisNode.left)
else return search(thisNode.right)
I didn't read your program to figure out how to integrate it, but that's a key element of huffman encoding in a nutshell
Try something like this - you're trying to find token. So if you wanted to find the String for "10010", you'd do search(root,"10010")
String search(Frequency top, String token) {
return search(top,token,0);
}
// depending on your tree, you may have to switch top.left and top.right
String search(Frequency top, String token, int depth) {
if(token.length() == depth) return "NOT FOUND";
if(token.length() == depth - 1) return top.getString();
if(token.charAt(depth) == '0') return search(top.left,token,depth+1);
else return search(top.right,token,depth+1);
}
I considered two options when I was having a go at Huffman coding encoding tree.
option 1: use pointer based binary tree. I coded most of this and then felt that, to trace up the tree from the leaf to find an encoding, I needed parent pointers. other wise, like mentioned in this post, you do a search of the tree which is not a solution to finding the encoding straight away. The disadvantage of the pointer based tree is that, I have to have 3 pointers for every node in the tree which I thought was too much. The code to follow the pointers is simple but more complicated that in option 2.
option 2: use an array based tree to represent the encoding tree that you will use on the run to encode and decode. so if you want the encoding of a character, you find the character in the array. Pretty straight forward, I use a table so smack right and there I get the leaf. now I trace up to the root which is at index 1 in the array. I do a (current_index / 2) for the parent. if child index is parent /2 it is a left and otherwise right.
option 2 was pretty easy to code up and although the array can have a empty spaces. I thought it was better in performance than a pointer based tree. Besides identifying the root and leaf now is a matter of indices rather than object type. ;) This will also be very usefull if you have to send your tree!?
also, you dont search (root, 10110) while decoding the Huffman code. You just walk the tree through the stream of encoded bitstream, take a left or right based on your bit and when you reach the leaf, you output the character.
Hope this was helpful.
Harisankar Krishna Swamy (example)
I guess your homework is either done or very late by now, but maybe this will help someone else.
It's actually pretty simple. You create a tree where 0 goes right and 1 goes left. Reading the stream will navigate you through the tree. When you hit a leaf, you found a letter and start over from the beginning. Like glowcoder said, you will never have a letter on a non-leaf node. The tree also covers every possible sequence of bits. So navigating in this way always works no matter the encoded input.
I had an assignment to write an huffman encoder/decoder just like you a while ago and I wrote a blog post with the code in Java and a longer explanation : http://www.byteauthor.com/2010/09/huffman-coding-in-java/
PS. Most of the explanation is on serializing the huffman tree with the least possible number of bits but the encoding/decoding algorithms are pretty straightforward in the sources.
Here's a Scala implementation: http://blog.flotsam.nl/2011/10/huffman-coding-done-in-scala.html

Categories

Resources