BST in Java -- search Nodes by String - java

I am creating a binary search tree using Nodes with MovieInfo objects as keys. MovieInfo objects are objects with three fields: ID, fullName, and shortName. The binary search tree will store information on an input text file containing every movie listed on IMDB. The insert would be based on an ID randomly associated with each movie. The search features would have a user input a String and pull up the other data of the object (shortName, ID, etc).
Now -- the errors I am having are with the findExact method.
First, no matter what my input is -- I get the message "Current Node is null" which means my starting currentNode is ALWAYS going null. The other issue is, I am not sure how to make sure my root node (the first currentNode searched for in the tree) is initialized properly. I have a feeling this might have something to do with my problem.
The other problem might lie in how I am inserting the nodes in the first place.
And for reference, the way I am testing this code/running in the text input file is IndexTester.java.
UPDATE: Okay. Everything works now. The only issue I am having now is that my findExact method seems to change the ID field of the MovieInfo class. So my search does not end up working. For example, if I input:
1568482 2 White Kids with Problems 2 White Kids with Problems
1568487 Disorient Disorient
1568488 DIY Playgrounds DIY Playgrounds
and search "disorient," the return is "1568488 Disorient Disorient" (with the ID for the "2 White Kids with Problems" object). Additionally, since the ID is taken...DIY Playgrounds can't be searched for succesfully. The insertion method works, but the findExact method is giving me issues. Any ideas as to what might be causing this change in the ID?
The MovieInfo class (separate .java file) -- can't be edited
public class MovieInfo {
public String shortName;
public String fullName;
static int ID;
public String key;
public MovieInfo(int id, String s, String f) {
ID = id;
shortName = s;
fullName = f;
}
}
BSTIndex.java -- the class in which I create the internal BST
import java.util.*;
import java.io.*;
public class BSTIndex extends MovieInfo {
public Node root;
public class Node{
public MovieInfo movie;
public Node left, right;
public Node(MovieInfo movie)
{
this.movie = movie;
//this.val = val;
//this.N = N;
}
}
public BSTIndex()
/**
* constructor to initialize the internal binary search tree.
* The data element should be an object of the type MovieInfo, described above.
*/
{
super(0, "", "");
}
public MovieInfo findExact(String key) {
return findExact(root, key);
}
private MovieInfo findExact(Node x, String key) {
if (x == null) return null;
int cmp = key.compareToIgnoreCase(x.movie.fullName);
if (cmp < 0) return findExact(x.left, key);
else if (cmp > 0) return findExact(x.right, key);
else return x.movie;
}
public void insert(MovieInfo data)
{
if (data == null) {return; }
root = insert(root, data);
}
public Node insert(Node x, MovieInfo data)
{
if (x == null) return new Node(data);
int cmp = data.ID - x.movie.ID;
if (cmp > 0 ) x.right = insert(x.right, data);
else if (cmp < 0) x.left = insert(x.left, data);
else if (cmp == 0) x.movie = data;
return x;
}
}
IndexTester.java
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class IndexTester {
// Test program
public static void main( String [ ] args ) throws FileNotFoundException
{
BSTIndex t = new BSTIndex( );
Scanner scan = new Scanner(new FileInputStream(args[0]));
long start = System.currentTimeMillis();
int i=0;
while(scan.hasNext()){
String line = scan.nextLine();
String [] fields = line.split("\t");
int id = Integer.parseInt(fields[0].trim());
String shortName = fields[1].trim();
String fullName = fields[2].trim();
MovieInfo info = new MovieInfo(id, shortName, fullName);
t.insert(info);
i++;
if(i % 10000 == 0){
System.out.println("Inserted " + i + " records.");
}
}
long end = System.currentTimeMillis();
System.out.println("Index building complete. Inserted " + i +
" records.Elapsed time = " + (end - start )/1000 + " seconds.");
Scanner input = new Scanner(System.in);
System.out.println("Enter search string, end in a '*' for
prefix search. q to quit");
while(input.hasNext()){
String search = input.nextLine().trim();
if(search.equals("q")) break;
if(search.indexOf('*')>0){
//call prefix search.
MovieInfo item = t.findPrefix(search);
if(item==null) System.out.println("Not Found");
else System.out.println(item.ID + " " + item.shortName + "
" + item.fullName);
}
else{
//call exact search, modify to return MovieInfo
MovieInfo item = t.findExact(search);
if(item==null) System.out.println("Not Found");
else System.out.println(item.ID + " " + item.shortName + "
" + item.fullName);
}
}
}
}

public class BSTIndex should not extend MovieInfo, preferably MovieInfo should extend Node.
Ok, so MovieInfo cannot be modified so I would populate the MovieInfo class with the data and set it as the nodevalue of an extended Node class.
public class BSTNode extends Node {
private BSTNode left,right;
private MovieInfo val;
public void setVal(MovieInfo val){
this.val=val;
}
public void setLeft(MovieInfo m){
left=new BSTNode(m);
}
public void setRight(MovieInfo m){
right=new BSTNode(m);
}
//override some of the Node methods
}

Related

Implementing a Circular Single Linked List using a scanner

This is a given activity where we should create a method for inserting a node and it uses a scanner for its input. So far, I can input 3 objects from the list but here comes the problem when I try to add another one:
It goes something like this: 1,2,3 and when I try to add another it goes to this 1,2,4 but want I want is this 1,2,3,4.
I deeply appreciate your help in advance.
Here is the main method:
import java.util.Scanner;
public class MySinglyLinkedCircularListMain {
public static void main(String[] args) throws ListOverflowException {
Scanner keyboard = new Scanner(System.in);
MySinglyLinkedCircularList<Node> singlyLinkedCircularList = new MySinglyLinkedCircularList<>();
while (true) {
System.out.println("+---------------------------------------------------+");
System.out.println("| Select the Number to be executed: |\n" +
"| 1) Insert an element |\n" +
"| 2) Delete an element from the list |\n" +
"| 3) Get an element from the list |\n" +
"| 4) Search an element in the list |\n" +
"| 5) Number of elements in the list |\n" +
"| 6) Show the elements in the list |");
System.out.print("+---------------------------------------------------+ \n");
System.out.print("Input your choice: ");
int intInput = keyboard.nextInt();
if (intInput == 1) {
singlyLinkedCircularList.insert(new Node(singlyLinkedCircularList));
} else if (intInput == 2) {
singlyLinkedCircularList.delete(new Node(singlyLinkedCircularList));
} else if (intInput == 3) {
singlyLinkedCircularList.getElement(new Node(singlyLinkedCircularList));
} else if (intInput == 4) {
singlyLinkedCircularList.search(new Node(singlyLinkedCircularList));
} else if (intInput == 5){
System.out.println("The current capacity of the single circular linked list is "
+ singlyLinkedCircularList.getSize());
} else if (intInput == 6) {
singlyLinkedCircularList.showAllElements();
System.out.println();
}
}
}
}
Here is the node class
public class Node<T> {
T data;
Node<T> next;
public Node(T data) {
this.data = data;
next = null;
}
public T getData() {
return data;
}
public void setData(T data) {
this.data = data;
}
public void setNext(Node<T> node) {
next = node;
}
public Node<T> getNext() {
return next;
}
}
Here's the class for the insert method
import java.util.NoSuchElementException;
import java.util.Scanner;
public class MySinglyLinkedCircularList<E> implements MyList<E> {
Scanner keyboard = new Scanner(System.in);
int size;
Node<E> startNode;
Node<E> endNode;
public MySinglyLinkedCircularList() {
size = 0;
startNode = endNode = null;
}
public int getSize() {
return size;
}
public void insert(E data) throws ListOverflowException {
System.out.print("Input the element you want: ");
data = (E) keyboard.next();
Node<E> newNode = new Node(data);
if (startNode == null) {
startNode = endNode = newNode;
startNode.next = startNode;
size++;
System.out.println("Element " + startNode.getData() + " has been stored in position "
+ getSize() + " and is now referenced itself");
} else {
Node<E> addNode = endNode;
for (int i = 0; i < getSize(); i++) {
addNode = addNode.getNext();
}
addNode.next = newNode;
newNode.next = startNode;
size++;
}
}
Here's the show elements method
public void showAllElements() {
Node<E> showNode = startNode;
int i = 0;
System.out.print("Here are the current elements: ");
while (i<getSize()) {
System.out.print(showNode.getData() + " ");
showNode = showNode.getNext();
i++;
}
System.out.print(showNode.getData());
}
Here's the interface
public interface MyList<E> {
public int getSize();
public void insert(E data) throws ListOverflowException;
public E getElement(E data) throws NoSuchElementException;
public boolean delete(E data); // returns false if the data is not deleted in the list
public boolean search(E data);
public void showAllElements();
}
This is problematic:
public void insert(E data) throws ListOverflowException {
System.out.print("Input the element you want: ");
data = (E) keyboard.next(); // ***** here *****
Your method accepts an E data parameter, insert(E data) and yet promptly discards any result held by the parameter in the second line of the method, data = (E) keyboard.next();, replacing it with Scanner input that shouldn't even be there. The MySinglyLinkedCircularList class should not have a Scanner object, nor should it take user input, but rather should handle the circular list's logic, and that's it. It shouldn't even have println statements, except perhaps temporary println's that you may usind during creation for debugging purposes only. Keep all UI code (user interface code -- Scanner input and println statements) together in the UI class.
Note, that I don't know if this is the cause of your full error, and to check for that will require more intense debugging, debugging that I urge you to do more of yourself, since it is both your responsibility to debug first, and it is a necessary and useful skill that only gets better with use. If you're not sure how to go about doing this, then please check out How to debug small programs. It won't solve your direct problem, but it will give you steps that you can follow that should help you solve it yourself, or even if that is not successful, then at least help you to better isolate your problem so that your question can be more focused and easier to answer.

Wrong Implementation of Binary Search Tree's search Function

I have a Binary Search Tree and I think one of my method is working incorrectly. The program I have is a program that separates the strings read from a file word by word and deletes the special characters in it, then transfers these words to the data structure in alphabetical order. If the same word was previously conveyed during the transmission, it increases the frequency of that word. While checking the output of my program, I saw something like this.
MY OUTPUT:
Readed Line: sun-meal //After some operation it is seperated like "sun" and "metal"
String inserted.
String inserted.
Readed Line: sun-oil //After some operation it is seperated like "sun" and "oil"
String inserted.
String inserted. //Error is here.
TRUE OUTPUT SHOULD BE:
Readed Line: sun-meal //After some operation it is seperated like "sun" and "metal"
String inserted.
String inserted.
Readed Line: sun-oil //After some operation it is seperated like "sun" and "oil"
String inserted.
Repeated String. Frequency +1. //It should be like that.
I will share my source code but what I want to know is what am I doing wrong? Why is "sun" inserted 2 times?
TreeDriver Class:
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class TreeDriver
{
public static void main(String [] args) throws FileNotFoundException {
Tree stTree = new Tree();
TreeNode compareNode;
Scanner scan = new Scanner(new File(args[0]));
while (scan.hasNextLine()) {
String data = scan.nextLine();
System.out.println("Readed Line: "+data);
String[] convertedData = data.replaceAll("[^a-zA-Z ]", " ").toLowerCase().split("\\s+");
int y = 0;
try {
while(convertedData[y] != null){
String st = convertedData[y];
if (st.contains(" ")) {
}
else{
compareNode = Tree.search(stTree.getRoot(), st);
if (compareNode != null) {
compareNode.upFreq();
System.out.println("\tRepeated String. Frequency +1.");
} else {
stTree.insert(st);
System.out.println("\tString inserted.");
}
y++;
}
}
}
catch(Exception ignored) {
}
}
scan.close();
}
}
TreeNode Class
public class TreeNode
{
private int freq; //frequency of the String in the Node
private String stValue;
private TreeNode left;
private TreeNode right;
public TreeNode(String st)
{
stValue = st;
left = null;
right = null;
freq = 1;
}
public void add(String st)
{
if (left == null)
{
left = new TreeNode(st);
}
else if (right == null)
{
right = new TreeNode(st);
}
else
{
if(countNodes(left) <= countNodes(right))
{
left.add(st);
}
else
{
right.add(st);
}
}
}
//Count the nodes in the binary tree to which root points, and
public static int countNodes( TreeNode root ) {
if ( root == null )
// The tree is empty. It contains no nodes.
return 0;
else {
// Start by counting the root.
int count = 1;
// Add the number of nodes in the left subtree.
count += countNodes(root.getLeft());
// Add the number of nodes in the right subtree.
count += countNodes(root.getRight());
return count; // Return the total.
}
}
public TreeNode getLeft(){
return left;
}
public TreeNode getRight(){
return right;
}
public String getString()
{
return stValue;
}
public void upFreq()
{
freq = freq + 1;
}
public int getFreq()
{
return freq;
}
}
Tree Class:
public class Tree
{
private TreeNode root;
public Tree()
{
root = null;
}
public boolean isEmpty()
{
return root == null;
}
public void insert(String st)
{
if (isEmpty())
{
root = new TreeNode(st);
}
else
{
root.add(st);
}
}
public TreeNode getRoot()
{
return root;
}
public static TreeNode search(TreeNode root, String st)
{
if(root == null)
{
return null;
}
else if(st.equals(root.getString()))
{
return root;
}
else
{ if (root.getLeft() != null)
return search(root.getLeft(), st);
else
return search(root.getRight(), st);
}
}
public TreeNode found(TreeNode root)
{
return root;
}
public static void preorderPrint(TreeNode root)
{
if ( root != null )
{
System.out.print( root.getString() + " " ); // Print the root item.
preorderPrint( root.getLeft() ); // Print items in left subtree.
preorderPrint( root.getRight() ); // Print items in right subtree.
}
}
}
Can you please help me find the problem?
Indeed, your search function is wrong :
if (root.getLeft() != null)
return search(root.getLeft(), st);
else
return search(root.getRight(), st);
You are going through the right child node only if the left one is null, when you should go through both.

Radix(Trie) Tree implementation for Cutomer search in Java

I am working on a project and need to search in data of millions of customers. I want to implement radix(trie) search algorithm. I have read and implement radix for a simple string collections. But Here I have a collection of customers and want to search it by name or by mobile number.
Customer Class:
public class Customer {
String name;
String mobileNumer;
public Customer (String name, String phoneNumer) {
this.name = name;
this.mobileNumer = phoneNumer;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getPhoneNumer() {
return mobileNumer;
}
public void setPhoneNumer(String phoneNumer) {
this.mobileNumer = phoneNumer;
}
}
RadixNode Class:
import java.util.HashMap;
import java.util.Map;
class RadixNode {
private final Map<Character, RadixNode> child = new HashMap<>();
private final Map<Customer, RadixNode> mobileNum = new HashMap<>();
private boolean endOfWord;
Map<Character, RadixNode> getChild() {
return child;
}
Map<Customer, RadixNode> getChildPhoneDir() {
return mobileNum;
}
boolean isEndOfWord() {
return endOfWord;
}
void setEndOfWord(boolean endOfWord) {
this.endOfWord = endOfWord;
}
}
Radix Class:
class Radix {
private RadixNode root;
Radix() {
root = new RadixNode();
}
void insert(String word) {
RadixNode current = root;
for (int i = 0; i < word.length(); i++) {
current = current.getChild().computeIfAbsent(word.charAt(i), c -> new RadixNode());
}
current.setEndOfWord(true);
}
void insert(Customer word) {
RadixNode current = root;
System.out.println("==========================================");
System.out.println(word.mobileNumer.length());
for (int i = 0; i < word.mobileNumer.length(); i++) {
current = current.getChildPhoneDir().computeIfAbsent(word.mobileNumer.charAt(i), c -> new RadixNode());
System.out.println(current);
}
current.setEndOfWord(true);
}
boolean delete(String word) {
return delete(root, word, 0);
}
boolean containsNode(String word) {
RadixNode current = root;
for (int i = 0; i < word.length(); i++) {
char ch = word.charAt(i);
RadixNode node = current.getChild().get(ch);
if (node == null) {
return false;
}
current = node;
}
return current.isEndOfWord();
}
boolean isEmpty() {
return root == null;
}
private boolean delete(RadixNode current, String word, int index) {
if (index == word.length()) {
if (!current.isEndOfWord()) {
return false;
}
current.setEndOfWord(false);
return current.getChild().isEmpty();
}
char ch = word.charAt(index);
RadixNode node = current.getChild().get(ch);
if (node == null) {
return false;
}
boolean shouldDeleteCurrentNode = delete(node, word, index + 1) && !node.isEndOfWord();
if (shouldDeleteCurrentNode) {
current.getChild().remove(ch);
return current.getChild().isEmpty();
}
return false;
}
public void displayContactsUtil(RadixNode curNode, String prefix)
{
// Check if the string 'prefix' ends at this Node
// If yes then display the string found so far
if (curNode.isEndOfWord())
System.out.println(prefix);
// Find all the adjacent Nodes to the current
// Node and then call the function recursively
// This is similar to performing DFS on a graph
for (char i = 'a'; i <= 'z'; i++)
{
RadixNode nextNode = curNode.getChild().get(i);
if (nextNode != null)
{
displayContactsUtil(nextNode, prefix + i);
}
}
}
public boolean displayContacts(String str)
{
RadixNode prevNode = root;
// 'flag' denotes whether the string entered
// so far is present in the Contact List
String prefix = "";
int len = str.length();
// Display the contact List for string formed
// after entering every character
int i;
for (i = 0; i < len; i++)
{
// 'str' stores the string entered so far
prefix += str.charAt(i);
// Get the last character entered
char lastChar = prefix.charAt(i);
// Find the Node corresponding to the last
// character of 'str' which is pointed by
// prevNode of the Trie
RadixNode curNode = prevNode.getChild().get(lastChar);
// If nothing found, then break the loop as
// no more prefixes are going to be present.
if (curNode == null)
{
System.out.println("No Results Found for \"" + prefix + "\"");
i++;
break;
}
// If present in trie then display all
// the contacts with given prefix.
System.out.println("Suggestions based on \"" + prefix + "\" are");
displayContactsUtil(curNode, prefix);
// Change prevNode for next prefix
prevNode = curNode;
}
for ( ; i < len; i++)
{
prefix += str.charAt(i);
System.out.println("No Results Found for \"" + prefix + "\"");
}
return true;
}
public void displayContactsUtil(RadixNode curNode, String prefix, boolean isPhoneNumber)
{
// Check if the string 'prefix' ends at this Node
// If yes then display the string found so far
if (curNode.isEndOfWord())
System.out.println(prefix);
// Find all the adjacent Nodes to the current
// Node and then call the function recursively
// This is similar to performing DFS on a graph
for (char i = '0'; i <= '9'; i++)
{
RadixNode nextNode = curNode.getChildPhoneDir().get(i);
if (nextNode != null)
{
displayContactsUtil(nextNode, prefix + i);
}
}
}
public boolean displayContacts(String str, boolean isPhoneNumber)
{
RadixNode prevNode = root;
// 'flag' denotes whether the string entered
// so far is present in the Contact List
String prefix = "";
int len = str.length();
// Display the contact List for string formed
// after entering every character
int i;
for (i = 0; i < len; i++)
{
// 'str' stores the string entered so far
prefix += str.charAt(i);
// Get the last character entered
char lastChar = prefix.charAt(i);
// Find the Node corresponding to the last
// character of 'str' which is pointed by
// prevNode of the Trie
RadixNode curNode = prevNode.getChildPhoneDir().get(lastChar);
// If nothing found, then break the loop as
// no more prefixes are going to be present.
if (curNode == null)
{
System.out.println("No Results Found for \"" + prefix + "\"");
i++;
break;
}
// If present in trie then display all
// the contacts with given prefix.
System.out.println("Suggestions based on \"" + prefix + "\" are");
displayContactsUtil(curNode, prefix, isPhoneNumber);
// Change prevNode for next prefix
prevNode = curNode;
}
for ( ; i < len; i++)
{
prefix += str.charAt(i);
System.out.println("No Results Found for \"" + prefix + "\"");
}
return true;
}
}
I have tried to search in a collection but got stuck. Any help / suggestion would be appreciated.
I propose you 2 ways of doing it.
First way: with a single trie.
It is possible to store all you need in a single trie. Your customer class is fine, and here is a possible RadixNode implementation.
I consider that there cannot be two customers with the same name, or with the same phone number. If it is not the case (possibility to have people with same name and different phone nb for instance) tell me in a comment I'll edit.
The thing that is important to understand, is that if you want to have two different ways of finding a customer, and you use a single trie, each customer will appear twice in your trie. Once at the end of the path corresponding to its name, and once after the end of the path corresponding to its phone number.
import java.util.HashMap;
import java.util.Map;
class RadixNode {
private Map<Character, RadixNode> children;
private Customer customer;
public RadixNode(){
this.children = new Map<Character, RadixNode>();
this.Customer = NULL;
}
Map<Character, RadixNode> getChildren() {
return children;
}
boolean hasCustomer() {
return this.customer != NULL;
}
Customer getCustomer() {
return customer;
}
void setCustomer(Customer customer) {
this.customer = customer;
}
}
As you can see, there is only one map storing the node's children. That is because we can see a phone number as a string of digits, so this trie will store all the customers ... twice. Once per name, once per phone number.
Now let's see an insert function. Your trie will need a root,n let's call it root.
public void insert(RadixNode root, Customer customer){
insert_with_name(root, customer, 0);
insert_with_phone_nb(root, customer, 0);
}
public void insert_with_name(RadixNode node, Customer customer, int idx){
if (idx == customer.getName().length()){
node.setCustomer(customer);
} else {
Character current_char = customer.getName().chatAt(idx);
if (! node.getChlidren().containsKey(current_char){
RadixNode new_child = new RadixNode();
node.getChildren().put(current_char, new_child);
}
insert_with_name(node.getChildren().get(current_char), customer, idx+1);
}
}
The insert_with_phone_nb() method is similar. This will work as long as people has unique names, unique phone numbers, and that someone's name cannot be someone's phone number.
As you can see, the method is recursive. I advice you to build your trie structure (and generally, everything based on tree structures) recursively, as it makes for simpler, and generallay cleaner code.
The search function is almost a copy-paste of the insert function:
public void search_by_name(RadixNode node, String name, int idx){
// returns NULL if there is no user going by that name
if (idx == name.length()){
return node.getCustomer();
} else {
Character current_char = name.chatAt(idx);
if (! node.getChlidren().containsKey(current_char){
return NULL;
} else {
return search_by_name(node.getChildren().get(current_char), name, idx+1);
}
}
}
Second way: with 2 tries
The principle is the same, all you have to do is reuse the code above, but keep two distinct root nodes, each of them will build a trie (one for names, one for phone numbers).
The only difference will be the insert function (as it will call insert_with_name and insert_with_phone_nb with 2 different roots), and the search function which will have to search in the right trie as well.
public void insert(RadixNode root_name_trie, RadixNode root_phone_trie, Customer customer){
insert_with_name(root_name_trie, customer, 0);
insert_with_phone_nb(root_phone_trie, customer, 0);
}
Edit: After comment precising there might be customers with the same name, here is an alternative implementation, to allow a RadixNode to contain references toward several Customer.
Replace the Customer customer attribute in RadixNode by, for example, a Vector<Customer>. The methods will have to be modified accordingly of course, and a search by name will then return to you a vector of customers (possibly empty), since this search can then lead to several results.
In your case, I'd go for a single trie, containing vectors of customers. So you can have both a search by name and phone (cast the number as a String), and a single data structure to maintain.

Getting all possible paths in a tree structure

I need to loop in a tree to get all possible paths, the problem in my code that i get just the first path!
example:
In the figure, there are 2 paths t handle: 1-2-3-4-5-6 and 1-2-3-7-8 , but i couldn't get both, i have just retrieved 1-2-3-4-5-6 !
my code:
In main:
for (String key : synset.keySet()) { // looping in a hash of Concept and it's ID
System.out.println("\nConcept: " + key + " " + synset.get(key));
List<Concept> ancts = myOntology.getConceptAncestors(myOntology.getConceptFromConceptID(synset.get(key))); // this function retreives the root of any node.
for (int i = 0; i < ancts.size(); i++) {
System.out.print(ancts.get(i).getConceptId() + " # ");
System.out.print(getChilds(ancts.get(i).getConceptId()) + " -> "); // here, the recursive function is needed to navigate into childs..
}
System.out.println("");
}
Rec. function:
public static String getChilds(String conId)
{
List<Concept> childs = myOntology.getDirectChildren(myOntology.getConceptFromConceptID(conId)); // get all childs of a node
if(childs.size() > 0)
{
for (int i = 0; i < childs.size(); i++) {
System.out.print( childs.size() + " ~ " + childs.get(i).getConceptId() + " -> ");
return getChilds(childs.get(i).getConceptId());
}
}
else
return "NULL";
return "final";
}
I didn't really see enough of your code to use the classes that you have defined. So I went for writing my own working solution.
In the following code, the problem is solved using recursion:
public class TreeNode {
private String id;
private TreeNode parent;
private List<TreeNode> children;
public TreeNode(String id) {
this.id = id;
this.children = new LinkedList<>();
}
public void addChild(TreeNode child) {
this.children.add(child);
child.setParent(this);
}
public List<TreeNode> getChildren() {
return Collections.unmodifiableList(this.children);
}
private void setParent(TreeNode parent) {
this.parent = parent;
}
public TreeNode getParent() {
return this.parent;
}
public String getId() {
return this.id;
}
}
public class TreePaths {
private static List<List<TreeNode>> getPaths0(TreeNode pos) {
List<List<TreeNode>> retLists = new ArrayList<>();
if(pos.getChildren().size() == 0) {
List<TreeNode> leafList = new LinkedList<>();
leafList.add(pos);
retLists.add(leafList);
} else {
for (TreeNode node : pos.getChildren()) {
List<List<TreeNode>> nodeLists = getPaths0(node);
for (List<TreeNode> nodeList : nodeLists) {
nodeList.add(0, pos);
retLists.add(nodeList);
}
}
}
return retLists;
}
public static List<List<TreeNode>> getPaths(TreeNode head) {
if(head == null) {
return new ArrayList<>();
} else {
return getPaths0(head);
}
}
}
To use the above code, a tree must be constructed using the TreeNode class. Start off by creating a head TreeNode, then add child nodes to it as required. The head is then submitted to the TreePaths getPaths static function.
After getPaths checks for null, the internal getPaths0 function will be called. Here we follow a depth first approach by trying to get to all leaf nodes as soon as possible. Once a leaf node is found, a List only containing this leaf node will be created and returned inside the list collection. The parent of this leaf node will then be added to the beginning of the list, which will again be put into a list collection. This will happen for all children of the parent.
In the end, all possible paths will end up in a single structure. This function can be tested as follows:
public class TreePathsTest {
TreeNode[] nodes = new TreeNode[10];
#Before
public void init() {
int count = 0;
for(TreeNode child : nodes) {
nodes[count] = new TreeNode(String.valueOf(count));
count++;
}
}
/*
* 0 - 1 - 3
* - 4
* - 2 - 5
* - 6
* - 7 - 8
* - 9
*/
private void constructBasicTree() {
nodes[0].addChild(nodes[1]);
nodes[0].addChild(nodes[2]);
nodes[1].addChild(nodes[3]);
nodes[1].addChild(nodes[4]);
nodes[2].addChild(nodes[5]);
nodes[2].addChild(nodes[6]);
nodes[2].addChild(nodes[7]);
nodes[7].addChild(nodes[8]);
nodes[7].addChild(nodes[9]);
}
#Test
public void testPaths() {
constructBasicTree();
List<List<TreeNode>> lists = TreePaths.getPaths(nodes[0]);
for(List<TreeNode> list : lists) {
for(int count = 0; count < list.size(); count++) {
System.out.print(list.get(count).getId());
if(count != list.size() - 1) {
System.out.print("-");
}
}
System.out.println();
}
}
}
This will print out:
0-1-3
0-1-4
0-2-5
0-2-6
0-2-7-8
0-2-7-9
Note: The above is enough for manual testing, but the test function should be modified to do proper assertions for proper automated unit testing.
maybe this code segment in getChilds() exist problem:
for (int i = 0; i < childs.size(); i++) {
System.out.print( childs.size() + " ~ " + childs.get(i).getConceptId() + " -> ");
return getChilds(childs.get(i).getConceptId());
}
the for loop cant play a role, it always return getChilds(childs.get(0).getConceptId());
maybe this is not what you want.
One simple way.
All you need is a tree traversal and little bit of custom code.
Have a list called tempPath. you can take it as an argument or a global variable.
Do a tree traversal(eg. inorder). Whenever you are at a node add this to tempPath list at the end and when you are done with this node remove the node from the end of tempPath.
whenever you encounter a leaf, you have one full path from root to leaf which is contained in tempPath. you can either print or copy this list value into another data structure.

Java Trie Matching using Iterator

I have an assignment which involves creating a Trie of company names (read from a file) and then reading a news article input and counting the number of times a company name from the Trie occurs in the article.
I have coded a pretty standard Trie structure, however for the assignment it made more sense to have the TrieNodes hold the full word rather than just each character.
To make things more complicated, each company name from the file has one "primary name" and can have multiple "secondary names". For example: Microsoft Corporation, Microsoft, Xbox - where the first name is always the primary.
The assignment requires that I count all matches in the article for any of the company names, but only return the company's primary name when printing the results. Because of this, my TrieNode has the String primeName datafield, along with the standard isEnd bool. However, in my case, isEnd represents whether or not the specified node and its parent(s) form a full company name.
For example, with the article input "Microsoft Corporation just released a new Xbox console." I would need to return something along the lines of "Microsoft:2" because both Microsoft Corporation and Xbox share the same primary company name which is Microsoft.
I am using an iterator in the getHits() method but when I do find a hit, I need to look at the next word in the array to make sure it is not a continuation before I decide whether to stop or continue. The problem is that calling iter.next() doesn't just "peek" the next value but it moves forward, essentially causing me to skip words.
For example, if you look at the below code and my example, after "Best" gets a hit, it should see that "Buy" is a child and the next time it loops get a match on "Buy", but since I already call iter.next() to look at "Buy" within the While loop, the next iteration entirely skips "Buy". Is there some way I can simply peek at the next iter value within the While loop without actually moving to it? Also, any improvements to this code are greatly appreciated! I am sure there are many places where I sloppily implemented something.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.*;
public class BuildTrie {
// Class Methods
public static void main(String[] args) throws IOException {
Trie Companies = new Trie();
String filename = "companies.dat";
try {
BufferedReader reader = new BufferedReader(new FileReader(filename));
String line;
while ((line = reader.readLine()) != null) {
// Split line by tab character
String[] aliases = line.replaceAll("\\p{P}", "").split("\t");
// Loop over each "alias" of specific company
for (int n = 0; n < aliases.length; n++) {
String[] name = aliases[n].split(" ");
// Insert each alias into Trie with index 0 as primary
Companies.insert(name, aliases[0]);
}
}
reader.close();
} catch (Exception e) {
System.err.format("Exception occurred trying to read '%s'.", filename);
e.printStackTrace();
}
/*System.out.println("Article Input: ");
try (BufferedReader input = new BufferedReader(new InputStreamReader(System.in))) {
String line;
while ((line = input.readLine()) != null) {
if (".".equals(line)) break;
String[] items = line.trim().replaceAll("\\p{P}", "").split("\\s+");
for (int i = 0; i < items.length; i++) {
Companies.words.add(items[i]);
//System.out.println(items[i]);
}
}
}*/
Companies.articleAdd("The");
Companies.articleAdd("company");
Companies.articleAdd("Best");
Companies.articleAdd("Buy");
Companies.articleAdd("sell");
Companies.articleAdd("Xbox");
Companies.getHits();
}
}
// Trie Node, which stores a character and the children in a HashMap
class TrieNode {
// Data Fields
private String word;
HashMap<String,TrieNode> children;
boolean bIsEnd;
private String primary = "";
// Constructors
public TrieNode() {
children = new HashMap<>();
bIsEnd = false;
}
public TrieNode(String st, String prime) {
word = st;
children = new HashMap<>();
bIsEnd = false;
primary = prime;
}
// Trie Node Methods
public HashMap<String,TrieNode> getChildren() {
return children;
}
public String getValue() {
return word;
}
public void setIsEnd(boolean val) {
bIsEnd = val;
}
public boolean isEnd() {
return bIsEnd;
}
public String getPrime() {
return primary;
}
}
class Trie {
private ArrayList<String> article = new ArrayList<String>();
private HashMap<String,Integer> hits = new HashMap<String,Integer>();
// Constructor
public Trie() {
root = new TrieNode();
}
// Insert article text
public void articleAdd(String word) {
article.add(word);
}
// Method to insert a new company name to Trie
public void insert(String[] names, String prime) {
// Find length of the given name
int length = names.length;
//TrieNode currNode = root;
HashMap<String,TrieNode> children = root.children;
// Traverse through all words of given name
for( int i=0; i<length; i++)
{
String name = names[i];
System.out.println("Iter: " + name);
TrieNode t;
// If there is already a child for current word of given name
if( children.containsKey(name))
t = children.get(name);
else // Else create a child
{
System.out.println("Inserting node " + name + " prime is " + prime);
t = new TrieNode(name, prime);
children.put( name, t );
}
children = t.getChildren();
int j = names.length-1;
if(i==j){
t.setIsEnd(true);
System.out.println("WordEnd");
}
}
}
public void getHits() {
// String[] articleArr = article.toArray(new String[0]);
// Initialize reference to traverse through Trie
// TrieNode crawl = root;
// int level, prevMatch = 0;
Iterator<String> iter = article.iterator();
TrieNode currNode = root;
while (iter.hasNext()) {
String word = iter.next();
System.out.println("Iter: " + word);
// HashMap of current node's children
HashMap<String,TrieNode> child = currNode.getChildren();
// If hit in currNode's children
if (child.containsKey(word)) {
System.out.println("Node exists: " + word);
// Update currNode to be node that matched
currNode = child.get(word);
System.out.println(currNode.isEnd());
String next = "";
// If currNode is leaf and next node has no match in children, were done
if (iter.hasNext()) {next = iter.next();}
if (currNode.isEnd() && !child.containsKey(next)) {
System.out.println("Matched word: " + word);
System.out.println("Primary: " + currNode.getPrime());
currNode = root;
} else {
// Else next node is continuation
}
} else {
// Else ignore next word and reset
currNode = root;
}
}
}
private TrieNode root;
}
I think instead of using while and iter.next() you can use for loop as below
for (Map.Entry entry : article.entrySet()) {
String word = entry.getKey();
}
So you are not really moving to the next items of your hashmap.
If this is not your point, please clarify us.
Thanks,
Nghia
I opted to use a for-loop style instead of While loop for this, as well as tweaked some logic to get it working. For those interested, the new code is below, as well as an example of the "companies.dat" file (what is populated to the Trie). The stdin is any text excerpt which ends with a "." on new line.
Companies.dat:
Microsoft Corporation Microsoft Xbox
Apple Computer Apple Mac
Best Buy
Dell
TrieBuilder:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.*;
public class BuildTrie {
// Class Methods
public static void main(String[] args) throws IOException {
Trie Companies = new Trie();
String filename = "companies.dat";
try {
BufferedReader reader = new BufferedReader(new FileReader(filename));
String line;
while ((line = reader.readLine()) != null) {
// Split line by tab character
String[] aliases = line.replaceAll("\\p{P}", "").split("\t");
// Loop over each "alias" of specific company
for (int n = 0; n < aliases.length; n++) {
String[] name = aliases[n].split(" ");
// Insert each alias into Trie with index 0 as primary
Companies.insert(name, aliases[0]);
}
}
reader.close();
} catch (Exception e) {
System.err.format("Exception occurred trying to read '%s'.", filename);
e.printStackTrace();
}
System.out.println("Article Input: ");
try (BufferedReader input = new BufferedReader(new InputStreamReader(System.in))) {
String line;
while ((line = input.readLine()) != null) {
if (".".equals(line)) break;
String[] items = line.trim().replaceAll("\\p{P}", "").split("\\s+");
for (int i = 0; i < items.length; i++) {
Companies.articleAdd(items[i]);
}
}
}
Companies.getHits();
}
}
// Trie Node, which stores a character and the children in a HashMap
class TrieNode {
// Data Fields
private String word;
HashMap<String,TrieNode> children;
boolean bIsEnd;
private String primary = "";
// Constructors
public TrieNode() {
children = new HashMap<>();
bIsEnd = false;
}
public TrieNode(String st, String prime) {
word = st;
children = new HashMap<>();
bIsEnd = false;
primary = prime;
}
// Trie Node Methods
public HashMap<String,TrieNode> getChildren() {
return children;
}
public String getValue() {
return word;
}
public void setIsEnd(boolean val) {
bIsEnd = val;
}
public boolean isEnd() {
return bIsEnd;
}
public String getPrime() {
return primary;
}
}
class Trie {
private ArrayList<String> article = new ArrayList<String>();
private HashMap<String,Integer> hits = new HashMap<String,Integer>();
// Constructor
public Trie() {
root = new TrieNode();
}
// Insert article text
public void articleAdd(String word) {
article.add(word);
}
// Method to insert a new company name to Trie
public void insert(String[] names, String prime) {
// Find length of the given name
int length = names.length;
HashMap<String,TrieNode> children = root.children;
// Traverse through all words of given name
for( int i=0; i<length; i++)
{
String name = names[i];
TrieNode t;
// If there is already a child for current word of given name
if( children.containsKey(name))
t = children.get(name);
else // Else create a child
{
t = new TrieNode(name, prime);
children.put( name, t );
}
children = t.getChildren();
int j = names.length-1;
if(i==j){
t.setIsEnd(true);
}
}
}
public void getHits() {
// Initialize reference to traverse through Trie
TrieNode currNode = root;
for (int i=0; i < article.size(); i++) {
String word = article.get(i);
System.out.println("Searching: " + word);
// HashMap of current node's children
HashMap<String, TrieNode> child = currNode.getChildren();
// If hit in currNode's children
if (child.containsKey(word)) {
System.out.println("Node exists: " + word);
// Update currNode to be node that matched
currNode = child.get(word);
child = currNode.getChildren();
System.out.println("isEnd?: " + currNode.isEnd());
String next = "";
if (i+1 < article.size()) {
next = article.get(i+1);
}
// If currNode is leaf and next node has no match in children, were done
if (currNode.isEnd() && !child.containsKey(next)) {
System.out.println("Primary of match: " + currNode.getPrime());
currNode = root;
}
} else {
// Else ignore next word and reset
System.out.println("No match.");
currNode = root;
}
}
}
private TrieNode root;
}

Categories

Resources