How to implement LinkedList Bag ADT to create a spellchecker

How to implement LinkedList Bag ADT to create a spellchecker - java

Fully develop the classes for the Linked Implementation of the ADT Bag (i.e., LinkedBag, Node).
Test your classes well (call all methods) before you proceed.
Spell checker
Write a spell checker that:
a. Reads in words from an external file.
b. Tests each word against a dictionary (an instance of your LinkedBag class) of correctly spelled words (if the word is found in the dictionary then it’s spelled correctly).
I have already developed the classes for the linked implementation of the ADT Bag and also made the spellchecker.java
I need help to understand how to use and implement linkedbag class and my words.txt file for this problem.
Dictionary.java
package Bags;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class Dictionary {
private int M = 1319; // prime number
final private Bucket[] array;
public Dictionary() {
this.M = M;
array = new Bucket[M];
for (int i = 0; i < M; i++) {
array[i] = new Bucket();
}
}
private int hash(String key) {
return (key.hashCode() & 0x7fffffff) % M;
}
// call hash() to decide which bucket to put it in, do it.
public void add(String key) {
array[hash(key)].put(key);
}
// call hash() to find what bucket it's in, get it from that bucket.
public boolean contains(String input) {
input = input.toLowerCase();
return array[hash(input)].get(input);
}
public void build(String filePath) {
try {
BufferedReader reader = new BufferedReader(new FileReader(filePath));
String line;
while ((line = reader.readLine()) != null) {
add(line);
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
//
public String[] getRandomEntries(int num) {
String[] toRet = new String[num];
for (int i = 0; i < num; i++) {
// pick a random bucket, go out a random number
Node n = array[(int) Math.random() * M].first;
int rand = (int) Math.random() * (int) Math.sqrt(num);
for (int j = 0; j < rand && n.next != null; j++)
n = n.next;
toRet[i] = n.word;
}
return toRet;
}
class Bucket {
private Node first;
public boolean get(String in) { // return key true if key exists
Node next = first;
while (next != null) {
if (next.word.equals(in)) {
return true;
}
next = next.next;
}
return false;
}
public void put(String key) {
for (Node curr = first; curr != null; curr = curr.next) {
if (key.equals(curr.word)) {
return; // search hit: return
}
}
first = new Node(key, first); // search miss: add new node
}
class Node {
String word;
Node next;
public Node(String key, Node next) {
this.word = key;
this.next = next;
}
}
}
My problem is that I am calling Hash() to decide which bucket to put it in and do the whole process. Instead of this I want to use an Instance of my linkedbag class to look through my word.txt file and return output if the input matches or not..

Related

How to speed up Depth First Search method?

I'm trying to do a Depth First Search of my graph, and something is slowing it down quite a lot and I'm not sure what.
Here is my Bag code:
import java.util.Iterator;
import java.util.NoSuchElementException;
public class Bag<Item> implements Iterable<Item> {
private Node<Item> first; // beginning of bag
private Node<Item> end;
private int n; // number of elements in bag
public int label;
public int edges;
public static class Node<Item> {
private Item item;
private Node<Item> next;
public int label;
public int edges;
}
public Bag() {
first = null; // empty bag initialized
end = null;
n = 0;
}
public void add(Item item) {
if (n==0) {
Node<Item> head = new Node<Item>(); // if bag is empty
first = head;
end = head;
head.item = item; // new node both first and end of bag
edges++;
n++;
}
else {
Node<Item> oldlast = end; // old last assigned to end of node
Node<Item> last = new Node<Item>();
last.item = item;
oldlast.next = last; // new node added after old last
end = last;
n++; // size increased
edges++;
}
}
public Iterator<Item> iterator() {
return new LinkedIterator(first); // returns an iterator that iterates over the items in this bag in arbitrary order
}
public class LinkedIterator implements Iterator<Item> {
private Node<Item> current;
public LinkedIterator(Node<Item> first) {
current = first; // iterator starts at head of bag
}
public boolean hasNext() { return current != null; }
public void remove() { throw new UnsupportedOperationException(); }
public Item next() {
if (!hasNext()) throw new NoSuchElementException(); // if there is next item, current is moved to next
Item item = current.item;
current = current.next;
return item; // item is returned
}
}
}
Here is my driver:
import java.util.ArrayList;
import java.util.Random;
public class Driver {
public static ArrayList<Integer> randomNum(int howMany) {
ArrayList<Integer> numbers = new ArrayList<Integer>(howMany);
Random randomGenerator = new Random();
while (numbers.size() < howMany) {
int rand_int = randomGenerator.nextInt(10000);
if (!numbers.contains(rand_int)) {
numbers.add(rand_int);
}
}
return numbers;
}
public static void main(String[] args) {
ArrayList<Integer> num = randomNum(100);
Graph G = new Graph(num);
System.out.println("The length of longest path for this sequence with graph is: " + G.dfsStart(num));
}
}
I send an ArrayList of random integers to my dfsStart method from the driver, which looks at all the different paths for each starting node in my graph. my DepthFirstSearch method calls the getAdjList for each starting node to find its neighbors using my Bag adj, and then works its way down each path before backtracking.
Here is my Graph code, containing my longest path method:
import java.util.ArrayList;
import java.util.NoSuchElementException;
public class Graph {
public final int V; // initializing variables and data structures
public Bag<Integer>[] adj;
public int longestPath;
public Graph(ArrayList<Integer> numbers) {
try {
longestPath = 0;
this.V = numbers.size();
adj = (Bag<Integer>[]) new Bag[V]; // bag initialized
for (int v = 0; v < V; v++) {
adj[v] = new Bag<Integer>();
}
for (int i = 0; i < V; i++) {
adj[i].label = numbers.get(i);
int j = (i + 1);
while (j < numbers.size()) {
if (numbers.get(i) < numbers.get(j)) {
addEdge(i, numbers.get(j));
}
j++;
}
}
}
catch (NoSuchElementException e) {
throw new IllegalArgumentException("invalid input format in Graph constructor", e);
}
}
public void addEdge(int index, int num) {
adj[index].add(num);
}
public int getIndex(int num) {
for (int i = 0; i < adj.length; i++) {
if (adj[i].label == num) {
return i;
}
}
return -1;
}
public Bag<Integer> getAdjList(int source) {
Bag<Integer> adjList = null;
for (Bag<Integer> list : adj) {
if (list.label == source) {
adjList = list;
break;
}
}
return adjList;
}
public int dfsStart(ArrayList<Integer> numbers) {
for (int i=0;i<numbers.size();i++) {
// Print all paths from current node
depthFirstSearch(numbers.get(i),new ArrayList<>(300));
}
return longestPath;
}
public void depthFirstSearch(int src, ArrayList<Integer> current) {
current.add(src);
Bag<Integer> srcAdj = getAdjList(src);
if (srcAdj.size() == 0) {
// Leaf node
// Print this path
longestPath = Math.max(longestPath, current.size());
}
for (int links : srcAdj) {
depthFirstSearch(links, current);
}
current.remove(current.size()-1);
}
}
I believe the suggestion below helped get rid of the error, but it is still unbelievably slow when trying to find the longest path in a graph of more than 150 vertices.

Even for a small dense graph there can be many unique paths from a src node. I tested for this input [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25] there are 16777216 unique paths from all nodes. So you can expect OOM for bigger inputs. one way is to update the longestPath as soon as a path is found instead of adding it to the list.
Change this to later.
addtoCount(current.size());
to
longestPath = Math.max(longestPath, current.size());
Make sure longestPath is global and initialized to 0 before every test case.

Well, I do not know JAVA but that is an incredible lot of code for doing a simple thing such as depth first search.
In C++ it is done like this:
void cPathFinder::depthFirst(
int v)
{
// initialize visited flag for each node in graph
myPath.clear();
myPath.resize(nodeCount(), 0);
// start recursive search from starting node
depthRecurse(v, visitor);
}
void cPathFinder::depthRecurse(
int v )
{
// remember this node has been visited
myPath[v] = 1;
// look for new adjacent nodes
for (int w : adjacent(v))
if (!myPath[w])
{
// search from new node
depthRecurse(w);
}
}

Did I write the copy constructor for my Linked List program correctly?

I am working on a project for my Data Structures class that asks me to write a class to implement a linked list of ints.
Use an inner class for the Node.
Include the methods below.
Write a tester to enable you to test all of the methods with whatever data you want in any order.
I have to create three different constructors. One of the constructors is a copy constructor. I have my code down below showing what I did but I'm not sure I wrote this constructor correctly. I also have a method called addToFront one of the many methods I need to implement in this project. Can someone let me know what I would need to write for the copy constructor? I have no idea what I need to write for a copy constructor. I've tried looking it up but the examples shown don't match with what I'm trying to write.
public class LinkedListOfInts {
Node head;
private class Node {
int value;
Node nextNode;
public Node(int value, Node nextNode) {
this.value = value;
this.nextNode = nextNode;
}
}
public LinkedListOfInts() {
}
public LinkedListOfInts(LinkedListOfInts other) {
}
public void addToFront(int x) {
head = new Node(x, head);
}
public String toString() {
String result = " ";
for (Node ptr = head; ptr != null; ptr = ptr.nextNode)
result += ptr.value + " ";
return result;
}
public static void main(String[] args) {
LinkedListOfInts list = new LinkedListOfInts();
for (int i = 0; i < 15; i++)
list.addToFront(i);
System.out.println(list);
}
}

You can iterate over the nodes of the other list and sequentially create new tail nodes based on their values.
public LinkedListOfInts(LinkedListOfInts other) {
Node tail = null;
for(Node n = other.head; n != null; n = n.nextNode){
if(tail == null) this.head = tail = new Node(n.value, null);
else {
tail.nextNode = new Node(n.value, null);
tail = tail.nextNode;
}
}
}
// ...
public static void main(String[] args) {
LinkedListOfInts list = new LinkedListOfInts();
for (int i = 0; i < 15; i++)
list.addToFront(i);
LinkedListOfInts copy = new LinkedListOfInts(list);
System.out.println(list);
System.out.println(copy);
}

Searching and deleting values in a circular linked list

I am trying to make an application that will loop through a circular linked list. As it does so, it will use another linked list of index values, and it will use these values to delete from the circular linked list.
I have it set up now where it should fetch the index value to be deleted from my random linked list via runRandomList() method. It then uses the rotate() method to loop through the circular linked list and deletes the value from it. It will then add the deleted value to "deletedLinked list". Then, control should return back to runRandomList() method and it should feed the rotate() method the next value from the random linked list. The circular linked list should begin traversing where it left off. It should keep track of the count and node it is on. The count should reset to 0 when it reaches the first node, so it can properly keep track of which index it is on.
Unfortunately, this is not happening. I have been trying different things for the last few days as the code stands right now; it enters into a continuous loop. the issue appears to be in the rotate method.
This is the rotate method code. My thought was the counter would advance until it matches the index input. If it reaches the first node, the counter would reset to 0 and then start to increment again until it reaches the index value.
private void rotate(int x)
{
while(counter <= x)
{
if(p == names.first)
{
counter = 0;
}
p = p.next;
counter++;
}
deleteList.add((String) p.value);
names.remove(x);
}
This is my linked list class:
public class List<T>{
/*
helper class, creates nodes
*/
public class Node {
T value;
Node next;
/*
Inner class constructors
*/
public Node(T value, Node next)
{
this.value = value;
this.next = next;
}
private Node(T value)
{
this.value = value;
}
}
/*
Outer class constructor
*/
Node first;
Node last;
public int size()
{
return size(first);
}
private int size(Node list)
{
if(list == null)
return 0;
else if(list == last)
return 1;
else
{
int size = size(list.next) + 1;
return size;
}
}
public void add(T value)
{
first = add(value, first);
}
private Node add(T value, Node list)
{
if(list == null)
{
last = new Node(value);
return last;
}
else
list.next = add(value, list.next);
return list;
}
public void setCircularList()
{
last.next = first;
}
public void show()
{
Node e = first;
while (e != null)
{
System.out.println(e.value);
e = e.next;
}
}
#Override
public String toString()
{
StringBuilder strBuilder = new StringBuilder();
// Use p to walk down the linked list
Node p = first;
while (p != null)
{
strBuilder.append(p.value + "\n");
p = p.next;
}
return strBuilder.toString();
}
public boolean isEmpty()
{
boolean result = isEmpty(first);
return result;
}
private boolean isEmpty(Node first)
{
return first == null;
}
public class RemovalResult
{
Node node; // The node removed from the list
Node list; // The list remaining after the removal
RemovalResult(Node remNode, Node remList)
{
node = remNode;
list = remList;
}
}
/**
The remove method removes the element at an index.
#param index The index of the element to remove.
#return The element removed.
#exception IndexOutOfBoundsException When index is
out of bounds.
*/
public T remove(int index)
{
// Pass the job on to the recursive version
RemovalResult remRes = remove(index, first);
T element = remRes.node.value; // Element to return
first = remRes.list; // Remaining list
return element;
}
/**
The private remove method recursively removes
the node at the given index from a list.
#param index The position of the node to remove.
#param list The list from which to remove a node.
#return The result of removing the node from the list.
#exception IndexOutOfBoundsException When index is
out of bounds.
*/
private RemovalResult remove(int index, Node list)
{
if (index < 0 || index >= size())
{
String message = String.valueOf(index);
throw new IndexOutOfBoundsException(message);
}
if (index == 0)
{
// Remove the first node on list
RemovalResult remRes;
remRes = new RemovalResult(list, list.next);
list.next = null;
return remRes;
}
// Recursively remove the element at index-1 in the tail
RemovalResult remRes;
remRes = remove(index-1, list.next);
// Replace the tail with the results and return
// after modifying the list part of RemovalResult
list.next = remRes.list;
remRes.list = list;
return remRes;
}
}
This contains the main(), runRandomList(), and rotate() methods.
public class lottery {
private int suitors;
private List<String> names;
private List<Integer> random;
private List<String> deleteList = new List<>();
private int counter;
private Node p;
public lottery(int suitors, List<String> names, List<Integer> random)
{
this.suitors = suitors;
this.names = names;
this.random = random;
p = names.first;
}
public void start()
{
//Set names list to circular
names.setCircularList();
runRandomList(random);
}
public void runRandomList(List<Integer> random)
{
Node i = random.first;
while(i != null)
{
rotate((int) i.value, counter, p);
i = i.next;
}
}
public List getDeleteList()
{
return deleteList;
}
private void rotate(int x, int count, Node p)
{
Node i = p;
while(count <= x)
{
if(i == names.first)
{
count = 0;
}
i = i.next;
count++;
}
deleteList.add((String) i.value);
names.remove(x);
p = i;
counter = count;
}
public static void main(String[] args)
{
List<String> namesList = new List<>();
namesList.add("a");
namesList.add("b");
namesList.add("c");
namesList.add("d");
namesList.add("e");
namesList.add("f");
List<Integer> randomList = new List<>();
randomList.add(3);
randomList.add(1);
randomList.add(5);
randomList.add(4);
randomList.add(0);
lottery obj = new lottery(6, namesList, randomList);
obj.start();
System.out.println(obj.getDeleteList());
}
}

As I suspected it was the rotate method, this is the solution.
private void rotate(int x, int count)
{
while(count != x)
{
p = p.next;
count++;
if(count == x)
{
deleteList.add((String)p.value);
counter = x;
}
if(count >= suitors)
{
for (int j = 0; j < x ; j++)
{
p = p.next;
}
deleteList.add((String)p.value);
counter = x;
count = x;
}
}
}

Java Trie Matching using Iterator

I have an assignment which involves creating a Trie of company names (read from a file) and then reading a news article input and counting the number of times a company name from the Trie occurs in the article.
I have coded a pretty standard Trie structure, however for the assignment it made more sense to have the TrieNodes hold the full word rather than just each character.
To make things more complicated, each company name from the file has one "primary name" and can have multiple "secondary names". For example: Microsoft Corporation, Microsoft, Xbox - where the first name is always the primary.
The assignment requires that I count all matches in the article for any of the company names, but only return the company's primary name when printing the results. Because of this, my TrieNode has the String primeName datafield, along with the standard isEnd bool. However, in my case, isEnd represents whether or not the specified node and its parent(s) form a full company name.
For example, with the article input "Microsoft Corporation just released a new Xbox console." I would need to return something along the lines of "Microsoft:2" because both Microsoft Corporation and Xbox share the same primary company name which is Microsoft.
I am using an iterator in the getHits() method but when I do find a hit, I need to look at the next word in the array to make sure it is not a continuation before I decide whether to stop or continue. The problem is that calling iter.next() doesn't just "peek" the next value but it moves forward, essentially causing me to skip words.
For example, if you look at the below code and my example, after "Best" gets a hit, it should see that "Buy" is a child and the next time it loops get a match on "Buy", but since I already call iter.next() to look at "Buy" within the While loop, the next iteration entirely skips "Buy". Is there some way I can simply peek at the next iter value within the While loop without actually moving to it? Also, any improvements to this code are greatly appreciated! I am sure there are many places where I sloppily implemented something.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.*;
public class BuildTrie {
// Class Methods
public static void main(String[] args) throws IOException {
Trie Companies = new Trie();
String filename = "companies.dat";
try {
BufferedReader reader = new BufferedReader(new FileReader(filename));
String line;
while ((line = reader.readLine()) != null) {
// Split line by tab character
String[] aliases = line.replaceAll("\\p{P}", "").split("\t");
// Loop over each "alias" of specific company
for (int n = 0; n < aliases.length; n++) {
String[] name = aliases[n].split(" ");
// Insert each alias into Trie with index 0 as primary
Companies.insert(name, aliases[0]);
}
}
reader.close();
} catch (Exception e) {
System.err.format("Exception occurred trying to read '%s'.", filename);
e.printStackTrace();
}
/*System.out.println("Article Input: ");
try (BufferedReader input = new BufferedReader(new InputStreamReader(System.in))) {
String line;
while ((line = input.readLine()) != null) {
if (".".equals(line)) break;
String[] items = line.trim().replaceAll("\\p{P}", "").split("\\s+");
for (int i = 0; i < items.length; i++) {
Companies.words.add(items[i]);
//System.out.println(items[i]);
}
}
}*/
Companies.articleAdd("The");
Companies.articleAdd("company");
Companies.articleAdd("Best");
Companies.articleAdd("Buy");
Companies.articleAdd("sell");
Companies.articleAdd("Xbox");
Companies.getHits();
}
}
// Trie Node, which stores a character and the children in a HashMap
class TrieNode {
// Data Fields
private String word;
HashMap<String,TrieNode> children;
boolean bIsEnd;
private String primary = "";
// Constructors
public TrieNode() {
children = new HashMap<>();
bIsEnd = false;
}
public TrieNode(String st, String prime) {
word = st;
children = new HashMap<>();
bIsEnd = false;
primary = prime;
}
// Trie Node Methods
public HashMap<String,TrieNode> getChildren() {
return children;
}
public String getValue() {
return word;
}
public void setIsEnd(boolean val) {
bIsEnd = val;
}
public boolean isEnd() {
return bIsEnd;
}
public String getPrime() {
return primary;
}
}
class Trie {
private ArrayList<String> article = new ArrayList<String>();
private HashMap<String,Integer> hits = new HashMap<String,Integer>();
// Constructor
public Trie() {
root = new TrieNode();
}
// Insert article text
public void articleAdd(String word) {
article.add(word);
}
// Method to insert a new company name to Trie
public void insert(String[] names, String prime) {
// Find length of the given name
int length = names.length;
//TrieNode currNode = root;
HashMap<String,TrieNode> children = root.children;
// Traverse through all words of given name
for( int i=0; i<length; i++)
{
String name = names[i];
System.out.println("Iter: " + name);
TrieNode t;
// If there is already a child for current word of given name
if( children.containsKey(name))
t = children.get(name);
else // Else create a child
{
System.out.println("Inserting node " + name + " prime is " + prime);
t = new TrieNode(name, prime);
children.put( name, t );
}
children = t.getChildren();
int j = names.length-1;
if(i==j){
t.setIsEnd(true);
System.out.println("WordEnd");
}
}
}
public void getHits() {
// String[] articleArr = article.toArray(new String[0]);
// Initialize reference to traverse through Trie
// TrieNode crawl = root;
// int level, prevMatch = 0;
Iterator<String> iter = article.iterator();
TrieNode currNode = root;
while (iter.hasNext()) {
String word = iter.next();
System.out.println("Iter: " + word);
// HashMap of current node's children
HashMap<String,TrieNode> child = currNode.getChildren();
// If hit in currNode's children
if (child.containsKey(word)) {
System.out.println("Node exists: " + word);
// Update currNode to be node that matched
currNode = child.get(word);
System.out.println(currNode.isEnd());
String next = "";
// If currNode is leaf and next node has no match in children, were done
if (iter.hasNext()) {next = iter.next();}
if (currNode.isEnd() && !child.containsKey(next)) {
System.out.println("Matched word: " + word);
System.out.println("Primary: " + currNode.getPrime());
currNode = root;
} else {
// Else next node is continuation
}
} else {
// Else ignore next word and reset
currNode = root;
}
}
}
private TrieNode root;
}

I think instead of using while and iter.next() you can use for loop as below
for (Map.Entry entry : article.entrySet()) {
String word = entry.getKey();
}
So you are not really moving to the next items of your hashmap.
If this is not your point, please clarify us.
Thanks,
Nghia

I opted to use a for-loop style instead of While loop for this, as well as tweaked some logic to get it working. For those interested, the new code is below, as well as an example of the "companies.dat" file (what is populated to the Trie). The stdin is any text excerpt which ends with a "." on new line.
Companies.dat:
Microsoft Corporation Microsoft Xbox
Apple Computer Apple Mac
Best Buy
Dell
TrieBuilder:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.*;
public class BuildTrie {
// Class Methods
public static void main(String[] args) throws IOException {
Trie Companies = new Trie();
String filename = "companies.dat";
try {
BufferedReader reader = new BufferedReader(new FileReader(filename));
String line;
while ((line = reader.readLine()) != null) {
// Split line by tab character
String[] aliases = line.replaceAll("\\p{P}", "").split("\t");
// Loop over each "alias" of specific company
for (int n = 0; n < aliases.length; n++) {
String[] name = aliases[n].split(" ");
// Insert each alias into Trie with index 0 as primary
Companies.insert(name, aliases[0]);
}
}
reader.close();
} catch (Exception e) {
System.err.format("Exception occurred trying to read '%s'.", filename);
e.printStackTrace();
}
System.out.println("Article Input: ");
try (BufferedReader input = new BufferedReader(new InputStreamReader(System.in))) {
String line;
while ((line = input.readLine()) != null) {
if (".".equals(line)) break;
String[] items = line.trim().replaceAll("\\p{P}", "").split("\\s+");
for (int i = 0; i < items.length; i++) {
Companies.articleAdd(items[i]);
}
}
}
Companies.getHits();
}
}
// Trie Node, which stores a character and the children in a HashMap
class TrieNode {
// Data Fields
private String word;
HashMap<String,TrieNode> children;
boolean bIsEnd;
private String primary = "";
// Constructors
public TrieNode() {
children = new HashMap<>();
bIsEnd = false;
}
public TrieNode(String st, String prime) {
word = st;
children = new HashMap<>();
bIsEnd = false;
primary = prime;
}
// Trie Node Methods
public HashMap<String,TrieNode> getChildren() {
return children;
}
public String getValue() {
return word;
}
public void setIsEnd(boolean val) {
bIsEnd = val;
}
public boolean isEnd() {
return bIsEnd;
}
public String getPrime() {
return primary;
}
}
class Trie {
private ArrayList<String> article = new ArrayList<String>();
private HashMap<String,Integer> hits = new HashMap<String,Integer>();
// Constructor
public Trie() {
root = new TrieNode();
}
// Insert article text
public void articleAdd(String word) {
article.add(word);
}
// Method to insert a new company name to Trie
public void insert(String[] names, String prime) {
// Find length of the given name
int length = names.length;
HashMap<String,TrieNode> children = root.children;
// Traverse through all words of given name
for( int i=0; i<length; i++)
{
String name = names[i];
TrieNode t;
// If there is already a child for current word of given name
if( children.containsKey(name))
t = children.get(name);
else // Else create a child
{
t = new TrieNode(name, prime);
children.put( name, t );
}
children = t.getChildren();
int j = names.length-1;
if(i==j){
t.setIsEnd(true);
}
}
}
public void getHits() {
// Initialize reference to traverse through Trie
TrieNode currNode = root;
for (int i=0; i < article.size(); i++) {
String word = article.get(i);
System.out.println("Searching: " + word);
// HashMap of current node's children
HashMap<String, TrieNode> child = currNode.getChildren();
// If hit in currNode's children
if (child.containsKey(word)) {
System.out.println("Node exists: " + word);
// Update currNode to be node that matched
currNode = child.get(word);
child = currNode.getChildren();
System.out.println("isEnd?: " + currNode.isEnd());
String next = "";
if (i+1 < article.size()) {
next = article.get(i+1);
}
// If currNode is leaf and next node has no match in children, were done
if (currNode.isEnd() && !child.containsKey(next)) {
System.out.println("Primary of match: " + currNode.getPrime());
currNode = root;
}
} else {
// Else ignore next word and reset
System.out.println("No match.");
currNode = root;
}
}
}
private TrieNode root;
}

Where do I find a standard Trie based map implementation in Java? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I have a Java program that stores a lot of mappings from Strings to various objects.
Right now, my options are either to rely on hashing (via HashMap) or on binary searches (via TreeMap). I am wondering if there is an efficient and standard trie-based map implementation in a popular and quality collections library?
I've written my own in the past, but I'd rather go with something standard, if available.
Quick clarification: While my question is general, in the current project I am dealing with a lot of data that is indexed by fully-qualified class name or method signature. Thus, there are many shared prefixes.

You might want to look at the Trie implementation that Limewire is contributing to the Google Guava.

There is no trie data structure in the core Java libraries.
This may be because tries are usually designed to store character strings, while Java data structures are more general, usually holding any Object (defining equality and a hash operation), though they are sometimes limited to Comparable objects (defining an order). There's no common abstraction for "a sequence of symbols," although CharSequence is suitable for character strings, and I suppose you could do something with Iterable for other types of symbols.
Here's another point to consider: when trying to implement a conventional trie in Java, you are quickly confronted with the fact that Java supports Unicode. To have any sort of space efficiency, you have to restrict the strings in your trie to some subset of symbols, or abandon the conventional approach of storing child nodes in an array indexed by symbol. This might be another reason why tries are not considered general-purpose enough for inclusion in the core library, and something to watch out for if you implement your own or use a third-party library.

Apache Commons Collections v4.0 now supports trie structures.
See the org.apache.commons.collections4.trie package info for more information. In particular, check the PatriciaTrie class:
Implementation of a PATRICIA Trie (Practical Algorithm to Retrieve Information Coded in Alphanumeric).
A PATRICIA Trie is a compressed Trie. Instead of storing all data at the edges of the Trie (and having empty internal nodes), PATRICIA stores data in every node. This allows for very efficient traversal, insert, delete, predecessor, successor, prefix, range, and select(Object) operations. All operations are performed at worst in O(K) time, where K is the number of bits in the largest item in the tree. In practice, operations actually take O(A(K)) time, where A(K) is the average number of bits of all items in the tree.

Also check out concurrent-trees. They support both Radix and Suffix trees and are designed for high concurrency environments.

I wrote and published a simple and fast implementation here.

What you need is org.apache.commons.collections.FastTreeMap , I think.

Below is a basic HashMap implementation of a Trie. Some people might find this useful...
class Trie {
HashMap<Character, HashMap> root;
public Trie() {
root = new HashMap<Character, HashMap>();
}
public void addWord(String word) {
HashMap<Character, HashMap> node = root;
for (int i = 0; i < word.length(); i++) {
Character currentLetter = word.charAt(i);
if (node.containsKey(currentLetter) == false) {
node.put(currentLetter, new HashMap<Character, HashMap>());
}
node = node.get(currentLetter);
}
}
public boolean containsPrefix(String word) {
HashMap<Character, HashMap> node = root;
for (int i = 0; i < word.length(); i++) {
Character currentLetter = word.charAt(i);
if (node.containsKey(currentLetter)) {
node = node.get(currentLetter);
} else {
return false;
}
}
return true;
}
}

Apache's commons collections:
org.apache.commons.collections4.trie.PatriciaTrie

You can try the Completely Java library, it features a PatriciaTrie implementation. The API is small and easy to get started, and it's available in the Maven central repository.

You might look at this TopCoder one as well (registration required...).

If you required sorted map, then tries are worthwhile.
If you don't then hashmap is better.
Hashmap with string keys can be improved over the standard Java implementation:
Array hash map

If you're not worried about pulling in the Scala library, you can use this space efficient implementation I wrote of a burst trie.
https://github.com/nbauernfeind/scala-burst-trie

here is my implementation, enjoy it via: GitHub - MyTrie.java
/* usage:
MyTrie trie = new MyTrie();
trie.insert("abcde");
trie.insert("abc");
trie.insert("sadas");
trie.insert("abc");
trie.insert("wqwqd");
System.out.println(trie.contains("abc"));
System.out.println(trie.contains("abcd"));
System.out.println(trie.contains("abcdefg"));
System.out.println(trie.contains("ab"));
System.out.println(trie.getWordCount("abc"));
System.out.println(trie.getAllDistinctWords());
*/
import java.util.*;
public class MyTrie {
private class Node {
public int[] next = new int[26];
public int wordCount;
public Node() {
for(int i=0;i<26;i++) {
next[i] = NULL;
}
wordCount = 0;
}
}
private int curr;
private Node[] nodes;
private List<String> allDistinctWords;
public final static int NULL = -1;
public MyTrie() {
nodes = new Node[100000];
nodes[0] = new Node();
curr = 1;
}
private int getIndex(char c) {
return (int)(c - 'a');
}
private void depthSearchWord(int x, String currWord) {
for(int i=0;i<26;i++) {
int p = nodes[x].next[i];
if(p != NULL) {
String word = currWord + (char)(i + 'a');
if(nodes[p].wordCount > 0) {
allDistinctWords.add(word);
}
depthSearchWord(p, word);
}
}
}
public List<String> getAllDistinctWords() {
allDistinctWords = new ArrayList<String>();
depthSearchWord(0, "");
return allDistinctWords;
}
public int getWordCount(String str) {
int len = str.length();
int p = 0;
for(int i=0;i<len;i++) {
int j = getIndex(str.charAt(i));
if(nodes[p].next[j] == NULL) {
return 0;
}
p = nodes[p].next[j];
}
return nodes[p].wordCount;
}
public boolean contains(String str) {
int len = str.length();
int p = 0;
for(int i=0;i<len;i++) {
int j = getIndex(str.charAt(i));
if(nodes[p].next[j] == NULL) {
return false;
}
p = nodes[p].next[j];
}
return nodes[p].wordCount > 0;
}
public void insert(String str) {
int len = str.length();
int p = 0;
for(int i=0;i<len;i++) {
int j = getIndex(str.charAt(i));
if(nodes[p].next[j] == NULL) {
nodes[curr] = new Node();
nodes[p].next[j] = curr;
curr++;
}
p = nodes[p].next[j];
}
nodes[p].wordCount++;
}
}

I have just tried my own Concurrent TRIE implementation but not based on characters, it is based on HashCode. Still We can use this having Map of Map for each CHAR hascode.
You can test this using the code # https://github.com/skanagavelu/TrieHashMap/blob/master/src/TrieMapPerformanceTest.java
https://github.com/skanagavelu/TrieHashMap/blob/master/src/TrieMapValidationTest.java
import java.util.concurrent.atomic.AtomicReferenceArray;
public class TrieMap {
public static int SIZEOFEDGE = 4;
public static int OSIZE = 5000;
}
abstract class Node {
public Node getLink(String key, int hash, int level){
throw new UnsupportedOperationException();
}
public Node createLink(int hash, int level, String key, String val) {
throw new UnsupportedOperationException();
}
public Node removeLink(String key, int hash, int level){
throw new UnsupportedOperationException();
}
}
class Vertex extends Node {
String key;
volatile String val;
volatile Vertex next;
public Vertex(String key, String val) {
this.key = key;
this.val = val;
}
#Override
public boolean equals(Object obj) {
Vertex v = (Vertex) obj;
return this.key.equals(v.key);
}
#Override
public int hashCode() {
return key.hashCode();
}
#Override
public String toString() {
return key +"#"+key.hashCode();
}
}
class Edge extends Node {
volatile AtomicReferenceArray<Node> array; //This is needed to ensure array elements are volatile
public Edge(int size) {
array = new AtomicReferenceArray<Node>(8);
}
#Override
public Node getLink(String key, int hash, int level){
int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Node returnVal = array.get(index);
for(;;) {
if(returnVal == null) {
return null;
}
else if((returnVal instanceof Vertex)) {
Vertex node = (Vertex) returnVal;
for(;node != null; node = node.next) {
if(node.key.equals(key)) {
return node;
}
}
return null;
} else { //instanceof Edge
level = level + 1;
index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Edge e = (Edge) returnVal;
returnVal = e.array.get(index);
}
}
}
#Override
public Node createLink(int hash, int level, String key, String val) { //Remove size
for(;;) { //Repeat the work on the current node, since some other thread modified this node
int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Node nodeAtIndex = array.get(index);
if ( nodeAtIndex == null) {
Vertex newV = new Vertex(key, val);
boolean result = array.compareAndSet(index, null, newV);
if(result == Boolean.TRUE) {
return newV;
}
//continue; since new node is inserted by other thread, hence repeat it.
}
else if(nodeAtIndex instanceof Vertex) {
Vertex vrtexAtIndex = (Vertex) nodeAtIndex;
int newIndex = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, vrtexAtIndex.hashCode(), level+1);
int newIndex1 = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level+1);
Edge edge = new Edge(Base10ToBaseX.Base.BASE8.getLevelZeroMask()+1);
if(newIndex != newIndex1) {
Vertex newV = new Vertex(key, val);
edge.array.set(newIndex, vrtexAtIndex);
edge.array.set(newIndex1, newV);
boolean result = array.compareAndSet(index, vrtexAtIndex, edge); //REPLACE vertex to edge
if(result == Boolean.TRUE) {
return newV;
}
//continue; since vrtexAtIndex may be removed or changed to Edge already.
} else if(vrtexAtIndex.key.hashCode() == hash) {//vrtex.hash == hash) { HERE newIndex == newIndex1
synchronized (vrtexAtIndex) {
boolean result = array.compareAndSet(index, vrtexAtIndex, vrtexAtIndex); //Double check this vertex is not removed.
if(result == Boolean.TRUE) {
Vertex prevV = vrtexAtIndex;
for(;vrtexAtIndex != null; vrtexAtIndex = vrtexAtIndex.next) {
prevV = vrtexAtIndex; // prevV is used to handle when vrtexAtIndex reached NULL
if(vrtexAtIndex.key.equals(key)){
vrtexAtIndex.val = val;
return vrtexAtIndex;
}
}
Vertex newV = new Vertex(key, val);
prevV.next = newV; // Within SYNCHRONIZATION since prevV.next may be added with some other.
return newV;
}
//Continue; vrtexAtIndex got changed
}
} else { //HERE newIndex == newIndex1 BUT vrtex.hash != hash
edge.array.set(newIndex, vrtexAtIndex);
boolean result = array.compareAndSet(index, vrtexAtIndex, edge); //REPLACE vertex to edge
if(result == Boolean.TRUE) {
return edge.createLink(hash, (level + 1), key, val);
}
}
}
else { //instanceof Edge
return nodeAtIndex.createLink(hash, (level + 1), key, val);
}
}
}
#Override
public Node removeLink(String key, int hash, int level){
for(;;) {
int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Node returnVal = array.get(index);
if(returnVal == null) {
return null;
}
else if((returnVal instanceof Vertex)) {
synchronized (returnVal) {
Vertex node = (Vertex) returnVal;
if(node.next == null) {
if(node.key.equals(key)) {
boolean result = array.compareAndSet(index, node, null);
if(result == Boolean.TRUE) {
return node;
}
continue; //Vertex may be changed to Edge
}
return null; //Nothing found; This is not the same vertex we are looking for. Here hashcode is same but key is different.
} else {
if(node.key.equals(key)) { //Removing the first node in the link
boolean result = array.compareAndSet(index, node, node.next);
if(result == Boolean.TRUE) {
return node;
}
continue; //Vertex(node) may be changed to Edge, so try again.
}
Vertex prevV = node; // prevV is used to handle when vrtexAtIndex is found and to be removed from its previous
node = node.next;
for(;node != null; prevV = node, node = node.next) {
if(node.key.equals(key)) {
prevV.next = node.next; //Removing other than first node in the link
return node;
}
}
return null; //Nothing found in the linked list.
}
}
} else { //instanceof Edge
return returnVal.removeLink(key, hash, (level + 1));
}
}
}
}
class Base10ToBaseX {
public static enum Base {
/**
* Integer is represented in 32 bit in 32 bit machine.
* There we can split this integer no of bits into multiples of 1,2,4,8,16 bits
*/
BASE2(1,1,32), BASE4(3,2,16), BASE8(7,3,11)/* OCTAL*/, /*BASE10(3,2),*/
BASE16(15, 4, 8){
public String getFormattedValue(int val){
switch(val) {
case 10:
return "A";
case 11:
return "B";
case 12:
return "C";
case 13:
return "D";
case 14:
return "E";
case 15:
return "F";
default:
return "" + val;
}
}
}, /*BASE32(31,5,1),*/ BASE256(255, 8, 4), /*BASE512(511,9),*/ Base65536(65535, 16, 2);
private int LEVEL_0_MASK;
private int LEVEL_1_ROTATION;
private int MAX_ROTATION;
Base(int levelZeroMask, int levelOneRotation, int maxPossibleRotation) {
this.LEVEL_0_MASK = levelZeroMask;
this.LEVEL_1_ROTATION = levelOneRotation;
this.MAX_ROTATION = maxPossibleRotation;
}
int getLevelZeroMask(){
return LEVEL_0_MASK;
}
int getLevelOneRotation(){
return LEVEL_1_ROTATION;
}
int getMaxRotation(){
return MAX_ROTATION;
}
String getFormattedValue(int val){
return "" + val;
}
}
public static int getBaseXValueOnAtLevel(Base base, int on, int level) {
if(level > base.getMaxRotation() || level < 1) {
return 0; //INVALID Input
}
int rotation = base.getLevelOneRotation();
int mask = base.getLevelZeroMask();
if(level > 1) {
rotation = (level-1) * rotation;
mask = mask << rotation;
} else {
rotation = 0;
}
return (on & mask) >>> rotation;
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to implement LinkedList Bag ADT to create a spellchecker - java

Related

How to speed up Depth First Search method?

Did I write the copy constructor for my Linked List program correctly?

Searching and deleting values in a circular linked list

Java Trie Matching using Iterator

Where do I find a standard Trie based map implementation in Java? [closed]

Categories

Resources