Related
I was trying to solve this problem from leetcode and the prompt looks like this:
Given a binary search tree, write a function kthSmallest to find the kth smallest element in it.
https://leetcode.com/problems/kth-smallest-element-in-a-bst/description/
class Solution {
public int kthSmallest(TreeNode root, int k) {
TreeNode curr = new TreeNode(0);
ArrayList<TreeNode> res = new ArrayList<TreeNode>();
res = inOrder(root);
if(res != null){
curr = res.get(k);
return curr.val;
}
return -1; //if not found
}
public ArrayList<TreeNode> inOrder(TreeNode root){ //the value of the nodes would be in increasing order
ArrayList<TreeNode> list = new ArrayList<TreeNode>();
if(root == null){
return list;
}
list.addAll(inOrder(root.left));
list.addAll(inOrder(root));
list.addAll(inOrder(root.right));
return list;
}
}
However, the system gave me the "memory limit exceeded" error message, is my logic faulted or is there anyway I could fix my code? Thanks in advance!
Your logic is probably fine, however should be an efficiency issue since you're getting MLE. It seems you're using two extra spaces, which we won't need that for solving this problem.
This'll pass in Java:
public final class Solution {
private static int res = 0;
private static int count = 0;
public final int kthSmallest(
final TreeNode root,
final int k
) {
count = k;
inorder(root);
return res;
}
private final void inorder(
final TreeNode node
) {
if (node.left != null)
inorder(node.left);
count--;
if (count == 0) {
res = node.val;
return;
}
if (node.right != null)
inorder(node.right);
}
}
and here is a Python version, if you'd be interested, similarly with inorder traversal:
class Solution:
def kthSmallest(self, root, k):
def inorder(node):
if not node:
return
inorder(node.left)
self.k -= 1
if self.k == 0:
self.res = node.val
return
inorder(node.right)
self.k, self.res = k, None
inorder(root)
return self.res
References
For additional details, please see the Discussion Board where you can find plenty of well-explained accepted solutions with a variety of languages including low-complexity algorithms and asymptotic runtime/memory analysis1, 2.
Brute force algorithms usually get accepted for easy questions. For medium and hard questions, brute force algorithms mostly fail with Time Limit Exceeded (TLE) and less with Memory Limit Exceeded (MLE) errors.
Two weeks ago I've finished an implementation of a splay tree that allows basic functions, like insert, delete, find key and and obtain the sum of a range of elements of the three. You can find this implementation here as reference for this question or out of curiosity.
As an extra task (its optional and its past due, I'm solving this not for a grade but because I believe its a useful data structure not easy to "Google about it"), I was asked to implement a Rope data structure to manipulate strings so if the string is "warlock" and the keys given are 0 2 2, then the string would be "lowarck" (0 2 is substring "war", "lock" is whats left after removing "war" and you insert it after 2nd char so turns into "lo"+"war"+"ck").
This is just one query but it can be up to 100k queries for a 300k character long string, so a naive solution wouldnt work.
My first doubt is about initializing the tree( For the ones who have read the gist,I'll use Node instead of Vertex in order to be easy to understand for most).
This is the Node class and the NodePair class:
class Node {
char key;
int size;
Node left;
Node right;
Node parent;
Node(char key, int size, Node left, Node right, Node parent) {
this.key = key;
this.size = size;
this.left = left;
this.right = right;
this.parent = parent;
}
}
class NodePair {
Node left;
Node right;
NodePair() {
}
NodePair(Node left, Node right) {
this.left = left;
this.right = right;
}
}
After that, I create the tree this way:
StringBuffer sb = new StringBuffer(br.readLine());
Node left=null;
for (int i=0;i<sb.length();i++){
root=new Vertex(sb.charAt(i), i+1, left, null, null);
if (i!=sb.length()-1){
left=root;
}
}
This creates a very unbalanced tree where the first char of the string (as node.key) has node.size 1 and is the leftmost child; and the last char of the string is the root with size=sb.length().
I am not completely sure if this is correctly initialized. I did an inorder traversal print with key and size and I got the whole string in size order, which is what I expected.
After that I have modified the Update method from this:
void update(Node v) {
if (v == null) return;
v.sum = v.key + (v.left != null ? v.left.sum : 0) + (v.right != null ? v.right.sum : 0);
if (v.left != null) {
v.left.parent = v;
}
if (v.right != null) {
v.right.parent = v;
}
}
To this: (based on CLRS chapter 14.1)
void update(Node v) {
if (v == null) return;
v.size = 1 + (v.left != null ? v.left.size : 0) + (v.right != null ? v.right.size : 0);
if (v.left != null) {
v.left.parent = v;
}
if (v.right != null) {
v.right.parent = v;
}
}
Then the find method, from the original:
NodePair find(Node root, int key) {
Node v = root;
Node last = root;
Node next = null;
while (v != null) {
if (v.key >= key && (next == null || v.key < next.key)) {
next = v;
}
last = v;
if (v.key == key) {
break;
}
if (v.key < key) {
v = v.right;
} else {
v = v.left;
}
}
root = splay(last);
return new NodePair(next, root);
}
to this:(Based in the Order Statistics-SELECT of CLRS Chapter 14.1)
Node find(Node r, int k){
int s = r.left.size + 1;
if (k==s) return r;
else if (k < s) {
return find(r.left,k);
}
return find(r.right,k-s);
}
However I already have a problem at this point since, as you can see, the original find returns a NodePair while this method returns a Node.
The explanation of the NodePair according to instructors is:
Returns pair of the result and the new root.If found, result is a
pointer to the node with the given key.Otherwise, result is a pointer
to the node with the smallest bigger key (next value in the order). If
the key is bigger than all keys in the tree, then result is null.
This complicates my split method since it uses Find method to look for the node to split.
Besides this, I'm obtaining NullPointerException at this find method and from other students I understand that to avoid other error we should use a non-recursive version, so basically I need to implement a non-recursive version of OS-Select that returns a NodePair as the previous find method or modify my split method which is:
NodePair split(Node root, int key) {
NodePair result = new NodePair();
NodePair findAndRoot = find(root, key);
root = findAndRoot.right;
result.right = findAndRoot.left;
if (result.right == null) {
result.left = root;
return result;
}
result.right = splay(result.right);
result.left = result.right.left;
result.right.left = null;
if (result.left != null) {
result.left.parent = null;
}
update(result.left);
update(result.right);
return result;
}
As you can see, the find method is assigned to the NodePair findAndRoot.
I believe that besides the OS-Select conversion to non-recursive my main problem is to understand the way NodePair is used by the previous find method and split method.
Finally, this is my implementation of the method to receive the tree and keys and manipulate them:
Node stringManip(Node v, int i, int j, int k){
Node left;
Node right;
NodePair middleRight =split(v,j+1);
left=middleRight.left;
right=middleRight.right;
NodePair leftMiddle = split(left,i);
Node start = leftMiddle.left;
Node substr = leftMiddle.right;
Node tmp = merge(start, right);
NodePair pairString = split(tmp,k+1);
Vertex fLeft =pairString.left;
Vertex fRight = pairString.right;
root = merge(merge(fLeft,substr),fRight);
root = splay(root);
update(root);
return root;
}
As you must realize from my code, I'm a beginner with only have 5 months that started learning to program and I picked Java, so from the info I've gathered I get that this type of data structure is more in the intermediate-expert level (I'm already surprise I was capable of implementing the previous splay tree.
So please, consider my beginner level in your answer.
PD: Here's a pseudocode version of the nonrecursive OS-Select, still having NullPointerException..
OS-SELECT(x, i)
while x != null
r <- size[left[x]]+1
if i = r
then return x
elseif i < r
x = left[x]
else
x = right[x]
i = i - r
I know that you can simply solve this question iteratively by using a counter to increment each time you pass a node in linkedlist; also creating an arraylist and setting the data found with each node inside it. Once you hit the tail of the linkedlist, just minus the Nth term from the total number of elements in the arraylist and you will be able to return the answer. However how would someone perform this using recursion? Is it possible and if so please show the code to show your genius :).
Note: I know you cannot return two values in Java (but in C/C++, you can play with pointers :])
Edit: This was a simple question I found online but I added the recursion piece to make it a challenge for myself which I've come to find out that it may be impossible with Java.
The trick is to do the work after the recursion. The array in the private method is basically used as a reference to a mutable integer.
class Node {
Node next;
int data;
public Node findNthFromLast(int n) {
return findNthFromLast(new int[] {n});
}
private Node findNthFromLast(int[] r) {
Node result = next == null ? null : next.findNthFromLast(r);
return r[0]-- == 0 ? this : result;
}
}
As a general rule, anything that can be done with loops can also be done with recursion in any reasonable language. The elegance of the solution may be wildly different. Here is a fairly java idiomatic version. I've omitted the usual accessor functions for brevity.
The idea here is to recur to the end of the list and increment a counter as the recursion unwinds. When the counter reaches the desire value, return that node. Otherwise return null. The non-null value is just returned all the way tot the top. Once down the list, once up. Minimal arguments. No disrespect to Adam intended, but I think this is rather simpler.
NB: OP's statement about Java being able to return only one value is true, but since that value can be any object, you can return an object with fields or array elements as you choose. That wasn't needed here, however.
public class Test {
public void run() {
Node node = null;
// Build a list of 10 nodes. The last is #1
for (int i = 1; i <= 10; i++) {
node = new Node(i, node);
}
// Print from 1st last to 10th last.
for (int i = 1; i <= 10; i++) {
System.out.println(i + "th last node=" + node.nThFromLast(i).data);
}
}
public static void main(String[] args) {
new Test().run();
}
}
class Node {
int data; // Node data
Node next; // Next node or null if this is last
Node(int data, Node next) {
this.data = data;
this.next = next;
}
// A context for finding nth last list element.
private static class NthLastFinder {
int n, fromLast = 1;
NthLastFinder(int n) {
this.n = n;
}
Node find(Node node) {
if (node.next != null) {
Node rtn = find(node.next);
if (rtn != null) {
return rtn;
}
fromLast++;
}
return fromLast == n ? node : null;
}
}
Node nThFromLast(int n) {
return new NthLastFinder(n).find(this);
}
}
Okay, I think think this should do the trick. This is in C++ but it should be easy to translate to Java. I also haven't tested.
Node *NToLastHelper(Node *behind, Node *current, int n) {
// If n is not yet 0, keep advancing the current node
// to get it n "ahead" of behind.
if (n != 0) {
return NToLastHelper(behind, current->next, n - 1);
}
// Since we now know current is n ahead of behind, if it is null
// the behind must be n from the end.
if (current->next == nullptr) {
return behind;
}
// Otherwise we need to keep going.
return NToLastHelper(behind->next, current->next, n);
}
Node *NToLast(Node *node, int n) {
// Call the helper function from the head node.
return NToLastHelper(node, node, n);
}
edit: If you want to return the value of the last node, you can just change it to:
int NToLast(Node *node, int n) {
// Call the helper function from the head node.
return NToLastHelper(node, node, n)->val;
}
This code will fail badly if node is null.
The recursion function:
int n_to_end(Node *no, int n, Node **res)
{
if(no->next == NULL)
{
if(n==0)
*res = no;
return 0;
}
else
{
int tmp = 1 + n_to_end(no->next, n, res);
if(tmp == n)
*res = no;
return tmp;
}
}
The wrapper function:
Node *n_end(Node *no, int n)
{
Node *res;
res = NULL;
int m = n_to_end(no, n, &res);
if(m < n)
{
printf("max possible n should be smaller than or equal to: %d\n", m);
}
return res;
}
The calling function:
int main()
{
List list;
list.append(3);
list.append(5);
list.append(2);
list.append(2);
list.append(1);
list.append(1);
list.append(2);
list.append(2);
Node * nth = n_end(list.head, 6);
if(nth!=NULL)
printf("value is: %d\n", nth->val);
}
This code has been tested with different inputs. Although it's a C++ version, you should be able to figure out the logic :)
This is a popular interview question and the only article I can find on the topic is one from TopCoder. Unfortunately for me, it looks overly complicated from an interview answer's perspective.
Isn't there a simpler way of doing this other than plotting the path to both nodes and deducing the ancestor? (This is a popular answer, but there's a variation of the interview question asking for a constant space answer).
A simplistic (but much less involved version) could simply be (.NET guy here Java a bit rusty, so please excuse the syntax, but I think you won't have to adjust too much). This is what I threw together.
class Program
{
static void Main(string[] args)
{
Node node1 = new Node { Number = 1 };
Node node2 = new Node { Number = 2, Parent = node1 };
Node node3 = new Node { Number = 3, Parent = node1 };
Node node4 = new Node { Number = 4, Parent = node1 };
Node node5 = new Node { Number = 5, Parent = node3 };
Node node6 = new Node { Number = 6, Parent = node3 };
Node node7 = new Node { Number = 7, Parent = node3 };
Node node8 = new Node { Number = 8, Parent = node6 };
Node node9 = new Node { Number = 9, Parent = node6 };
Node node10 = new Node { Number = 10, Parent = node7 };
Node node11 = new Node { Number = 11, Parent = node7 };
Node node12 = new Node { Number = 12, Parent = node10 };
Node node13 = new Node { Number = 13, Parent = node10 };
Node commonAncestor = FindLowestCommonAncestor(node9, node12);
Console.WriteLine(commonAncestor.Number);
Console.ReadLine();
}
public class Node
{
public int Number { get; set; }
public Node Parent { get; set; }
public int CalculateNodeHeight()
{
return CalculateNodeHeight(this);
}
private int CalculateNodeHeight(Node node)
{
if (node.Parent == null)
{
return 1;
}
return CalculateNodeHeight(node.Parent) + 1;
}
}
public static Node FindLowestCommonAncestor(Node node1, Node node2)
{
int nodeLevel1 = node1.CalculateNodeHeight();
int nodeLevel2 = node2.CalculateNodeHeight();
while (nodeLevel1 > 0 && nodeLevel2 > 0)
{
if (nodeLevel1 > nodeLevel2)
{
node1 = node1.Parent;
nodeLevel1--;
}
else if (nodeLevel2 > nodeLevel1)
{
node2 = node2.Parent;
nodeLevel2--;
}
else
{
if (node1 == node2)
{
return node1;
}
node1 = node1.Parent;
node2 = node2.Parent;
nodeLevel1--;
nodeLevel2--;
}
}
return null;
}
}
Constant space answer: (although not necessarily efficient).
Have a function findItemInPath(int index, int searchId, Node root)
then iterate from 0 .. depth of tree, finding the 0-th item, 1-th item etc. in both search paths.
When you find i such that the function returns the same result for both, but not for i+1,
then the i-th item in the path is the lowest common ancestor.
The main reason why the article's solutions are more complicated is that it is dealing with a two-stage problem- preprocessing and then queries- while from your question it sounds like you're only doing one query so preprocessing doesn't make sense. It's also dealing with arbitrary trees rather than binary trees.
The best answer will certainly depend on details about the tree. For many kinds of trees, the time complexity is going to be O(h) where h is the tree's height. If you've got pointers to parent nodes, then the easy "constant-space" answer is, as in Mirko's solution, to find both nodes' height and compare ancestors of the same height. Note that this works for any tree with parent links, binary or no. We can improve on Mirko's solution by making the height function iterative and by separating the "get to the same depth" loops from the main loop:
int height(Node n){
int h=-1;
while(n!=null){h++;n=n.parent;}
return h;
}
Node LCA(Node n1, Node n2){
int discrepancy=height(n1)-height(n2);
while(discrepancy>0) {n1=n1.parent;discrepancy--;}
while(discrepancy<0) {n2=n2.parent;discrepancy++;}
while(n1!=n2){n1=n1.parent();n2=n2.parent();}
return n1;
}
The quotation marks around "constant-space" are because in general we need O(log(h)) space to store the heights and the difference between them (say, 3 BigIntegers). But if you're dealing with trees with heights too large to stuff in a long, you likely have other problems to worry about that are more pressing than storing a couple nodes' heights.
If you have a BST, then you can easily take a common ancestor (usu. starting with root) and check its children to see whether either of them is a common ancestor:
Node LCA(Node n1, Node n2, Node CA){
while(true){
if(n1.val<CA.val & n2.val<CA.val) CA=CA.left;
else if (n1.val>CA.val & n2.val>CA.val) CA=CA.right;
else return CA;
}
}
As Philip JF mentioned, this same idea can be used in any tree for a constant-space algorithm, but for a general tree doing it this way will be really slow since figuring out repeatedly whether CA.left or CA.right is a common ancestor will repeat a lot of work, so you'd normally prefer to use more space to save some time. The main way to make that tradeoff would be basically the algorithm you've mentioned (storing the path from root).
It matters what kind of tree you are using. You can always tell if a node is the ancestor of another node in constant space, and the top node is always a common ancestor, so getting the Lowest Common Ancestor in constant space just requires iterating your way down. On a binary search tree this is pretty easy to do fast, but it will work on any tree.
Many different trade offs are relevant for this problem, and the type of tree matters. The problem tends is much easier if you have pointers to parent nodes, and not just to children (Mirko's code uses this)
See also:
http://en.wikipedia.org/wiki/Lowest_common_ancestor
The obvious solution, that uses log(n) space, (n is the number of nodes) is the algorithm you mentioned. Here's an implementation. In the worst case it takes O(n) time (imagine that one of the node you are searching common ancestor for includes the last node).
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication2
{
class Node
{
private static int counter = 0;
private Node left = null;
private Node right = null;
public int id = counter++;
static Node constructTreeAux(int depth)
{
if (depth == 0)
return null;
Node newNode = new Node();
newNode.left = constructTree(depth - 1);
newNode.right = constructTree(depth - 1);
return newNode;
}
public static Node constructTree(int depth)
{
if (depth == 0)
return null;
Node root = new Node();
root.left = constructTreeAux(depth - 1);
root.right = constructTreeAux(depth - 1);
return root;
}
private List<Node> findPathAux(List<Node> pathSoFar, int searchId)
{
if (this.id == searchId)
{
if (pathSoFar == null)
pathSoFar = new List<Node>();
pathSoFar.Add(this);
return pathSoFar;
}
if (left != null)
{
List<Node> result = left.findPathAux(null, searchId);
if (result != null)
{
result.Add(this);
return result;
}
}
if (right != null)
{
List<Node> result = right.findPathAux(null, searchId);
if (result != null)
{
result.Add(this);
return result;
}
}
return null;
}
public static void printPath(List<Node> path)
{
if (path == null)
{
Console.Out.WriteLine(" empty path ");
return;
}
Console.Out.Write("[");
for (int i = 0; i < path.Count; i++)
Console.Out.Write(path[i] + " ");
Console.Out.WriteLine("]");
}
public override string ToString()
{
return id.ToString();
}
/// <summary>
/// Returns null if no common ancestor, the lowest common ancestor otherwise.
/// </summary>
public Node findCommonAncestor(int id1, int id2)
{
List<Node> path1 = findPathAux(null, id1);
if (path1 == null)
return null;
path1 = path1.Reverse<Node>().ToList<Node>();
List<Node> path2 = findPathAux(null, id2);
if (path2 == null)
return null;
path2 = path2.Reverse<Node>().ToList<Node>();
Node commonAncestor = this;
int n = path1.Count < path2.Count? path1.Count : path2.Count;
printPath(path1);
printPath(path2);
for (int i = 0; i < n; i++)
{
if (path1[i].id == path2[i].id)
commonAncestor = path1[i];
else
return commonAncestor;
}
return commonAncestor;
}
private void printTreeAux(int depth)
{
for (int i = 0; i < depth; i++)
Console.Write(" ");
Console.WriteLine(id);
if (left != null)
left.printTreeAux(depth + 1);
if (right != null)
right.printTreeAux(depth + 1);
}
public void printTree()
{
printTreeAux(0);
}
public static void testAux(out Node root, out Node commonAncestor, out int id1, out int id2)
{
Random gen = new Random();
int startid = counter;
root = constructTree(5);
int endid = counter;
int offset = gen.Next(endid - startid);
id1 = startid + offset;
offset = gen.Next(endid - startid);
id2 = startid + offset;
commonAncestor = root.findCommonAncestor(id1, id2);
}
public static void test1()
{
Node root = null, commonAncestor = null;
int id1 = 0, id2 = 0;
testAux(out root, out commonAncestor, out id1, out id2);
root.printTree();
commonAncestor = root.findCommonAncestor(id1, id2);
if (commonAncestor == null)
Console.WriteLine("Couldn't find common ancestor for " + id1 + " and " + id2);
else
Console.WriteLine("Common ancestor for " + id1 + " and " + id2 + " is " + commonAncestor.id);
}
}
}
The bottom up approach described here is an O(n) time, O(1) space approach:
http://www.leetcode.com/2011/07/lowest-common-ancestor-of-a-binary-tree-part-i.html
Node *LCA(Node *root, Node *p, Node *q) {
if (!root) return NULL;
if (root == p || root == q) return root;
Node *L = LCA(root->left, p, q);
Node *R = LCA(root->right, p, q);
if (L && R) return root; // if p and q are on both sides
return L ? L : R; // either one of p,q is on one side OR p,q is not in L&R subtrees
}
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I have a Java program that stores a lot of mappings from Strings to various objects.
Right now, my options are either to rely on hashing (via HashMap) or on binary searches (via TreeMap). I am wondering if there is an efficient and standard trie-based map implementation in a popular and quality collections library?
I've written my own in the past, but I'd rather go with something standard, if available.
Quick clarification: While my question is general, in the current project I am dealing with a lot of data that is indexed by fully-qualified class name or method signature. Thus, there are many shared prefixes.
You might want to look at the Trie implementation that Limewire is contributing to the Google Guava.
There is no trie data structure in the core Java libraries.
This may be because tries are usually designed to store character strings, while Java data structures are more general, usually holding any Object (defining equality and a hash operation), though they are sometimes limited to Comparable objects (defining an order). There's no common abstraction for "a sequence of symbols," although CharSequence is suitable for character strings, and I suppose you could do something with Iterable for other types of symbols.
Here's another point to consider: when trying to implement a conventional trie in Java, you are quickly confronted with the fact that Java supports Unicode. To have any sort of space efficiency, you have to restrict the strings in your trie to some subset of symbols, or abandon the conventional approach of storing child nodes in an array indexed by symbol. This might be another reason why tries are not considered general-purpose enough for inclusion in the core library, and something to watch out for if you implement your own or use a third-party library.
Apache Commons Collections v4.0 now supports trie structures.
See the org.apache.commons.collections4.trie package info for more information. In particular, check the PatriciaTrie class:
Implementation of a PATRICIA Trie (Practical Algorithm to Retrieve Information Coded in Alphanumeric).
A PATRICIA Trie is a compressed Trie. Instead of storing all data at the edges of the Trie (and having empty internal nodes), PATRICIA stores data in every node. This allows for very efficient traversal, insert, delete, predecessor, successor, prefix, range, and select(Object) operations. All operations are performed at worst in O(K) time, where K is the number of bits in the largest item in the tree. In practice, operations actually take O(A(K)) time, where A(K) is the average number of bits of all items in the tree.
Also check out concurrent-trees. They support both Radix and Suffix trees and are designed for high concurrency environments.
I wrote and published a simple and fast implementation here.
What you need is org.apache.commons.collections.FastTreeMap , I think.
Below is a basic HashMap implementation of a Trie. Some people might find this useful...
class Trie {
HashMap<Character, HashMap> root;
public Trie() {
root = new HashMap<Character, HashMap>();
}
public void addWord(String word) {
HashMap<Character, HashMap> node = root;
for (int i = 0; i < word.length(); i++) {
Character currentLetter = word.charAt(i);
if (node.containsKey(currentLetter) == false) {
node.put(currentLetter, new HashMap<Character, HashMap>());
}
node = node.get(currentLetter);
}
}
public boolean containsPrefix(String word) {
HashMap<Character, HashMap> node = root;
for (int i = 0; i < word.length(); i++) {
Character currentLetter = word.charAt(i);
if (node.containsKey(currentLetter)) {
node = node.get(currentLetter);
} else {
return false;
}
}
return true;
}
}
Apache's commons collections:
org.apache.commons.collections4.trie.PatriciaTrie
You can try the Completely Java library, it features a PatriciaTrie implementation. The API is small and easy to get started, and it's available in the Maven central repository.
You might look at this TopCoder one as well (registration required...).
If you required sorted map, then tries are worthwhile.
If you don't then hashmap is better.
Hashmap with string keys can be improved over the standard Java implementation:
Array hash map
If you're not worried about pulling in the Scala library, you can use this space efficient implementation I wrote of a burst trie.
https://github.com/nbauernfeind/scala-burst-trie
here is my implementation, enjoy it via: GitHub - MyTrie.java
/* usage:
MyTrie trie = new MyTrie();
trie.insert("abcde");
trie.insert("abc");
trie.insert("sadas");
trie.insert("abc");
trie.insert("wqwqd");
System.out.println(trie.contains("abc"));
System.out.println(trie.contains("abcd"));
System.out.println(trie.contains("abcdefg"));
System.out.println(trie.contains("ab"));
System.out.println(trie.getWordCount("abc"));
System.out.println(trie.getAllDistinctWords());
*/
import java.util.*;
public class MyTrie {
private class Node {
public int[] next = new int[26];
public int wordCount;
public Node() {
for(int i=0;i<26;i++) {
next[i] = NULL;
}
wordCount = 0;
}
}
private int curr;
private Node[] nodes;
private List<String> allDistinctWords;
public final static int NULL = -1;
public MyTrie() {
nodes = new Node[100000];
nodes[0] = new Node();
curr = 1;
}
private int getIndex(char c) {
return (int)(c - 'a');
}
private void depthSearchWord(int x, String currWord) {
for(int i=0;i<26;i++) {
int p = nodes[x].next[i];
if(p != NULL) {
String word = currWord + (char)(i + 'a');
if(nodes[p].wordCount > 0) {
allDistinctWords.add(word);
}
depthSearchWord(p, word);
}
}
}
public List<String> getAllDistinctWords() {
allDistinctWords = new ArrayList<String>();
depthSearchWord(0, "");
return allDistinctWords;
}
public int getWordCount(String str) {
int len = str.length();
int p = 0;
for(int i=0;i<len;i++) {
int j = getIndex(str.charAt(i));
if(nodes[p].next[j] == NULL) {
return 0;
}
p = nodes[p].next[j];
}
return nodes[p].wordCount;
}
public boolean contains(String str) {
int len = str.length();
int p = 0;
for(int i=0;i<len;i++) {
int j = getIndex(str.charAt(i));
if(nodes[p].next[j] == NULL) {
return false;
}
p = nodes[p].next[j];
}
return nodes[p].wordCount > 0;
}
public void insert(String str) {
int len = str.length();
int p = 0;
for(int i=0;i<len;i++) {
int j = getIndex(str.charAt(i));
if(nodes[p].next[j] == NULL) {
nodes[curr] = new Node();
nodes[p].next[j] = curr;
curr++;
}
p = nodes[p].next[j];
}
nodes[p].wordCount++;
}
}
I have just tried my own Concurrent TRIE implementation but not based on characters, it is based on HashCode. Still We can use this having Map of Map for each CHAR hascode.
You can test this using the code # https://github.com/skanagavelu/TrieHashMap/blob/master/src/TrieMapPerformanceTest.java
https://github.com/skanagavelu/TrieHashMap/blob/master/src/TrieMapValidationTest.java
import java.util.concurrent.atomic.AtomicReferenceArray;
public class TrieMap {
public static int SIZEOFEDGE = 4;
public static int OSIZE = 5000;
}
abstract class Node {
public Node getLink(String key, int hash, int level){
throw new UnsupportedOperationException();
}
public Node createLink(int hash, int level, String key, String val) {
throw new UnsupportedOperationException();
}
public Node removeLink(String key, int hash, int level){
throw new UnsupportedOperationException();
}
}
class Vertex extends Node {
String key;
volatile String val;
volatile Vertex next;
public Vertex(String key, String val) {
this.key = key;
this.val = val;
}
#Override
public boolean equals(Object obj) {
Vertex v = (Vertex) obj;
return this.key.equals(v.key);
}
#Override
public int hashCode() {
return key.hashCode();
}
#Override
public String toString() {
return key +"#"+key.hashCode();
}
}
class Edge extends Node {
volatile AtomicReferenceArray<Node> array; //This is needed to ensure array elements are volatile
public Edge(int size) {
array = new AtomicReferenceArray<Node>(8);
}
#Override
public Node getLink(String key, int hash, int level){
int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Node returnVal = array.get(index);
for(;;) {
if(returnVal == null) {
return null;
}
else if((returnVal instanceof Vertex)) {
Vertex node = (Vertex) returnVal;
for(;node != null; node = node.next) {
if(node.key.equals(key)) {
return node;
}
}
return null;
} else { //instanceof Edge
level = level + 1;
index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Edge e = (Edge) returnVal;
returnVal = e.array.get(index);
}
}
}
#Override
public Node createLink(int hash, int level, String key, String val) { //Remove size
for(;;) { //Repeat the work on the current node, since some other thread modified this node
int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Node nodeAtIndex = array.get(index);
if ( nodeAtIndex == null) {
Vertex newV = new Vertex(key, val);
boolean result = array.compareAndSet(index, null, newV);
if(result == Boolean.TRUE) {
return newV;
}
//continue; since new node is inserted by other thread, hence repeat it.
}
else if(nodeAtIndex instanceof Vertex) {
Vertex vrtexAtIndex = (Vertex) nodeAtIndex;
int newIndex = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, vrtexAtIndex.hashCode(), level+1);
int newIndex1 = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level+1);
Edge edge = new Edge(Base10ToBaseX.Base.BASE8.getLevelZeroMask()+1);
if(newIndex != newIndex1) {
Vertex newV = new Vertex(key, val);
edge.array.set(newIndex, vrtexAtIndex);
edge.array.set(newIndex1, newV);
boolean result = array.compareAndSet(index, vrtexAtIndex, edge); //REPLACE vertex to edge
if(result == Boolean.TRUE) {
return newV;
}
//continue; since vrtexAtIndex may be removed or changed to Edge already.
} else if(vrtexAtIndex.key.hashCode() == hash) {//vrtex.hash == hash) { HERE newIndex == newIndex1
synchronized (vrtexAtIndex) {
boolean result = array.compareAndSet(index, vrtexAtIndex, vrtexAtIndex); //Double check this vertex is not removed.
if(result == Boolean.TRUE) {
Vertex prevV = vrtexAtIndex;
for(;vrtexAtIndex != null; vrtexAtIndex = vrtexAtIndex.next) {
prevV = vrtexAtIndex; // prevV is used to handle when vrtexAtIndex reached NULL
if(vrtexAtIndex.key.equals(key)){
vrtexAtIndex.val = val;
return vrtexAtIndex;
}
}
Vertex newV = new Vertex(key, val);
prevV.next = newV; // Within SYNCHRONIZATION since prevV.next may be added with some other.
return newV;
}
//Continue; vrtexAtIndex got changed
}
} else { //HERE newIndex == newIndex1 BUT vrtex.hash != hash
edge.array.set(newIndex, vrtexAtIndex);
boolean result = array.compareAndSet(index, vrtexAtIndex, edge); //REPLACE vertex to edge
if(result == Boolean.TRUE) {
return edge.createLink(hash, (level + 1), key, val);
}
}
}
else { //instanceof Edge
return nodeAtIndex.createLink(hash, (level + 1), key, val);
}
}
}
#Override
public Node removeLink(String key, int hash, int level){
for(;;) {
int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
Node returnVal = array.get(index);
if(returnVal == null) {
return null;
}
else if((returnVal instanceof Vertex)) {
synchronized (returnVal) {
Vertex node = (Vertex) returnVal;
if(node.next == null) {
if(node.key.equals(key)) {
boolean result = array.compareAndSet(index, node, null);
if(result == Boolean.TRUE) {
return node;
}
continue; //Vertex may be changed to Edge
}
return null; //Nothing found; This is not the same vertex we are looking for. Here hashcode is same but key is different.
} else {
if(node.key.equals(key)) { //Removing the first node in the link
boolean result = array.compareAndSet(index, node, node.next);
if(result == Boolean.TRUE) {
return node;
}
continue; //Vertex(node) may be changed to Edge, so try again.
}
Vertex prevV = node; // prevV is used to handle when vrtexAtIndex is found and to be removed from its previous
node = node.next;
for(;node != null; prevV = node, node = node.next) {
if(node.key.equals(key)) {
prevV.next = node.next; //Removing other than first node in the link
return node;
}
}
return null; //Nothing found in the linked list.
}
}
} else { //instanceof Edge
return returnVal.removeLink(key, hash, (level + 1));
}
}
}
}
class Base10ToBaseX {
public static enum Base {
/**
* Integer is represented in 32 bit in 32 bit machine.
* There we can split this integer no of bits into multiples of 1,2,4,8,16 bits
*/
BASE2(1,1,32), BASE4(3,2,16), BASE8(7,3,11)/* OCTAL*/, /*BASE10(3,2),*/
BASE16(15, 4, 8){
public String getFormattedValue(int val){
switch(val) {
case 10:
return "A";
case 11:
return "B";
case 12:
return "C";
case 13:
return "D";
case 14:
return "E";
case 15:
return "F";
default:
return "" + val;
}
}
}, /*BASE32(31,5,1),*/ BASE256(255, 8, 4), /*BASE512(511,9),*/ Base65536(65535, 16, 2);
private int LEVEL_0_MASK;
private int LEVEL_1_ROTATION;
private int MAX_ROTATION;
Base(int levelZeroMask, int levelOneRotation, int maxPossibleRotation) {
this.LEVEL_0_MASK = levelZeroMask;
this.LEVEL_1_ROTATION = levelOneRotation;
this.MAX_ROTATION = maxPossibleRotation;
}
int getLevelZeroMask(){
return LEVEL_0_MASK;
}
int getLevelOneRotation(){
return LEVEL_1_ROTATION;
}
int getMaxRotation(){
return MAX_ROTATION;
}
String getFormattedValue(int val){
return "" + val;
}
}
public static int getBaseXValueOnAtLevel(Base base, int on, int level) {
if(level > base.getMaxRotation() || level < 1) {
return 0; //INVALID Input
}
int rotation = base.getLevelOneRotation();
int mask = base.getLevelZeroMask();
if(level > 1) {
rotation = (level-1) * rotation;
mask = mask << rotation;
} else {
rotation = 0;
}
return (on & mask) >>> rotation;
}
}