Why use a tree structure to support searching?

Why use a tree structure to support searching? - java

For example, the BST class uses a tree structure:
public class BST<Key extends Comparable<Key>,Value> {
private Node root;
public class Node {
private int n;
private Node left;
private Node right;
private Key key;
private Value val;
}
public void put();
public Value get(Key key)
}
In this structure, nodes are connected with left and right fields and inserts and search is supported with put and get methods.
I'm interested in knowing
how are the nodes created?
why use a tree structure rather than array or some other simple structure?

The key advantage of a tree structure for searching is efficiency. It is possible to reach the node matching the key you are searching for much more quickly than iterating through an array (for example). If you have a balanced binary tree (i.e. the depth of the tree to the left and right of any node differs by no more than 1) of 1024 items then you will need step through a maximum of 10 nodes to find the key. There are special cases for arrays (such as evenly distributed values) that can be searched just as efficiently but trees are very good for a wide variety of situations which involve lots more searches than inserts.
On the other hand, inserting values into trees can be inefficient, particularly due to the need to rebalance the tree to allow efficient searching (which is their main purpose). This makes the put method relatively complicated (depending on which balancing algorithm you use).
The get method is pretty trivial:
public Value get(Key key) {
Node node = root;
while (node != null) {
int compare = key.compareTo(node.key);
if (compare > 0)
node = node.right;
else if (compare < 0)
node = node.left;
else
return node.val;
}
return null;
}

This is very well explained here and here. To summarize - this is how the nodes are created.
/**
* Inserts the key-value pair into the symbol table, overwriting the old value
* with the new value if the key is already in the symbol table.
* If the value is <tt>null</tt>, this effectively deletes the key from the symbol table.
*
* #param key the key
* #param val the value
* #throws NullPointerException if <tt>key</tt> is <tt>null</tt>
*/
public void put(Key key, Value val) {
if (val == null) {
delete(key);
return;
}
root = put(root, key, val);
assert check();
}
private Node put(Node x, Key key, Value val) {
if (x == null) return new Node(key, val, 1);
int cmp = key.compareTo(x.key);
if (cmp < 0) x.left = put(x.left, key, val);
else if (cmp > 0) x.right = put(x.right, key, val);
else x.val = val;
x.N = 1 + size(x.left) + size(x.right);
return x;
}
If the key already exists - it's replaced and if the key is new it is recursively searched till the search ends at a leaf - the leaf is then expanded with the new node. It's important to note that the final position of the node is the place where search would have ultimately found it if it were already in the tree so as to satisfy the BST properties.
Also due to properties of Binary search tree - the search operation is O(log n) or O(h) where n is number of nodes and h is height of tree.

Related

How to iterate through a list of nodes which might have sub-lists of nodes (unknown depth levels)

I have a list of nodes, and each node might have a list of subNodes (the number of levels are unknown):
class Node {
int score;
boolean selected;
List<Node> subNodes;
}
Here's how an hypothetical structure might look like:
NODE
+ NODE
+ NODE
+ NODE
+ NODE
+ NODE
+ NODE
+ NODE
+ NODE
+ NODE
+ NODE
+ NODE
Combinations are just countless. I need a way to sum NODE.score for all those nodes that have NODE.selected set to true, possibly using Java 8 features. Any hints would be really appreciated.

Something like:
public int recursiveTotal(final Node node) {
//node not select, don't count the node or any of its subnodes
if (!node.selected) {
return 0;
}
//no subnodes, only node score counts
if (node.subNodes.isEmpty()) {
return node.score;
}
//node has subnodes, recursively count subnode score + parent node score
int totalScore = node.score;
for (final Node subNode : node.subNodes) {
totalScore += recursiveTotal(subNode);
}
return totalScore;
}
Coded using stackoverflow as an IDE, no guarantee against compilation errors ;)

Create a recursive method in your Node class which returns a stream of nodes concatenating a stream of the parent node and the sub nodes:
class Node {
int score;
boolean selected;
List<Node> subNodes;
public Stream<Node> streamNodes() {
return Stream.concat(Stream.of(this), subNodes.stream().flatMap(Node::streamNodes));
}
}
and use it like below to stream over your list:
List<Node> myNodes = //your list
int sum = myNodes.stream()
.flatMap(Node::streamNodes)
.filter(Node::isSelected)
.mapToInt(Node::getScore)
.sum();

TL;DR
Judging by the structure, you've provided each Node in your List is the root of an N-ary Tree data structure (I assume that there are no circles).
And in order to get the required data, we can utilize one of the classic tree-traversal algorithms. In case when the average depth is lower than the average width Depth first search algorithm would be more suitable because it would be more space-efficient, in the opposite situation it would be better to use Breadth first search. I'll go with DFS.
It's easier to come up with a recursive implementation, so I'll start with it. But it has no practical value in Java, hence we would proceed with a couple of improvements.
Streams + recursion
You can create a helper-method responsible for flattening the nodes which would be called from the stream.
List<Node> nodes = // initializing the list
long totalScore = nodes.stream()
.flatMap(node -> flatten(node).stream())
.filter(Node::isSelected)
.mapToLong(Node::getScore)
.sum();
Recursive auxiliary method:
public static List<Node> flatten(Node node) {
if (node.getSubNodes().isEmpty()) {
return List.of(node);
}
List<Node> result = new ArrayList<>();
result.add(node);
node.getSubNodes().forEach(n -> result.addAll(flatten(n)));
return result;
}
No recursion
To avoid StackOverflowError method flatten() can be implemented without recursion by polling and allocating new nodes on the stack (represented by an ArrayDeque) iterativelly.
public static List<Node> flatten(Node node) {
List<Node> result = new ArrayList<>();
Deque<Node> stack = new ArrayDeque<>();
stack.add(node);
while (!stack.isEmpty()) {
Node current = stack.poll();
result.add(current);
current.getSubNodes().forEach(stack::push);
}
return result;
}
No recursion & No intermediate data allocation
Allocating intermediate data in the form of nodes which eventually would not be used is impractical.
Instead, we can make the auxiliary method to be responsible for calculating the total score produced by summarizing the score of each selected node in the tree of nodes.
For that we need to perform isSelected() while traversing the tree.
List<Node> nodes = // initializing the list
long totalScore = nodes.stream()
.mapToLong(node -> getScore(node))
.sum();
public static long getScore(Node node) {
long total = 0;
Deque<Node> stack = new ArrayDeque<>();
stack.push(node);
while (!stack.isEmpty()) {
Node current = stack.poll();
if (current.isSelected()) total += current.getScore();
current.getSubNodes().forEach(stack::push);
}
return total;
}

trouble understanding implementation of hash table with chaining

I'm studying on hash table with chaining in java by its implementation. The trouble is about get() method. An index value is determined with key.hashCode() % table.length. Assume that the table size is 10 and key.hashCode() is 124 so index is found as 4. In for each loop table[index] is started from table[4], AFAIK index is being incremented one by one 4,5,6,7... so on. But what about indices 0,1,2,3? Are they been checked? (I think no) Isn't there any possibility that occurring of key on one of the indices? (I think yes). The other issue that there are null checks but initially there is no any null assignment for key and value. So how can the checking work? Is null assigned as soon as private LinkedList<Entry<K, V>>[] table is declared?
// Data Structures: Abstraction and Design Using Java, Koffman, Wolfgang
package KW.CH07;
import java.util.AbstractMap;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.StringJoiner;
/**
* Hash table implementation using chaining.
* #param <K> The key type
* #param <V> The value type
* #author Koffman and Wolfgang
**/
public class HashtableChain<K, V>
// Insert solution to programming project 7, chapter -1 here
implements KWHashMap<K, V> {
/** The table */
private LinkedList<Entry<K, V>>[] table;
/** The number of keys */
private int numKeys;
/** The capacity */
private static final int CAPACITY = 101;
/** The maximum load factor */
private static final double LOAD_THRESHOLD = 3.0;
// Note this is equivalent to java.util.AbstractMap.SimpleEntry
/** Contains key-value pairs for a hash table.
#param <K> the key type
#param <V> the value type
*/
public static class Entry<K, V>
// Insert solution to programming project 6, chapter -1 here
{
/** The key */
private final K key;
/** The value */
private V value;
/**
* Creates a new key-value pair.
* #param key The key
* #param value The value
*/
public Entry(K key, V value) {
this.key = key;
this.value = value;
}
/**
* Retrieves the key.
* #return The key
*/
#Override
public K getKey() {
return key;
}
/**
* Retrieves the value.
* #return The value
*/
#Override
public V getValue() {
return value;
}
/**
* Sets the value.
* #param val The new value
* #return The old value
*/
#Override
public V setValue(V val) {
V oldVal = value;
value = val;
return oldVal;
}
// Insert solution to programming exercise 3, section 4, chapter 7 here
}
// Constructor
public HashtableChain() {
table = new LinkedList[CAPACITY];
}
// Constructor for test purposes
HashtableChain(int capacity) {
table = new LinkedList[capacity];
}
/**
* Method get for class HashtableChain.
* #param key The key being sought
* #return The value associated with this key if found;
* otherwise, null
*/
#Override
public V get(Object key) {
int index = key.hashCode() % table.length;
if (index < 0) {
index += table.length;
}
if (table[index] == null) {
return null; // key is not in the table.
}
// Search the list at table[index] to find the key.
for (Entry<K, V> nextItem : table[index]) {
if (nextItem.getKey().equals(key)) {
return nextItem.getValue();
}
}
// assert: key is not in the table.
return null;
}
/**
* Method put for class HashtableChain.
* #post This key-value pair is inserted in the
* table and numKeys is incremented. If the key is already
* in the table, its value is changed to the argument
* value and numKeys is not changed.
* #param key The key of item being inserted
* #param value The value for this key
* #return The old value associated with this key if
* found; otherwise, null
*/
#Override
public V put(K key, V value) {
int index = key.hashCode() % table.length;
if (index < 0) {
index += table.length;
}
if (table[index] == null) {
// Create a new linked list at table[index].
table[index] = new LinkedList<>();
}
// Search the list at table[index] to find the key.
for (Entry<K, V> nextItem : table[index]) {
// If the search is successful, replace the old value.
if (nextItem.getKey().equals(key)) {
// Replace value for this key.
V oldVal = nextItem.getValue();
nextItem.setValue(value);
return oldVal;
}
}
// assert: key is not in the table, add new item.
table[index].addFirst(new Entry<>(key, value));
numKeys++;
if (numKeys > (LOAD_THRESHOLD * table.length)) {
rehash();
}
return null;
}
/** Returns true if empty
#return true if empty
*/
#Override
public boolean isEmpty() {
return numKeys == 0;
}
}

Assume that the table size is 10 and key.hashCode() is 124 so index is found as 4. In for each loop table[index] is started from table[4]
Correct.
there are null checks but initially there is no any null assignment for key and value. So how can the checking work?
When an array of objects is initialized, all values are set to null.
index is being incremented one by one 4,5,6,7... so on. But what about indices 0,1,2,3? Are they been checked? (I think no) Isn't there any possibility that occurring of key on one of the indices? (I think yes).
Looks like there's some misunderstanding here. First, think of the data structure like this (with data having already been added to it):
table:
[0] -> null
[1] -> LinkedList -> item 1 -> item 2 -> item 3
[2] -> LinkedList -> item 1
[3] -> null
[4] -> LinkedList -> item 1 -> item 2
[5] -> LinkedList -> item 1 -> item 2 -> item 3 -> item 4
[6] -> null
Another important point is that the hash code for a given key should not change, so it will always map to the same index in the table.
So say we call get with a value who's hash code maps it to 3, then we know that it's not in the table:
if (table[index] == null) {
return null; // key is not in the table.
}
If another key comes in that maps to 1, now we need to iterate over the LinkedList:
// LinkedList<Entry<K, V>> list = table[index]
for (Entry<K, V> nextItem : table[index]) {
// iterate over item 1, item 2, item 3 until we find one that is equal.
if (nextItem.getKey().equals(key)) {
return nextItem.getValue();
}
}

I think you aren't quite visualizing your hash table correctly. There are two equally good simple implementations of a hash table.
Method 1 uses linked lists: An array (well, Vector, actually) of linked lists.
Given a "key", you derive a hash value for that key(*). You take the remainder of that hash value relative to the current size of the vector, let's call that "x". Then you sequentially search the linked list that vector[x] points to for a match to your key.
(*) You hope that the hash values will be reasonably well-distributed. There are complex algorithms for doing this. Let's hope your JVM implementation of HashCode does a good job of this.
Method 2 avoids linked lists: you create a Vector and compute an index into the Vector (as above). Then you look at the Vector.get(x). If that's the key you want, your return the corresponding value. Let's assume it's not. Then you look at Vector.get(x+1), Vector.get(x+2), etc. Eventually, one of the following three things will happen:
a) You find the key you are looking for. Then you return the corresponding value.
b) you find an empty entry (key == null). Return null or whatever value you have chosen to mean "this isn't the droid you're looking for".
c) you have examined every entry in the Vector. Again, return null or whatever.
Checking for (c) is a precaution, so that if the Hash Table happens to be full you won't loop forever. If the hash table is about to be full (you can keep a count of how many entries have been used) you should reallocate a bigger hash table. IDeally, you want to keep the hash table sparse enough that you never get anywhere near searching the whole table: that vitiates the whole purpose of a hash table -- that you can search it in much less than linear time, ideally in order 1 (that is, the number of comparisons is <= a small constant). I would suggest that you allocate a Vector that is at least 10x the number of entries you expect to put in it.
The use of the word "chaining" in you questions suggests to me that you want to implement the second type of hash table.
Btw, you should never use 10 as the size of a hash table. The size should be a prime number.
Hope this helps.

How could HashMap assurance same index when a duplicate key added with different `tab.length`?

The following piece of code is used to add an element to a HashMap (from Android 5.1.1 source tree), I'm very confused this statement:int index = hash & (tab.length - 1);, how could this map assurance the same index when a duplicate key added with different tab.length?
For example, assume that we have a new empty HashMap hMap. Firstly, we add pair ("1","1") to it, assume tab.length equals 1 at this time, then we add many pairs to this map, assume tab.length equals "x", now we add a duplicate pair ("1","1") to it, notice that the tab.length is changed, so the index's value int index = hash & (tab.length - 1); may also changed.
/**
* Maps the specified key to the specified value.
*
* #param key
* the key.
* #param value
* the value.
* #return the value of any previous mapping with the specified key or
* {#code null} if there was no such mapping.
*/
#Override public V put(K key, V value) {
if (key == null) {
return putValueForNullKey(value);
}
int hash = Collections.secondaryHash(key);
HashMapEntry<K, V>[] tab = table;
int index = hash & (tab.length - 1);
for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
if (e.hash == hash && key.equals(e.key)) {
preModify(e);
V oldValue = e.value;
e.value = value;
return oldValue;
}
}
// No entry for (non-null) key is present; create one
modCount++;
if (size++ > threshold) {
tab = doubleCapacity();
index = hash & (tab.length - 1);
}
addNewEntry(key, value, hash, index);
return null;
}

When table need to reconstruct, it will first re-computing the index of older element, so the index will follow the changes of table's length.

Java pass by value and recursion

I have simple code which prints path to a specific node in a tree. My implemention using java String is as below
//using strings
public static void getPathS(Node node,String path,int key){
if (node == null) {
return;
} else if(node.data == key) {
System.out.println(path+" "+key);;
}
getPathS(node.left,path+" "+node.data,key);
getPathS(node.right,path+" "+node.data,key);
}
Suppose there is tree as give below,
if I call getPathS on 3 , Above implementation will print
1 34 3 //path from root to the element
If I implement same method using ArrayList as below
public static List getPath(Node node, List<Integer> path, int key) {
if (node == null) {
//1 . path = new ArrayList<Integer>();
path = new ArrayList<Integer>();
// 2. or tried path.clear() -- it should clear the path
//return path;
return null;
} else if (node.data == key) {
path.add(node.data);
return path;
}
path.add(node.data);
return nonNull(getPath(node.left, path, key), getPath(node.right, path, key));
}
private List nonNull(List path1, List path2) {
if (path1 != null)
return path1;
if(path2 !=null )
return path2;
return null;
}
// class Node { Node left, Node right , int data; };
//Code to call getPath
Node node = new Node(1);
node.left = new Node(2);
node.left.left = new Node(4);
node.right = new Node(34);
node.right.right = new Node(3);
System.out.println(getPath(node, new ArrayList(), 3));
In second implementation, I tried two approaches, when we get NULL node, in 1st approach if I assign new ArrayList to path, it prints all the elements i.e.
[1, 2, 4, 34, 3]
If I use path.clear(), it only prints last element i.e. element to be searched.
How can we make sure ArrayList will work as String in recursion ?

The problem here is that you don't consider failure for both branches with your call to nonNull().
Here is a correction that takes into account that possibility, and removes the data of the current node if we failed to find the key in its children.
public static List<Integer> getPath(Node node, List<Integer> path, int key) {
if (node == null) {
return null;
} else if (node.data == key) {
path.add(node.data);
return path;
}
path.add(node.data);
// path is unchanged if nothing is found in left children
if (getPath(node.left, path, key) != null || getPath(node.right, path, key) != null) {
// found in one branch or the other
return path;
}
// not found in either branch, remove our data
path.remove(path.size() - 1);
return null;
}
Of course, it looks like we're manipulating different lists, but there is only one: the one provided as argument the first time. That's why data should be removed from it. You need to be clear about your arguments.
A cleaner solution, that emphasizes the fact that there is one list only.
/**
* Appends to the specified list all keys from {#code node} to the {#link Node} containing the
* specified {#code key}. If the key is not found in the specified node's children, the list is
* guaranteed to be unchanged. If the key is found among the children, then the specified list
* will contain the new elements (in addition to the old ones).
*
* #param node
* the node to start at
* #param path
* the current path to append data to
* #param key
* the key to stop at
* #return true if the key was found among the specified node's children, false otherwise
*/
public static boolean getPath(Node node, List<Integer> path, int key) {
if (node == null) {
// leaf reached, and the key was not found
return false;
}
// add data to the path
path.add(node.data);
// the OR is lazy here, so we treat everything in the given order
// if getPath failed on the left children, path is unchanged and used for right children
if (node.data == key || getPath(node.left, path, key) || getPath(node.right, path, key)) {
// the key is found in the current node, its left children, or its right children
return true;
}
// not found in either branch, remove our data
path.remove(path.size() - 1);
return false;
}
Note that I didn't use path.remove(node.data) because there could be more that one node with that data, and the first one would be removed instead of the last.

Using String as BST key value

I have a binary search tree which stores objects. For inserting objects to it I am using Int value as a key. I get that value by calling method form the object like this:
public class Tree
{
// The root node of the tree which is null;
private Node root;
private double largest;
private Node insert (Node tree, Element d)
{
if (tree == null) return new Node(d);
else if (d.getPlaceInTable() < tree.data.getPlaceInTable()) tree.left = insert (tree.left, d);
else if (d.getPlaceInTable() > tree.data.getPlaceInTable()) tree.right = insert (tree.right, d);
return tree;
}
public void insert (Element d)
{
root = insert (root, d);
}
But what if I want to use Elements name as a key value which is string? How can I do it? Should I use compareTo() method? I know how to compare string1.compareTo(string2) but I really don' have any idea how I can use it in this case. If you have any suggestions, I really appreciate that.

Yes, String implements Comparable, so you can do
d.getName().compareTo(tree.data.getName()) < 0 for the left node and
d.getName().compareTo(tree.data.getName()) >= 0 for the right node
Also note, that in your original code you do not insert anything in your tree when values are equal.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Why use a tree structure to support searching? - java

Related

How to iterate through a list of nodes which might have sub-lists of nodes (unknown depth levels)

trouble understanding implementation of hash table with chaining

How could HashMap assurance same index when a duplicate key added with different `tab.length`?

Java pass by value and recursion

Using String as BST key value

Categories

Resources