How do I parallelise this DFS? - java

I have a binary tree where each node has the value of 0 or 1, and each path from the root to leaf node represents a binary string of a certain length. The aim of the program is to find all possible binary String (i.e. all possible paths from root to leaf). Now I want to parallelise it, so that it can use multiple cores. I assume I need to somehow split up the workload on the branch nodes, but I have no idea where to begin. I am looking at the ForkJoin functionality, but I have no idea how to split up the work and then combine it.
public class Tree{
Node root;
int levels;
Tree(int v){
root = new Node(v);
levels = 1;
}
Tree(){
root = null;
levels = 0;
}
public static void main(String[] args){
Tree tree = new Tree(0);
populate(tree, tree.root, tree.levels);
tree.printPaths(tree.root);
}
public static void populate(Tree t, Node n, int levels){
levels++;
if(levels >6){
n.left = null;
n.right = null;
}
else{
t.levels = levels;
n.left = new Node(0);
n.right = new Node(1);
populate(t, n.left, levels);
populate(t, n.right, levels);
}
}
void printPaths(Node node)
{
int path[] = new int[1000];
printPathsRecur(node, path, 0);
}
void printPathsRecur(Node node, int path[], int pathLen)
{
if (node == null)
return;
/* append this node to the path array */
path[pathLen] = node.value;
pathLen++;
/* it's a leaf, so print the path that led to here */
if (node.left == null && node.right == null)
printArray(path, pathLen);
else
{
/* otherwise try both subtrees */
printPathsRecur(node.left, path, pathLen);
printPathsRecur(node.right, path, pathLen);
}
}
/* Utility function that prints out an array on a line. */
void printArray(int ints[], int len)
{
int i;
for (i = 0; i < len; i++)
{
System.out.print(ints[i] + " ");
}
System.out.println("");
}
}

You can use Thread Pools to efficiently split the load between separate threads.
One possible approach:
ExecutorService service = Executors.newFixedThreadPool(8);
Runnable recursiveRunnable = new Runnable() {
#Override
public void run() {
//your recursive code goes here (for every new branch you have a runnable (recommended to have a custom class implementing Runnable))
}
};
service.execute(recursiveRunnable);
However, this approach is no longer a deep first search, since you list the sub-branches for your position before searching through the first one. In my understanding, DFS is a strictly linear approach and is therefore not entirely parallelizable (feel free to correct me in the comments though).

Related

Print out and sort values inside LinkedList

I wrote a recursive backtracking algorithm for the so-called "Coin Change Problem". I store the coin values (int) in a self-written LinkedList ("ll") and each of those LinkedLists is stored inside one master LinkedList ("ll_total"). Now, when I try to print out the LinkedLists inside the master LinkedList, all I get is "LinkedList#1e88b3c". Can somebody tell me how to modify the code, in order to print out the coin values properly?
I would also like the algorithm to chose the LinkedList with the least values stored inside, as it would represent the optimal coin combination for the "coin change problem".
import java.util.Scanner;
public class CoinChange_Backtracking {
static int[] coins = {3, 2, 1};
static int index_coins = 0;
static int counter = 0;
static LinkedList ll = new LinkedList();
static LinkedList ll_total = new LinkedList();
public static void main(String[] args) {
Scanner myInput = new Scanner(System.in);
int amount;
System.out.println("Put in the amount of money: ");
amount = myInput.nextInt();
if (amount < 101) {
//Start recursion and display result.
recursiveFunction(coins, amount, index_coins, ll);
ll_total.show_ll();
} else {
System.out.println("The value must be less than 100!");
}
}
public static LinkedList recursiveFunction(int[] coins, int amount, int index_coins, LinkedList ll) {
//The current coin is being omitted. (If index_coins + 1 is still within range.)
if ((index_coins + 1) < coins.length) {
ll = recursiveFunction(coins, amount, index_coins + 1, ll);
ll_total.insert_ll(ll);
for (int i = 0; i < counter; i++) {
ll.deleteAt(0);
}
counter = 0;
}
//The current coin is not being omitted. (If there is still some change left and value of change isn't lower than value of current coin.)
if (amount != 0) {
if (amount >= coins[index_coins]) {
ll.insert(coins[index_coins]);
counter++;
ll = recursiveFunction(coins, amount - coins[index_coins], index_coins, ll);
}
}
return ll;
}
}
public class LinkedList {
Node head;
public void insert(int data) {
Node node = new Node();
node.data = data;
node.next = null;
if (head == null) {
head = node;
} else {
Node n = head;
while(n.next != null) {
n = n.next;
}
n.next = node;
}
}
public void insert_ll(LinkedList ll) {
Node node = new Node();
node.ll = ll;
node.next = null;
if (head == null) {
head = node;
} else {
Node n = head;
while(n.next != null) {
n = n.next;
}
n.next = node;
}
}
public void deleteAt(int index) {
if(index == 0) {
head = head.next;
} else {
Node n = head;
Node n1 = null;
for (int i = 0; i < index - 1; i++) {
n = n.next;
}
n1 = n.next;
n.next = n1.next;
n1 = null;
}
}
public void show() {
Node node = head;
while(node.next != null) {
System.out.println(node.data);
node = node.next;
}
System.out.println(node.data);
}
public void show_ll() {
Node node = head;
while(node.next != null) {
System.out.println(node.ll);
node = node.next;
}
System.out.println(node.ll);
}
//A toString method I tried to implement. Causes an array error.
/*
public String toString() {
Node n = head.next;
String temp = "";
while (n != null) {
temp = temp + n.data + " ";
n = n.next;
}
return temp;
}
*/
}
public class Node {
int data;
LinkedList ll;
Node next;
}
To answer your question. You are printing the linked list object, see here System.out.println(node.ll);
There are several ways to do it right. One approach is to question why you use Node and LinkedList the way you do ? A node can have a linked list and a linked list can have a node, I believe this is not really what you wanted. Maybe you can make it work, but from a design point of view in my experience it is not good. I find it confusing and it's a great source of bugs.
I try to list some points that caught my eye (or that my IDE had caught for my eyes).
You are not closing the Scanner object. Just close it at the end of the program or use the try-with-resources.
As mentioned before you have linked list that has a node and a node that has a linked list. You are not using that correctly in your program. I recommend to review that approach. It is error prone.
Also simply use the LinkedList of the Java library unless you have a good reason not to. It works fine and offers all you need.
You use many static, global (within the scope of the package) variables. In this case I think you can avoid that. coins does not need to be given as a parameter every time. It should be an immutable object. It is not supposed to change.
...
And I am not sure if it is a backtracking algorithm. It is certainly tree recursive. This just as a side note.
I'd like to propose a solution that looks similar to yours. I'd probably do it differently my way, but then it probably takes time to understand it. I try to adopt your style, which I hope helps. I simplified the program.
In order to print the result, simply write a helper function.
The linked list is an object. You have to make a copy of the list every time you call the recursion in order to work on a dedicated object. Otherwise you modify the same object while recursing different paths.
You can simply use a list of lists. A global list of lists (within package scope), and a list of which you make a copy every time you recurse. When you reach a good base case you add it to the global list. Otherwise just ignore.
import java.util.LinkedList;
import java.util.Scanner;
public class CoinChangeBacktracking {
static final int[] COINS = {3, 2, 1};
static final LinkedList<LinkedList<Integer>> changes = new LinkedList<>();
public static void main(String[] args) {
Scanner myInput = new Scanner(System.in);
int amount;
System.out.println("Put in the amount of money: ");
amount = myInput.nextInt();
if (amount < 101) {
// Start recursion and display result.
recursiveFunction(amount, 0, new LinkedList<>());
print(changes);
} else {
System.out.println("The value must be less than 100!");
}
myInput.close();
}
static void recursiveFunction(int amount, int index,
LinkedList<Integer> list) {
// exact change, so add it to the solution
if (amount == 0) {
changes.add(list);
return;
}
// no exact change possible
if (amount < 0 || index >= COINS.length) {
return;
}
// explore change of amount without current coin
recursiveFunction(amount, index + 1, new LinkedList<>(list));
// consider current coin for change and keep exploring
list.add(COINS[index]);
recursiveFunction(amount - COINS[index], index, new LinkedList<>(list));
}
static void print(LinkedList<LinkedList<Integer>> ll) {
for (LinkedList<Integer> list : ll) {
for (Integer n : list) {
System.out.print(n + ", ");
}
System.out.println();
}
}
}

How to perform different basic traversals of graphs?

I am trying to perform an iterative breadth first traversal, iterative depth first traversal, and recursive depth first traversal of a given graph (using an adjacency matrix).
In its current state, my program outputs various wrong answers.
Here's some examples.
I am expecting
From Node A
DFS (iterative): A B H C D E I F G
DFS (recursive): A B H C D E I F G
BFS (iterative): A B D I H C E F G
but am instead getting
From Node A
DFS (iterative): A I D B H C F G E
DFS (recursive): A B H C F D E I G
BFS (iterative): A B D I H C E F G
I'm unsure if the problem with my program lies within the implementation of the traversals, or my implementation of some other part of the program. To be more specific, I'm not sure if my implementation connectNode or getNeighbors method is what is causing the incorrect output, or if it is my implementation of the traversals.
EDIT: Neighbors are supposed to be chosen in ascending order, if that's important. Perhaps this is part of the problem?
EDIT2: I added the new line of code, thanks to #HylianPikachu's suggestion. I now get full answers, but they are still not in the correct order.
EDIT3: I added the code to make it so the root node is checked as visited for bfs and recursive dfs. I think. I should also note that I was given parts of this code and told to fill in the rest. The use of the stack and queue are what I was told to use, even though there might be better options.
EDIT4: Added what was suggested, and now, the Iterative BFS works and gets the correct result. However, both DSF searches still do not work. I modified the results of the program above, to show this.
import java.util.*;
public class GraphM {
public Node rootNode;
public List<Node> nodes = new ArrayList<Node>(); // nodes in graph
public int[][] adjMatrix; // adjacency Matrix
public void setRootNode(Node n) {
rootNode = n;
}
public Node getRootNode() {
return rootNode;
}
public void addNode(Node n) {
nodes.add(n);
}
// This method connects two nodes
public void connectNode(Node src, Node dst) {
if(adjMatrix == null) {
adjMatrix = new int[nodes.size()][nodes.size()];
}
adjMatrix[nodes.indexOf(src)][nodes.indexOf(dst)] = 1;
adjMatrix[nodes.indexOf(dst)][nodes.indexOf(src)] = 1;
}
// Helper method to get one unvisited node from a given node n.
private Node getUnvisitedChildNode(Node n) {
int index = nodes.indexOf(n);
int size = adjMatrix.length;
for (int j = 0; j < size; j++)
if (adjMatrix[index][j] == 1 && ((Node) nodes.get(j)).visited == false)
return nodes.get(j);
return null;
}
// get all neighboring nodes of node n.
public List<Node> getNeighbors(Node n) {
List<Node> neighbors = new ArrayList<Node>();
for(int i = 0; i < nodes.size(); i ++) {
if (adjMatrix[nodes.indexOf(n)][i] == 1) {
neighbors.add(nodes.get(i));
}
Collections.sort(neighbors);
}
return neighbors;
}
// Helper methods for clearing visited property of node
private void reset() {
for (Node n : nodes)
n.visited = false;
}
// Helper methods for printing the node label
private void printNode(Node n) {
System.out.print(n.label + " ");
}
// BFS traversal (iterative version)
public void bfs() {
Queue<Node> queue = new LinkedList<Node>();
queue.add(rootNode);
while(!queue.isEmpty()) {
Node node = queue.poll();
printNode(node);
node.visited = true;
List<Node> neighbors = getNeighbors(node);
for ( int i = 0; i < neighbors.size(); i ++) {
Node n = neighbors.get(i);
if (n != null && n.visited != true) {
queue.add(n);
n.visited = true;
}
}
}
}
// DFS traversal (iterative version)
public void dfs() {
Stack<Node> stack = new Stack<Node>();
stack.add(rootNode);
while(!stack.isEmpty()){
Node node = stack.pop();
if(node.visited != true) {
printNode(node);
node.visited = true;
}
List<Node> neighbors = getNeighbors(node);
for (int i = 0; i < neighbors.size(); i++) {
Node n = neighbors.get(i);
if(n != null && n.visited != true) {
stack.add(n);
}
}
}
}
// DFS traversal (recursive version)
public void dfs(Node n) {
printNode(n);
n.visited = true;
List<Node> neighbors = getNeighbors(n);
for (int i = 0; i < neighbors.size(); i ++) {
Node node = neighbors.get(i);
if(node != null && node.visited != true) {
dfs(node);
}
}
}
// A simple Node class
static class Node implements Comparable<Node> {
public char label;
public boolean visited = false;
public Node(char label) {
this.label = label;
}
public int compareTo(Node node) {
return Character.compare(this.label, node.label);
}
}
// Test everything
public static void main(String[] args) {
Node n0 = new Node('A');
Node n1 = new Node('B');
Node n2 = new Node('C');
Node n3 = new Node('D');
Node n4 = new Node('E');
Node n5 = new Node('F');
Node n6 = new Node('G');
Node n7 = new Node('H');
Node n8 = new Node('I');
// Create the graph (by adding nodes and edges between nodes)
GraphM g = new GraphM();
g.addNode(n0);
g.addNode(n1);
g.addNode(n2);
g.addNode(n3);
g.addNode(n4);
g.addNode(n5);
g.addNode(n6);
g.addNode(n7);
g.addNode(n8);
g.connectNode(n0, n1);
g.connectNode(n0, n3);
g.connectNode(n0, n8);
g.connectNode(n1, n7);
g.connectNode(n2, n7);
g.connectNode(n2, n3);
g.connectNode(n3, n4);
g.connectNode(n4, n8);
g.connectNode(n5, n6);
g.connectNode(n5, n2);
// Perform the DFS and BFS traversal of the graph
for (Node n : g.nodes) {
g.setRootNode(n);
System.out.print("From node ");
g.printNode(n);
System.out.print("\nDFS (iterative): ");
g.dfs();
g.reset();
System.out.print("\nDFS (recursive): ");
g.dfs(g.getRootNode());
g.reset();
System.out.print("\nBFS (iterative): ");
g.bfs();
g.reset();
System.out.println("\n");
}
}
}
So, we already covered the first part of your question, but I'll restate it here for those who follow. Whenever working with graphs and an adjacency matrix, probably the best way to initialize elements in the array is "both ways."
Instead of just using the following, which would require a specific vertex be listed first in order to find the neighbors:
adjMatrix[nodes.indexOf(src)][nodes.indexOf(dst)] = 1;
Use this, which leads to searches that are agnostic of the vertex order:
adjMatrix[nodes.indexOf(src)][nodes.indexOf(dst)] = 1;
adjMatrix[nodes.indexOf(dst)][nodes.indexOf(src)] = 1;
Now, for ordering. You want the vertices to be outputted in order from "least" letter to "greatest" letter. We'll address each one of your data structures individually.
In BFS (iterative), you use a Queue. Queues are "first in, first out." In other words, the element that was least recently added to the Queue will be outputted first whenever you call queue.poll(). Thus, you need to add your nodes from least to greatest.
In DFS (iterative), you use a Stack. Stacks are "last in, first out." In other words, the element that was most recently added to the Stack will be outputted first whenever you call stack.pop(). Thus, you need to add your nodes from greatest to least.
In DFS (recursive), you use a List. Lists have no "in-out" ordering per se, as we can poll them in whatever order we want, but the easiest thing to do would just be to sort the List from least to greatest and output them in order.
With this in mind, we need to introduce protocol for sorting the graph. All three protocols use getNeighbors(), so we'll sort the outputted List immediately after we call that function. Lists can be ordered with the function Collections.sort(List l) from java.utils.Collections, but we first need to modify your nodes class so Java knows how to sort the Nodes. For further reading about the details of what I'm doing, you can look here, but this post is getting way longer than I intended already, so I'm going to just show the code here and let the interested explore the link themselves.
You would first tweak your Node class by implementing Comparable<Node> and adding the compareTo() function.
static class Node implements Comparable<Node>{
public char label;
public boolean visited = false;
public Node(char label) {
this.label = label;
}
#Override
public int compareTo(Node that) {
return Character.compare(this.label, that.label);
}
}
Then, in the cases in which we want to order the List from least to greatest, we can use Collections.sort(neighbors). When we want it from greatest to least, we can use Collections.sort(neighbors, Collections.reverseOrder()). Our final code will look like this:
// BFS traversal (iterative version)
public void bfs() {
Queue<Node> queue = new LinkedList<Node>();
queue.add(rootNode);
while(!queue.isEmpty()) {
Node node = queue.poll();
printNode(node);
node.visited = true;
List<Node> neighbors = getNeighbors(node);
//NEW CODE: Sort our neighbors List!
Collections.sort(neighbors);
for ( int i = 0; i < neighbors.size(); i ++) {
Node n = neighbors.get(i);
if (n != null && n.visited != true) {
queue.add(n);
n.visited = true;
}
}
}
}
// DFS traversal (iterative version)
public void dfs() {
Stack<Node> stack = new Stack<Node>();
stack.add(rootNode);
while(!stack.isEmpty()){
Node node = stack.pop();
if(node.visited != true) {
printNode(node);
node.visited = true;
}
List<Node> neighbors = getNeighbors(node);
//NEW CODE: Sort our neighbors List in reverse order!
Collections.sort(neighbors, Collections.reverseOrder());
for (int i = 0; i < neighbors.size(); i++) {
Node n = neighbors.get(i);
if(n != null && n.visited != true) {
stack.add(n);
}
}
}
}
// DFS traversal (recursive version)
public void dfs(Node n) {
printNode(n);
n.visited = true;
List<Node> neighbors = getNeighbors(n);
//NEW CODE: Sort our neighbors List!
Collections.sort(neighbors);
for (int i = 0; i < neighbors.size(); i ++) {
Node node = neighbors.get(i);
if(node != null && node.visited != true) {
dfs(node);
}
}
}
I would suggest splitting up your problem into smaller parts.
If you want to write a class for an undirected graph, first do that and test it a bit.
If you want to look if you can implement traversal, make sure your graph works first. You can also use guava, which lets you use MutableGraph (and lots more). Here is how to install it in case you're using IntelliJ and here is how to use graphs from guava.
Also remember to use a debugger to find out were your code goes wrong.

Binary tree searching with strings

This is my code so far to find all of the nodes represented by char characters. I've been able to code how to get the root node's "(" and ")" using the helper method. However, i'm having a mind block when trying to integrate the helper method into the treeProcessor. In the treeProcessor method i'm trying to further delve into binary tree so that the end goal is to just print out the root node and sub trees.
public static ArrayList<String> paths = new ArrayList<>();
public static void main(String[] args)
{
String tree = "(a(b()())(c()()))";
//replace this line with a call to the treeProcessor
System.out.println(Arrays.toString(treeBreakdownHelper(tree)));//a, (b()()), (c()())
System.out.println();
System.out.println(paths);//prints every path found
}
//recursive method
public static void treeProcessor(String tree, String path)
{
//breakdown tree
//update path
//check if current element is leaf/last element
//if it is, add to ArrayList
//if not last element, run processor again on subtrees that are not empty
}
//valid tree:
//(a()())
//(a(b()())(c()()))
//helper method
public static String[] treeBreakdownHelper(String tree)
{
String[] temp = new String[3];
//0 = root
//1 = left tree
//2 = right tree
tree = tree.substring(1, tree.length()-1);
//System.out.println(tree);//test removal of outer parens
temp[0] = "" + tree.charAt(0);
tree = tree.substring(1);
//System.out.println(tree);//test removal of root node
int openCount = 0;
int middle = 0;
for(int i = 0; i < tree.length(); i++)
{
//System.out.println(openCount);
if(tree.charAt(i) == '(')
{
openCount++;
}
else if(tree.charAt(i) == ')')
{
openCount--;
}
if(openCount == 0)
{
middle = i;
break;
}
}
//System.out.println(middle);
//System.out.println(tree.substring(0,middle+1));
temp[1] = tree.substring(0,middle+1);
//System.out.println(tree.substring(middle+1));
temp[2] = tree.substring(middle+1);
return temp;
}
}

Converting a recursive method (where the recursion is done inside a loop) into an iterative method

I have a recursive algorithm that I use to iterate over a hierarchical data structure, but unfortunately with some data, the hierarchical structure is so deep that I'm getting a StackOverflowError. I've seen this happen with a depth of about 150ish nodes, while the data could potentially grow to much further than that. For context, this code will run in limited environments and changing the JVM stack size is not an option, and the data structure is a given and represents different file systems with directories and files.
To work around the stack overflow, I've tried to convert the algorithm into an iterative one. It's not something I've had to do before, so I started from some examples showing how to do this with a simple recursion, but I'm not sure how to apply this to recursion inside a loop. I've found a way to do it that seems to work, but the code is rather insane.
Here is a simplified version of my original recursive method:
private CacheEntry sumUpAndCacheChildren(Node node) {
final CacheEntry entry = getCacheEntry(node);
if (entryIsValid(entry))
return entry;
Node[] children = node.listChildren();
long size = 0;
if (children != null) {
for (Node child : children) {
if (child.hasChildren()) {
size += sumUpAndCacheChildren(child).size;
} else {
size += child.size();
}
}
}
return putInCache(node, size);
}
Each leaf node has a size, while the size for any ancestor node is considered to be the size of all of its descendants. I want to know this size for each node, so the size is aggregated and cached for every node.
Here is the iterative version:
private CacheEntry sumUpAndCacheChildren(Node initialNode) {
class StackFrame {
final Node node;
Node[] children;
// Local vars
long size;
// Tracking stack frame state
int stage;
int loopIndex;
StackFrame(Node node) {
this.node = node;
this.children = null;
this.size = 0;
this.stage = 0;
this.loopIndex = 0;
}
}
final Stack<StackFrame> stack = new Stack<StackFrame>();
stack.push(new StackFrame(initialNode));
CacheEntry retValue = getCacheEntry(initialNode);
outer:
while (!stack.isEmpty()) {
final StackFrame frame = stack.peek();
final Node node = frame.node;
switch(frame.stage) {
case 0: {
final CacheEntry entry = getCacheEntry(node);
if (entryIsValid(entry)) {
retValue = entry;
stack.pop();
continue;
}
frame.children = node.asItem().listChildren();
frame.stage = frame.children != null ? 1 : 3;
} break;
case 1: {
for (int i = frame.loopIndex; i < frame.children.length; ++i) {
frame.loopIndex = i;
final Node child = frame.children[i];
if (child.hasChildren()) {
stack.push(new StackFrame(child));
frame.stage = 2; // Accumulate results once all the child stacks have been calculated.
frame.loopIndex++; // Make sure we restart the for loop at the next iteration the next time around.
continue outer;
} else {
frame.size += child.size();
}
}
frame.stage = 3;
} break;
case 2: {
// Accumulate results
frame.size += retValue.size;
frame.stage = 1; // Continue the for loop
} break;
case 3: {
retValue = putInCache(node, frame.type);
stack.pop();
continue;
}
}
}
return retValue;
}
This just feels more insane than it needs to be, and it would be painful to have to do this in all the places in the code where I recurse into the children and do different ops on them. What techniques could I use to make it easier to do recursion when I'm aggregating at each level and doing that in a for-loop over the children?
EDIT:
I was able to greatly simplify things with the help of the answers below. The code is now nearly as concise as the original recursive version. Now, I just need to apply the same principles everywhere else where I'm recursing over the same data structure.
Since you're dealing with a tree structure and wish to compute cumulative sizes, try a DFS while tracking the parent of each node. I assume here that you cannot change or subclass Node and I kept all the function signatures you used.
private class SizedNode {
public long cumulativeSize;
public Node node;
public SizedNode parent;
public SizedNode(SizedNode parent, Node node) {
this.node = node;
this.parent = parent;
}
public long getSize() {
if (node.hasChildren()) {
return cumulativeSize;
}
else {
return node.size();
}
}
}
private void sumUpAndCacheChildren(Node start)
{
Stack<SizedNode> nodeStack = new Stack<SizedNode>();
// Let's start with the beginning node.
nodeStack.push(new SizedNode(null, start));
// Loop as long as we've got nodes to process
while (!nodeStack.isEmpty()) {
// Take a look at the top node
SizedNode sizedNode = nodeStack.peek();
CacheEntry entry = getCacheEntry(sizedNode.node);
if (entryIsValid(entry)) {
// It's cached already, so we have computed its size
nodeStack.pop();
// Add the size to the parent, if applicable.
if (sizedNode.parent != null) {
sizedNode.parent.cumulativeSize += sizedNode.getSize();
// If the parent's now the top guy, we're done with it so let's cache it
if (sizedNode.parent == nodeStack.peek()) {
putInCache(sizedNode.parent.node, sizedNode.parent.getSize());
}
}
}
else {
// Not cached.
if (sizedNode.node.hasChildren()) {
// It's got a bunch of children.
// We can't compute the size yet, so just add the kids to the stack.
Node[] children = sizedNode.node.listChildren();
if (children != null) {
for (Node child : children) {
nodeStack.push(new SizedNode(sizedNode, child));
}
}
}
else {
// It's a leaf node. Let's cache it.
putInCache(sizedNode.node, sizedNode.node.size());
}
}
}
}
You're basically doing a post-order iterative traversal of an N-ary tree; you can try searching for that for more detailed examples.
In very rough pseudocode:
Node currentNode;
Stack<Node> pathToCurrent;
Stack<Integer> sizesInStack;
Stack<Integer> indexInNode;
pathToCurrent.push(rootNode);
sizesInStack.push(0);
indexInNode.push(0);
current = rootNode;
currentSize = 0;
currentIndex = 0;
while (current != null) {
if (current.children != null && currentIndex < current.children.size) {
//process the next node
nextChild = current.children[currentIndex];
pathToCurrent.push(current);
sizesInStack.push(currentSize);
indexInNode.push(currentIndex);
current = nextChild;
currentSize = 0;
currentIndex = 0;
} else {
//this node is a leaf, or we've handled all its children
//put our size into the cache, then pop off the stack and set up for the next child of our parent
currentSize += this.size();
putInCache(this, currentSize);
current = pathToCurrent.pop(); //If pop throws an exception on empty stack, handle it here and exit the loop
currentSize = currentSize + sizesInStack.pop();
currentIndex = 1 + indexInNode.pop();
}
}
OK, im gonna explain it in human words since i dont want to code right now :
Acquire topmost level of elements and write into a list
LOOP BEGIN
count elements on this level and add them to your counter
acquire list of children from your current list, store seperately
delete list of current elements
write list of children to where the list of the current elements was
LOOP END
you simply need to put a boolean into the loop-header and set it to false if the list of children has no elements anymore ... i hope i was able to express myself correctly, feel free to ask questions and/or inquire about clarification.
This algorithm will get exponentially slower ( --> O(n²) ) in each iteration if the data-structure keeps "folding out", its rather inefficient and im quite sure someone can come up with an optimization - but it will be faster than with recursion and it wont produce a stack overflow; yet it may produce an OutOfMemoryException for very large datasets - but since only one level is iterated at any time this is ... quite unrealistic i guess
After adapting #Marius's answer to my use case, I came up with this:
class SizedNode {
final Node node;
final SizedNode parent;
long size;
boolean needsCaching;
SizedNode(Node node, SizedNode parent) {
this.parent = parent;
this.node = node;
}
}
private CacheEntry sumUpAndCacheChildren(Node start) {
final Stack<SizedNode> stack = new Stack<SizedNode>();
stack.push(new SizedNode(start, null));
CacheEntry returnValue = getCacheEntry(start);
while (!stack.isEmpty()) {
final SizedNode sizedNode = stack.pop();
final CacheEntry entry = getCacheEntry(sizedNode.folder);
if (sizedNode.needsCaching) {
// We finished processing all children, and now we're done with this node.
if (sizedNode.parent != null) {
sizedNode.parent.size += sizedNode.size;
}
returnValue = putInCache(sizedNode.folder, sizedNode.size);
} else if (entryIsValid(entry)) {
if (sizedNode.parent != null) {
sizedNode.parent.size += entry.size;
}
returnValue = entry;
} else {
// The next time we see this node again, it will be after we process all of its children.
sizedNode.needsCaching = true;
stack.push(sizedNode);
for (Node child : sizedNode.node.listChildren()) {
if (child.hasChildren()) {
stack.push(new SizedNode(child, sizedNode));
} else {
sizedNode.size += child.size();
}
}
}
}
return returnValue;
}
This is much better than the crazy mess I came up with on my first pass. Just goes to show that you really have to think about transforming the algorithm so that it also makes sense as an iterative approach. Thanks all for the help!

I get different behaviour in Java depending on HOW I initialize the data structures

I implemented an algorithm in Java. I coded two versions:
one where I initialized the data structures in the constructor,
and
one where I parsed a textfile and initialized the data structure from the input
The strange thing is that I got different behaviour from the two versions, and can hardly understand how.
Why do I get different behaviour?
The algorithm is a first part of Depth-First Search. A set of nodes should be visited and printed only once. In my version where I read from a textfile, the first node is printed twice. The program uses recursion.
Here is the output, the code is below. The first four lines prints the data structures, then is each first-visit-of-a-node printed, and a counter. The counter should only go to 2 not 3.
Output, when read from textfile:
>java GraphStart ex1.txt
Node 1
Node 2
Edge: Node 1 -- Node 2
Edge: Node 2 -- Node 1
Start on Node 1
Node 1 Counter: 1
Node 2 Counter: 2
Node 1 Counter: 3
Output, when initialized in constructor:
Node 1
Node 2
Edge: Node 1 -- Node 2
Edge: Node 2 -- Node 1
Start on Node 1
Node 1 Counter: 1
Node 2 Counter: 2
Depth-First Search - initialized in contructor:
public class DepthFirstSearch {
private final static LinkedList<Node> nodes = new LinkedList<Node>();
private static LinkedList[] edges = new LinkedList[0];
public DepthFirstSearch() {
Node node1 = new Node(1);
Node node2 = new Node(2);
nodes.add(node1);
nodes.add(node2);
edges = Arrays.copyOf(edges, 1);
edges[0] = new LinkedList<Edge>();
edges[0].add(new Edge(node1, node2));
edges = Arrays.copyOf(edges, 2);
edges[1] = new LinkedList<Edge>();
edges[1].add(new Edge(node2, node1));
DFS.startDFS(nodes, edges);
}
public static void main(String[] args) {
new DepthFirstSearch();
}
}
Depth-First Search - initialized from textfile:
public class GraphStart {
private final static LinkedList<Node> nodes = new LinkedList<Node>();
private static LinkedList[] edges = new LinkedList[0];
public GraphStart(String fileName) {
scanFile(fileName);
DFS.startDFS(nodes, edges);
}
// Parse a textfile with even number of integers
// Add the nodes and edges to the datastructures
private static void scanFile(String filename) {
try {
Scanner sc = new Scanner(new File(filename));
while(sc.hasNextInt()){
Node startNode = new Node(sc.nextInt());
if(sc.hasNextInt()) {
Node endNode = new Node(sc.nextInt());
if(!nodes.contains(startNode)){
nodes.add(startNode);
//EDIT
System.out.println("Added " + startNode);
// Grow the Edge-array and initialize the content
if(edges.length < startNode.getNr())
edges = Arrays.copyOf(edges, startNode.getNr());
edges[startNode.getNr()-1] = new LinkedList<Edge>();
}
if(!nodes.contains(endNode)){
nodes.add(endNode);
//EDIT
System.out.println("Added " + endNode);
// Grow the Edge-array and initialize the content
if(edges.length < endNode.getNr())
edges = Arrays.copyOf(edges, endNode.getNr());
edges[endNode.getNr()-1] = new LinkedList<Edge>();
}
// Add the Edge
edges[startNode.getNr()-1].add(new Edge(startNode, endNode));
}
}
} catch (FileNotFoundException e) {
System.out.println("Can not find the file:" + filename);
System.exit(0);
}
}
public static void main(String[] args) {
if(args.length==1) {
new GraphStart(args[0]);
} else {
System.out.println("Wrong argument. <filename>");
}
}
}
Textfile for input:
1 2
2 1
It represents the Edge from Node 1 to Node 2, and the Edge from Node 2 to Node 1.
The algorithm is implemented in a static file, used by both versions.
DFS - the algorithm:
public class DFS {
private static int counter = 0;
private static LinkedList<Node> nodes;
private static LinkedList[] edges;
public static void startDFS(LinkedList<Node> ns, LinkedList[] es) {
nodes = ns;
edges = es;
/* Print the data structures */
printList(nodes);
printEdges(edges);
for(Node n : nodes) {
if(!n.isVisited()) {
System.out.println("\nStart on "+n);
dfs(n);
}
}
}
private static void dfs(Node n) {
counter++;
n.visit();
System.out.println(n + " Counter: " + counter);
for(Object o : edges[n.getNr()-1]) {
if(!((Edge)o).getEnd().isVisited()) {
dfs(((Edge)o).getEnd());
}
}
private static void printList(LinkedList<?> list) {
for(Object obj : list)
System.out.println(obj);
}
private static void printEdges(LinkedList[] edges) {
for(LinkedList list : edges) {
System.out.print("Edge: ");
for(Object o : list) {
System.out.print(o);
}
System.out.println("");
}
}
}
EDIT: Added code listings of Node and Edge.
Node:
public class Node {
private final int nr;
private boolean visited = false;
public Node(int nr) {
this.nr = nr;
}
public int getNr() { return nr; }
public boolean isVisited() { return visited; }
public void visit() { visited = true; }
#Override
public boolean equals(Object obj) {
if(obj instanceof Node)
return ((Node)obj).getNr() == nr;
else
return false;
}
#Override
public String toString() {
return "Node " + nr;
}
}
Edge:
public class Edge {
private final Node startNode;
private final Node endNode;
public Edge(Node start, Node end) {
this.startNode = start;
this.endNode = end;
}
public Node getStart() { return startNode; }
public Node getEnd() { return endNode; }
public String toString() {
return startNode + " " +
"--" + " " +
endNode;
}
}
Sorry for the very long code listings. I tried to isolate my problem and also show a runnable program.
Without seeing the code for Node, my guess is that it isn't implementing hashCode() and equals() or that these aren't implemented correctly.
So for example:
if(!nodes.contains(startNode)){
nodes.add(startNode);
Will be doing the containment check with reference equality (==) instead of anything logical. So the fact that you've create three different node instances will not resolve even though two are "the same".
...and that's why the static method version works because you only have two node instances.
Edit: the above was a good guess but reading deeper into the code I think it has to do with the fact that visit state is kept right on the nodes instead of in a separate visited collection. You have three node instances in your graph even if only two are in the nodes list. One of the edges is pointing to the third node instance (the other one with a '1')... since the visited() method was never called on that one (because it was called on the first '1' instance) then isVisited() will likely return false (can't say for sure because I don't know your Node implementation).
You did not show the implementation for Node, but I would guess that you did not override equals() for it. This will lead nodes.contains(node) to return false and more nodes to be added to the collection than wanted. (The file reading loop creates a fresh start- and endNode everytime through the loop.)
Your constuctor version simply uses 2 unique nodes, which gives the different result.
Implementing Node.equals() will probably solve your issue.
Your scanFile() method creates three nodes - two containing the integer 1, and one containing the integer 2.

Categories

Resources