automation of data format conversion to parent child format

automation of data format conversion to parent child format - java

This is an excel sheet which has only a single column filled for each row.
(explanation : all CITY categories fall under V21 , all handset categories fall under CityJ and so on )
V21
CITYR
CITYJ
HandsetS
HandsetHW
HandsetHA
LOWER_AGE<=20
LOWER_AGE>20
SMS_COUNT<=0
RECHARGE_MRP<=122
RECHARGE_MRP>122
SMS_COUNT>0
I need to change this format to a double column format
with parent and child category format.
therefore
the output sheet would be
V21 CITYR
V21 CITYJ
CITYJ HandsetS
CITYJ HandsetHW
CITYJ HandsetHA
HandsetHA LOWER_AGE<=20
HandsetHA LOWER_AGE>20
LOWER_AGE>20 SMS_COUNT<=0
SMS_COUNT<=0 RECHARGE_MRP<=122
SMS_COUNT<=0 RECHARGE_MRP>122
LOWER_AGE>20 SMS_COUNT>0
the datas are huge so i cant do them manually . how can i automate this ?

There are 3 parts of the task so I want to know what is that you are asking help about.
Reading excel sheet data into Java
Manipulating data
Writing data back into the excel sheet.
You have said that the data sheet is large and cannot be pulled as a whole into memory. Can I ask you how many top level elements do you have ? i.e, How many V21s do you have? If it is just ONE, then how many CITYR/CITYJ do you have?
--
Adding some source code from my previous answer about how to manipulate data. I gave it an input file which was separated by tabs (4 spaces equals to one column for you in excel) and the following code printed stuff out neatly. Please note that there is a condition of level == 1 left empty. If you think ur JVM has too many objects, you could clear the entries and stack at that point :)
package com.ekanathk;
import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Stack;
import java.util.logging.Logger;
import org.junit.Test;
class Entry {
private String input;
private int level;
public Entry(String input, int level) {
this.input = input;
this.level = level;
}
public String getInput() {
return input;
}
public int getLevel() {
return level;
}
#Override
public String toString() {
return "Entry [input=" + input + ", level=" + level + "]";
}
}
public class Tester {
private static final Logger logger = Logger.getLogger(Tester.class.getName());
#SuppressWarnings("unchecked")
#Test
public void testSomething() throws Exception {
InputStream is = Thread.currentThread().getContextClassLoader().getResourceAsStream("samplecsv.txt");
BufferedReader b = new BufferedReader(new InputStreamReader(is));
String input = null;
List entries = new ArrayList();
Stack<Entry> stack = new Stack<Entry>();
stack.push(new Entry("ROOT", -1));
while((input = b.readLine()) != null){
int level = whatIsTheLevel(input);
input = input.trim();
logger.info("input = " + input + " at level " + level);
Entry entry = new Entry(input, level);
if(level == 1) {
//periodically clear out the map and write it to another excel sheet
}
if (stack.peek().getLevel() == entry.getLevel()) {
stack.pop();
}
Entry parent = stack.peek();
logger.info("parent = " + parent);
entries.add(new String[]{parent.getInput(), entry.getInput()});
stack.push(entry);
}
for(Object entry : entries) {
System.out.println(Arrays.toString((String[])entry));
}
}
private int whatIsTheLevel(String input) {
int numberOfSpaces = 0;
for(int i = 0 ; i < input.length(); i++) {
if(input.charAt(i) != ' ') {
return numberOfSpaces/4;
} else {
numberOfSpaces++;
}
}
return numberOfSpaces/4;
}
}

This considers that you have a file small enough to fit in computer memory. Even 10MB file should be good.
It has 2 parts:
DataTransformer which does all the
required transformation of data
TreeNode is custom simple Tree data
structure
public class DataTransformer {
public static void main(String[] args) throws IOException {
InputStream in = DataTransformer.class
.getResourceAsStream("source_data.tab");
BufferedReader br = new BufferedReader(
new InputStreamReader(in));
String line;
TreeNode root = new TreeNode("ROOT", Integer.MIN_VALUE);
TreeNode currentNode = root;
while ((line = br.readLine()) != null) {
int level = getLevel(line);
String value = line.trim();
TreeNode nextNode = new TreeNode(value, level);
relateNextNode(currentNode, nextNode);
currentNode = nextNode;
}
printAll(root);
}
public static int getLevel(String line) {
final char TAB = '\t';
int numberOfTabs = 0;
for (int i = 0; i < line.length(); i++) {
if (line.charAt(i) != TAB) {
break;
}
numberOfTabs++;
}
return numberOfTabs;
}
public static void relateNextNode(
TreeNode currentNode, TreeNode nextNode) {
if (currentNode.getLevel() < nextNode.getLevel()) {
currentNode.addChild(nextNode);
} else {
relateNextNode(currentNode.getParent(), nextNode);
}
}
public static void printAll(TreeNode node) {
if (!node.isRoot() && !node.getParent().isRoot()) {
System.out.println(node);
}
for (TreeNode childNode : node.getChildren()) {
printAll(childNode);
}
}
}
class TreeNode implements Serializable {
private static final long serialVersionUID = 1L;
private TreeNode parent;
private List<TreeNode> children = new ArrayList<TreeNode>();
private String value;
private int level;
public TreeNode(String value, int level) {
this.value = value;
this.level = level;
}
public void addChild(TreeNode child) {
child.parent = this;
this.children.add(child);
}
public void addSibbling(TreeNode sibbling) {
TreeNode parent = this.parent;
parent.addChild(sibbling);
}
public TreeNode getParent() {
return parent;
}
public List<TreeNode> getChildren() {
return children;
}
public String getValue() {
return value;
}
public int getLevel() {
return level;
}
public boolean isRoot() {
return this.parent == null;
}
public String toString() {
String str;
if (this.parent != null) {
str = this.parent.value + '\t' + this.value;
} else {
str = this.value;
}
return str;
}
}

Related

Why saving the char '�' to a file saves it as '?'?

I learned about Huffman Coding and tried to apply. So I made a very basic text reader that can only open and save files. And wrote a decorator that can be used to compress the text before saving (which uses Huffman Coding).
There was a bug that I couldn't find and after alot of debugging I figured out that when I compress the text, as a result the character � may be in the compressed text. For example, the text ',-.:BCINSabcdefghiklmnoprstuvwy gets compressed to 앐낧淧翵�ဌ䤺큕㈀.
I figured out that the bug lies in the saving function. When I save the compressed text, it changes every occurence of � to ?. For example, when saving 앐낧淧翵�ဌ䤺큕㈀, I get 앐낧淧翵?ဌ䤺큕㈀.
When I try to read the saved file to decompress it, I get a different string so the decompression fails.
What makes it more difficult is that the saving function alone works fine, but it doesn't work when using it in my code. the function looks like this:
public void save() throws IOException {
FileWriter fileWriter = new FileWriter(this.filename);
fileWriter.write(this.text);
fileWriter.close();
}
It's confusing that this.text at the moment of saving is 앐낧淧翵�ဌ䤺큕㈀ yet it saves it as 앐낧淧翵?ဌ䤺큕㈀.
As I said before, the function works fine when alone, but doesn't work in my code. I couldn't do any thing more that removing as much as possible from my code and and putting it here. Anyways, a breakpoint can be put at the function FileEditor::save and you'll find that this.text at the moment of saving is 앐낧淧翵�ဌ䤺큕㈀ and the content of the file is 앐낧淧翵?ဌ䤺큕㈀.
Code:
FileEditor is right below Main.
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.PriorityQueue;
import java.util.TreeMap;
import static pack.BitsManipulator.CHAR_SIZE_IN_BITS;
public class Main {
public static void main(String[] args) throws IOException {
String text = " ',-.:BCINSabcdefghiklmnoprstuvwy";
FileEditor fileEditor2 = new FileEditor("file.txt");
HuffmanDecorator compressor = new HuffmanDecorator(fileEditor2);
compressor.setText(text);
System.out.println(compressor.getText());
compressor.save();
}
}
class FileEditor implements BasicFileEditor {
private String filename;
private String text;
public FileEditor(String filename) throws IOException {
this.filename = filename;
File file = new File(filename);
StringBuilder builder = new StringBuilder();
if (!file.createNewFile()) {
FileReader reader = new FileReader(file);
int ch;
while ((ch = reader.read()) != -1)
builder.append((char) ch);
}
this.text = builder.toString();
}
#Override
public String getText() {
return text;
}
#Override
public void setText(String text) {
this.text = text;
}
#Override
public void save() throws IOException {
FileWriter fileWriter = new FileWriter(this.filename);
fileWriter.write(this.text);
fileWriter.close();
}
}
interface BasicFileEditor {
String getText();
void setText(String text);
void save() throws IOException;
}
abstract class FileEditorDecorator implements BasicFileEditor {
FileEditor fileEditor;
public FileEditorDecorator(FileEditor fileEditor) {
this.fileEditor = fileEditor;
}
#Override
public String getText() {
return fileEditor.getText();
}
#Override
public void setText(String text) {
fileEditor.setText(text);
}
#Override
public void save() throws IOException {
String oldText = getText();
setText(getModifiedText());
fileEditor.save();
setText(oldText);
}
protected abstract String getModifiedText();
}
class HuffmanDecorator extends FileEditorDecorator {
public HuffmanDecorator(FileEditor fileEditor) {
super(fileEditor);
}
#Override
protected String getModifiedText() {
HuffmanCodingCompressor compressor = new HuffmanCodingCompressor(getText());
return compressor.getCompressedText();
}
}
class HuffmanCodingCompressor {
String text;
public HuffmanCodingCompressor(String text) {
this.text = text;
}
public String getCompressedText() {
EncodingBuilder builder = new EncodingBuilder(text);
return builder.getCompressedText();
}
}
class Node implements Comparable<Node> {
public Node left;
public Node right;
public int value;
public Character character;
public Node(Node left, Node right, int value) {
this(left, right, value, null);
}
public Node(Node left, Node right, int value, Character character) {
this.left = left;
this.right = right;
this.character = character;
this.value = value;
}
#Override
public int compareTo(Node o) {
return this.value - o.value;
}
public boolean isLeafNode() {
return left == null && right == null;
}
Node getLeft() {
if (left == null)
left = new Node(null, null, 0);
return left;
}
Node getRight() {
if (right == null)
right = new Node(null, null, 0);
return right;
}
}
class EncodingBuilder {
private String text;
private Node encodingTree;
private TreeMap<Character, String> encodingTable;
public EncodingBuilder(String text) {
this.text = text;
buildEncodingTree();
buildEncodingTableFromTree(encodingTree);
}
private void buildEncodingTableFromTree(Node encodingTree) {
encodingTable = new TreeMap<>();
buildEncodingTableFromTreeHelper(encodingTree, new StringBuilder());
}
public void buildEncodingTableFromTreeHelper(Node root, StringBuilder key) {
if (root == null)
return;
if (root.isLeafNode()) {
encodingTable.put(root.character, key.toString());
} else {
key.append('0');
buildEncodingTableFromTreeHelper(root.left, key);
key.deleteCharAt(key.length() - 1);
key.append('1');
buildEncodingTableFromTreeHelper(root.right, key);
key.deleteCharAt(key.length() - 1);
}
}
public void buildEncodingTree() {
TreeMap<Character, Integer> freqArray = new TreeMap<>();
for (int i = 0; i < text.length(); i++) {
// improve here.
char c = text.charAt(i);
if (freqArray.containsKey(c)) {
Integer freq = freqArray.get(c) + 1;
freqArray.put(c, freq);
} else {
freqArray.put(c, 1);
}
}
PriorityQueue<Node> queue = new PriorityQueue<>();
for (Character c : freqArray.keySet())
queue.add(new Node(null, null, freqArray.get(c), c));
if (queue.size() == 1)
queue.add(new Node(null, null, 0, '\0'));
while (queue.size() > 1) {
Node n1 = queue.poll();
Node n2 = queue.poll();
queue.add(new Node(n1, n2, n1.value + n2.value));
}
encodingTree = queue.poll();
}
public String getCompressedTextInBits() {
StringBuilder bits = new StringBuilder();
for (int i = 0; i < text.length(); i++)
bits.append(encodingTable.get(text.charAt(i)));
return bits.toString();
}
public String getCompressedText() {
String compressedInBits = getCompressedTextInBits();
int remainder = compressedInBits.length() % CHAR_SIZE_IN_BITS;
int paddingNeededToBeDivisibleByCharSize = CHAR_SIZE_IN_BITS - remainder;
String compressed = BitsManipulator.convertBitsToText(compressedInBits + "0".repeat(paddingNeededToBeDivisibleByCharSize));
return compressed;
}
}
class BitsManipulator {
public static final int CHAR_SIZE_IN_BITS = 16;
public static int bitsInStringToInt(String bits) {
int result = 0;
for (int i = 0; i < bits.length(); i++) {
result *= 2;
result += bits.charAt(i) - '0';
}
return result;
}
public static String convertBitsToText(String bits) {
if (bits.length() % CHAR_SIZE_IN_BITS != 0)
throw new NumberOfBitsNotDivisibleBySizeOfCharException();
StringBuilder result = new StringBuilder();
for (int i = 0; i < bits.length(); i += CHAR_SIZE_IN_BITS)
result.append(asciiInBitsToChar(bits.substring(i, i + CHAR_SIZE_IN_BITS)));
return result.toString();
}
public static char asciiInBitsToChar(String bits) {
return (char) bitsInStringToInt(bits);
}
public static class NumberOfBitsNotDivisibleBySizeOfCharException extends RuntimeException {
}
}

� is the Unicode replacement character U+FFFD. If you encode that in a non-unicode encoding, it will get converted to a regular question mark, as non-unicode encodings can't encode all unicode characters, and this provides a "safety" (i.e. convert everything to question marks that we can't encode).
You seem to be confused about the difference between binary data and text data, leading you to look at compressed data as if it were Korean text instead of binary data. You need to store (and observe) the data as bytes, not chars or Strings.

ANTLR4 parse tree to DOT using DOTGenerator

How do I use DOTGenerator to convert a parse tree to DOT/graphviz format in ANTLR4?
I found this related question but the only answer uses TreeViewer to display the tree in a JPanel and that's not what I'm after. This other question is exacly what I need but it didn't get answered. Everything else I stumbled upon relates to DOTTreeGenerator from ANTLR3 and it's not helpful.
I'm using Java with the ANTLR4 plugin for IntelliJ.

I have a small project that has all kind of utility methods w.r.t. ANTLR4 grammar debugging/testing. I haven't found the time to provide it of some proper documentation so that I can put it on Github. But here's a part of it responsible for creating a DOT file from a grammar.
Stick it all in a single file called Main.java (and of course generate the lexer and parser for Expression.g4), and you will see a DOT string being printed to your console:
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.ParseTree;
import java.util.*;
public class Main {
public static void main(String[] args) {
/*
// Expression.g4
grammar Expression;
expression
: '-' expression
| expression ('*' | '/') expression
| expression ('+' | '-') expression
| '(' expression ')'
| NUMBER
| VARIABLE
;
NUMBER
: [0-9]+ ( '.' [0-9]+ )?
;
VARIABLE
: [a-zA-Z] [a-zA-Z0-9]+
;
SPACE
: [ \t\r\n] -> skip
;
*/
String source = "3 + 42 * (PI - 3.14159)";
ExpressionLexer lexer = new ExpressionLexer(CharStreams.fromString(source));
ExpressionParser parser = new ExpressionParser(new CommonTokenStream(lexer));
SimpleTree tree = new SimpleTree.Builder()
.withParser(parser)
.withParseTree(parser.expression())
.withDisplaySymbolicName(false)
.build();
DotOptions options = new DotOptions.Builder()
.withParameters(" labelloc=\"t\";\n label=\"Expression Tree\";\n\n")
.withLexerRuleShape("circle")
.build();
System.out.println(new DotTreeRepresentation().display(tree, options));
}
}
class DotTreeRepresentation {
public String display(SimpleTree tree) {
return display(tree, DotOptions.DEFAULT);
}
public String display(SimpleTree tree, DotOptions options) {
return display(new InOrderTraversal().traverse(tree), options);
}
public String display(List<SimpleTree.Node> nodes, DotOptions options) {
StringBuilder builder = new StringBuilder("graph tree {\n\n");
Map<SimpleTree.Node, String> nodeNameMap = new HashMap<>();
int nodeCount = 0;
if (options.parameters != null) {
builder.append(options.parameters);
}
for (SimpleTree.Node node : nodes) {
nodeCount++;
String nodeName = String.format("node_%s", nodeCount);
nodeNameMap.put(node, nodeName);
builder.append(String.format(" %s [label=\"%s\", shape=%s];\n",
nodeName,
node.getLabel().replace("\"", "\\\""),
node.isTokenNode() ? options.lexerRuleShape : options.parserRuleShape));
}
builder.append("\n");
for (SimpleTree.Node node : nodes) {
String name = nodeNameMap.get(node);
for (SimpleTree.Node child : node.getChildren()) {
String childName = nodeNameMap.get(child);
builder.append(" ").append(name).append(" -- ").append(childName).append("\n");
}
}
return builder.append("}\n").toString();
}
}
class InOrderTraversal {
public List<SimpleTree.Node> traverse(SimpleTree tree) {
if (tree == null)
throw new IllegalArgumentException("tree == null");
List<SimpleTree.Node> nodes = new ArrayList<>();
traverse(tree.root, nodes);
return nodes;
}
private void traverse(SimpleTree.Node node, List<SimpleTree.Node> nodes) {
if (node.hasChildren()) {
traverse(node.getChildren().get(0), nodes);
}
nodes.add(node);
for (int i = 1; i < node.getChildCount(); i++) {
traverse(node.getChild(i), nodes);
}
}
}
class DotOptions {
public static final DotOptions DEFAULT = new DotOptions.Builder().build();
public static final String DEFAULT_PARAMETERS = null;
public static final String DEFAULT_LEXER_RULE_SHAPE = "box";
public static final String DEFAULT_PARSER_RULE_SHAPE = "ellipse";
public static class Builder {
private String parameters = DEFAULT_PARAMETERS;
private String lexerRuleShape = DEFAULT_LEXER_RULE_SHAPE;
private String parserRuleShape = DEFAULT_PARSER_RULE_SHAPE;
public DotOptions.Builder withParameters(String parameters) {
this.parameters = parameters;
return this;
}
public DotOptions.Builder withLexerRuleShape(String lexerRuleShape) {
this.lexerRuleShape = lexerRuleShape;
return this;
}
public DotOptions.Builder withParserRuleShape(String parserRuleShape) {
this.parserRuleShape = parserRuleShape;
return this;
}
public DotOptions build() {
if (lexerRuleShape == null)
throw new IllegalStateException("lexerRuleShape == null");
if (parserRuleShape == null)
throw new IllegalStateException("parserRuleShape == null");
return new DotOptions(parameters, lexerRuleShape, parserRuleShape);
}
}
public final String parameters;
public final String lexerRuleShape;
public final String parserRuleShape;
private DotOptions(String parameters, String lexerRuleShape, String parserRuleShape) {
this.parameters = parameters;
this.lexerRuleShape = lexerRuleShape;
this.parserRuleShape = parserRuleShape;
}
}
class SimpleTree {
public static class Builder {
private Parser parser = null;
private ParseTree parseTree = null;
private Set<Integer> ignoredTokenTypes = new HashSet<>();
private boolean displaySymbolicName = true;
public SimpleTree build() {
if (parser == null) {
throw new IllegalStateException("parser == null");
}
if (parseTree == null) {
throw new IllegalStateException("parseTree == null");
}
return new SimpleTree(parser, parseTree, ignoredTokenTypes, displaySymbolicName);
}
public SimpleTree.Builder withParser(Parser parser) {
this.parser = parser;
return this;
}
public SimpleTree.Builder withParseTree(ParseTree parseTree) {
this.parseTree = parseTree;
return this;
}
public SimpleTree.Builder withIgnoredTokenTypes(Integer... ignoredTokenTypes) {
this.ignoredTokenTypes = new HashSet<>(Arrays.asList(ignoredTokenTypes));
return this;
}
public SimpleTree.Builder withDisplaySymbolicName(boolean displaySymbolicName) {
this.displaySymbolicName = displaySymbolicName;
return this;
}
}
public final SimpleTree.Node root;
private SimpleTree(Parser parser, ParseTree parseTree, Set<Integer> ignoredTokenTypes, boolean displaySymbolicName) {
this.root = new SimpleTree.Node(parser, parseTree, ignoredTokenTypes, displaySymbolicName);
}
public SimpleTree(SimpleTree.Node root) {
if (root == null)
throw new IllegalArgumentException("root == null");
this.root = root;
}
public SimpleTree copy() {
return new SimpleTree(root.copy());
}
public String toLispTree() {
StringBuilder builder = new StringBuilder();
toLispTree(this.root, builder);
return builder.toString().trim();
}
private void toLispTree(SimpleTree.Node node, StringBuilder builder) {
if (node.isLeaf()) {
builder.append(node.getLabel()).append(" ");
}
else {
builder.append("(").append(node.label).append(" ");
for (SimpleTree.Node child : node.children) {
toLispTree(child, builder);
}
builder.append(") ");
}
}
#Override
public String toString() {
return String.format("%s", this.root);
}
public static class Node {
protected String label;
protected int level;
protected boolean isTokenNode;
protected List<SimpleTree.Node> children;
Node(Parser parser, ParseTree parseTree, Set<Integer> ignoredTokenTypes, boolean displaySymbolicName) {
this(parser.getRuleNames()[((RuleContext)parseTree).getRuleIndex()], 0, false);
traverse(parseTree, this, parser, ignoredTokenTypes, displaySymbolicName);
}
public Node(String label, int level, boolean isTokenNode) {
this.label = label;
this.level = level;
this.isTokenNode = isTokenNode;
this.children = new ArrayList<>();
}
public void replaceWith(SimpleTree.Node node) {
this.label = node.label;
this.level = node.level;
this.isTokenNode = node.isTokenNode;
this.children.remove(node);
this.children.addAll(node.children);
}
public SimpleTree.Node copy() {
SimpleTree.Node copy = new SimpleTree.Node(this.label, this.level, this.isTokenNode);
for (SimpleTree.Node child : this.children) {
copy.children.add(child.copy());
}
return copy;
}
public void normalizeLevels(int level) {
this.level = level;
for (SimpleTree.Node child : children) {
child.normalizeLevels(level + 1);
}
}
public boolean hasChildren() {
return !children.isEmpty();
}
public boolean isLeaf() {
return !hasChildren();
}
public int getChildCount() {
return children.size();
}
public SimpleTree.Node getChild(int index) {
return children.get(index);
}
public int getLevel() {
return level;
}
public String getLabel() {
return label;
}
public boolean isTokenNode() {
return isTokenNode;
}
public List<SimpleTree.Node> getChildren() {
return new ArrayList<>(children);
}
private void traverse(ParseTree parseTree, SimpleTree.Node parent, Parser parser, Set<Integer> ignoredTokenTypes, boolean displaySymbolicName) {
List<SimpleTree.ParseTreeParent> todo = new ArrayList<>();
for (int i = 0; i < parseTree.getChildCount(); i++) {
ParseTree child = parseTree.getChild(i);
if (child.getPayload() instanceof CommonToken) {
CommonToken token = (CommonToken) child.getPayload();
if (!ignoredTokenTypes.contains(token.getType())) {
String tempText = displaySymbolicName ?
String.format("%s: '%s'",
parser.getVocabulary().getSymbolicName(token.getType()),
token.getText()
.replace("\r", "\\r")
.replace("\n", "\\n")
.replace("\t", "\\t")
.replace("'", "\\'")) :
String.format("%s",
token.getText()
.replace("\r", "\\r")
.replace("\n", "\\n")
.replace("\t", "\\t"));
if (parent.label == null) {
parent.label = tempText;
}
else {
parent.children.add(new SimpleTree.Node(tempText, parent.level + 1, true));
}
}
}
else {
SimpleTree.Node node = new SimpleTree.Node(parser.getRuleNames()[((RuleContext)child).getRuleIndex()], parent.level + 1, false);
parent.children.add(node);
todo.add(new SimpleTree.ParseTreeParent(child, node));
}
}
for (SimpleTree.ParseTreeParent wrapper : todo) {
traverse(wrapper.parseTree, wrapper.parent, parser, ignoredTokenTypes, displaySymbolicName);
}
}
#Override
public String toString() {
return String.format("{label=%s, level=%s, isTokenNode=%s, children=%s}", label, level, isTokenNode, children);
}
}
private static class ParseTreeParent {
final ParseTree parseTree;
final SimpleTree.Node parent;
private ParseTreeParent(ParseTree parseTree, SimpleTree.Node parent) {
this.parseTree = parseTree;
this.parent = parent;
}
}
}
And if you paste the output in a DOT viewer, you will get this:

You may also want to look at alternatives. DOT graphs aren't the pretties among possible graph representations. Maybe you like an svg graph instead? If so have a look at the ANTLR4 grammar extension for Visual Studio Code, which generates and exports an svg graphic with the click of a mouse button (and you can style that with own CSS code).

Remove method in a binary tree

I'm trying to create a method to remove nodes from a Binary Tree but I am having a problem, it seems to be ok but I have another method for printing all of them and after "deleting" a specific node I use the print method but it prints all of them including the one I've already deleted.
public class BinaryTree
{
Node root;
Node n;
private class Node
{
public Node f; //father
public Node right;
public Node left;
public int key; // key
public String Student;
public int Mark;
public Node(int key)
{
right = null;
left = null;
f = null;
Student = null;
Mark = 0;
}
}
public void remove()
{
System.out.println("");
System.out.println("Which student do you want to delete? Write down his ID.");
int id = Genio.getInteger();
n = new Node(id);
Node temporal = root;
if(root == null)
{
System.out.println("This tree is empty");
}
else
{
while(temporal != null)
{
n.f = temporal;
if(n.key == temporal.key)
{
if(n.f.right == null && n.f.left == null)
{
n = null;
temporal = null;
}
}
else if(n.key >= temporal.key)
{
temporal = temporal.right;
}
else
{
temporal = temporal.left;
}
}
}
}
}

What is a better method to sort strings alphabetically in a linked list that is reading in lines from a text file?

public class ContactList {
private ContactNode head;
private ContactNode last;
public ContactNode current;
public ContactList value;
public ContactList() {}
public void addNode(ContactNode input) {
if (this.head == null) {
this.head = input;
this.last = input;
} else last.setNext(input);
input.setPrev(last);
this.last = input;
}
public void traverse() {
System.out.println();
current = this.head;
while (current != null) {
System.out.print(current.getName() + " ");
System.out.println("");
current = current.getNext();
}
System.out.println();
}
public void insertNewFirstNode(String value) {
ContactNode newNode = new ContactNode(value);
head = newNode;
if (last == null) {
last = head;
}
}
public void sort() {
ContactList sorted = new ContactList();
current = head;
while (current != null) {
int index = 0;
if ((current.getName() != null)) {
index = this.current.getName().compareTo(current.getName());
if (index == 1) {
sorted.insertNewFirstNode(current.getName());
}
current = current.getNext();
} else if ((current != null)) {
System.out.print(sorted + "\n");
}
}
} // end contactList
Main Method:
import java.util.Scanner;
import java.io.FileReader;
import java.io.FileNotFoundException;
public class ContactMain {
public static void main(String[] args) {
try {
FileReader filepath = new FileReader("data1.txt");
Scanner k = new Scanner(filepath);
ContactList myList = new ContactList();
while (k.hasNextLine()) {
String i = k.nextLine();
myList.addNode(new ContactNode(i));
}
myList.traverse();
myList.sort();
myList.traverse();
} catch (FileNotFoundException e) {
System.out.println("File Not Found. ");
}
}
}
Node Class:
public class ContactNode {
private String name;
public int index;
private ContactNode prev;
public ContactNode next;
ContactNode(String a) {
name = a;
index = 0;
next = null;
prev = null;
}
public ContactNode getNext() {
return next;
}
public ContactNode getPrev() {
return prev;
}
public String getName() {
return name;
}
public int getIndex() {
return index;
}
public void setNext(ContactNode newnext) {
next = newnext;
}
public void setPrev(ContactNode newprevious) {
prev = newprevious;
}
public void setName(String a) {
name = a;
}
public void setIndex(int b) {
index = b;
}
}
I am making a program for fun that reads in contact information from a text file and puts them into a Linked List. I want to create a sort() method to sort each node or name alphabetically. I've done a good amount of research and my method only prints code like: ContactList#282c0dbe, by as many lines as my text file.

what is ContactList#282c0dbe?
It is class name follow by at sign and hash code at the end, hash code of the object.All classes in Java inherit from the Object class, directly or indirectly . The Object class has some basic methods like clone(), toString(), equals(),.. etc. The default toString() method in Object prints “class name # hash code”.
What is the solution?
You need to override toString method in ContactList class because it is going to give you clear information about the object in readable format that you can understand.
The merit about overriding toString:
Help the programmer for logging and debugging of Java program
Since toString is defined in java.lang.Object and does not give valuable information, so it is
good practice to override it for subclasses.
#override
public String toString(){
// I assume name is the only field in class test
return name + " " + index;
}
For sorting, you should implement Comparator interface since your object does not have natural ordering. In better sense, if you want to define an external controllable ordering behavior, this can override the default ordering behavior
read more about Comparator interface

You need custom Comparator for sorting, and to pretty print your List you need to implement toString() in ContactList class

I'm getting a Out Of Memory Error: Java heap space Exception

I am currently trying to take in a text file and read each word in the file into a binary tree the specific error i get is:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
the text file i am reading into the project was given to me by the professor for the assignment so i know this should not be running into any memory problems i have never felt with this type of exception before and don't know where to start please help. here is my code:
public class Tester {
public static void main(String[] args) throws FileNotFoundException {
Tester run = new Tester();
run.it();
}
public void it() throws FileNotFoundException {
BTree theTree = new BTree();
String str = this.readInFile();
String [] firstWords = this.breakIntoWords(str);
String [] finalWords = this.removeNullValues(firstWords);
for(int i = 0; i < finalWords.length; i++) {
theTree.add(finalWords[i]);
}
theTree.print();
}
public String readInFile() throws FileNotFoundException {
String myFile = "";
int numWords = 0;
Scanner myScan = new Scanner(new File("Dracula.txt"));
while(myScan.hasNext() == true) {
myFile += myScan.nextLine() + " ";
}
return myFile;
}
public String [] breakIntoWords(String myFile) {
String[] words = new String[myFile.length()];
String nextWord = "";
int position = 0;
int i = 0;
while(myFile.length() > position) {
char next = myFile.charAt(position);
next = Character.toLowerCase(next);
// First trim beginning
while (((next < 'a') || (next > 'z')) && !Character.isDigit(next)) {
position++;
next = myFile.charAt(position);
next = Character.toLowerCase(next);
}
// Now pull only letters or numbers until we hit a space
while(!Character.isWhitespace(next)) {
if (Character.isLetterOrDigit(next)) {
nextWord += myFile.charAt(position);
}
position++;
next = myFile.charAt(position);
}
words [i] = nextWord;
i++;
}
return words;
}
public String[] removeNullValues(String[] myWords) {
String[] justMyWords = new String[myWords.length];
for (int i = 0; i < myWords.length; i++) {
if (myWords[i] != null) {
justMyWords[i] = myWords[i];
}
}
return justMyWords;
}
}
Here's my B-tree class:
public class BTree {
private BTNode root;
private int nodeCount;
public boolean add(String word) {
BTNode myNode = new BTNode(word);
if(root == null) {
root = myNode;
nodeCount++;
return true;
}
if(findNode(word)) {
int tmp = myNode.getNumInstance();
tmp++;
myNode.setNumInstance(tmp);
return false;
}
BTNode temp = root;
while(temp != null) {
if(word.compareTo(temp.getMyWord()) < 0) {
if(temp.getRightChild() == null) {
temp.setLeftChild(myNode);
nodeCount++;
return true;
} else {
temp = temp.getRightChild();
}
} else {
if(temp.getLeftChild() == null) {
temp.setLeftChild(myNode);
nodeCount++;
return true;
} else {
temp = temp.getLeftChild();
}
}
}
return false;
}
public boolean findNode(String word) {
return mySearch(root, word);
}
private boolean mySearch(BTNode root, String word) {
if (root == null) {
return false;
}
if ((root.getMyWord().compareTo(word) < 0)) {
return true;
} else {
if (word.compareTo(root.getMyWord()) > 0) {
return mySearch(root.getLeftChild(), word);
} else {
return mySearch(root.getRightChild(), word);
}
}
}
public void print() {
printTree(root);
}
private void printTree(BTNode root) {
if (root == null) {
System.out.print(".");
return;
}
printTree(root.getLeftChild());
System.out.print(root.getMyWord());
printTree(root.getRightChild());
}
public int wordCount() {
return nodeCount;
}
}
And my B-tree node class:
public class BTNode {
private BTNode rightChild;
private BTNode leftChild;
private String myWord;
private int numWords;
private int numInstance;
private boolean uniqueWord;
private boolean isRoot;
private boolean isDeepest;
public BTNode(String myWord){
this.numInstance = 1;
this.myWord = myWord;
this.rightChild = null;
this.leftChild = null;
}
public String getMyWord() {
return myWord;
}
public void setMyWord(String myWord) {
this.myWord = myWord;
}
public BTNode getRightChild() {
return rightChild;
}
public void setRightChild(BTNode rightChild) {
this.rightChild = rightChild;
}
public BTNode getLeftChild() {
return leftChild;
}
public void setLeftChild(BTNode leftChild) {
this.leftChild = leftChild;
}
public int getnumWords() {
return numWords;
}
public void setnumWords(int numWords) {
this.numWords = numWords;
}
public boolean isUniqueWord() {
return uniqueWord;
}
public void setUniqueWord(boolean uniqueWord) {
this.uniqueWord = uniqueWord;
}
public boolean isRoot() {
return isRoot;
}
public void setRoot(boolean isRoot) {
this.isRoot = isRoot;
}
public boolean isDeepest() {
return isDeepest;
}
public void setDeepest(boolean isDeepest) {
this.isDeepest = isDeepest;
}
public int getNumInstance() {
return numInstance;
}
public void setNumInstance(int numInstance) {
this.numInstance = numInstance;
}
}

This little file should not be the reason for the OutOfMemory error.
Performance
That is no error, but if you want to read a whole file in the memory
don't read line per line and concatenate the strings. This slows down your programm.
You can use:
String myFile = new String(Files.readAllBytes(Paths.get("Dracula.txt")));
myFile = myFile.replaceAll("\r\n", " ");
return myFile;
That is also not superfast, but faster.
Now the Errors
word array is too large
public String[] breakIntoWords(String myFile) {
String[] words = new String[myFile.length()];
You define words as an array of lengh lenght of file . That is much too large if you
the name is mnemonic and means that you need an array of length count of words in file
nextWord is never resetted (Cause of OutOfMemory)
// Now pull only letters or numbers until we hit a space
while (!Character.isWhitespace(next)) {
if (Character.isLetterOrDigit(next)) {
nextWord += myFile.charAt(position);
}
position++;
next = myFile.charAt(position);
}
words[i] = nextWord;
i++;
because next word is never set to "" after assigning it to words[i]. So that next word grow
up word by word and your array contents looks like as:
words[0] = "Word1"
words[1] = "Word1Word2"
words[2] = "Word1Word2Word3"
As you can imagine, that will result in an very large amount of used space.

When you are building the tree, you are inserting nodes in the wrong side when you should insert the element to the right.
You should replace this code at BTree class:
while(temp != null) {
if(word.compareTo(temp.getMyWord()) < 0) {
if(temp.getRightChild() == null) {
temp.setRightChild(myNode); // <-- You were using setLeftChild()
nodeCount++;
return true;
} else {
temp = temp.getRightChild();
}
....
}
You are probably creating a huge tree with all the elements to the left side and getting the OutOfMemoryError

Add VM arguments :
-Xms<size> set initial Java heap size
-Xmx<size> set maximum Java heap size
-Xss<size> set java thread stack size
or run it using : java -Xmx256m yourclass.java

It depends on various factors.
Amount of java heap you are running with (default values differ for 32 bit and 64 bit JDK)
Size of the file you feed to the java program

You are trying to load entire contents of the file(i.e. stream object) into Java Memory. In such case, your file size limited(i.e small) Then above code will work in your limited memory but if the file size is increased(i.e. Contents of the file is increased). Then you will face issue.
You have to follow better approach to solve this problem by reading the file contents in chuck. Otherwise you will face same issue.
If you increase JVM arguments also won't work for larger files.
I feel your professor also testing the implementation of your project.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

automation of data format conversion to parent child format - java

Related

Why saving the char '�' to a file saves it as '?'?

ANTLR4 parse tree to DOT using DOTGenerator

Remove method in a binary tree

What is a better method to sort strings alphabetically in a linked list that is reading in lines from a text file?

I'm getting a Out Of Memory Error: Java heap space Exception

Categories

Resources