Efficient algorithm for expression parser (Java) - java

I am trying to design an algorithm that can parse an expression in the form of a String. I want to be able to extract the operands and the operation from the given expression. Also, I want the algorithm to recognize bracket balance. No need for precedence of operations, as the input of the algorithm will include brackets if there are more than 1 binary operations. For unary operations, if a "-" appears before a bracket, it means the entire expression inside the respective brackets is the operand. Examples:
-parsing "a+b" gives "a" and "b" as operands and "+" as operation.
-parsing "(a/b) - (c*v)" gives "a/b" and "c*v" as operands and "-" as operation.
-parsing "((a/(b))) - (((c)*v))" gives the same result as above
-parsing "-a" gives operand as "a" and operation as "-"
-parsing "a + (-c/v)" gives "a" and "-c/v" as operands and "+" as operation
-parsing "-(c)" gives "c" is operand and "-" as operands
-parsing "(-(c))" gives same result as above
Thanks

Try this.
record Node(String name, Node left, Node right) {
#Override
public String toString() {
return "Node[" + name
+ (left != null ? ", " + left : "")
+ (right != null ? ", " + right : "") + "]";
}
}
and
static Node parse(String input) {
return new Object() {
int index = 0;
int ch() { return index < input.length() ? input.charAt(index) : -1; }
boolean eat(char expected) {
while (Character.isWhitespace(ch())) ++index;
if (ch() == expected) {
++index;
return true;
}
return false;
}
Node factor() {
Node node;
boolean minus = eat('-');
if (eat('(')) {
node = expression();
if (!eat(')'))
throw new RuntimeException("')' expected");
} else if (Character.isAlphabetic(ch())) {
node = new Node(Character.toString(ch()), null, null);
++index;
} else
throw new RuntimeException("unknown char '" + (char)ch() + "'");
if (minus) node = new Node("-", node, null);
return node;
}
Node expression() {
Node node = factor();
while (true)
if (eat('*')) node = new Node("*", node, factor());
else if (eat('/')) node = new Node("/", node, factor());
else if (eat('+')) node = new Node("+", node, factor());
else if (eat('-')) node = new Node("-", node, factor());
else break;
return node;
}
}.expression();
}
test:
static void testParse(String input) {
System.out.printf("%-22s -> %s%n", input, parse(input));
}
public static void main(String[] args) {
testParse("a+b");
testParse("(a/b) - (c*v)");
testParse("((a/(b))) - (((c)*v))");
testParse("-a");
testParse("-a + (-c/v)");
testParse("-(c)");
testParse("(-(c))");
}
output:
a+b -> Node[+, Node[a], Node[b]]
(a/b) - (c*v) -> Node[-, Node[/, Node[a], Node[b]], Node[*, Node[c], Node[v]]]
((a/(b))) - (((c)*v)) -> Node[-, Node[/, Node[a], Node[b]], Node[*, Node[c], Node[v]]]
-a -> Node[-, Node[a]]
-a + (-c/v) -> Node[+, Node[-, Node[a]], Node[/, Node[-, Node[c]], Node[v]]]
-(c) -> Node[-, Node[c]]
(-(c)) -> Node[-, Node[c]]

Related

Writing a recursive method that creates a random expression

I am supposed to write a recursive method for generating an expression in an expression tree. The method should have a parameter that limits the "height" of the expression tree. The maxHeight gives the maximum allowable height, not the actual height. The nodes for constants, variables, and binary operators have already been created but I am having issues getting the method to work. Below is the method that I have created and one of the tree nodes. I am still learning Java so please be kind. Any help will be appreciated.
static ExpNode randomExpression(int maxHeight) {
ExpNode e1 = new BinOpNode('+', new VariableNode(), new ConstNode(maxHeight));
ExpNode e2 = new BinOpNode('*', new ConstNode(maxHeight), new VariableNode());
ExpNode e3 = new BinOpNode('*', e1, e2);
ExpNode e4 = new BinOpNode('-', e1, new ConstNode(-3));
if (maxHeight < 0) {
return null;
}
if (maxHeight == 0) {
return new BinOpNode('-', e1, e2);
}
if (maxHeight > 0) {
maxHeight++;
return new BinOpNode('/', e3, e4);
}
return randomExpression(maxHeight);
}
static class BinOpNode extends ExpNode {
char op; // the operator, which must be '+', '-', '*', or '/'
ExpNode left, right; // the expression trees for the left and right operands.
BinOpNode(char op, ExpNode left, ExpNode right) {
if (op != '+' && op != '-' && op != '*' && op != '/')
throw new IllegalArgumentException("'" + op + "' is not a legal operator.");
this.op = op;
this.left = left;
this.right = right;
}
double value(double x) {
double a = left.value(x); // value of the left operand expression tree
double b = right.value(x); // value of the right operand expression tree
switch (op) {
case '+': return a + b;
case '-': return a - b;
case '*': return a * b;
default: return a / b;
}
}
public String toString() {
return "(" + left.toString() + op + right.toString() + ")";
}
}
}

Return value changing between functions

I am making a recursive decent parser with this grammar.
Expr -> Term ( '+' | '-' ) Expr | Term
Term -> Number ( '*' | '/' ) Term | Number
Number -> any valid Java double
And My getTerm Method looks like this.
private static BTree getTerm(Tokenizer tokens)
{
String tokenHold = "";
BTree result = new BTree(getNumber(tokens).getElement());
System.out.println("VALUE of result : " + result.toString());
while(tokens.hasToken() && ("*/".indexOf(tokens.peekToken()) != -1)){
BTree newTree = null;
boolean isMulti = false;
boolean isDiv = false;
if(tokens.peekToken().equals("*")){
isMulti = true;
}
if(tokens.peekToken().equals("/")){
isDiv = true;
}
if(isMulti) {
newTree = new BTree( "*" );
}
else if(isDiv){
newTree = new BTree( "/" );
}
tokenHold = tokens.nextToken();
newTree.addLeftTree(result);
newTree.addRightTree(getTerm(tokens));
result = newTree;
}
System.out.println("Expression of result : " + result.toString());
return result;
}
It returns to the getExpr method which looks like
private static BTree getExpr(Tokenizer tokens)
{
String tokenHold = "";
BTree result = new BTree(getTerm(tokens).getElement());//consumes term
System.out.println("Expression of result in getExpr: " + result.toString());
while(tokens.hasToken() && ("+-".indexOf(tokens.peekToken()) != -1)){
BTree newTree = null;
boolean isAdd = false;
boolean isSub = false;
if(tokens.peekToken().equals("+")){isAdd = true;}
if(tokens.peekToken().equals("-")){isSub = true;}
if(isAdd){ newTree = new BTree( "+" );}
else if(isSub){ newTree = new BTree( "-" );}
tokenHold = tokens.nextToken();
newTree.addRightTree(result);
newTree.addLeftTree(getTerm(tokens)); // old tree on the right
result = newTree;
}
return result;
}
Constructors for the BTree
public BTree(String element)
{
this.element = element;
left = null;
right = null;
}
public BTree(String element, BTree left, BTree right)
{
this.element = element;
this.left = left;
this.right = right;
}
When I input this syntax 4 / 2 / 2 . The getTerm Method has the correct values being returned " (/ 4 (/ 2 2)) " but the getExpr only see's "/". I have sat and tried to figure out my issue but I think I might have a fundamental miss understanding of how these two methods are passing arguments. I also have a feeling it is because of the recursion. I will answer this question myself if I figure it out myself. Thanks in advance.
Alright I finally figured it out.
in my getExpr method I was using a constructor that I didn't include in the original question for the Binary Tree it looks like this
public BTree(String element)
{
this.element = element;
left = null;
right = null;
}
I should have been using the constructor that had both the left and right child of the tree. This constructor looks like this.
public BTree(String element, BTree left, BTree right)
{
this.element = element;
this.left = left;
this.right = right;
}
Because I was not using the correct constructor when passing this value from the getTerm method to getExpr method I lost some of the information and thus was only getting the root. I am new to binary tree's / Recursion / AST and sometimes forget the BIG picture when working with these tools.

ExpressionTree: Postfix to Infix

I am having problems getting my toString() method to work and print out parenthesis. Within my infix notation. For example, right now if I enter 12+3* it will print out 1 + 2 * 3. I would like it to print out ((1+2) *3).
Also, I would like my expression tree to be built when it contains a space within the input. For example, right now if I enter 12+ it works, but I want to be able to enter 1 2 + and it still work. Any thoughts?
P.S. Ignore my evaluate method I haven't implemented it yet!
// Java program to construct an expression tree
import java.util.EmptyStackException;
import java.util.Scanner;
import java.util.Stack;
import javax.swing.tree.TreeNode;
// Java program for expression tree
class Node {
char ch;
Node left, right;
Node(char item) {
ch = item;
left = right = null;
}
public String toString() {
return (right == null && left == null) ? Character.toString(ch) : "(" + left.toString()+ ch + right.toString() + ")";
}
}
class ExpressionTree {
static boolean isOperator(char c) {
if ( c == '+' ||
c == '-' ||
c == '*' ||
c == '/'
) {
return true;
}
return false;
}
// Utility function to do inorder traversal
public void inorder(Node t) {
if (t != null) {
inorder(t.left);
System.out.print(t.ch + " ");
inorder(t.right);
}
}
// Returns root of constructed tree for given
// postfix expression
Node constructTree(char postfix[]) {
Stack<Node> st = new Stack();
Node t, t1, t2;
for (int i = 0; i < postfix.length; i++) {
// If operand, simply push into stack
if (!isOperator(postfix[i])) {
t = new Node(postfix[i]);
st.push(t);
} else // operator
{
t = new Node(postfix[i]);
// Pop two top nodes
// Store top
t1 = st.pop(); // Remove top
t2 = st.pop();
// make them children
t.right = t1;
t.left = t2;
// System.out.println(t1 + "" + t2);
// Add this subexpression to stack
st.push(t);
}
}
// only element will be root of expression
// tree
t = st.peek();
st.pop();
return t;
}
public static void main(String args[]) {
Scanner input = new Scanner(System.in);
/*boolean keepgoing = true;
while (keepgoing) {
String line = input.nextLine();
if (line.isEmpty()) {
keepgoing = false;
} else {
Double answer = calculate(line);
System.out.println(answer);
}
}*/
ExpressionTree et = new ExpressionTree();
String postfix = input.nextLine();
char[] charArray = postfix.toCharArray();
Node root = et.constructTree(charArray);
System.out.println("infix expression is");
et.inorder(root);
}
public double evaluate(Node ptr)
{
if (ptr.left == null && ptr.right == null)
return toDigit(ptr.ch);
else
{
double result = 0.0;
double left = evaluate(ptr.left);
double right = evaluate(ptr.right);
char operator = ptr.ch;
switch (operator)
{
case '+' : result = left + right; break;
case '-' : result = left - right; break;
case '*' : result = left * right; break;
case '/' : result = left / right; break;
default : result = left + right; break;
}
return result;
}
}
private boolean isDigit(char ch)
{
return ch >= '0' && ch <= '9';
}
private int toDigit(char ch)
{
return ch - '0';
}
}
Why you use inorder()? root.toString() returns exactly what you want, "((1+2)*3)"
Spaces you can skip at start of loop:
for (int i = 0; i < postfix.length; i++) {
if (postfix[i] == ' ')
continue;
...
Change main like this.
Scanner input = new Scanner(System.in);
String postfix = input.nextLine();
char[] charArray = postfix.replace(" ", "").toCharArray();
Node root = constructTree(charArray);
System.out.println("infix expression is");
System.out.println(root);

Reduce number of parentheses for a binary expression tree

I have a function that receives a binary expression tree and returns a String with the expression in-order. The only "problem" is that the resulting expression have too many parentheses,
e.g.: The function returns (a + (b * c)), but it can be reduced to a + b * c.
It is defined with the binary operators +, -, *, /, and the unary operator _ (negative).
What I really want to know is if I can modify the already existing function to reduce the number of parentheses in an efficient way, or create another function that operates with the String of the in-order expression.
The function is as follows:
private static String infijo(ArbolB t){
String s = "";
if (t != null) {
String info = String.valueOf(t.info);
if ("+-*/".contains(info)) s += "(";
if ("_".contains(info)) s += "-(";
s += infijo(t.left) + (info.equals("_") ? "" : info) + infijo(t.right);
if ("+-*/_".contains(String.valueOf(t.info))) s += ")";
}
return s;
}
Where ArbolB is a binary tree defined by:
public class ArbolB {
ArbolB right;
ArbolB left;
Object info;
public ArbolB(Object info, ArbolB right, ArbolB left){
this.info = info;
this.right = right;
this.left = left;
}
}
After writing this whole thing out, I realized that I didn't really answer your question properly (my solution ignores PEMDAS and just matches pairs, d'oh!). So, take from this what you can... I'm not throwing it out :P
I think you COULD solve this either way, but here would be my preferred method, using and trusting what you already have. There's probably a good way to use nodes to do this, but why not use what you have, right?
Starting from the point where you have your expression as a string (e.g. "((2*2) + _(3+3))" you could try something like:
public string RemoveUnnecessaryParens(string expression)
{
readonly string openParen = "(";
readonly string closeParen = ")";
// char array would also work for this
// multiple ways to track "balance" of parens, lets use int
int unclosedParenCount = 0;
string toReturn = "";
// iterate through the expression
for (int i = 0; i < expression.Length; i++)
{
string current = expression.Substring(i,1);
if (openParen == current)
unclosedParenCount++;
else if (closedParen == current)
unclosedParenCount--;
else
toReturn += current;
if (unclosedParenCount < 0) throw new UnbalancedExpressionException(); // One more close than open! Uh oh!
}
if (0 != unclosedParenCount) throw new UnbalancedExpressionException(); // One more open than close at the end! Uh oh!
else return toReturn;
}
Make sense?
Well, after thinking it a while, I got to a solution myself, by adding a priority function for determining when parentheses were necessary, and a variable that indicates if the operation was on the left or the right side of the formula, this because a-b+c don't need parentheses, but c+(a-b) do need them.
private static String infijo(ArbolB t, int priority, boolean right) {
String s = "";
int oP = 0;
if (t != null) {
String info = String.valueOf(t.info);
int pi = priority(info);
if ("+-*/".contains(info)) {
/* this only adds parentheses if the operation has higher priority or if the
operation on the right side should be done before the one on the left side*/
if ("+-*/".contains(info)) {
if (pi/2 < priority/2) s += "(";
else s += pi/2 == priority/2 && pi != priority && right ? "(" : "";
oP = priority; //stores the old priority
priority= pi; //priority of the new operator
}
}
if ("_".contains(info)) {
s += "-";
oP = priority;
priority = pi;
}
s += infijo(t.left, priority, false) + (info.equals("_") ? "" : info)
+ infijo(t.right, priority, true);
if ("+-*/".contains(info)) {
// adds the closing parentheses following the same rules as for the opening ones
if (priority / 2 < oP / 2) s += ")";
else s += priority / 2 == oP / 2 && priority != oP && right ? ")" : "";
}
}
return s;
}
private static int priority(String op) {
if ("_".contains(op)) return 4;
if ("/".contains(op)) return 3;
if ("*".contains(op)) return 2;
if ("-".contains(op)) return 1;
return 0;
}
#Override
public String toString() {
ArbolB f = getFormula(); //this returns the Binary Expression Tree
return infijo(f, Integer.MIN_VALUE, false);
}

Constructing a Tree from an arithmetic expression

I'm pretty lost at the moment on how I would go about implementing this Tree, I'm trying to construct a Tree from a string representation of input "(4 + 6) + (2 + 3)". How would I go about making a Tree from two Stacks?
public class Tree {
private Stack opStk = new Stack();
private Stack valStk = new Stack();
private Tree parent = null;
public Tree(String str){
System.out.println((EvaluateExpression(str)));
}
public void doOperation() {
Object x = valStk.pop();
Object y = valStk.pop();
Object op = opStk.pop();
if ((Integer) x <= 0 || (Integer) y <= 0){
throw new NumberFormatException();
}
if (op.equals("+")) {
int sum = (Integer) x + (Integer) y;
valStk.push(sum);
}
}
public void repeatOps(char refOp) {
while (valStk.count() > 1 &&
prec(refOp) <= prec((char)opStk.pop())) {
doOperation();
}
}
int prec(char op) {
switch (op) {
case '+':
case '-':
return 0;
case '*':
case '/':
return 1;
case '^':
return 2;
default:
throw new IllegalArgumentException("Operator unknown: " + op);
}
}
public Object EvaluateExpression(String str) {
System.out.println("Evaluating " + str);
Scanner s = null;
try {
s = new Scanner(str);
//while there is tokens to be read
while (s.hasNext()) {
//if token is an int
if (s.hasNextInt()) {
//read it
int val = s.nextInt();
if(val <= 0) {
throw new NumberFormatException("Non-positive");
}
System.out.println("Val " + val);
//push it to the stack
valStk.push(val);
} else {
//push operator
String next = s.next();
char chr = next.charAt(0);
System.out.println("Repeat Ops " + chr);
repeatOps(chr);
System.out.println("Op " + next);
opStk.push(chr);
}
repeatOps('+');
}
} finally {
if (s != null) {
s.close();
}
}
System.out.println("Should be the result: " + valStk.pop());
return valStk.pop();
}
I have a few suggestions to make that might set you on the right path (hopefully).
Firstly I suggest your expression tree follow the Composite Design Pattern. It works very well for these types of hierarchies. For your purpose it would look something like:
interface Expression {
int getValue();
}
class Constant implements Expression {
private int value;
public int getValue() {
return value;
}
}
class Operation implements Expression {
private Expression operand1;
private Operator operator;
private Expression operand2;
public int getValue() {
return operator.apply(operand1, operand2);
}
}
Note that you don't need any concept of operator precedence or parentheses: it's entirely implicit in how the tree is constructed. For example "3 + 4 * 2" should result in a tree "(+ 3 (* 4 2))" while "(3 + 4) * 2" should result in a tree "(* (+ 3 4) 2)".
Secondly I suggest you make your operators into an enum rather than relying on the string values:
enum Operator {
TIMES((n1, n2) -> n1 * n2),
DIVIDE((n1, n2) -> n1 / n2),
PLUS((n1, n2) -> n1 + n2),
MINUS((n1, n2) -> n1 - n2);
private final BinaryOperator<Integer> operation;
Operator(BinaryOperator<Integer> operation) {
this.operation = operation;
}
public int apply(int operand1, int operand2) {
return operation.apply(operand1, operand2);
}
}
The advantage of this approach is that it's trivial to add new operators without changing the structure of the tree at all.
Thirdly I suggest you split your conversion from string to expression tree into two steps. The first is to convert from string to tokens and the second from token to trees. These are call lexical and semantic analysis in the jargon.
If you are using the shunting yard algorithm for the semantic analysis then keep in mind that the output stack will hold Expression instances ready to become operands. I can give you more detail on how to shunt operators but it's probably worth you giving the suggestions above a try first.

Categories

Resources