Parenthesis Check within a Linked stack for infix to postfix - java

I have four classes.
One contains my linkedstack setup
One is infixtopostfix for prioritization and conversion
Parenthesis for matching
Postfix for evaluation
I have setup almost everything here but it is still returning false anyway I put it.
On another note my equals on !stackMatch.pop().equals(c) is not working due to it being a object type with '!' being a problem.
My programs are simple and straight forward:
LinkedStack.java
public class LinkedStack implements StackInterface {
private Node top;
public LinkedStack() {
top = null;
} // end default constructor
public boolean isEmpty() {
return top == null;
} // end isEmpty
public void push(Object newItem) {
Node n = new Node();
n.setData(newItem);
n.setNext(top);
top = n;
} // end push
public Object pop() throws Exception {
if (!isEmpty()) {
Node temp = top;
top = top.getNext();
return temp.getData();
} else {
throw new Exception("StackException on pop: stack empty");
} // end if
} // end pop
public Object peek() throws Exception {
if (!isEmpty()) {
return top.getData();
} else {
throw new Exception("StackException on peek: stack empty");
} // end if
} // end peek
} // end LinkedStack
InfixToPostfix.java
import java.util.*;
public class InfixToPostfix {
Parenthesis p = new Parenthesis();
LinkedStack stack = new LinkedStack();
String token = ""; // each token of the string
String output = ""; // the string holding the postfix expression
Character topOfStackObject = null; // the top object of the stack, converted to a Character Object
char charValueOfTopOfStack = ' '; // the primitive value of the Character object
/**
* Convert an infix expression to postfix. If the expression is invalid, throws an exception.
* #param s the infix expression
* #return the postfix expression as a string
* hint: StringTokenizer is very useful to this iteratively
*/
//public String convertToPostfix(String s) throws Exception {
//}
private boolean isOperand (char c){
return ((c>= '0' && c <= '9') || (c >= 'a' && c<= 'z'));
}
public void precedence(char curOp, int val) throws Exception {
while (!stack.isEmpty()) {
char topOp = (Character) stack.pop();
// charValueOfTopOfStack = topOfStackObject.charValue();
if (topOp == '(') {
stack.push(topOp);
break;
}// it's an operator
else {// precedence of new op
int prec2;
if (topOp == '+' || topOp == '-') {
prec2 = 1;
} else {
prec2 = 2;
}
if (prec2 < val) // if prec of new op less
{ // than prec of old
stack.push(topOp); // save newly-popped op
break;
} else // prec of new not less
{
output = output + topOp; // than prec of old
}
}
}
}
Parenthesis.java
import java.util.*;
public class Parenthesis{
private LinkedStack stack = new LinkedStack();
private Object openBrace;
private String outputString;
/**
* Determine if the expression has matching parenthesis using a stack
*
* #param expr the expression to be evaluated
* #return returns true if the expression has matching parenthesis
*/
public boolean match(String expr) {
LinkedStack stackMatch = new LinkedStack();
for(int i=0; i < expr.length(); i++) {
char c = expr.charAt(i);
if(c == '(')
stackMatch.push(c);
else if(c == ')'){
if (stackMatch.isEmpty() || !stackMatch.pop().equals(c))
return false;
}
}
return stackMatch.isEmpty();
}
}
Just wanted to give you all of it so you could help me. I have tests written already just struggling with the parenthesis problem of pushing it on the stack but unable to compare it to the closing parenthesis so it can check if there is enough while checking to be sure it is not empty.

The problem probably is, that you are trying to test if matching ( is currently on top of the stack when ) comes, but in c is acctual character, ), so you test if ) is on top of stack, not ( as you should.

Related

Validating expression using 2 Generic Stacks

The task is to implement a generic stack (can not use the libraries from java), make the user input an expression using true and false for booleans b1 and b2, logical operators (and, or, not, iff, implies) recognize if its boolean or operator and send to 2 stacks, then poping the stacks to evaluate if its a valid expression, i.e: input:(b1 and b2) implies b3 is a valid expression but B3 and (b2 or) is not, I have issues with the stack part, since the peek is not returning any element, here is my code so far, note: the charat is because I would be checking that the brackets are balanced as well:
public class MyStack<T> {
class StackOverFlowException extends RuntimeException{}
class EmptyStackException extends RuntimeException{}
private T[] stack;
private int top;
public MyStack(int size) {
this.stack = (T[]) new Object[size];
this.top = 0;
}
public boolean isEmpty() {
return this.top == 0;
}
public boolean isFull() {
return this.top == stack.length;
}
public void push(T x) {
if(top == stack.length) {
throw new StackOverFlowException();
}
else {
this.stack[top] = x;
top++;
}
}
public T pop() {
if(isEmpty()) {
throw new EmptyStackException();
}
else {
T value = this.stack[--top];
return value;
}
}
public T peek() {
return this.stack[top];
}
public static void main(String[] args) {
MyStack<String> tf = new MyStack(100);
MyStack<String> operators = new MyStack(100);
System.out.println("Please input the expression to evaluate: ");
Scanner scn = new Scanner(System.in);
String expression = scn.nextLine();
String tokens[] = expression.split(" ");
int n = tokens.length;
boolean P1 = true;
boolean P2 = true;
boolean result = true;
for(int i = 0; i < n; i++ ) {
String separate = tokens[i];
char x = separate.charAt(i);
if(tokens[i].equalsIgnoreCase("true")||tokens[i].equalsIgnoreCase("false")) {
tf.push(separate);
tf.peek();
}
else if(tokens[i].equalsIgnoreCase("and")||tokens[i].equalsIgnoreCase("not")||tokens[i].equalsIgnoreCase("or")||tokens[i].equalsIgnoreCase("implies")||tokens[i].equalsIgnoreCase("iff")) {
operators.push(separate);
}
else {
System.out.println("Expression not Valid!");
}
}
}
The top variable is being misinterpreted in the peek() method (as well as the isEmpty() method).
As implemented, top is a misnomer since it is actually the size of the stack (which may also be considered the index for the next element to be pushed). So your peek() method should be looking at the element before top.
Alternatively, you may to define top as the element at the top of the stack, as this is generally how you are using it elsewhere. In this case, you will need to define a flag value to indicate the stack is empty.
In any case, you need handle the empty Stack case in the peek() method.
public class MyStack {
private static final int EMPTY = -1;
private int top = EMPTY;
... other stuff ...
public boolean isEmpty() {
return EMPTY == top;
}
public T peek() {
if (ifEmpty()) {
throw new EmptyStackException("Cannot peek into empty Stack");
}
return stack[top];
}
}

Finding corresponding elements in a string

Please refer tho this question from hackerrank:
A bracket is considered to be any one of the following characters: (,
), {, }, [, or ].
Two brackets are considered to be a matched pair if the an opening
bracket (i.e., (, [, or {) occurs to the left of a closing bracket
(i.e., ), ], or }) of the exact same type. There are three types of
matched pairs of brackets: [], {}, and ().
A matching pair of brackets is not balanced if the set of brackets it
encloses are not matched. For example, {[(])} is not balanced because
the contents in between { and } are not balanced. The pair of square
brackets encloses a single, unbalanced opening bracket, (, and the
pair of parentheses encloses a single, unbalanced closing square
bracket, ]...
I have done the program as follows:
import java.io.*;
import java.math.*;
import java.security.*;
import java.text.*;
import java.util.*;
import java.util.concurrent.*;
import java.util.regex.*;
public class Solution {
static char findCorrBracket(char b)
{
if(b == '{')
{
return '}';
}
else if(b == '[')
{
return ']';
}
else
{
return ')';
}
}
// Complete the isBalanced function below.
static String isBalanced(String s) {
char a[] = new char[1000];
int top = 0,i=1;
a[0]=s.charAt(0);
char retBrack;
String result;
while(top!=-1 )
{
retBrack=findCorrBracket(s.charAt(top));
if(s.charAt(i)!=retBrack)
{
a[top]=s.charAt(i);
top=i;
}
else
{
top--;
}
i++;
if(i>=s.length()-1)
{
break;
}
}
System.out.println(top);
if(top==0)
{
result = "YES";
}
else
{
result = "NO";
}
return result;
}
private static final Scanner scanner = new Scanner(System.in);
public static void main(String[] args) throws IOException {
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(System.getenv("OUTPUT_PATH")));
int t = scanner.nextInt();
scanner.skip("(\r\n|[\n\r\u2028\u2029\u0085])?");
for (int tItr = 0; tItr < t; tItr++) {
String s = scanner.nextLine();
String result = isBalanced(s);
bufferedWriter.write(result);
bufferedWriter.newLine();
}
bufferedWriter.close();
scanner.close();
}
}
I have changed the code a little bit. It has made the program more readable. But still, the problem persists.
/******************************************************************************
Online Java Debugger.
Code, Run and Debug Java program online.
Write your code in this editor and press "Debug" button to debug program.
*******************************************************************************/
public class Main
{
static char findCorrBracket(char b)
{
if(b == '{')
{
return '}';
}
else if(b == '[')
{
return ']';
}
else
{
return ')';
}
}
// Complete the isBalanced function below.
static String isBalanced(String s) {
char a[] = new char[1000];
int top = 0,i=1;
a[0]=s.charAt(0);
char retBrack;
String result;
while(i<s.length())
{
retBrack=findCorrBracket(s.charAt(top));
if(s.charAt(i)!=retBrack)
{
top++;
a[top]=s.charAt(i);
}
else
{
top--;
}
i++;
}
System.out.println(top);
if(top==-1)
{
result = "YES";
}
else
{
result = "NO";
}
return result;
}
public static void main(String[] args) {
String s="{[]()}";
String result = isBalanced(s);
System.out.println(result);
}
}
It runs for few of the test cases, while for others it doesn't. How should I change the code?
Update - I've added the corrections I made as comments in the code.
static char findCorrBracket(char b)
{
if(b == '{')
{
return '}';
}
else if(b == '[')
{
return ']';
}
else if(b == '(')
{
//Use else if here instead of else, since otherwise '}',']','(' & ')' will all get the returned character value ')'
return ')';
} else {
return '_';
}
}
// Complete the isBalanced function below.
static String isBalanced(String s) {
char a[] = new char[1000];
int top = 0,i=1;
a[0]=s.charAt(0);
char retBrack;
String result;
while(i<s.length())
{
if(top == -1) {
//If the stack is empty, then we don't need to get the 'correct bracket' and check
//We can directly insert the character into the stack
top++;
a[top] = s.charAt(i);
} else {
//findCorrBracket from `a[top]`, not from `s.charAt(top)`
retBrack = findCorrBracket(a[top]);
if (s.charAt(i) != retBrack) {
top++;
a[top] = s.charAt(i);
} else {
top--;
}
}
i++;
}
System.out.println(top);
if(top==-1)
{
result = "YES";
}
else
{
result = "NO";
}
return result;
}
You loop should iterate through all the characters of the string. So, while (i<s.length()) should be moved from inside the while block to the condition.
The top value needs to be incremented, not set to the value of i.
// Complete the isBalanced function below.
static String isBalanced(String s) {
char a[] = new char[1000];
int top = 0,i=1;
a[0]=s.charAt(0);
char retBrack;
String result;
while(i<s.length() )
{
retBrack=findCorrBracket(s.charAt(top));
if(s.charAt(i)!=retBrack)
{
top++;
a[top]=s.charAt(i);
}
else
{
top--;
}
i++;
}
System.out.println(top);
if(top==-1)
{
result = "YES";
}
else
{
result = "NO";
}
return result;
}
P.S - There are a few improvements I could suggest,
As racraman suggested, use for loops instead of while, unless you need a do while.
Don't create a static array to use as a stack, use a Java collection (eg, ArrayList) to create a dynamic stack. This way, even if the string has more than a 1000 consecutive (, your code would work.
You've started off on the right path, using a stack and iterating over the string one character at a time. But your logic for each character doesn't seem to make any sense. Apparently it "works" if the input string meets some very specific conditions, but not in general. Something more like this (in pseudo-code) will work on any input:
for each character `c` in string `s`:
if `c` is an opening bracket:
PUSH`c` onto stack
else:
// `c` must be a closing bracket
if stack is EMPTY:
's` IS UNBALANCED
else:
POP top of stack into `b`
if `b` is not the correct matching opening bracket for `c`:
`s` IS UNBALANCED
end if
end if
end if
end for
if stack is EMPTY:
SUCCESS! (`s` is correctly balanced)
else:
`s` IS UNBALANCED
end if

ExpressionTree: Postfix to Infix

I am having problems getting my toString() method to work and print out parenthesis. Within my infix notation. For example, right now if I enter 12+3* it will print out 1 + 2 * 3. I would like it to print out ((1+2) *3).
Also, I would like my expression tree to be built when it contains a space within the input. For example, right now if I enter 12+ it works, but I want to be able to enter 1 2 + and it still work. Any thoughts?
P.S. Ignore my evaluate method I haven't implemented it yet!
// Java program to construct an expression tree
import java.util.EmptyStackException;
import java.util.Scanner;
import java.util.Stack;
import javax.swing.tree.TreeNode;
// Java program for expression tree
class Node {
char ch;
Node left, right;
Node(char item) {
ch = item;
left = right = null;
}
public String toString() {
return (right == null && left == null) ? Character.toString(ch) : "(" + left.toString()+ ch + right.toString() + ")";
}
}
class ExpressionTree {
static boolean isOperator(char c) {
if ( c == '+' ||
c == '-' ||
c == '*' ||
c == '/'
) {
return true;
}
return false;
}
// Utility function to do inorder traversal
public void inorder(Node t) {
if (t != null) {
inorder(t.left);
System.out.print(t.ch + " ");
inorder(t.right);
}
}
// Returns root of constructed tree for given
// postfix expression
Node constructTree(char postfix[]) {
Stack<Node> st = new Stack();
Node t, t1, t2;
for (int i = 0; i < postfix.length; i++) {
// If operand, simply push into stack
if (!isOperator(postfix[i])) {
t = new Node(postfix[i]);
st.push(t);
} else // operator
{
t = new Node(postfix[i]);
// Pop two top nodes
// Store top
t1 = st.pop(); // Remove top
t2 = st.pop();
// make them children
t.right = t1;
t.left = t2;
// System.out.println(t1 + "" + t2);
// Add this subexpression to stack
st.push(t);
}
}
// only element will be root of expression
// tree
t = st.peek();
st.pop();
return t;
}
public static void main(String args[]) {
Scanner input = new Scanner(System.in);
/*boolean keepgoing = true;
while (keepgoing) {
String line = input.nextLine();
if (line.isEmpty()) {
keepgoing = false;
} else {
Double answer = calculate(line);
System.out.println(answer);
}
}*/
ExpressionTree et = new ExpressionTree();
String postfix = input.nextLine();
char[] charArray = postfix.toCharArray();
Node root = et.constructTree(charArray);
System.out.println("infix expression is");
et.inorder(root);
}
public double evaluate(Node ptr)
{
if (ptr.left == null && ptr.right == null)
return toDigit(ptr.ch);
else
{
double result = 0.0;
double left = evaluate(ptr.left);
double right = evaluate(ptr.right);
char operator = ptr.ch;
switch (operator)
{
case '+' : result = left + right; break;
case '-' : result = left - right; break;
case '*' : result = left * right; break;
case '/' : result = left / right; break;
default : result = left + right; break;
}
return result;
}
}
private boolean isDigit(char ch)
{
return ch >= '0' && ch <= '9';
}
private int toDigit(char ch)
{
return ch - '0';
}
}
Why you use inorder()? root.toString() returns exactly what you want, "((1+2)*3)"
Spaces you can skip at start of loop:
for (int i = 0; i < postfix.length; i++) {
if (postfix[i] == ' ')
continue;
...
Change main like this.
Scanner input = new Scanner(System.in);
String postfix = input.nextLine();
char[] charArray = postfix.replace(" ", "").toCharArray();
Node root = constructTree(charArray);
System.out.println("infix expression is");
System.out.println(root);

Having trouble checking to see if a string is balanced or not

package edu.bsu.cs121.mamurphy;
import java.util.Stack;
public class Checker {
char openPara = '(';
char openBracket = '[';
char openCurly = '{';
char openArrow = '<';
char closePara = ')';
char closeBracket = ']';
char closeCurly = '}';
char closeArrow = '>';
public boolean checkString(String stringToCheck) {
Stack<Character> stack = new Stack<Character>();
for (int i = 0; i < stringToCheck.length(); i++) {
char c = stringToCheck.charAt(i);
if (c == openPara || c == openBracket || c == openCurly || c == openArrow) {
stack.push(c);
System.out.println(stack);
;
}
if (c == closePara) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openPara) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
if (c == closeBracket) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openBracket) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
if (c == closeCurly) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openCurly) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
if (c == closeArrow) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openArrow) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
}
return false;
}
}
I am currently trying to create a program where I check to see if a string is balanced or not. A string is balanced if and only if each opening character: (, {, [, and < have a matching closing character: ), }, ], and > respectively.
What happens is when checking through the string, if an opening character is found, it is pushed into a stack, and it checks to see if there is the appropriate closing character.
If there is a closing character before the opening character, then that automatically means that the string is unbalanced. Also, the string is automatically unbalanced if after going to the next character there is something still inside of the stack.
I tried to use
else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
as a way of seeing if the stack still had anything in it, but it still isn't working for me. Any advice on what to do?
For example, if the string input were ()<>{() then the program should run through like normal until it gets to the single { and then the code should realize that the string is unbalanced and output Unbalanced.
For whatever reason, my code does not do this.
The following logic is flawed (emphasis mine):
For example, if the string input were ()<>{() then the program should run through like normal until it gets to the single { and then the code should realize that the string is unbalanced and output Unbalanced.
In fact, the code can't conclude that the string is unbalanced until it has scanned the entire string and established that the { has no matching }. For all it knows, the full input could be ()<>{()} and be balanced.
To achieve this, you need to add a check that ensures that the stack is empty after the entire string has been processes. In your example, it would still contain the {, indicating that the input is not balanced.
I took a shot at answering this. My solutions returns true if the string is balanced and enforces opening/closing order (ie ({)} returns false). I started with your code and tried to slim it down.
import java.util.HashMap;
import java.util.Map;
import java.util.Stack;
public class mamurphy {
private static final char openPara = '(';
private static final char openBracket = '[';
private static final char openCurly = '{';
private static final char openArrow = '<';
private static final char closePara = ')';
private static final char closeBracket = ']';
private static final char closeCurly = '}';
private static final char closeArrow = '>';
public static void main(String... args) {
System.out.println(checkString("{}[]()90<>"));//true
System.out.println(checkString("(((((())))"));//false
System.out.println(checkString("((())))"));//false
System.out.println(checkString(">"));//false
System.out.println(checkString("["));//false
System.out.println(checkString("{[(<>)]}"));//true
System.out.println(checkString("{[(<>)}]"));//false
System.out.println(checkString("( a(b) (c) (d(e(f)g)h) I (j<k>l)m)"));//true
}
public static boolean checkString(String stringToCheck) {
final Map<Character, Character> closeToOpenMap = new HashMap<>();
closeToOpenMap.put(closePara, openPara);
closeToOpenMap.put(closeBracket, openBracket);
closeToOpenMap.put(closeCurly, openCurly);
closeToOpenMap.put(closeArrow, openArrow);
Stack<Character> stack = new Stack<>();
final char[] stringAsChars = stringToCheck.toCharArray();
for (int i = 0; i < stringAsChars.length; i++) {
final char current = stringAsChars[i];
if (closeToOpenMap.values().contains(current)) {
stack.push(current); //found an opening char, push it!
} else if (closeToOpenMap.containsKey(current)) {
if (stack.isEmpty() || closeToOpenMap.get(current) != stack.pop()) {
return false;//found closing char without correct opening char on top of stack
}
}
}
if (!stack.isEmpty()) {
return false;//still have opening chars after consuming whole string
}
return true;
}
}
Here's an alternate approach:
private static final char[] openParens = "[({<".toCharArray();
private static final char[] closeParens = "])}>".toCharArray();
public static boolean isBalanced(String expression){
Deque<Character> stack = new ArrayDeque<>();
for (char c : expression.toCharArray()){
for (int i = 0; i < openParens.length; i++){
if (openParens[i] == c){
// This is an open - put it in the stack
stack.push(c);
break;
}
if (closeParens[i] == c){
// This is a close - check the open is at the top of the stack
if (stack.poll() != openParens[i]){
return false;
}
break;
}
}
}
return stack.isEmpty();
}
It simplifies the logic to have two corresponding arrays of open and close symbols. You could also do this with even and odd positions in one array - ie. "{}<>", for example:
private static final char[] symbols = "[](){}<>".toCharArray();
public static boolean isBalanced(String expression){
Deque<Character> stack = new ArrayDeque<>();
for (char c : expression.toCharArray()){
for (int i = 0; i < symbols.length; i += 2){
if (symbols[i] == c){
// This is an open - put it in the stack
stack.push(c);
break;
}
if (symbols[i + 1] == c){
// This is a close - check the open is at the top of the stack
if (stack.poll() != symbols[i]){
return false;
}
break;
}
}
}
return stack.isEmpty();
}
Note that poll returns null if the stack is empty, so will correctly fail the equality comparison if we run out of stack.
For example, if the string input were ()<>{() then the program should run through like normal until it gets to the single { and then the code should realize that the string is unbalanced and output Unbalanced.
It is not clear by your example whether the boundaries can be nested like ([{}]). If they can, that logic will not work, as the whole string has to be consumed to be sure any missing closing-chars aren't at the end, and so, the string cannot be reliably deemed unbalanced at the point you indicate.
Here is my take on your problem:
BalanceChecker class:
package so_q33378870;
import java.util.Stack;
public class BalanceChecker {
private final char[] opChars = "([{<".toCharArray();
private final char[] edChars = ")]}>".toCharArray();
//<editor-fold defaultstate="collapsed" desc="support functions">
public boolean isOPChar(char c) {
for (char checkChar : opChars) {
if (c == checkChar) {
return true;
}
}
return false;
}
public boolean isEDChar(char c) {
for (char checkChar : edChars) {
if (c == checkChar) {
return true;
}
}
return false;
}
//NOTE: Unused.
// public boolean isBoundaryChar(char c) {
// boolean result;
// if (result = isOPChar(c) == false) {
// return isEDChar(c);
// } else {
// return result;
// }
// }
public char getOpCharFor(char c) {
for (int i = 0; i < edChars.length; i++) {
if (c == edChars[i]) {
return opChars[i];
}
}
throw new IllegalArgumentException("The character (" + c + ") received is not recognized as a closing boundary character.");
}
//</editor-fold>
public boolean isBalanced(char[] charsToCheck) {
Stack<Character> checkStack = new Stack<>();
for (int i = 0; i < charsToCheck.length; i++) {
if (isOPChar(charsToCheck[i])) {
//beginning char found. Add to top of stack.
checkStack.push(charsToCheck[i]);
} else if (isEDChar(charsToCheck[i])) {
if (checkStack.isEmpty()) {
//ending char found without beginning chars on the stack. UNBALANCED.
return false;
} else if (getOpCharFor(charsToCheck[i]) == checkStack.peek()) {
//ending char found matches last beginning char on the stack. Pop and continue.
checkStack.pop();
} else {
//ending char found, but doesn't match last beginning char on the stack. UNBALANCED.
return false;
}
}
}
//the string is balanced if and only if the stack is empty at the end.
return checkStack.empty();
}
public boolean isBalanced(String stringToCheck) {
return isBalanced(stringToCheck.toCharArray());
}
}
Main class (used for testing):
package so_q33378870;
public class main {
private static final String[] tests = {
//Single - Balanced.
"()",
//Single - Unbalanced by missing end.
"(_",
//Multiple - Balanced.
"()[]{}<>",
//Multiple - Unbalanced by missing beginning.
"()[]_}<>",
//Nested - Balanced.
"([{<>}])",
//Nested - Unbalanced by missing end.
"([{<>}_)",
//Endurance test - Balanced.
"the_beginning (abcd) divider (a[bc]d) divider (a[b{c}d]e) divider (a[b{c<d>e}f]g) the_end"
};
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
BalanceChecker checker = new BalanceChecker();
for (String s : tests) {
System.out.println("\"" + s + "\" is " + ((checker.isBalanced(s)) ? "BALANCED!" : "UNBALANCED!"));
}
}
}

Dynamic operator tokens in ANTLR4

I'm trying to make a calculator in ANTLR4 that can use almost every possible symbol as mathematical operator.
Concrete:
- The user defines operations consisting of an operator and a precedence. The operator can be any combination of symbols except for some system symbols (parentheses, commas, ...). Precedence is a positive integer number. Operations are stored in a java HashMap.
- There are three different kinds of operations: left side (unary minus, ...), right side (factorial, ...) and binary (addition, ...)
- The operations should be requested at runtime, so that operations can be (de)activated during the parse. If this is not possible, then the operators should be requested at parser creation.
- For the precedence: full dynamic precedence is preferable(at runtime the precedence of an encountered operation is requested), but if it isn't possible then there should be different precedence presets. (multiplication, addition, ...)
What I've got:
- Working code for operator recognition
- Precedence climbing code which produces a correct parse tree, but gives an error: rule expr failed predicate: (getPrecedence($op) >= $_p)?
UPDATE: fixed operator recognition code, and found code for the precedence climbing mechanism
tokens { PREOP, POSTOP, BINOP, ERROR }
#lexer::members {
private static List<String> binaryOperators;
private static List<String> prefixOperators;
private static List<String> postfixOperators;
{
binaryOperators = new ArrayList<String>();
binaryOperators.add("+");
binaryOperators.add("*");
binaryOperators.add("-");
binaryOperators.add("/");
prefixOperators = new ArrayList<String>();
prefixOperators.add("-");
postfixOperators = new ArrayList<String>();
postfixOperators.add("!");
}
private Deque<Token> deque = new LinkedList<Token>();
private Token previousToken;
private Token nextToken;
#Override
public Token nextToken() {
if (!deque.isEmpty()) {
return previousToken = deque.pollFirst();
}
Token next = super.nextToken();
if (next.getType() != SYMBOL) {
return previousToken = next;
}
StringBuilder builder = new StringBuilder();
while (next.getType() == SYMBOL) {
builder.append(next.getText());
next = super.nextToken();
}
deque.addLast(nextToken = next);
List<Token> tokens = findOperatorCombination(builder.toString(), getOperatorType());
for (int i = tokens.size() - 1; i >= 0; i--) {
deque.addFirst(tokens.get(i));
}
return deque.pollFirst();
}
private static List<Token> findOperatorCombination(String sequence, OperatorType type) {
switch (type) {
case POSTFIX:
return getPostfixCombination(sequence);
case PREFIX:
return getPrefixCombination(sequence);
case BINARY:
return getBinaryCombination(sequence);
default:
break;
}
return null;
}
private static List<Token> getPrefixCombination(String sequence) {
if (isPrefixOperator(sequence)) {
List<Token> seq = new ArrayList<Token>(1);
seq.add(0, new CommonToken(MathParser.PREOP, sequence));
return seq;
}
if (sequence.length() <= 1) {
return null;
}
for (int i = 1; i < sequence.length(); i++) {
List<Token> seq1 = getPrefixCombination(sequence.substring(0, i));
List<Token> seq2 = getPrefixCombination(sequence.substring(i, sequence.length()));
if (seq1 != null & seq2 != null) {
seq1.addAll(seq2);
return seq1;
}
}
return null;
}
private static List<Token> getPostfixCombination(String sequence) {
if (isPostfixOperator(sequence)) {
List<Token> seq = new ArrayList<Token>(1);
seq.add(0, new CommonToken(MathParser.POSTOP, sequence));
return seq;
}
if (sequence.length() <= 1) {
return null;
}
for (int i = 1; i < sequence.length(); i++) {
List<Token> seq1 = getPostfixCombination(sequence.substring(0, i));
List<Token> seq2 = getPostfixCombination(sequence.substring(i, sequence.length()));
if (seq1 != null && seq2 != null) {
seq1.addAll(seq2);
return seq1;
}
}
return null;
}
private static List<Token> getBinaryCombination(String sequence) {
for (int i = 0; i < sequence.length(); i++) { // i is number of postfix spaces
for (int j = 0; j < sequence.length() - i; j++) { // j is number of prefix spaces
String seqPost = sequence.substring(0, i);
List<Token> post = getPostfixCombination(seqPost);
String seqPre = sequence.substring(sequence.length()-j, sequence.length());
List<Token> pre = getPrefixCombination(seqPre);
String seqBin = sequence.substring(i, sequence.length()-j);
if ((post != null || seqPost.isEmpty()) &&
(pre != null || seqPre.isEmpty()) &&
isBinaryOperator(seqBin)) {
List<Token> res = new ArrayList<Token>();
if (post != null)
res.addAll(post);
res.add(new CommonToken(MathParser.BINOP, seqBin));
if (pre != null)
res.addAll(pre);
return res;
}
}
}
return null;
}
/**
* Returns the expected operator type based on the previous and next token
*/
private OperatorType getOperatorType() {
if (isValueEnd(previousToken.getType())) {
if (isValueStart(nextToken.getType())) {
return OperatorType.BINARY;
}
return OperatorType.POSTFIX;
}
return OperatorType.PREFIX;
}
private enum OperatorType { BINARY, PREFIX, POSTFIX };
/**
* Checks whether the given token is a token found at the start of value elements
* #param tokenType
* #return
*/
private static boolean isValueStart(int tokenType) {
return tokenType == MathParser.INT;
}
/**
* Checks whether the given token is a token found at the end of value elements
* #param tokenType
* #return
*/
private static boolean isValueEnd(int tokenType) {
return tokenType == MathParser.INT;
}
private static boolean isBinaryOperator(String operator) {
return binaryOperators.contains(operator);
}
private static boolean isPrefixOperator(String operator) {
return prefixOperators.contains(operator);
}
private static boolean isPostfixOperator(String operator) {
return postfixOperators.contains(operator);
}
}
Precedence climbing code:
#parser::members {
static Map<String, Integer> precedenceMap = new HashMap<String, Integer>();
static {
precedenceMap.put("*", 2);
precedenceMap.put("+", 1);
precedenceMap.put("^", 4);
precedenceMap.put("-", 3);
precedenceMap.put("!", 5);
}
public static Integer getPrecedence(Token op) {
return precedenceMap.get(op.getText());
}
public static Integer getNextPrecedence(Token op) {
Integer p = getPrecedence(op);
if (op.getType() == PREOP) return p;
else if (op.getText().equals("^")) return p;
else if (op.getType() == BINOP) return p+1;
else if (op.getType() == POSTOP) return p+1;
throw new IllegalArgumentException(op.getText());
}
}
prog
: expr[0]
;
expr [int _p]
: aexpr
( {getPrecedence(_input.LT(1)) >= $_p}? op=BINOP expr[getNextPrecedence($op)]
| {getPrecedence(_input.LT(1)) >= $_p}? POSTOP
)*
;
atom
: INT
| '(' expr[0] ')'
| op=PREOP expr[getNextPrecedence($op)]
;
So now the question is what can do about this predicate failure error
Thanks to the other contributors I have found a complete (and actually reasonably clean) solution for my problem.
Operator matching:
By looking at the tokens before and after the encountered series of symbols, it is possible to detect the fixity of the operator. After that, apply an algorithm which detects a sequence of valid operators in the symbol series. Then inject those tokens in the token stream (in nextToken() ).
Just make sure you define all hardcoded tokens before the SYMBOL definition.
Precedence climbing:
Actually this wasn't that hard, it is exactly the same as ANTLR4's internal strategy.
grammar Math;
tokens { PREOP, POSTOP, BINOP, ERROR }
#header {
import java.util.*;
}
#lexer::members {
private static List<String> binaryOperators;
private static List<String> prefixOperators;
private static List<String> postfixOperators;
{
binaryOperators = new ArrayList<String>();
binaryOperators.add("+");
binaryOperators.add("*");
binaryOperators.add("-");
binaryOperators.add("/");
System.out.println(binaryOperators);
prefixOperators = new ArrayList<String>();
prefixOperators.add("-");
System.out.println(prefixOperators);
postfixOperators = new ArrayList<String>();
postfixOperators.add("!");
System.out.println(postfixOperators);
}
private Deque<Token> deque = new LinkedList<Token>();
private Token previousToken;
private Token nextToken;
#Override
public Token nextToken() {
if (!deque.isEmpty()) {
return previousToken = deque.pollFirst();
}
Token next = super.nextToken();
if (next.getType() != SYMBOL) {
return previousToken = next;
}
StringBuilder builder = new StringBuilder();
while (next.getType() == SYMBOL) {
builder.append(next.getText());
next = super.nextToken();
}
deque.addLast(nextToken = next);
List<Token> tokens = findOperatorCombination(builder.toString(), getOperatorType());
for (int i = tokens.size() - 1; i >= 0; i--) {
deque.addFirst(tokens.get(i));
}
return deque.pollFirst();
}
private static List<Token> findOperatorCombination(String sequence, OperatorType type) {
switch (type) {
case POSTFIX:
return getPostfixCombination(sequence);
case PREFIX:
return getPrefixCombination(sequence);
case BINARY:
return getBinaryCombination(sequence);
default:
break;
}
return null;
}
private static List<Token> getPrefixCombination(String sequence) {
if (isPrefixOperator(sequence)) {
List<Token> seq = new ArrayList<Token>(1);
seq.add(0, new CommonToken(MathParser.PREOP, sequence));
return seq;
}
if (sequence.length() <= 1) {
return null;
}
for (int i = 1; i < sequence.length(); i++) {
List<Token> seq1 = getPrefixCombination(sequence.substring(0, i));
List<Token> seq2 = getPrefixCombination(sequence.substring(i, sequence.length()));
if (seq1 != null & seq2 != null) {
seq1.addAll(seq2);
return seq1;
}
}
return null;
}
private static List<Token> getPostfixCombination(String sequence) {
if (isPostfixOperator(sequence)) {
List<Token> seq = new ArrayList<Token>(1);
seq.add(0, new CommonToken(MathParser.POSTOP, sequence));
return seq;
}
if (sequence.length() <= 1) {
return null;
}
for (int i = 1; i < sequence.length(); i++) {
List<Token> seq1 = getPostfixCombination(sequence.substring(0, i));
List<Token> seq2 = getPostfixCombination(sequence.substring(i, sequence.length()));
if (seq1 != null && seq2 != null) {
seq1.addAll(seq2);
return seq1;
}
}
return null;
}
private static List<Token> getBinaryCombination(String sequence) {
for (int i = 0; i < sequence.length(); i++) { // i is number of postfix spaces
for (int j = 0; j < sequence.length() - i; j++) { // j is number of prefix spaces
String seqPost = sequence.substring(0, i);
List<Token> post = getPostfixCombination(seqPost);
String seqPre = sequence.substring(sequence.length()-j, sequence.length());
List<Token> pre = getPrefixCombination(seqPre);
String seqBin = sequence.substring(i, sequence.length()-j);
if ((post != null || seqPost.isEmpty()) &&
(pre != null || seqPre.isEmpty()) &&
isBinaryOperator(seqBin)) {
List<Token> res = new ArrayList<Token>();
if (post != null)
res.addAll(post);
res.add(new CommonToken(MathParser.BINOP, seqBin));
if (pre != null)
res.addAll(pre);
return res;
}
}
}
return null;
}
/**
* Returns the expected operator type based on the previous and next token
*/
private OperatorType getOperatorType() {
if (isAfterAtom()) {
if (isBeforeAtom()) {
return OperatorType.BINARY;
}
return OperatorType.POSTFIX;
}
return OperatorType.PREFIX;
}
private enum OperatorType { BINARY, PREFIX, POSTFIX };
/**
* Checks whether the current token is a token found at the start of atom elements
* #return
*/
private boolean isBeforeAtom() {
int tokenType = nextToken.getType();
return tokenType == MathParser.INT ||
tokenType == MathParser.PLEFT;
}
/**
* Checks whether the current token is a token found at the end of atom elements
* #return
*/
private boolean isAfterAtom() {
int tokenType = previousToken.getType();
return tokenType == MathParser.INT ||
tokenType == MathParser.PRIGHT;
}
private static boolean isBinaryOperator(String operator) {
return binaryOperators.contains(operator);
}
private static boolean isPrefixOperator(String operator) {
return prefixOperators.contains(operator);
}
private static boolean isPostfixOperator(String operator) {
return postfixOperators.contains(operator);
}
}
#parser::members {
static Map<String, Integer> precedenceMap = new HashMap<String, Integer>();
static {
precedenceMap.put("*", 2);
precedenceMap.put("+", 1);
precedenceMap.put("^", 4);
precedenceMap.put("-", 3);
precedenceMap.put("!", 5);
}
public static Integer getPrecedence(Token op) {
return precedenceMap.get(op.getText());
}
public static Integer getNextPrecedence(Token op) {
Integer p = getPrecedence(op);
if (op.getType() == PREOP) return p;
else if (op.getText().equals("^")) return p;
else if (op.getType() == BINOP) return p+1;
throw new IllegalArgumentException(op.getText());
}
}
prog
: expr[0]
;
expr [int _p]
: atom
( {getPrecedence(_input.LT(1)) >= $_p}? op=BINOP expr[getNextPrecedence($op)]
| {getPrecedence(_input.LT(1)) >= $_p}? POSTOP
)*
;
atom
: INT
| PLEFT expr[0] PRIGHT
| op=PREOP expr[getNextPrecedence($op)]
;
INT
: ( '0'..'9' )+
;
PLEFT : '(' ;
PRIGHT : ')' ;
WS
: [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines
SYMBOL
: .
;
Note: code is meant as an example, not as my real code (operators and precedence will be requested externally)
You can't define precedence/associativity rules for Antlr at runtime. What you can, however, is parse all of the operators (built-in in the language or user-defined) as a single chained list (like ArrayList<>) in the parse, then apply your own algorithm for precedence and associativity in a visitor (or in grammar actions, if you really want to).
The algorithm itself isn't that hard, if you iterate the list many times. For example, you can first fetch the precedence of each operator in the list, then check the one with highest precedence, see if its right or left-associative, and from there you've built your first (bottom-most) tree node. Keep applying until the list is empty, and you've built your own "parse tree", but without the parsing (you're not working with abstract-input strings anymore).
Alternatively, at runtime make externals calls for Antlr to compile the .g4 and to javac to compile the generated Antlr code, then use reflection to call it. However, it is probably much slower and arguably harder to pull off.
A parser rule that will work 'correctly' according to some runtime definition of Symbol precedence is possible. While not initially appearing to be an idiomatic choice, the standard alternative of deferring semantic analysis out of the parser would produce a very poorly differentiated parse tree -- making this a reasonable exception to the standard design rule.
In (overly simplified) form, the parser rule would be:
expr : LParen expr RParen # group
| expr Symbol expr # binary
| expr Symbol # postfix
| Symbol expr # prefix
| Int+ # value
;
To cure the ambiguity, add inline predicates:
expr : LParen expr RParen # group
| expr s=Symbol { binary($s) }? expr # binary
| expr s=Symbol { postfix($s) }? # postfix
| s=Symbol { prefix($s) }? expr # prefix
| Int+ # value
;
For any given Symbol, a single predicate method should evaluate as true.
Extending to multiple Symbol strings will add a bit of complexity (ex, differentiating a binary from a postfix followed by a prefix) but the mechanics remain largely the same.
I think your approach is the right way. I propose following grammar:
grammar Op;
options {
superClass=PrecedenceParser;
}
prog : expr[0] ;
expr[int _p] locals[Token op]: INT ({$op = _input.LT(1);} {getPrecedence($op) >= $_p}? OP expr[getPrecedence($op)])*;
INT : ( '0'..'9' )+ ;
OP : '+' | '*'; // all allowed symbols, should be extended
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines
The rule for op should contain all allowed operator symbols. My restriction to + and * is only for simplicity. The parser super class would be:
public abstract class PrecedenceParser extends Parser {
private Map<String, Integer> precedences;
public PrecedenceParser(TokenStream input) {
super(input);
this.precedences = new HashMap<>();
}
public PrecedenceParser putOperator(String op, int p) {
precedences.put(op, p);
return this;
}
public int getPrecedence(Token operator) {
Integer p = precedences.get(operator.getText());
if (p == null) {
return Integer.MAX_VALUE;
} else {
return p;
}
}
}
Results
with precedences {+ : 4, * : 3 }
(prog (expr 1 + (expr 2) * (expr 3 + (expr 4))))
with precedences {+ : 3, * : 4 }
(prog (expr 1 + (expr 2 * (expr 3) + (expr 4))))
Evaluating these sequences from left to right is equivalent to evaluating them with precedence.
This approach should work for larger sets of operators. ANTLR4 uses this approach internally for precedence climbing yet ANTLR uses constants instead of the precedences map (because it assumes that precedence is determined at parser build time).

Categories

Resources