I have a Recursive Descent Parser project, and I earlier posted a problem that I was having regarding stack overflow error. I have been able to fix that issue, however it is now returning a ClassCastException.
From a main SWING GUI form, I pass input string then build it into a linked list of Strings. I then pass it to the parser.java. The error says:
java.lang.ClassCastException: java.lang.String cannot be cast to
compilerfinalproject.Token
Here are some example inputs:
1+2+3
(1+2)-(3+4)
(1+2+3)*4
Below is my tokenizer code.
import java.util.LinkedList;
public class Tokenizer {
Tokenizer () {} // EMPTY CONSTRUCTOR FOR INSTANTIATION AT CompilerFinal.java
Tokenizer(String expression) { // CONSTRUCTOR FOR CALLING FROM LOCAL Tokenize METHOD
this.expression = expression.toCharArray(); // SET EXPRESSION TO TOKENIZE
this.pos = 0; // INITIALIZE AT INDEX 0
}
int pos; // STRING INDEX DECLARATION
char[] expression; // EXPRESSION DECLARATION
LinkedList <String> tokens = new LinkedList<>(); // ARRAYLIST FOR ALL TOKENS
enum tokenClass {PLUS, MINUS, MULTIPLY, DIVIDE, EXPONENT, NUMBER, IDENTIFIER, OPEN, CLOSE, NEGATIVE, DEFAULT} // TOKEN CLASSES (DEFAULT FOR INITIALIZATION PURPOSES ONLY)
class Lexeme { // EACH LEXEME HAS TOKEN CLASS AND TOKEN VALUE
String tokenClass, token;
Lexeme(String tokenClass, String token) {
this.tokenClass = tokenClass;
this.token = token;
}
}
Lexeme getToken() { // METHOD TO GET TOKENS
StringBuilder token = new StringBuilder(); // BUILDS TOKEN PER CHARACTER
boolean endOfToken = false; // FLAG WHETHER TO END TOKEN
tokenClass type = tokenClass.DEFAULT; // DEFAULT VALUE FOR TOKENCLASS
while (!endOfToken && hasMoreTokens()) // LOOP UNTIL A TOKEN IS COMPLETED
{
while(expression[pos] == ' ') // SKIP ALL LEADING SPACES
pos++;
switch (expression[pos])
{
case '+':
if(type != tokenClass.NUMBER && type != tokenClass.IDENTIFIER)
{
type = tokenClass.PLUS; // SET TOKEN CLASS AS OPERATOR
token.append(expression[pos]);
pos++;
}
endOfToken = true; // END TOKEN IMMEDIATELY
break;
case '-':
if(type != tokenClass.NUMBER && type != tokenClass.IDENTIFIER)
{
type = tokenClass.MINUS; // SET TOKEN CLASS AS OPERATOR
token.append(expression[pos]);
pos++;
}
endOfToken = true; // END TOKEN IMMEDIATELY
break;
case '*':
if(type != tokenClass.NUMBER && type != tokenClass.IDENTIFIER)
{
type = tokenClass.MULTIPLY; // SET TOKEN CLASS AS OPERATOR
token.append(expression[pos]);
pos++;
}
endOfToken = true; // END TOKEN IMMEDIATELY
break;
case '/':
if(type != tokenClass.NUMBER && type != tokenClass.IDENTIFIER)
{
type = tokenClass.DIVIDE; // SET TOKEN CLASS AS OPERATOR
token.append(expression[pos]);
pos++;
}
endOfToken = true; // END TOKEN IMMEDIATELY
break;
case '^':
if(type != tokenClass.NUMBER && type != tokenClass.IDENTIFIER)
{
type = tokenClass.EXPONENT; // SET TOKEN CLASS AS OPERATOR
token.append(expression[pos]);
pos++;
}
endOfToken = true; // END TOKEN IMMEDIATELY
break;
case '(': // OPEN PARENTHESES
if(type != tokenClass.NUMBER && type != tokenClass.IDENTIFIER)
{
type = tokenClass.OPEN; // SET TOKEN CLASS AS OPEN
token.append(expression[pos]);
pos++;
}
endOfToken = true; // END TOKEN IMMEDIATELY
break;
case ')': // CLOSE PARENTHESES
if(type != tokenClass.NUMBER && type != tokenClass.IDENTIFIER)
{
type = tokenClass.CLOSE; // SET TOKEN CLASS AS CLOSE
token.append(expression[pos]);
pos++;
}
endOfToken = true; // END TOKEN IMMEDIATELY
break;
case ' ': // SKIP WHITESPACE
endOfToken = true;
pos++;
break;
default:
if(Character.isDigit(expression[pos]) || expression[pos] == '.') // FOR NUMBERS AND DECIMAL POINTS
{
token.append(expression[pos]);
type = tokenClass.NUMBER; // SET TOKENCLASS AS NUMBER
}
else if(Character.isAlphabetic(expression[pos])) // FOR IDENTIFIERS
{
token.append(expression[pos]);
type = tokenClass.IDENTIFIER;
}
pos++; // NO END TOKEN TO TAKE INTO ACCOUNT MULTIPLE DIGIT NUMBERS
break;
}
}
return new Lexeme(type.name().toLowerCase(), token.toString());
}
boolean hasMoreTokens() { // CONDITION CHECKING
return pos < expression.length;
}
public LinkedList tokenize (String expression) { // CALLED FROM CompilerFinal.java TO GET TOKENS IN ARRAYLIST
Tokenizer tokenizer = new Tokenizer(expression); // INSTANTIATE
while (tokenizer.hasMoreTokens()) // GETTING ALL TOKENS
{
Lexeme nextToken = tokenizer.getToken();
tokens.add(nextToken.token);
}
return tokens;
}
public String getLexeme (String expression) // CALLED FROM CompilerFinal.java FOR DISPLAYING IN GUI FORM
{
StringBuilder lexemeList = new StringBuilder();
Tokenizer tokenizer = new Tokenizer(expression); // INSTANTIATE
lexemeList.append("LEXEMES:\n");
while (tokenizer.hasMoreTokens()) // GETTING ALL TOKENS
{
Lexeme nextToken = tokenizer.getToken();
lexemeList.append(nextToken.token).append("\t").append(nextToken.tokenClass).append("\n");
}
return lexemeList.toString();
}
}
Below is my parser code. I have included the grammar I used in the comments.
import java.util.LinkedList;
class Token {
public static final int PLUS = 0;
public static final int MINUS = 1;
public static final int MULTIPLY = 2;
public static final int DIVIDE = 3;
public static final int EXPONENT = 4;
public static final int NUMBER = 5;
public static final int IDENTIFIER = 6;
public static final int OPEN = 7;
public static final int CLOSE = 8;
//public static final int NEGATIVE = 7;
public final int token; // FIELDS TO HOLD DATA PER TOKEN
public final String sequence;
public Token (int token, String sequence) {
super();
this.token = token;
this.sequence = sequence;
}
}
public class Parser {
private Token next; // POINTER FOR NEXT TOKEN
private final LinkedList<Token> tokens; // LIST OF TOKENS PRODUCED BY TOKENIZER
private int counter = 0;
public Parser(LinkedList tokens)
{
this.tokens = (LinkedList<Token>) tokens.clone(); // GET LINKEDLIST
this.tokens.getFirst(); // ASSIGNS FIRST ELEMENT OF LINKEDLIST
}
//////// START OF PARSING METHODS ////////
/*
GRAMMAR:
E -> TE' | TE''
E' -> +E | e
E'' -> -E | e
T -> FT' | FT''
T' -> *T | e
T'' -> /T | e
F -> (E) | -F | "NUMBER" | "IDENTIFIER"
*/
public boolean Parse ()
{
return E(); // INVOKE START SYMBOL
}
private boolean term (int token) // GETS NEXT TOKEN
{
boolean flag = false;
if(next.token == token)
flag = true;
counter++; // INCREMENT COUNTER
if(counter < tokens.size()) // POINT TO NEXT TOKEN
next = tokens.get(counter);
return flag;
}
///////// START OF LIST OF PRODUCTIONS /////////
//////// E -> TE' | TE'' ////////
private boolean E()
{
return E1() || E2();
}
private boolean E1 ()
{
// E -> TE'
int flag = counter;
boolean result = true;
if(!( T() && E_P() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean E2 ()
{
// E -> TE''
int flag = counter;
boolean result = true;
if(!( T() && E_PP() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
//////// E' -> +E | e ////////
private boolean E_P()
{
return E_P1() || E_P2();
}
private boolean E_P1()
{
// E' -> +E
int flag = counter;
boolean result = true;
if(!( term(Token.PLUS) && E() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean E_P2()
{
// E' -> e
return true;
}
//////// E'' -> -E | e ////////
private boolean E_PP()
{
return E_PP1() || E_PP2();
}
private boolean E_PP1()
{
// E'' -> -E
int flag = counter;
boolean result = true;
if(!( term(Token.MINUS) && E() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean E_PP2()
{
// E'' -> e
return true;
}
//////// T -> FT' | FT'' ////////
private boolean T()
{
return T1() || T2();
}
private boolean T1()
{
// T -> FT'
int flag = counter;
boolean result = true;
if(!( F() && T_P() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean T2()
{
// T -> FT''
int flag = counter;
boolean result = true;
if(!( F() && T_PP() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
//////// T' -> *T | e ////////
private boolean T_P()
{
return T_P1() || T_P2();
}
private boolean T_P1()
{
// T' -> *T
int flag = counter;
boolean result = true;
if(!( term(Token.MULTIPLY) && T() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean T_P2()
{
// T' -> e
return true;
}
//////// T'' -> /T | e ////////
private boolean T_PP()
{
return T_PP1() || T_PP2();
}
private boolean T_PP1()
{
// T'' -> /T
int flag = counter;
boolean result = true;
if(!( term(Token.DIVIDE) && T() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean T_PP2()
{
// T'' -> e
return true;
}
//////// F -> (E) | -F | "NUMBER" | "IDENTIFIER" ////////
private boolean F()
{
return F1() || F2() || F3() || F4();
}
private boolean F1()
{
// F -> (E)
int flag = counter;
boolean result = true;
if(!( term(Token.OPEN) && T() && term(Token.CLOSE) ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean F2()
{
// F -> -F
int flag = counter;
boolean result = true;
if(!( term(Token.MINUS) && F() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean F3()
{
// F -> NUMBER
int flag = counter;
boolean result = true;
if(!( term(Token.NUMBER) ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean F4()
{
// F -> NUMBER
int flag = counter;
boolean result = true;
if(!( term(Token.IDENTIFIER) ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
}
First thing, change this:
public LinkedList tokenize (String expression) {...}
to this:
public LinkedList<String> tokenize (String expression) {...}
And change this:
public Parser(LinkedList tokens) {...}
to this:
public Parser(LinkedList<Token> tokens) {...}
LinkedList (without a generic part) is called a raw type and facilitates an unchecked conversion. If there are more places where you use raw types, they also need to be changed. Removing all raw types from your code will almost certainly lead you to the error.
I strongly suspect you have code essentially like this:
Tokenizer t = ...;
// passing LinkedList<String> as a LinkedList<Token>
Parser p = new Parser(t.tokenize(...));
In other words you have skipped a step where you need to convert Strings to Tokens.
Related
Given an arbitrary number of TextField inputs (t1, t2, t3, ...) and a custom boolean string input from a JtextArea, I need to check if lines in a file match the custom boolean expression. It needs to support nested parentheses.
Example:
User enters "str1" into t1 , "str2" into t2, "str3" into t3, "str4" into t4, "str5" into t5.
User enters the following into the JTextArea for the custom boolean:
"not ((t1 and not t3) or (t4 and t2)) or t5"
Then based on these inputs, I must filter a file and return lines in the file that match the custom boolean based on a "contains" relationship (e.g. "t1 and not t3" means a line must contain the string t1 and not contain the string t3).
For example a file with the following two lines:
str 5
str4 str2
The filter would only return str5 because it is the only line that matches the custom boolean.
I am having trouble even getting started. I have tried to think of a recursive solution but couldn't come up with anything. Also I tried non-recursive solutions but can't come up with anything either.
There is also the problem of the end result boolean needing to take in a parameter (each line in the file). I thought of maybe producing a sequence of operations to perform rather than a boolean that somehow takes in a parameter. But I can't figure out how to get this sequence in the first place.
Here is what I have now. It is very bad and I am thinking of scrapping this approach.
public class CustomInputParser {
private ArrayList<String> pairs;
private String inp;
private HashMap<Integer,String> atomMap;
public CustomInputParser() {
this.pairs = null;
this.inp = "";
this.atomMap = new HashMap<Integer,String>();
}
public void findAtoms() {
int i = 0;
for(String s : this.pairs) {
String[] indices = s.split(",");
int begin = Integer.valueOf(indices[0]);
int end = Integer.valueOf(indices[1]);
if(!inp.substring(begin+1, end).contains("(")) {
this.pairs.set(i, this.pairs.get(i) + ",#");
}
i++;
}
}
public void computeAtoms() {
int i = 0;
for(String s : this.pairs) {
if(s.contains(",#")) {
String[] indices = s.split(",");
int begin = Integer.valueOf(indices[0]);
int end = Integer.valueOf(indices[1]);
//this.pairs.set(i,this.pairs.get(i).replace(",a", ""));
this.pairs.set(i, this.pairs.get(i) + ","+inp.substring(begin+1, end));
this.atomMap.put(begin,this.pairs.get(i).split(",")[3]+"#"+String.valueOf(end));
}
i++;
}
System.out.println(this.pairs.toString());
System.out.println(this.atomMap.toString());
}
public void replaceAtoms() {
int i = 0;
for(String s : this.pairs) {
if(!(s.contains("o") || s.contains("a") || s.contains("n"))) {
String[] indices = s.split(",");
int begin = Integer.valueOf(indices[0])+1;
int end = Integer.valueOf(indices[1]);
for(int j = begin; j < end; j++) {
if(inp.charAt(j) == '(') {
if(atomMap.containsKey(j)) {
this.pairs.set(i, this.pairs.get(i) + ","+j+"#"+atomMap.get(j).split("#")[1]+">"+atomMap.get(j).split("#")[0]);
}
else {
this.pairs.set(i,"!"+ this.pairs.get(i));
}
}
}
}
i++;
}
System.out.println(this.pairs.toString());
}
public ArrayList<String> getPairs(String str){
this.inp = str;
ArrayList<String> res = new ArrayList<String>();
char[] arr = str.toCharArray();
Stack<Integer> s = new Stack<Integer>();
for(int i = 0; i < arr.length; i++) {
if(arr[i] == '(') {
s.push(i);
}
if(arr[i] == ')') {
if(s.empty()) {
return null;
}
else {
Integer start = s.pop();
Integer end = Integer.valueOf(i);
res.add(start.toString() + "," + end.toString());
}
}
}
if(!s.empty()) {
return null;
}
this.pairs = res;
return res;
}
public static void main(String[] args) {
String x = "((not t1 and ((not t2 or t4) or (t3 or t4))) or (t5 and not t6)) and t7";
x = x.replace("not", "n").replace("and","a").replace("or", "o").replace("t", "").replace(" ", "");
System.out.println(x);
CustomInputParser c = new CustomInputParser();
System.out.println(c.getPairs(x).toString());
c.findAtoms();
c.computeAtoms();
c.replaceAtoms();
}
}
The first step is to tokenize the input. Define
enum Token {VAR, LP, RP, NOT, AND, OR, END}
LP and RP are parentheses. Now define a tokenizer class that looks something like this:
class Tokenizer {
Tokenizer(String input) {...}
void reset() {...}
Token getNext() {...}
String getVarName() {...}
}
Calling getNext() on your example in a loop should return
LP LP NOT VAR AND LP LP NOT VAR OR VAR RP OR LP VAR OR VAR RP RP RP OR LP VAR AND NOT VAR RP RP AND VAR END
Calling getVarName() immediately after a VAR has been returned by getNext() gives you the name of the variable (e.g. "t42").
There are many ways to implement little scanners like this. You should do this first and make sure it's bulletproof by testing. Trying to build a parser on top of a flaky scanner is torture.
As I said in comments, I'd consider recursive descent parsing. If you have a suitable grammar, writing an RD parser is a very short step as the Dragon Book (also mentioned above) shows.
A reasonable grammar (using tokens as above) is
Expr -> Term AND Term
| Term OR Term
| Term END
Term -> NOT Term
| Opnd
Opnd -> VAR
| LP Expr RP
For example, here is how you'd get started. It shows the first rule converted to a function:
class Evaluator {
final Tokenizer tokenizer = ...; // Contains the expression text.
final Map<String, Boolean> env = ... // Environment: variables to values.
Token lookAhead; // Holds the token we're parsing right now.
Evaluator(Tokenizer tokenizer, Map<String, Boolean> env) { ... }
void advance() {
lookAhead = tokenizer.getNext();
}
boolean expr() {
boolean leftHandSide = term(); // Parse the left hand side recursively.
Token op = lookAhead; // Remember the operator.
if (op == Token.END) return leftHandSide; // Oops. That's all.
advance(); // Skip past the operator.
boolean rightHandSide = term(); // Parse the right hand side recursively.
if (op == Token.AND) return leftHandSide && rightHandSide; // Evaluate!
if (op == Token.OR) return leftHandSide || rightHandSide;
dieWithSyntaxError("Expected op, found " + op);
}
boolean term() {...}
boolean opnd() {...}
}
The environment is used when a VAR is parsed. Its boolean value is env.get(tokenizer.getVarName()).
So to process the file, you'll
For each line
For each variable tX in the expression
See if the line contains the string tX is bound to in its text field.
If so, put the mapping tX -> true in the environment
else put tX -> false
Reset the tokenizer
Call Evaluator.evaluate(tokenizer, environment)
If it returns true, print the line, else skip it.
This is the simplest approach I can think of. About 150 lines. Many optimizations are possible.
Added
Well since I can no longer take away the thrill of discovery, here is my version:
import static java.lang.Character.isDigit;
import static java.lang.Character.isWhitespace;
import java.util.HashMap;
import java.util.Map;
import static java.util.stream.Collectors.toMap;
public class TextExpressionSearch {
enum Token { VAR, LP, RP, NOT, AND, OR, END }
static class Tokenizer {
final String input;
int pos = 0;
String var;
Tokenizer(String input) {
this.input = input;
}
void reset() {
pos = 0;
var = null;
}
String getRead() {
return input.substring(0, pos);
}
Token getNext() {
var = null;
while (pos < input.length() && isWhitespace(input.charAt(pos))) {
++pos;
}
if (pos >= input.length()) {
return Token.END;
}
int start = pos++;
switch (input.charAt(start)) {
case 't':
while (pos < input.length() && isDigit(input.charAt(pos))) {
++pos;
}
var = input.substring(start, pos);
return Token.VAR;
case '(':
return Token.LP;
case ')':
return Token.RP;
case 'n':
if (input.startsWith("ot", pos)) {
pos += 2;
return Token.NOT;
}
break;
case 'a':
if (input.startsWith("nd", pos)) {
pos += 2;
return Token.AND;
}
break;
case 'o':
if (input.startsWith("r", pos)) {
pos += 1;
return Token.OR;
}
break;
}
throw new AssertionError("Can't tokenize: " + input.substring(start));
}
}
static class Evaluator {
final Tokenizer tokenizer;
final Map<String, Boolean> env;
Token lookAhead;
Evaluator(Tokenizer tokenizer, Map<String, Boolean> env) {
this.tokenizer = tokenizer;
this.env = env;
advance();
}
boolean die(String msg) {
throw new AssertionError(msg + "\nRead: " + tokenizer.getRead());
}
void advance() {
lookAhead = tokenizer.getNext();
}
void match(Token token) {
if (lookAhead != token) {
die("Expected " + token + ", found " + lookAhead);
}
advance();
}
boolean evaluate() {
boolean exprVal = expr();
match(Token.END);
return exprVal;
}
boolean expr() {
boolean lhs = negated();
switch (lookAhead) {
case AND:
advance();
return negated() && lhs;
case OR:
advance();
return negated() || lhs;
case END:
return lhs;
}
return die("Expected expr, found " + lookAhead);
}
boolean negated() {
switch (lookAhead) {
case NOT:
advance();
return !negated();
default:
return operand();
}
}
boolean operand() {
switch (lookAhead) {
case VAR:
if (!env.containsKey(tokenizer.var)) {
die("Undefined variable: " + tokenizer.var);
}
boolean varVal = env.get(tokenizer.var);
advance();
return varVal;
case LP:
advance();
boolean exprVal = expr();
match(Token.RP);
return exprVal;
}
return die("Expected operand, found " + lookAhead);
}
}
public static void main(String [] args) {
String expr = "((not t1 and ((not t2 or t4) or (t3 or t4))) or (t5 and not t6)) and t7";
Map<String, String> bindings = new HashMap<>();
bindings.put("t1", "str1");
bindings.put("t2", "str2");
bindings.put("t3", "str3");
bindings.put("t4", "str4");
bindings.put("t5", "str5");
bindings.put("t6", "str6");
bindings.put("t7", "str7");
String [] lines = {"str5 str7", "str4 str2"};
Tokenizer tokenizer = new Tokenizer(expr);
for (String line : lines) {
Map<String, Boolean> env =
bindings.entrySet().stream()
.collect(toMap(e -> e.getKey(), e -> line.contains(e.getValue())));
tokenizer.reset();
if (new Evaluator(tokenizer, env).evaluate()) {
System.out.println(line);
}
}
}
}
You can define a parser that returns a Predicate<String> that tests if a given string satisfies a conditional expression.
static Predicate<String> parse(String s, Map<String, String> map) {
return new Object() {
String[] tokens = Pattern.compile("[()]|[a-z][a-z0-9]*")
.matcher(s).results()
.map(MatchResult::group)
.toArray(String[]::new);
int length = tokens.length;
int index = 0;
String token = get();
String get() {
return token = index < length ? tokens[index++] : null;
}
boolean eat(String expect) {
if (expect.equals(token)) {
get();
return true;
}
return false;
}
Predicate<String> identifier() {
String id = token;
return s -> {
String value = map.get(id);
if (value == null)
throw new RuntimeException(
"identifier '" + id + "' undefined");
return s.contains(value);
};
}
Predicate<String> factor() {
boolean not = false;
Predicate<String> p;
if (eat("not"))
not = true;
switch (token) {
case "(":
get();
p = expression();
if (!eat(")"))
throw new RuntimeException("')' expected");
break;
case ")": case "not": case "and": case "or":
throw new RuntimeException("syntax error at '" + token + "'");
default:
p = identifier();
get();
break;
}
if (not)
p = p.negate();
return p;
}
Predicate<String> term() {
Predicate<String> p = factor();
while (eat("and"))
p = p.and(factor());
return p;
}
Predicate<String> expression() {
Predicate<String> p = term();
while (eat("or"))
p = p.or(term());
return p;
}
Predicate<String> parse() {
Predicate<String> p = expression();
if (token != null)
throw new RuntimeException("extra tokens string");
return p;
}
}.parse();
}
test case:
#Test
public void testParse() {
String s = "not ((t1 and not t3) or (t4 and t2)) or t5";
Map<String, String> map = new HashMap<>(Map.of(
"t1", "str1",
"t2", "str2",
"t3", "str3",
"t4", "str4",
"t5", "str5"));
Predicate<String> p = parse(s, map);
assertTrue(p.test("str5"));
assertTrue(p.test("str3"));
assertTrue(p.test("str1 str3"));
assertFalse(p.test("str1"));
assertFalse(p.test("str2 str4"));
// you can change value of variables.
assertFalse(p.test("str1 FOO"));
map.put("t5", "FOO");
assertTrue(p.test("str1 FOO"));
}
syntax:
expression = term { "or" term }
term = factor { "and" factor }
factor = [ "not" ] ( "(" expression ")" | identifier )
identifier = letter { letter | digit }
letter = "a" | "b" | ... | "z"
digit = "0" | "1" | ... | "9"
For posterity, here is my shunting yard solution which includes input validation:
public class CustomInputParser {
private Stack<Character> ops;
private LinkedList<Character> postFix;
private HashMap<Character, Integer> precedence;
private Stack<Boolean> eval;
private HashMap<Integer, String> termsMap;
private String customBool;
public CustomInputParser(HashMap<Integer, String> tMap, String custBool) {
this.ops = new Stack<Character>();
this.eval = new Stack<Boolean>();
this.postFix = new LinkedList<Character>();
this.termsMap = tMap;
this.customBool = custBool;
this.precedence = new HashMap<Character, Integer>();
precedence.put('n', 1);
precedence.put('a', 2);
precedence.put('o',3);
precedence.put('(', 4);
}
private int inToPost() {
char[] expr = convertToArr(this.customBool);
char c;
for(int i = 0; i < expr.length; i++) {
c = expr[i];
if(isOp(c)) {
if(processOp(c) != 0) return -1;
}
else {
if(!Character.isDigit(c)) {
return -1;
}
//I made the mistake of using a queue of Characters for postfix initially
//This only worked for up to 9 operands (multi digit would add mutiple chars to
// postfix for a single reference.
//This loops is a lazy workaround:
// 1. get the string of the reference (e.g. "11")
// 2. convert it to int
// 3. store the char value of the int in postfix
// 4. when evaluating operands in postfix eval, convert char back to int to get the termsMap key
String num = "";
while(i < expr.length) {
if(!Character.isDigit(expr[i])) {
i--;
break;
}
c = expr[i];
num += c;
i++;
}
int j = Integer.valueOf(num);
c = (char) j;
postFix.offer(c); //enqueue
}
}
while(!ops.empty()) {
if(ops.peek() == '(')return -1; //no matching close paren for the open paren
postFix.offer(ops.pop()); //pop and enqueue all remaining ops from stack
}
return 0;
}
private boolean isOp(char c) {
if(c == '(' || c == ')' || c =='n' || c=='a' || c=='o') {
return true;
}
return false;
}
private int processOp(char c) {
if (ops.empty() || c == '(') {
ops.push(c);
}
else if(c == ')') {
while(ops.peek() != '(') {
postFix.offer(ops.pop()); //pop and equeue ops wrapped in parens
if(ops.empty()) return -1; //no matching open paren for the close paren
}
ops.pop(); // don't enqueue open paren, just remove it from stack
}
else if(precedence.get(c) > precedence.get(ops.peek())) {
postFix.offer(ops.pop()); //pop and enqueue the higher precedence op
ops.push(c);
}
else {
ops.push(c);
}
return 0;
}
public boolean evaluate(String s) {
while(!postFix.isEmpty()) {
char c = postFix.poll();
boolean op1, op2;
switch(c) {
case 'n':
op1 = eval.pop();
eval.push(!op1);
break;
case 'a':
op1 = eval.pop();
op2 = eval.pop();
eval.push(op1 && op2);
break;
case 'o':
op1 = eval.pop();
op2 = eval.pop();
eval.push(op1 || op2);
break;
default:
int termKey = (int) c;
String term = this.termsMap.get(termKey);
eval.push(s.contains(String.valueOf(term)));
break;
}
}
return eval.pop();
}
private char[] convertToArr(String x) {
x = x.replace("not", "n").replace("and","a").replace("or", "o").replace("t", "").replace(" ", "");
return x.toCharArray();
}
public static void main(String[] args) {
String customBool = "(t1 and not (t2 and t3)) or (t4 and not t5)";
HashMap<Integer,String> termsMap = new HashMap<Integer, String>();
termsMap.put(1,"str1");
termsMap.put(2,"str2");
termsMap.put(3,"str3");
termsMap.put(4,"str4");
termsMap.put(5,"str5");
CustomInputParser c = new CustomInputParser(termsMap, customBool);
if(c.inToPost() != 0) {
System.out.println("invalid custom boolean");
}
else {
System.out.println(c.evaluate("str1str5"));
}
}
}
import java.util.ArrayDeque;
import java.util.Deque;
import java.util.Scanner;
import static java.lang.System.in;
import static java.lang.System.out;
/*
*
*
Use a stack to check parentheses, balanced and nesting
* The parentheses are: (), [] and {}
*
* See:
* - UseAStack
*
*/
public class Ex3CheckParen {
public static void main(String[] args) {
new Ex3CheckParen().program();
}
void program() {
// All should be true
out.println(checkParentheses("()"));
out.println(checkParentheses("(()())"));
out.println(!checkParentheses("(()))")); // Unbalanced
out.println(!checkParentheses("((())")); // Unbalanced
out.println(checkParentheses("({})"));
out.println(!checkParentheses("({)}")); // Bad nesting
out.println(checkParentheses("({} [()] ({}))"));
out.println(!checkParentheses("({} [() ({)})")); // Unbalanced and bad nesting
}
// This is interesting because have to return, but what if no match?!?
boolean checkParentheses(String str) {
Deque<Character> stack = new ArrayDeque<>();
String k = "({[";
String s = ")]}";
for (int i = 0; i < str.length(); i++) {
if (k.contains(String.valueOf(str.charAt(i)))) {
stack.push(str.charAt(i));
} else if (s.contains(String.valueOf(str.charAt(i)))) {
if (matching(stack.peek()) == str.charAt(i)) { //ILLEGAL ARGUMENT EXCEPTION HERE
return true;
}
} else {
return false;
}
}
return false;
}
char matching(char ch) {
//char c = must initialize but to what?!
switch (ch) {
case ')':
return '('; // c = '('
case ']':
return '[';
case '}':
return '{';
default:
// return c;
throw new IllegalArgumentException("No match found");
}
}
}
I'm getting an exception error in the if statement containing matching. Unable to figure out the cause.
Maybe something like this?
public class Ex3CheckParen {
public static void main(String[] args) {
new Ex3CheckParen().program();
}
void program() {
// All should be true
out.println(checkParentheses("()"));
out.println(checkParentheses("(()())"));
out.println(!checkParentheses("(()))")); // Unbalanced
out.println(!checkParentheses("((())")); // Unbalanced
out.println(checkParentheses("({})"));
out.println(!checkParentheses("({)}")); // Bad nesting
out.println(checkParentheses("({} [()] ({}))"));
out.println(!checkParentheses("({} [() ({)})")); // Unbalanced and bad nesting
}
// This is interesting because have to return, but what if no match?!?
boolean checkParentheses(String str) {
Deque<Character> stack = new ArrayDeque<>();
String k = "({[";
String s = ")]}";
for (int i = 0; i < str.length(); i++) {
if (k.contains(String.valueOf(str.charAt(i)))) {
stack.push(str.charAt(i));
} else if (s.contains(String.valueOf(str.charAt(i)))) {
if (matching(stack.peek(), str.charAt(i))) {
return true;
}
} else {
return false;
}
}
return false;
}
boolean matching(char ch1, char ch2) {
if ('(' == ch1 && ch2 == ')' || '[' == ch1 && ch2 == ']' || '{' == ch1 && ch2 == '}') {
return true;
}
return false;
}
}
In my opinion by the way, the method checkParentheses(String str) should look more like this:
boolean checkParentheses(String str) {
Deque<Character> stack = new ArrayDeque<>();
String open = "({[";
String close = ")]}";
int length = 0;
for (int i = 0; i < str.length(); i++) {
char currentChar = str.charAt(i);
if (open.contains(String.valueOf(currentChar))) {
stack.push(currentChar);
length++;
} else if (close.contains(String.valueOf(currentChar))) {
if (!stack.isEmpty() && matching(stack.peek(), currentChar)) {
stack.pop();
length--;
}
else {
return false;
}
} else {
return false;
}
}
if (length == 0)
return true;
return false;
}
But it is totally up to you...
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
im work this assignment and keep getting Exception in thread
"main" java.lang.RuntimeException: Stack Underflow at Stack.pop(Postfix.java:74)
at Postfix.eval(Postfix.java:221)at Postfix.main(Postfix.java:112)
dont know why i look at the stack and write it correct , i cant see problem why it pop when (3*4)/5
import java.io.IOException;
class CharStack
{
private final int STACKSIZE= 80;
private int top;
private char[] items;
public CharStack(){
items = new char[STACKSIZE];
top =-1;
}
public boolean empty() {
if(top==-1){
return true;
}
return false;
}
public char pop() {
if(empty()){
throw new RuntimeException("Stack Underflow");
}
return items[top--];
}
public void push(char symb)
{
if(top == STACKSIZE -1) {
throw new RuntimeException("Stack Overflow");
}
items[++top] =symb;
}
public char peek() {
if(empty()){
throw new RuntimeException("Stack Underflow");
}
return items[top];
}
}
class Stack {
private final int STACKSIZE= 80;
private int top;
private double[] items;
public Stack(){
items = new double[STACKSIZE];
top =-1;
}
public void push(double x)
{
if(top == STACKSIZE -1) {
throw new RuntimeException("Stack Overflow");
}
items[++top] =x;
}
public double pop(){
if(empty()){
System.out.print(top);
throw new RuntimeException("Stack Underflow");
}
return items[top--];
}
public double peek() {
if(empty()){
throw new RuntimeException("Stack Underflow");
}
return items[top];
}
boolean empty()
{
if(top==-1){
return true;
}
return false;
}
}
public class Postfix {
public final static int MAXCOLS = 80;
public static void main(String[] args) throws IOException {
String infix, pfix;
System.out.println("Enter a infix String: ");
infix = readString().trim();
System.out.println("The original infix expr is: " + infix);
pfix = postfix(infix);
System.out.println("The Postfix expr is: " + pfix);
System.out.println("The value is : " + eval(pfix));
} // end main
public static boolean isOperand(char x)
{
if(x == '+')
{
return false;
}
else if(x == '-')
{
return false;
}
else if (x == '*')
{
return false;
}
else if (x == '/')
{
return false;
}
else if ( x== '$')
{
return false;
}
return true;
}
public static int operPrecedence(char oper)
{
if(oper == '+'||oper == '-' )
{
return 1;
}
else if (oper == '*' || oper == '/')
{
return 2;
}
else if (oper == '$')
{
return 3;
}
return 0;
}
public static boolean precedence(char top, char symb)
{
if ((top != '('||top != ')')&&symb == '(')
{
return false;
}
if (top == '(' && (symb != '('||symb != ')') )
{
return false;
}
else if((top != '('||top != ')')&&symb ==')' )
{
return true;
}
int opcode1, opcode2;
opcode1 =operPrecedence(top) ;
opcode2 =operPrecedence(symb) ;
if(opcode1>=opcode2){
return true;
}
return false;
}
public static String readString() throws IOException {
char[] charArray = new char[80];
int position = 0;
char c;
while ((c = (char) System.in.read()) != '\n') {
charArray[position++] = c;
}
return String.copyValueOf(charArray, 0, position); // turns a character array into a string, starting between zero and position-1
}// end read string
public static double eval(String infix) {
char c;
int position;
double opnd1, opnd2, value;
Stack opndstk = new Stack();
for (position = 0; position < infix.length(); position++) {
c = infix.charAt(position);
if (Character.isDigit(c)) // operand-convert the character represent of
// the digit into double and push it into the
// stack
{
opndstk.push((double) Character.digit(c, 10));
} else {
// operator
opnd2 = opndstk.pop();
opnd1 = opndstk.pop();
value = oper(c, opnd1, opnd2);
opndstk.push(value);
} // else
} // end for
return opndstk.pop();
}// end eval
public static String postfix(String infix) {
int position, outpos = 0;
char symb;
char[] postr = new char[MAXCOLS];
CharStack opstk = new CharStack();
for (position = 0; position < infix.length(); position++) {
symb = infix.charAt(position);
if (isOperand(symb)) {
postr[outpos++] = symb;
} else {
while (!opstk.empty() && precedence(opstk.peek(), symb)) {
postr[outpos++] = opstk.pop();
} // end while
if (symb != ')') {
opstk.push(symb);
} else {
opstk.pop();
}
} // end else
} // end for
while (!opstk.empty()) {
postr[outpos++] = opstk.pop();
}
return String.copyValueOf(postr, 0, outpos);
}// end pos
public static double oper(char symb, double op1, double op2) {
double value = 0;
switch (symb) {
case '+':
value = op1 + op2;
break;
case '-':
value = op1 - op2;
break;
case '*':
value = op1 * op2;
break;
case '/':
value = op1 / op2;
break;
case '$':
value = Math.pow(op1, op2);
break;
default:
throw new RuntimeException("illegal operator: " + symb);
}// end switch
return value;
}// end oper
}
At least part of the problem you're having is your isOperand method. The characters ( and ) are not operands, however, when they are passed to this method, it would return true. For a quick test, I added the following lines to the end of the method:
else if (x == '(')
{
return true;
}
else if (x == ')')
{
return true;
}
And your example input, (3*4)/5) runs successfully. However, this breaks your postfix output, as it leaves the brackets out of the postfix version and instead prints 34*5/, which I am guessing you don't want.
Then, I looked at your eval method, which is where the problem is coming from, according to the error message I'm receiving:
Exception in thread "main" java.lang.RuntimeException: Stack Underflow
at Stack.pop(Postfix.java:74)
at Postfix.eval(Postfix.java:221)
at Postfix.main(Postfix.java:112)
Note the line Postfix.java:221, which indicates the line that called the method which created the error. If you output your character c right before that line is called, you'll notice that c is the ( character, which means your eval method is recognizing ( as an operator, and is attempting to pop two operands after it, causing your underflow.
All of this is fairly simple to determine with some System.out.println() calls, and looking at your error. I'll leave the actual fixing to you, but at least you've hopefully got a direction to head in now.
package edu.bsu.cs121.mamurphy;
import java.util.Stack;
public class Checker {
char openPara = '(';
char openBracket = '[';
char openCurly = '{';
char openArrow = '<';
char closePara = ')';
char closeBracket = ']';
char closeCurly = '}';
char closeArrow = '>';
public boolean checkString(String stringToCheck) {
Stack<Character> stack = new Stack<Character>();
for (int i = 0; i < stringToCheck.length(); i++) {
char c = stringToCheck.charAt(i);
if (c == openPara || c == openBracket || c == openCurly || c == openArrow) {
stack.push(c);
System.out.println(stack);
;
}
if (c == closePara) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openPara) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
if (c == closeBracket) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openBracket) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
if (c == closeCurly) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openCurly) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
if (c == closeArrow) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
break;
} else if (stack.peek() == openArrow) {
stack.pop();
} else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
}
}
return false;
}
}
I am currently trying to create a program where I check to see if a string is balanced or not. A string is balanced if and only if each opening character: (, {, [, and < have a matching closing character: ), }, ], and > respectively.
What happens is when checking through the string, if an opening character is found, it is pushed into a stack, and it checks to see if there is the appropriate closing character.
If there is a closing character before the opening character, then that automatically means that the string is unbalanced. Also, the string is automatically unbalanced if after going to the next character there is something still inside of the stack.
I tried to use
else if (stack.size() > 0) {
System.out.println("Unbalanced");
break;
}
as a way of seeing if the stack still had anything in it, but it still isn't working for me. Any advice on what to do?
For example, if the string input were ()<>{() then the program should run through like normal until it gets to the single { and then the code should realize that the string is unbalanced and output Unbalanced.
For whatever reason, my code does not do this.
The following logic is flawed (emphasis mine):
For example, if the string input were ()<>{() then the program should run through like normal until it gets to the single { and then the code should realize that the string is unbalanced and output Unbalanced.
In fact, the code can't conclude that the string is unbalanced until it has scanned the entire string and established that the { has no matching }. For all it knows, the full input could be ()<>{()} and be balanced.
To achieve this, you need to add a check that ensures that the stack is empty after the entire string has been processes. In your example, it would still contain the {, indicating that the input is not balanced.
I took a shot at answering this. My solutions returns true if the string is balanced and enforces opening/closing order (ie ({)} returns false). I started with your code and tried to slim it down.
import java.util.HashMap;
import java.util.Map;
import java.util.Stack;
public class mamurphy {
private static final char openPara = '(';
private static final char openBracket = '[';
private static final char openCurly = '{';
private static final char openArrow = '<';
private static final char closePara = ')';
private static final char closeBracket = ']';
private static final char closeCurly = '}';
private static final char closeArrow = '>';
public static void main(String... args) {
System.out.println(checkString("{}[]()90<>"));//true
System.out.println(checkString("(((((())))"));//false
System.out.println(checkString("((())))"));//false
System.out.println(checkString(">"));//false
System.out.println(checkString("["));//false
System.out.println(checkString("{[(<>)]}"));//true
System.out.println(checkString("{[(<>)}]"));//false
System.out.println(checkString("( a(b) (c) (d(e(f)g)h) I (j<k>l)m)"));//true
}
public static boolean checkString(String stringToCheck) {
final Map<Character, Character> closeToOpenMap = new HashMap<>();
closeToOpenMap.put(closePara, openPara);
closeToOpenMap.put(closeBracket, openBracket);
closeToOpenMap.put(closeCurly, openCurly);
closeToOpenMap.put(closeArrow, openArrow);
Stack<Character> stack = new Stack<>();
final char[] stringAsChars = stringToCheck.toCharArray();
for (int i = 0; i < stringAsChars.length; i++) {
final char current = stringAsChars[i];
if (closeToOpenMap.values().contains(current)) {
stack.push(current); //found an opening char, push it!
} else if (closeToOpenMap.containsKey(current)) {
if (stack.isEmpty() || closeToOpenMap.get(current) != stack.pop()) {
return false;//found closing char without correct opening char on top of stack
}
}
}
if (!stack.isEmpty()) {
return false;//still have opening chars after consuming whole string
}
return true;
}
}
Here's an alternate approach:
private static final char[] openParens = "[({<".toCharArray();
private static final char[] closeParens = "])}>".toCharArray();
public static boolean isBalanced(String expression){
Deque<Character> stack = new ArrayDeque<>();
for (char c : expression.toCharArray()){
for (int i = 0; i < openParens.length; i++){
if (openParens[i] == c){
// This is an open - put it in the stack
stack.push(c);
break;
}
if (closeParens[i] == c){
// This is a close - check the open is at the top of the stack
if (stack.poll() != openParens[i]){
return false;
}
break;
}
}
}
return stack.isEmpty();
}
It simplifies the logic to have two corresponding arrays of open and close symbols. You could also do this with even and odd positions in one array - ie. "{}<>", for example:
private static final char[] symbols = "[](){}<>".toCharArray();
public static boolean isBalanced(String expression){
Deque<Character> stack = new ArrayDeque<>();
for (char c : expression.toCharArray()){
for (int i = 0; i < symbols.length; i += 2){
if (symbols[i] == c){
// This is an open - put it in the stack
stack.push(c);
break;
}
if (symbols[i + 1] == c){
// This is a close - check the open is at the top of the stack
if (stack.poll() != symbols[i]){
return false;
}
break;
}
}
}
return stack.isEmpty();
}
Note that poll returns null if the stack is empty, so will correctly fail the equality comparison if we run out of stack.
For example, if the string input were ()<>{() then the program should run through like normal until it gets to the single { and then the code should realize that the string is unbalanced and output Unbalanced.
It is not clear by your example whether the boundaries can be nested like ([{}]). If they can, that logic will not work, as the whole string has to be consumed to be sure any missing closing-chars aren't at the end, and so, the string cannot be reliably deemed unbalanced at the point you indicate.
Here is my take on your problem:
BalanceChecker class:
package so_q33378870;
import java.util.Stack;
public class BalanceChecker {
private final char[] opChars = "([{<".toCharArray();
private final char[] edChars = ")]}>".toCharArray();
//<editor-fold defaultstate="collapsed" desc="support functions">
public boolean isOPChar(char c) {
for (char checkChar : opChars) {
if (c == checkChar) {
return true;
}
}
return false;
}
public boolean isEDChar(char c) {
for (char checkChar : edChars) {
if (c == checkChar) {
return true;
}
}
return false;
}
//NOTE: Unused.
// public boolean isBoundaryChar(char c) {
// boolean result;
// if (result = isOPChar(c) == false) {
// return isEDChar(c);
// } else {
// return result;
// }
// }
public char getOpCharFor(char c) {
for (int i = 0; i < edChars.length; i++) {
if (c == edChars[i]) {
return opChars[i];
}
}
throw new IllegalArgumentException("The character (" + c + ") received is not recognized as a closing boundary character.");
}
//</editor-fold>
public boolean isBalanced(char[] charsToCheck) {
Stack<Character> checkStack = new Stack<>();
for (int i = 0; i < charsToCheck.length; i++) {
if (isOPChar(charsToCheck[i])) {
//beginning char found. Add to top of stack.
checkStack.push(charsToCheck[i]);
} else if (isEDChar(charsToCheck[i])) {
if (checkStack.isEmpty()) {
//ending char found without beginning chars on the stack. UNBALANCED.
return false;
} else if (getOpCharFor(charsToCheck[i]) == checkStack.peek()) {
//ending char found matches last beginning char on the stack. Pop and continue.
checkStack.pop();
} else {
//ending char found, but doesn't match last beginning char on the stack. UNBALANCED.
return false;
}
}
}
//the string is balanced if and only if the stack is empty at the end.
return checkStack.empty();
}
public boolean isBalanced(String stringToCheck) {
return isBalanced(stringToCheck.toCharArray());
}
}
Main class (used for testing):
package so_q33378870;
public class main {
private static final String[] tests = {
//Single - Balanced.
"()",
//Single - Unbalanced by missing end.
"(_",
//Multiple - Balanced.
"()[]{}<>",
//Multiple - Unbalanced by missing beginning.
"()[]_}<>",
//Nested - Balanced.
"([{<>}])",
//Nested - Unbalanced by missing end.
"([{<>}_)",
//Endurance test - Balanced.
"the_beginning (abcd) divider (a[bc]d) divider (a[b{c}d]e) divider (a[b{c<d>e}f]g) the_end"
};
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
BalanceChecker checker = new BalanceChecker();
for (String s : tests) {
System.out.println("\"" + s + "\" is " + ((checker.isBalanced(s)) ? "BALANCED!" : "UNBALANCED!"));
}
}
}
I have the following code for a parser that accepts a linked list as a parameter from a parent class. It throws a stack overflow error after I input the expression.
I pass the input expression from a jTextField in a Swing GUI class, then return the boolean result to a jLabel in the same class. What could be causing the stack overflow error? Help please, Thanks!!
Example input:
1+2+3
(1+2)/(3+4)
import java.util.LinkedList;
class Token {
public static final int PLUS_MINUS = 0;
public static final int MULTIPLY_DIVIDE = 1;
public static final int EXPONENT = 2;
public static final int NUMBER = 3;
public static final int IDENTIFIER = 4;
public static final int OPEN = 5;
public static final int CLOSE = 6;
//public static final int NEGATIVE = 7;
public final int token; // FIELDS TO HOLD DATA PER TOKEN
public final String sequence;
public Token (int token, String sequence) {
super();
this.token = token;
this.sequence = sequence;
}
}
public class Parser {
private Token next; // POINTER FOR NEXT TOKEN
private LinkedList<Token> tokens; // LIST OF TOKENS PRODUCED BY TOKENIZER
private int counter = 0;
public Parser(LinkedList tokens)
{
this.tokens = (LinkedList<Token>) tokens.clone(); // GET LINKEDLIST
this.tokens.getFirst(); // ASSIGNS FIRST ELEMENT OF LINKEDLIST
}
//////// START OF PARSING METHODS ////////
/*
GRAMMAR:
E -> E+T | E-T | T | -E
T -> T*X | T/X | X
X -> X^F | F
F -> (E) | NUMBERS | IDENTIFIERS
F -> (E) | N | I
N -> D | ND
I -> IDENTIFIERS
*/
public boolean Parse ()
{
return E(); // INVOKE START SYMBOL
}
private boolean term (int token) // GETS NEXT TOKEN
{
boolean flag = false;
if(next.token == token)
flag = true;
counter++; // INCREMENT COUNTER
if(counter < tokens.size()) // POINT TO NEXT TOKEN
next = tokens.get(counter);
return flag;
}
///////// START OF LIST OF PRODUCTIONS /////////
//////// E -> E+T | E-T | T | -E ////////
private boolean E()
{
return E1() || E2() || E3();
}
private boolean E1 ()
{
// E -> E+T | E-T
int flag = counter;
boolean result = true;
if(!(E() && term(Token.PLUS_MINUS) && T() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean E2 ()
{
// E -> T
int flag = counter;
boolean result = true;
if(!T())
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean E3 ()
{
// E -> -E
int flag = counter;
boolean result = true;
if(!(term(Token.PLUS_MINUS) && E() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
//////// T -> T*X | T/X | X ////////
private boolean T()
{
return T1() || T2();
}
private boolean T1 ()
{
// T -> T*X | T/X
int flag = counter;
boolean result = true;
if(!(T() && term(Token.MULTIPLY_DIVIDE) && X() ))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean T2 ()
{
// T -> X
int flag = counter;
boolean result = true;
if(!X())
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
//////// X -> X^F | F ////////
private boolean X()
{
return X1() || X2();
}
private boolean X1()
{
// X-> X^F
int flag = counter;
boolean result = true;
if(!(X() && term(Token.EXPONENT) && F()))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean X2()
{
// X-> F
int flag = counter;
boolean result = true;
if(!F())
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
//////// F -> (E) | NUMBERS | IDENTIFIERS ////////
private boolean F()
{
return F1() || F2() || F3();
}
private boolean F1()
{
// F -> (E)
int flag = counter;
boolean result = true;
if(!(term(Token.OPEN) && E() && term(Token.CLOSE)))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean F2()
{
// F -> NUMBERS
int flag = counter;
boolean result = true;
if(!term(Token.NUMBER))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
private boolean F3()
{
// F -> IDENTIFIERS
int flag = counter;
boolean result = true;
if(!term(Token.IDENTIFIER))
{
counter = flag; // BACKTRACK
if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
next = tokens.get(counter);
result = false;
}
return result;
}
}
Your problem is that recursive descent parsing cannot handle left recursive grammars. your first production says "E -> E + T", we call this left recursive, because the first thing being derived is the thing you are defining. The way recursive descent works is to first match an "E" then a "+" then a "T". The problem is that your "E" method first calls the "E1" method, which then immediately calls "E" again, which calls "E1" and encounters an infinite recursive loop. You need to left factor your grammar if you want to use recursive descent. I've copied a link that has more information on left factoring: http://en.wikipedia.org/wiki/Left_recursion. So in summary, you are overflowing the stack because you have an infinite recursive loop.
Usually when you get a stack overflow it means that the program recursively calls one or methods without end, so let's look at how your program would execute.
It appears that your code is executed using the Parse method (by the way in Java the naming convention is to have lower case method names)
public boolean Parse() {
return E(); // INVOKE START SYMBOL
}
Not too bad so far; the grammar specifies that E is first to be parsed, so E() is called. Let's look at the definition of E().
private boolean E() {
return E1() || E2() || E3();
}
Let's see what happens when this executes. Java will evaluate this boolean expression by doing E1(), then E2(), and finally E3(), so let's look at E1 now.
private boolean E1 () {
// E -> E+T | E-T
int flag = counter;
boolean result = true;
if(!(E() && term(Token.PLUS_MINUS) && T() )) {
counter = flag; // BACKTRACK
...
Here is the problem. Your flag is set, result is set to true, and the if statement immediately executes E(). Remember that E() was what was just being evaluated, and now E1() is calling E() again, which will execute E1(), forever (and if you debugged the program you would see in the application stack alternating calls to E1() and E()).
This is part of the trouble of recursive descent parsing. It is an elegant solution to parsing, but the grammar sometimes requires some rewriting or else this is the exact problem you run into, where you get trapped in a grammar rule. In order for this to work you need to rewrite the grammar (see this document on recursive descent parsing I found with a quick Google search).
There are two requirements for the grammar: it must be deterministic and it can't contain left recursion.
The problem you run into is left recursion, in nearly every rule:
E -> E+T | E-T | T | -E
This says that the token E can be the token E + T, and to recognize this you have to recognize the token E, which can be E + T, ... (forever). This caused the problem in your program.
Rewriting the grammar by eliminating the left recursion will solve this problem, and make sure when you finish you have a deterministic grammar.