Java precedence on abstract syntax tree?

Java precedence on abstract syntax tree? - java

One of my midterm review questions asks to parse this tree in different ways - pre/postfix etc. It asks these two ways as well though: In "Infix, Java precedence rules" and in "Infix, left-to-right precedence"
What is the difference between Java precedence rule and plain left-to-right infix rule? I thought if it was as Java precedence, something like "newline" may be needed like the actual java code but I really don't see what's really asked here. Thanks for your help in advance
Another question. How would you regard d and e nodes?
If it was postfix, (d e) f h * - would be appropriate for that portion of tree?

I think left-to-right precedence simply means that you just apply all the infix operators from left to right, so that
2 * 3 + 4 * 5
is interpreted as
((2 * 3) + 4) * 5 = 50
In Java and every other programming language I know of except APL, however, * is given higher precedence than + or -, which means the expression is interpreted as
(2 * 3) + (4 * 5) = 26
(Java has a lot more operators, so the order of precedence is pretty complicated. But if you're only going to see +, -, *, and /, all you need to know is that * and / have higher precedence; and that for operators with the same precedence, they're evaluated left to right.)
I'm guessing that the assignment is asking you how the tree would be represented using the two different precedence rules. Of course, you could put parentheses around everything, which means the precedence rules wouldn't apply at all:
(foo (a, (b + c) * ((d ? e) - (f * h)), (j * k)) - g
[The ? is there because there seems to be a box missing from the diagram.] So you're probably supposed to write it in a way without unnecessary parentheses, which means you need to know the precedence rules.
To answer your last question about d and e: you should ask your instructor, because I'm guessing it means "misprint". Unless they've come up with some new kinds of syntax tree diagrams since I studied this, it looks like a box is missing.

Related

Why operator precedence being ignored here? [duplicate]

I am reading some Java text and got the following code:
int[] a = {4,4};
int b = 1;
a[b] = b = 0;
In the text, the author did not give a clear explanation and the effect of the last line is: a[1] = 0;
I am not so sure that I understand: how did the evaluation happen?

Let me say this very clearly, because people misunderstand this all the time:
Order of evaluation of subexpressions is independent of both associativity and precedence. Associativity and precedence determine in what order the operators are executed but do not determine in what order the subexpressions are evaluated. Your question is about the order in which subexpressions are evaluated.
Consider A() + B() + C() * D(). Multiplication is higher precedence than addition, and addition is left-associative, so this is equivalent to (A() + B()) + (C() * D()) But knowing that only tells you that the first addition will happen before the second addition, and that the multiplication will happen before the second addition. It does not tell you in what order A(), B(), C() and D() will be called! (It also does not tell you whether the multiplication happens before or after the first addition.) It would be perfectly possible to obey the rules of precedence and associativity by compiling this as:
d = D() // these four computations can happen in any order
b = B()
c = C()
a = A()
sum = a + b // these two computations can happen in any order
product = c * d
result = sum + product // this has to happen last
All the rules of precedence and associativity are followed there -- the first addition happens before the second addition, and the multiplication happens before the second addition. Clearly we can do the calls to A(), B(), C() and D() in any order and still obey the rules of precedence and associativity!
We need a rule unrelated to the rules of precedence and associativity to explain the order in which the subexpressions are evaluated. The relevant rule in Java (and C#) is "subexpressions are evaluated left to right". Since A() appears to the left of C(), A() is evaluated first, regardless of the fact that C() is involved in a multiplication and A() is involved only in an addition.
So now you have enough information to answer your question. In a[b] = b = 0 the rules of associativity say that this is a[b] = (b = 0); but that does not mean that the b=0 runs first! The rules of precedence say that indexing is higher precedence than assignment, but that does not mean that the indexer runs before the rightmost assignment.
(UPDATE: An earlier version of this answer had some small and practically unimportant omissions in the section which follows which I have corrected. I've also written a blog article describing why these rules are sensible in Java and C# here: https://ericlippert.com/2019/01/18/indexer-error-cases/)
Precedence and associativity only tell us that the assignment of zero to b must happen before the assignment to a[b], because the assignment of zero computes the value that is assigned in the indexing operation. Precedence and associativity alone say nothing about whether the a[b] is evaluated before or after the b=0.
Again, this is just the same as: A()[B()] = C() -- All we know is that the indexing has to happen before the assignment. We don't know whether A(), B(), or C() runs first based on precedence and associativity. We need another rule to tell us that.
The rule is, again, "when you have a choice about what to do first, always go left to right". However, there is an interesting wrinkle in this specific scenario. Is the side effect of a thrown exception caused by a null collection or out-of-range index considered part of the computation of the left side of the assignment, or part of the computation of the assignment itself? Java chooses the latter. (Of course, this is a distinction that only matters if the code is already wrong, because correct code does not dereference null or pass a bad index in the first place.)
So what happens?
The a[b] is to the left of the b=0, so the a[b] runs first, resulting in a[1]. However, checking the validity of this indexing operation is delayed.
Then the b=0 happens.
Then the verification that a is valid and a[1] is in range happens
The assignment of the value to a[1] happens last.
So, though in this specific case there are some subtleties to consider for those rare error cases that should not be occurring in correct code in the first place, in general you can reason: things to the left happen before things to the right. That's the rule you're looking for. Talk of precedence and associativity is both confusing and irrelevant.
People get this stuff wrong all the time, even people who should know better. I have edited far too many programming books that stated the rules incorrectly, so it is no surprise that lots of people have completely incorrect beliefs about the relationship between precedence/associativity, and evaluation order -- namely, that in reality there is no such relationship; they are independent.
If this topic interests you, see my articles on the subject for further reading:
http://blogs.msdn.com/b/ericlippert/archive/tags/precedence/
They are about C#, but most of this stuff applies equally well to Java.

Eric Lippert's masterful answer is nonetheless not properly helpful because it is talking about a different language. This is Java, where the Java Language Specification is the definitive description of the semantics. In particular, §15.26.1 is relevant because that describes the evaluation order for the = operator (we all know that it is right-associative, yes?). Cutting it down a little to the bits that we care about in this question:
If the left-hand operand expression is an array access expression (§15.13), then many steps are required:
First, the array reference subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason; the index subexpression (of the left-hand operand array access expression) and the right-hand operand are not evaluated and no assignment occurs.
Otherwise, the index subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and the right-hand operand is not evaluated and no assignment occurs.
Otherwise, the right-hand operand is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and no assignment occurs.
[… it then goes on to describe the actual meaning of the assignment itself, which we can ignore here for brevity …]
In short, Java has a very closely defined evaluation order that is pretty much exactly left-to-right within the arguments to any operator or method call. Array assignments are one of the more complex cases, but even there it's still L2R. (The JLS does recommend that you don't write code that needs these sorts of complex semantic constraints, and so do I: you can get into more than enough trouble with just one assignment per statement!)
C and C++ are definitely different to Java in this area: their language definitions leave evaluation order undefined deliberately to enable more optimizations. C# is like Java apparently, but I don't know its literature well enough to be able to point to the formal definition. (This really varies by language though, Ruby is strictly L2R, as is Tcl — though that lacks an assignment operator per se for reasons not relevant here — and Python is L2R but R2L in respect of assignment, which I find odd but there you go.)

a[b] = b = 0;
1) array indexing operator has higher precedence then assignment operator (see this answer):
(a[b]) = b = 0;
2) According to 15.26. Assignment Operators of JLS
There are 12 assignment operators; all are syntactically right-associative (they group right-to-left). Thus, a=b=c means a=(b=c), which assigns the value of c to b and then assigns the value of b to a.
(a[b]) = (b=0);
3) According to 15.7. Evaluation Order of JLS
The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.
and
The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated.
So:
a) (a[b]) evaluated first to a[1]
b) then (b=0) evaluated to 0
c) (a[1] = 0) evaluated last

Your code is equivalent to:
int[] a = {4,4};
int b = 1;
c = b;
b = 0;
a[c] = b;
which explains the result.

Consider another more in-depth example below.
As a General Rule of Thumb:
It's best to have a table of the Order of Precedence Rules and Associativity available to read when solving these questions e.g. http://introcs.cs.princeton.edu/java/11precedence/
Here is a good example:
System.out.println(3+100/10*2-13);
Question: What's the Output of the above Line?
Answer: Apply the Rules of Precedence and Associativity
Step 1: According to rules of precedence: / and * operators take priority over + - operators. Therefore the starting point to execute this equation will the narrowed to:
100/10*2
Step 2: According to the rules and precedence: / and * are equal in precedence.
As / and * operators are equal in precedence, we need to look at the associativity between those operators.
According to the ASSOCIATIVITY RULES of these two particular operators,
we start executing the equation from the LEFT TO RIGHT i.e. 100/10 gets executed first:
100/10*2
=100/10
=10*2
=20
Step 3: The equation is now in the following state of execution:
=3+20-13
According to the rules and precedence: + and - are equal in precedence.
We now need to look at the associativity between the operators + and - operators. According to the associativity of these two particular operators,
we start executing the equation from the LEFT to RIGHT i.e. 3+20 gets executed first:
=3+20
=23
=23-13
=10
10 is the correct output when compiled
Again, it is important to have a table of the Order of Precedence Rules and Associativity with you when solving these questions e.g. http://introcs.cs.princeton.edu/java/11precedence/

Computing a large mixed equation in java [duplicate]

I needed some help with creating custom trees given an arithmetic expression. Say, for example, you input this arithmetic expression:
(5+2)*7
The result tree should look like:
*
/ \
+ 7
/ \
5 2
I have some custom classes to represent the different types of nodes, i.e. PlusOp, LeafInt, etc. I don't need to evaluate the expression, just create the tree, so I can perform other functions on it later.
Additionally, the negative operator '-' can only have one child, and to represent '5-2', you must input it as 5 + (-2).
Some validation on the expression would be required to ensure each type of operator has the correct the no. of arguments/children, each opening bracket is accompanied by a closing bracket.
Also, I should probably mention my friend has already written code which converts the input string into a stack of tokens, if that's going to be helpful for this.
I'd appreciate any help at all. Thanks :)
(I read that you can write a grammar and use antlr/JavaCC, etc. to create the parse tree, but I'm not familiar with these tools or with writing grammars, so if that's your solution, I'd be grateful if you could provide some helpful tutorials/links for them.)

Assuming this is some kind of homework and you want to do it yourself..
I did this once, you need a stack
So what you do for the example is:
parse what to do? Stack looks like
( push it onto the stack (
5 push 5 (, 5
+ push + (, 5, +
2 push 2 (, 5, +, 2
) evaluate until ( 7
* push * 7, *
7 push 7 +7, *, 7
eof evaluate until top 49
The symbols like "5" or "+" can just be stored as strings or simple objects, or you could store the + as a +() object without setting the values and set them when you are evaluating.
I assume this also requires an order of precedence, so I'll describe how that works.
in the case of: 5 + 2 * 7
you have to push 5 push + push 2 next op is higher precedence so you push it as well, then push 7. When you encounter either a ) or the end of file or an operator with lower or equal precedence you start calculating the stack to the previous ( or the beginning of the file.
Because your stack now contains 5 + 2 * 7, when you evaluate it you pop the 2 * 7 first and push the resulting *(2,7) node onto the stack, then once more you evaluate the top three things on the stack (5 + *node) so the tree comes out correct.
If it was ordered the other way: 5 * 2 + 7, you would push until you got to a stack with "5 * 2" then you would hit the lower precedence + which means evaluate what you've got now. You'd evaluate the 5 * 2 into a *node and push it, then you'd continue by pushing the + and 3 so you had *node + 7, at which point you'd evaluate that.
This means you have a "highest current precedence" variable that is storing a 1 when you push a +/-, a 2 when you push a * or / and a 3 for "^". This way you can just test the variable to see if your next operator's precedence is < = your current precedence.
if ")" is considered priority 4 you can treat it as other operators except that it removes the matching "(", a lower priority would not.

I wanted to respond to Bill K.'s answer, but I lack the reputation to add a comment there (that's really where this answer belongs). You can think of this as a addendum to Bill K.'s answer, because his was a little incomplete. The missing consideration is operator associativity; namely, how to parse expressions like:
49 / 7 / 7
Depending on whether division is left or right associative, the answer is:
49 / (7 / 7) => 49 / 1 => 49
or
(49 / 7) / 7 => 7 / 7 => 1
Typically, division and subtraction are considered to be left associative (i.e. case two, above), while exponentiation is right associative. Thus, when you run into a series of operators with equal precedence, you want to parse them in order if they are left associative or in reverse order if right associative. This just determines whether you are pushing or popping to the stack, so it doesn't overcomplicate the given algorithm, it just adds cases for when successive operators are of equal precedence (i.e. evaluate stack if left associative, push onto stack if right associative).

The "Five minute introduction to ANTLR" includes an arithmetic grammar example. It's worth checking out, especially since antlr is open source (BSD license).

Several options for you:
Re-use an existing expression parser. That would work if you are flexible on syntax and semantics. A good one that I recommend is the unified expression language built into Java (initially for use in JSP and JSF files).
Write your own parser from scratch. There is a well-defined way to write a parser that takes into account operator precedence, etc. Describing exactly how that's done is outside the scope of this answer. If you go this route, find yourself a good book on compiler design. Language parsing theory is going to be covered in the first few chapters. Typically, expression parsing is one of the examples.
Use JavaCC or ANTLR to generate lexer and parser. I prefer JavaCC, but to each their own. Just google "javacc samples" or "antlr samples". You will find plenty.
Between 2 and 3, I highly recommend 3 even if you have to learn new technology. There is a reason that parser generators have been created.
Also note that creating a parser that can handle malformed input (not just fail with parse exception) is significantly more complicated that writing a parser that only accepts valid input. You basically have to write a grammar that spells out the various common syntax errors.
Update: Here is an example of an expression language parser that I wrote using JavaCC. The syntax is loosely based on the unified expression language. It should give you a pretty good idea of what you are up against.
Contents of org.eclipse.sapphire/plugins/org.eclipse.sapphire.modeling/src/org/eclipse/sapphire/modeling/el/parser/internal/ExpressionLanguageParser.jj

the given expression (5+2)*7 we can take as infix
Infix : (5+2)*7
Prefix : *+527
from the above we know the preorder and inorder taversal of tree ... and we can easily construct tree from this.
Thanks,

Math.cos not working

This bit of my program is supposed to calculate bottomAngle using cosine rule.
public double bottomAngle() {
topAngleinRadians = Math.toRadians(topAngle) ;
return (Math.cos(topAngleinRadians)(bottomAngle() = ladderLength^2 + floorLength^2 - verticalHeight^2) / 2 * ladderLength * floorLength) ;
}
Errors produced:
Here is my list of errors and I can't figure it out what's wrong with my formula. All the methods such verticalHeight , ladderLength works perfectly fine in other methods. There is something wrong with the way I put this formula. Can you please help me out?

Without seeing your list of errors, you do have syntax errors:
return (Math.cos(topAngleinRadians)(bottomAngle() = ladderLength^2 + floorLength^2 - verticalHeight^2) / 2 * ladderLength * floorLength);
You have no operator between your call to Math.cos() and the next part of your expression.
You're also appear to be assigning values to a function call, which doesn't make sense.
The ^ operator is also not the exponential operator, but a bitwise exclusive OR operator. You're probably looking for Math.pow().
Those are just what I'm seeing right off the top. It might be helpful to read up about the Java operators and how they are evaluated.

There's a few issues here.
Multiplication - unlike in regular algebra, you have to explicitly define that you want multiplication between two expressions Math.cos(topAngleinRadians)*...
Assignment - you appear to be trying to assign something to a method call (bottomAngle() = ...). This is not really something you can do, and I'm not really sure what you are trying to achieve by it.
Squaring - 10^2 does not square 10 into 100 in java, but is rather the XOR (exclusive OR) operator. You probably want to use Math.pow(ladderLength, 2) or simply ladderLength * ladderLength

Unlike algebraic notation, Java parentheses do not implicitly multiply.
You need to insert a * between )(.

Can't really understand the purpose of your return statement, but I would rather break the statement into 2-3 lines to make it more readable: -
public double bottomAngle() {
topAngleinRadians = Math.toRadians(topAngle) ;
double bottomAngle = Math.pow(ladderLength, 2) + Math.pow(floorLength, 2) -
Math.pow(verticalHeight, 2);
double denom = 2 * ladderLength * floorLength;
double numerator = bottomAngle * Math.cos(topAngleinRadians);
return numerator / denom ;
}
Note that, 3 ^ 2 does mean 3 squared in Java. You would need Math.pow method for that.
Also, you need to check why you were having bottomAngle() method call on LHS. I have assumed it to be a temp variable here.
As you can see, your code looks much more readable. And it becomes easy to find out compiler errors.

Your sytnax is incorrect. Parenthesis do not mean mutliplication, you need explicit * (multiplication operator). Also, you have some other mistakes:
Math.cos(topAngleinRadians)(bottomAngle() = ladderLength^2 + floorLength^2...
this looks like a method call i.e. bottomAngle() being set equal to some other expression, this is invalid also..

How does expression evaluation order differ between C++ and Java?

I've had my brain wrinkled from trying to understand the examples on this page:
http://answers.yahoo.com/question/index?qid=20091103170907AAxXYG9
More specifically this code:
int j = 4;
cout << j++ << j << ++j << endl;
gives an output: 566
Now this makes sense to me if the expression is evaluated right to left, however in Java a similar expression:
int j = 4;
System.out.print("" + (j++) + (j) + (++j));
gives an output of: 456
Which is more intuitive because this indicates it's been evaluated left to right. Researching this across various sites, it seems that with C++ the behaviour differs between compilers, but I'm still not convinced I understand. What's the explanation for this difference in evaluation between Java and C++? Thanks SO.

When an operation has side effects, C++ relies on sequence points rule to decide when side effects (such as increments, combined assignments, etc.) have to take effect. Logical and-then/or-else (&& and ||) operators, ternary ? question mark operators, and commas create sequence points; +, -, << and so on do not.
In contrast, Java completes side effects before proceeding with further evaluation.
When you use an expression with side effects multiple times in the absence of sequence points, the resulting behavior is undefined in C++. Any result is possible, including one that does not make logical sense.

Java guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right. The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. Java also guarantees that every operand of an operator (except the conditional operators &&, ||, and ?:) appears to be fully evaluated before any part of the operation itself is performed. See Java language specification §15.7 for more details.
C++, on the other hand, happily let's you get away with undefined behavior if the expression is ambiguous because language itself doesn't guarantee any order of evaluation of sub-expressions. See Sequence Point for more details.

In C++ the order of evalulations of subexpressions isn't left-to-right nor right-to-left. It is undefined.

Problems with a shunting yard algorithm

I have successfully implemented a shunting yard algorithm in java. The algorithm itself was simple however I am having trouble with the tokenizer. Currently the algorithm works with everything I want excluding one thing. How can I tell the difference between subtraction(-) and negative (-)
such as 4-3 is subtraction
but -4+3 is negative
I now know how to find out when it should be a negative and when it should be a minus, but where in the algorithm should it be placed because if you use it like a function it wont always work for example
3 + 4 * 2 / -( 1 − 5 ) ^ 2 ^ 3
when 1-5 becomes -4 it will become 4 before it gets squared and cubed
just like
3 + 4 * 2 / cos( 1 − 5 ) ^ 2 ^ 3 , you would take the cosine before squaring and cubing
but in real math you wouldn’t with a - because what your really saying is
3 + 4 * 2 / -(( 1 − 5 ) ^ 2 ^ 3) in order to have the right value

It sounds like you're doing a lex-then-parse style parser, where you're going to need a simple state machine in the lexer in order to get separate tokens for unary and binary minus. (In a PEG parser, this isn't something you have to worry about.)
In JavaCC, you would have a DEFAULT state, where you would consider the - character to be UNARY_MINUS. When you tokenized the end of a primary expression (either a closing paren, or an integer, based on the examples you gave), then you would switch to the INFIX state where - would be considered to be INFIX_MINUS. Once you encountered any infix operator, you would return to the DEFAULT state.
If you're rolling your own, it might be a bit simpler than that. Look at this Python code for a clever way of doing it. Basically, when you encounter a -, you just check to see if the previous token was an infix operator. That example uses the string "-u" to represent the unary minus token, which is convenient for an informal tokenization. Best I can tell, the Python example does fail to handle case where a - follows an open paren, or comes at the beginning of the input. Those should be considered unary as well.
In order for unary minus to be handled correctly in the shunting-yard algorithm itself, it needs to have higher precedence than any of the infix operators, and it needs to marked as right-associative. (Make sure you handle right-associativity. You may have left it out since the rest of your operators are left-associative.) This is clear enough in the Python code (although I would use some kind of struct rather than two separate maps).
When it comes time to evaluate, you will need to handle unary operators a little differently, since you only need to pop one number off the stack, rather than two. Depending on what your implementation looks like, it may be easier to just go through the list and replace every occurrence of "-u" with [-1, "*"].
If you can follow Python at all, you should be able to see everything I'm talking about in the example I linked to. I find the code to be a bit easier to read than the C version that someone else mentioned. Also, if you're curious, I did a little write-up a while back about using shunting-yard in Ruby, but I handled unary operators as a separate nonterminal, so they are not shown.

The answers to this question might be helpful.
In particular, one of those answers references a solution in C that handles unary minus.
Basically, you have to recognize a unary minus based on the appearance of the minus sign in positions where a binary operator can't be, and make a different token for it, as it has different precedence.
Dijkstra's original paper doesn't too clearly explain how he dealt with this, but the unary minus was listed as a separate operator.

This isn't in Java, but here is a library I wrote to specifically solve this problem after searching and not finding any clear answers.
This does all you want and more:
https://marginalhacks.com/Hacks/libExpr.rb/
It is a ruby library (as well as a testbench to check it) that runs a modified shunting yard algorithm that also supports unary ('-a') and ternary ('a?b:c') ops. It also does RPN, Prefix and AST (abstract syntax trees) - your choice, and can evaluate the expression, including the ability to yield to a block (a lambda of sorts) that can handle any variable evaluation. Only AST does the full set of operations, including the ability to handle short-circuit operations (such as '||' and '?:' and so on), but RPN does support unary. It also has a flexible precedence model that includes presets for precedence as done by C expressions or by Ruby expressions (not the same). The testbench itself is interesting as it can create random expressions which it can then eval() and also run through libExpr to compare results.
It's fairly documented/commented, so it shouldn't be too hard to convert the ideas to Java or some other language.
The basic idea as far as unary operators is that you can recognize them based on the previous token. If the previous token is either an operator or a left-paren, then the "unary-possible" operators (+ and -) are just unary and can be pushed with only one operand. It's important that your RPN stack distinguishes between the unary operator and the binary operator so it knows what to do on evaluation.

In your lexer, you can implement this pseudo-logic:
if (symbol == '-') {
if (previousToken is a number
OR previousToken is an identifier
OR previousToken is a function) {
currentToken = SUBTRACT;
} else {
currentToken = NEGATION;
}
}
You can set up negation to have a precedence higher than multiply and divide, but lower than exponentiation. You can also set it up to be right associative (just like '^').
Then you just need to integrate the precedence and associativity into the algorithm as described on Wikipedia's page.
If the token is an operator, o1, then: while there is an operator
token, o2, at the top of the stack, and either o1 is left-associative
and its precedence is less than or equal to that of o2, or o1 has
precedence less than that of o2, pop o2 off the stack, onto the output
queue; push o1 onto the stack.
I ended up implementing this corresponding code:
} else if (nextToken instanceof Operator) {
final Operator o1 = (Operator) nextToken;
while (!stack.isEmpty() && stack.peek() instanceof Operator) {
final Operator o2 = (Operator) stack.peek();
if ((o1.associativity == Associativity.LEFT && o1.precedence <= o2.precedence)
|| (o1.associativity == Associativity.RIGHT && o1.precedence < o2.precedence)) {
popStackTopToOutput();
} else {
break;
}
}
stack.push(nextToken);
}
Austin Taylor is quite right that you only need to pop off one number for a unary operator:
if (token is operator negate) {
operand = pop;
push operand * -1;
}
Example project:
https://github.com/Digipom/Calculator-for-Android/
Further reading:
http://en.wikipedia.org/wiki/Shunting-yard_algorithm
http://sankuru.biz/blog/1-parsing-object-oriented-expressions-with-dijkstras-shunting-yard-algorithm

I know it's an old post, but may be someone will find it useful .
I implemented this algorithm before, starting by toknizer using StreamTokenizer class
and it works fine. In StreamTokenizer in Java, there are some character with specific meaning. For example: ( is an operator, sin is a word,...
For your question, There is a method called "streamToknizer.ordinaryChar(..)" which it specifies that the character argument is "ordinary" in this tokenizer. It removes any special significance the character has as a comment character, word component, string delimiter, white space, or number character. Source here
So you can define - as ordinary character which means, it won't be considered as a sign for number.For example, if you have expression 2-3 , You will have [2,-,3], but if you didn't specify it as ordinary, so it will be [2,-3]

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.