Firstly I learnt that &, |, ^ are the bitwise operators, and now somebody mentioned them as logical operators with &&, ||, I am completely confused - the same operator has two names? There are already logical operators &&, ||, then why use &, |, ^?
The Java operators &, | and ^ are EITHER bitwise operators OR logical operators ... depending on the types of the operands. If the operands are integers, the operators are bitwise. If they are booleans, then the operators are logical.
And this is not just me saying this. The JLS describes these operators this way too; see JLS 15.22.
(This is just like + meaning EITHER addition OR string concatenation ... depending on the types of the operands. Or just like a "rose" meaning either a flower or a shower attachment. Or "cat" meaning either a furry animal or a UNIX command. Words mean different things in different contexts. And this is true for the symbols used in programming languages too.)
There are already logical operators &&, ||, why use &, |, ^?
In the case of the first two, it is because the operators have different semantics with regards to when / whether the operands get evaluated. The two different semantics are needed in different situations; e.g.
boolean res = str != null && str.isEmpty();
versus
boolean res = foo() & bar(); // ... if I >>need<< to call both methods.
The ^ operator has no short-circuit equivalent because it simply doesn't make sense to have one.
Having a language reference is one thing, interpreting it correctly is another.
We need to interpret things correctly.
Even if Java documented that & is both bitwise and logical, we could make an argument that & really didn't lost its logical-operator-ness mojo since time immemorial, since C. That is, & is first and foremost, an inherently logical operator(albeit a non-short-circuited one at that)
& parses lexically+logically as logical operation.
To prove the point, both of these lines behaves the same, eversince C and upto now(Java, C#, PHP, etc)
if (a == 1 && b)
if (a == 1 & b)
That is, the compiler will interpret those as these:
if ( (a == 1) && (b) )
if ( (a == 1) & (b) )
And even if both variables a and b are both integers. This...
if (a == 1 & b)
... will still be interpereted as:
if ( (a == 1) & (b) )
Hence, this will yield a compilation error on languages which doesn't facilitate integer/boolean duality, e.g. Java and C#:
if (a == 1 & b)
In fact, on the compilation error above, we could even make an argument that & didn't lost its logical(non-short-circuit) operation mojo, and we can conclude that Java continues the tradition of C making the & still a logical operation. Consequently, we could say it's the other way around, i.e. the & can be repurposed as bitwise operation (by applying parenthesis):
if ( a == (1 & b) )
So there we are, in another parallel universe, someone could ask, how to make the & expression become a bitmask operation.
How to make the following compile, I read in JLS that & is a bitwise
operation. Both a and b are integers, but it eludes me why the
following bitwise operation is a compilation error in Java:
if (a == 1 & b)
Or this kind of question:
Why the following didn't compile, I read in JLS that & is a bitwise
operation when both its operands are integers. Both a and b are
integers, but it eludes me why the following bitwise operation is a
compilation error in Java:
if (a == 1 & b)
In fact, I would not be surprised if there's already an existing stackoverflow questions similar to above questions that asked how to do that masking idiom in Java.
To make that logical operation interpretation by the language become bitwise, we have to do this (on all languages, C, Java, C#, PHP, etc):
if ( a == (1 & b) )
So to answer the question, it's not because JLS defined things such way, it's because Java(and other languages inspired by C)'s & operator is for all intents and purposes is still a logical operator, it retained C's syntax and semantics. It's the way it is since C, since time immemorial, since before I was even born.
Things just don't happen by chance, JLS 15.22 didn't happen by chance, there's a deep history around it.
In another parallel universe, where && was not introduced to the language, we will still be using & for logical operations, one might even ask a question today:
Is it true, we can use the logical operator & for bitwise operation?
& doesn't care if its operands are integers or not, booleans or not. It's still a logical operator, a non-short-circuited one. And in fact, the only way to force it to become a bitwise operator in Java(and even in C) is to put parenthesis around it. i.e.
if ( a == (1 & b) )
Think about it, if && was not introduced to C language(and any language who copied its syntax and semantics), anyone could be asking now:
how to use & for bitwise operations?
To sum it up, first and foremost Java & is inherently a logical operator(a non-short-circuited one), it doesn't care about its operands, it will do its business as usual(applying logical operation) even if both operands are integers(e.g. masking idiom). You can only force it to become bitwise operation by applying parenthesis. Java continues the C tradition
If Java's & really is a bitwise operation if its operands(integer 1 and integer variable b on example code below) are both integers, this should compile:
int b = 7;
int a = 1;
if (a == 1 & b) ...
They(& and |) were used for two purposes long time ago, logical operator and bitwise operator. If you'll check out the neonatal C (the language Java was patterned after), & and | were used as logical operator.
But since disambiguating the bitwise operations from logical operations in the same statement is very confusing, it prompted Dennis Ritchie to create a separate operator(&& and ||) for logical operator.
Check the Neonatal C section here: http://cm.bell-labs.com/who/dmr/chist.html
You can still use the bitwise operators as logical operators, its retained operator precedence is the evidence of that. Read out the history of bitwise operator's past life as logical operator on Neonatal C
Regarding the evidence, I made a blog post on comparing the logical operator and bitwise operator. It will be self-evident that the so called bitwise operators are still logical operators if you try contrasting them in an actual program: http://www.anicehumble.com/2012/05/operator-precedence-101.html
I also answered a question related to your question on What is the point of the logical operators in C?
So it's true, bitwise operators are logical operators too, albeit non-short-circuited version of short-circuited logical operators.
Regarding
There are already logical operators &&, ||, then why use &, |, ^?
The XOR can be easily answered, it's like a radio button, only one is allowed, code below returns false. Apology for the contrived code example below, the belief that drinking both beer and milk at the same time is bad was debunked already ;-)
String areYouDiabetic = "Yes";
String areYouEatingCarbohydrate = "Yes";
boolean isAllowed = areYouDiabetic == "Yes" ^ areYouEatingCarbohydrate == "Yes";
System.out.println("Allowed: " + isAllowed);
There's no short-circuit equivalent to XOR bitwise operator, as both sides of the expression are needed be evaluated.
Regarding why the need to use & and | bitwise operators as logical operators, frankly you'll be hard-pressed to find a need to use bitwise operators(a.k.a. non-short-circuit logical operators) as logical operators. A logical operation can be non-short-circuited (by using the bitwise operator, aka non-short-circuited logical operator) if you want to achieve some side effect and make your code compact(subjective), case in point:
while ( !password.isValid() & (attempts++ < MAX_ATTEMPTS) ) {
// re-prompt
}
The above can re-written as the following(removing the parenthesis), and still has exactly the same interpretation as the preceding code.
while ( !password.isValid() & attempts++ < MAX_ATTEMPTS ) {
// re-prompt
}
Removing the parenthesis and yet it still yields the same interpretation as the parenthesized one, can make the logical operator vestige of & more apparent. To run the risk of sounding superfluous, but I have to emphasize that the unparenthesized expression is not interpreted as this:
while ( ( !password.isValid() & attempts++ ) < MAX_ATTEMPTS ) {
// re-prompt
}
To sum it up, using & operator (more popularly known as bitwise operator only, but is actually both bitwise and logical(non-short-circuited)) for non-short-circuit logical operation to achieve side effect is clever(subjective), but is not encouraged, it's just one line of savings effect in exchange for readability.
Example sourced here: Reason for the exsistance of non-short-circuit logical operators
The Java type byte is signed which might be a problem for the bitwise operators. When negative bytes are extended to int or long, the sign bit is copied to all higher bits to keep the interpreted value. For example:
byte b1=(byte)0xFB; // that is -5
byte b2=2;
int i = b1 | b2<<8;
System.out.println((int)b1); // This prints -5
System.out.println(i); // This prints -5
Reason: (int)b1 is internally 0xFFFB and b2<<8 is 0x0200 so i will be 0xFFFB
Solution:
int i = (b1 & 0xFF) | (b2<<8 & 0xFF00);
System.out.println(i); // This prints 763 which is 0x2FB
Related
I've had my brain wrinkled from trying to understand the examples on this page:
http://answers.yahoo.com/question/index?qid=20091103170907AAxXYG9
More specifically this code:
int j = 4;
cout << j++ << j << ++j << endl;
gives an output: 566
Now this makes sense to me if the expression is evaluated right to left, however in Java a similar expression:
int j = 4;
System.out.print("" + (j++) + (j) + (++j));
gives an output of: 456
Which is more intuitive because this indicates it's been evaluated left to right. Researching this across various sites, it seems that with C++ the behaviour differs between compilers, but I'm still not convinced I understand. What's the explanation for this difference in evaluation between Java and C++? Thanks SO.
When an operation has side effects, C++ relies on sequence points rule to decide when side effects (such as increments, combined assignments, etc.) have to take effect. Logical and-then/or-else (&& and ||) operators, ternary ? question mark operators, and commas create sequence points; +, -, << and so on do not.
In contrast, Java completes side effects before proceeding with further evaluation.
When you use an expression with side effects multiple times in the absence of sequence points, the resulting behavior is undefined in C++. Any result is possible, including one that does not make logical sense.
Java guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right. The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. Java also guarantees that every operand of an operator (except the conditional operators &&, ||, and ?:) appears to be fully evaluated before any part of the operation itself is performed. See Java language specification §15.7 for more details.
C++, on the other hand, happily let's you get away with undefined behavior if the expression is ambiguous because language itself doesn't guarantee any order of evaluation of sub-expressions. See Sequence Point for more details.
In C++ the order of evalulations of subexpressions isn't left-to-right nor right-to-left. It is undefined.
This seems to be a stupid question since Java does short circuit, but I remembered how Android doesn't quite use Java in the same sense as I assume, so say in this bit of code I wrote:
... code omitted ...
else if (mimeType.equals("application/x-tar")
|| mimeType.equals("application/x-rar-compressed")
|| mimeType.equals("application/stuffit")
|| mimeType.equals("application/zip")
|| mimeType.equals("application/x-gzip"))
...would it be better for me to put the more common things (zip/rar) before the less common things (tarballs/gzip)?
The fact that I wasn't able to find a similar question on SO probably gives me the answer to this, but better safe than sorry.
Short circuiting is supported with ||.
If you are trying to optimize this case you should try putting each value in a static Set and then check to see if typeSet.contains(mimeType).
Yes, the || (conditional-or) operator is a short-circuit operator. To quote the Java Language Specification:
The || operator is like | (§15.22.2), but evaluates its right-hand operand only if the value of its left-hand operand is false. It is syntactically left-associative (it groups left-to-right). It is fully associative with respect to both side effects and result value; that is, for any expressions a, b, and c, evaluation of the expression ((a)||(b))||(c) produces the same result, with the same side effects occurring in the same order, as evaluation of the expression (a)||((b)||(c)).
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why do we usually use || not |, what is the difference?
Can I use single ampersand like & instead of a bitwise operator like &&? What kind of differences may arise and is there a specific example to illustrate this issue clearly?
The single & will check both conditions always. The double && will stop after the first condition if it evaluates to false. Using two is a way of "short circuiting" the condition check if you truly only need 1 condition of 2 to be true or false.
An example could be:
if (myString != null && myString.equals("testing"))
If myString is null, the first condition would fail and it would not bother to check the value of it. This is one way you can avoid null pointers.
with & and ¦ both operands are always evaluated
with && and ¦¦ the second operand is only evaluated when it is necessary
Here's a link to a page with a pretty good explanation of how to use these.
Java Quick Reference on Operators
The other answers are correct to an extent, but don't explain the whole thing.
First off, the doubled operators, && and ||, function sort of like nested if statements:
if (expression_a && expression_b) {
do_something;
}
is equivalent to
if (expression_a) {
if (expression_b) {
do_something;
}
}
In both cases, if expression_a evaluates to false then expression_b will not be evaluated -- a feature referred to as "short-circuiting". (The || case is similar but a hair more complicated.)
Additionally, in Java (but not in C/C++) the && and || operators apply only to boolean values -- you cannot use && or || on an int, eg.
The single operators, & and |, on the other hand, are relatively "pure" operators (commutative and associative with respect to themselves), with none of the "short-circuiting" of the double operators. Additionally, they can operate on any integer type -- boolean, char, byte, short, int, long. They perform a bit-by-bit operation -- bit N of the left operand is ANDed or ORed with bit N of the right operand to produce the Nth bit in a result value that is the same bit width as the two operands (after they are widened as appropriate for binary operators). In this regard, their operation with boolean is just the degenerate case (albeit one that is somewhat special-cased).
Normally, one would use only the doubled operators for combining boolean expressions in an if statement. There is no great harm in using the single operators if the boolean expressions involved are "safe" (cannot result in a null pointer exception, eg), but the doubled operators tend to be slightly more efficient and the short-circuiting is often desired (as in if (a != null && a.b == 5), eg), so it's generally wise to cultivate the habit of using the doubled forms. The only thing to beware of is that if you want the second expression to be evaluated (for it's side-effects), the doubled operator will not guarantee this happens.
This question already has answers here:
Closed 11 years ago.
Possible Duplicates:
logical operators in java
What's the difference between | and || in Java?
As the title says, I need to know the difference between & operator and && operator.
Can anyone help me in simple words.
How do they differ from each other?
And which one to be used in a IF statement?
There are in fact three "and" operators:
a && b, with a and b being boolean: evaluate a. If true, evaluate b. If true, result is true. Otherwise, result is false. (I.e. b is not evaluated if a is not true.)
a & b, with a and b being boolean: evaluate both, do logical and (i.e. true only if both are true).
a & b, where a and b are both integral types (int, long, short, char, byte): evaluate a and b, and do a bitwise AND.
The second one can be viewed as a special type of the third one, if one sees boolean as a one-bit integral type ;-)
As the top-level condition of an if-statement you can use the first two, but the first one is most likely useful. (I.e. there are not many cases where you really need the second and the first would do something wrong, but the other way around is more common. In most cases the first is simply a little bit faster.)
If the first result is FALSE, && doesn't evaluate the second result. & does.
&& only evaluates the second expression if the first operation is true. & evaluates both expressions (even if the first expression is false and there is no point in evaluating the second expression). Hence && is a tiny bit faster than & in logical operations. Hence && is also known as short-circuit and.
& can also be used as a bitwise operator in addition to a logical operator. So you can do a 'Bit And' of 2 numbers like for example
int result = 1 & 3; // will evaluate to 1
&& cannot be used as a bit-and operator.
For conditional IF operator, use &&. It is a tiny bit faster than just &.
I have successfully implemented a shunting yard algorithm in java. The algorithm itself was simple however I am having trouble with the tokenizer. Currently the algorithm works with everything I want excluding one thing. How can I tell the difference between subtraction(-) and negative (-)
such as 4-3 is subtraction
but -4+3 is negative
I now know how to find out when it should be a negative and when it should be a minus, but where in the algorithm should it be placed because if you use it like a function it wont always work for example
3 + 4 * 2 / -( 1 − 5 ) ^ 2 ^ 3
when 1-5 becomes -4 it will become 4 before it gets squared and cubed
just like
3 + 4 * 2 / cos( 1 − 5 ) ^ 2 ^ 3 , you would take the cosine before squaring and cubing
but in real math you wouldn’t with a - because what your really saying is
3 + 4 * 2 / -(( 1 − 5 ) ^ 2 ^ 3) in order to have the right value
It sounds like you're doing a lex-then-parse style parser, where you're going to need a simple state machine in the lexer in order to get separate tokens for unary and binary minus. (In a PEG parser, this isn't something you have to worry about.)
In JavaCC, you would have a DEFAULT state, where you would consider the - character to be UNARY_MINUS. When you tokenized the end of a primary expression (either a closing paren, or an integer, based on the examples you gave), then you would switch to the INFIX state where - would be considered to be INFIX_MINUS. Once you encountered any infix operator, you would return to the DEFAULT state.
If you're rolling your own, it might be a bit simpler than that. Look at this Python code for a clever way of doing it. Basically, when you encounter a -, you just check to see if the previous token was an infix operator. That example uses the string "-u" to represent the unary minus token, which is convenient for an informal tokenization. Best I can tell, the Python example does fail to handle case where a - follows an open paren, or comes at the beginning of the input. Those should be considered unary as well.
In order for unary minus to be handled correctly in the shunting-yard algorithm itself, it needs to have higher precedence than any of the infix operators, and it needs to marked as right-associative. (Make sure you handle right-associativity. You may have left it out since the rest of your operators are left-associative.) This is clear enough in the Python code (although I would use some kind of struct rather than two separate maps).
When it comes time to evaluate, you will need to handle unary operators a little differently, since you only need to pop one number off the stack, rather than two. Depending on what your implementation looks like, it may be easier to just go through the list and replace every occurrence of "-u" with [-1, "*"].
If you can follow Python at all, you should be able to see everything I'm talking about in the example I linked to. I find the code to be a bit easier to read than the C version that someone else mentioned. Also, if you're curious, I did a little write-up a while back about using shunting-yard in Ruby, but I handled unary operators as a separate nonterminal, so they are not shown.
The answers to this question might be helpful.
In particular, one of those answers references a solution in C that handles unary minus.
Basically, you have to recognize a unary minus based on the appearance of the minus sign in positions where a binary operator can't be, and make a different token for it, as it has different precedence.
Dijkstra's original paper doesn't too clearly explain how he dealt with this, but the unary minus was listed as a separate operator.
This isn't in Java, but here is a library I wrote to specifically solve this problem after searching and not finding any clear answers.
This does all you want and more:
https://marginalhacks.com/Hacks/libExpr.rb/
It is a ruby library (as well as a testbench to check it) that runs a modified shunting yard algorithm that also supports unary ('-a') and ternary ('a?b:c') ops. It also does RPN, Prefix and AST (abstract syntax trees) - your choice, and can evaluate the expression, including the ability to yield to a block (a lambda of sorts) that can handle any variable evaluation. Only AST does the full set of operations, including the ability to handle short-circuit operations (such as '||' and '?:' and so on), but RPN does support unary. It also has a flexible precedence model that includes presets for precedence as done by C expressions or by Ruby expressions (not the same). The testbench itself is interesting as it can create random expressions which it can then eval() and also run through libExpr to compare results.
It's fairly documented/commented, so it shouldn't be too hard to convert the ideas to Java or some other language.
The basic idea as far as unary operators is that you can recognize them based on the previous token. If the previous token is either an operator or a left-paren, then the "unary-possible" operators (+ and -) are just unary and can be pushed with only one operand. It's important that your RPN stack distinguishes between the unary operator and the binary operator so it knows what to do on evaluation.
In your lexer, you can implement this pseudo-logic:
if (symbol == '-') {
if (previousToken is a number
OR previousToken is an identifier
OR previousToken is a function) {
currentToken = SUBTRACT;
} else {
currentToken = NEGATION;
}
}
You can set up negation to have a precedence higher than multiply and divide, but lower than exponentiation. You can also set it up to be right associative (just like '^').
Then you just need to integrate the precedence and associativity into the algorithm as described on Wikipedia's page.
If the token is an operator, o1, then: while there is an operator
token, o2, at the top of the stack, and either o1 is left-associative
and its precedence is less than or equal to that of o2, or o1 has
precedence less than that of o2, pop o2 off the stack, onto the output
queue; push o1 onto the stack.
I ended up implementing this corresponding code:
} else if (nextToken instanceof Operator) {
final Operator o1 = (Operator) nextToken;
while (!stack.isEmpty() && stack.peek() instanceof Operator) {
final Operator o2 = (Operator) stack.peek();
if ((o1.associativity == Associativity.LEFT && o1.precedence <= o2.precedence)
|| (o1.associativity == Associativity.RIGHT && o1.precedence < o2.precedence)) {
popStackTopToOutput();
} else {
break;
}
}
stack.push(nextToken);
}
Austin Taylor is quite right that you only need to pop off one number for a unary operator:
if (token is operator negate) {
operand = pop;
push operand * -1;
}
Example project:
https://github.com/Digipom/Calculator-for-Android/
Further reading:
http://en.wikipedia.org/wiki/Shunting-yard_algorithm
http://sankuru.biz/blog/1-parsing-object-oriented-expressions-with-dijkstras-shunting-yard-algorithm
I know it's an old post, but may be someone will find it useful .
I implemented this algorithm before, starting by toknizer using StreamTokenizer class
and it works fine. In StreamTokenizer in Java, there are some character with specific meaning. For example: ( is an operator, sin is a word,...
For your question, There is a method called "streamToknizer.ordinaryChar(..)" which it specifies that the character argument is "ordinary" in this tokenizer. It removes any special significance the character has as a comment character, word component, string delimiter, white space, or number character. Source here
So you can define - as ordinary character which means, it won't be considered as a sign for number.For example, if you have expression 2-3 , You will have [2,-,3], but if you didn't specify it as ordinary, so it will be [2,-3]