Why operator precedence being ignored here? [duplicate] - java

I am reading some Java text and got the following code:
int[] a = {4,4};
int b = 1;
a[b] = b = 0;
In the text, the author did not give a clear explanation and the effect of the last line is: a[1] = 0;
I am not so sure that I understand: how did the evaluation happen?

Let me say this very clearly, because people misunderstand this all the time:
Order of evaluation of subexpressions is independent of both associativity and precedence. Associativity and precedence determine in what order the operators are executed but do not determine in what order the subexpressions are evaluated. Your question is about the order in which subexpressions are evaluated.
Consider A() + B() + C() * D(). Multiplication is higher precedence than addition, and addition is left-associative, so this is equivalent to (A() + B()) + (C() * D()) But knowing that only tells you that the first addition will happen before the second addition, and that the multiplication will happen before the second addition. It does not tell you in what order A(), B(), C() and D() will be called! (It also does not tell you whether the multiplication happens before or after the first addition.) It would be perfectly possible to obey the rules of precedence and associativity by compiling this as:
d = D() // these four computations can happen in any order
b = B()
c = C()
a = A()
sum = a + b // these two computations can happen in any order
product = c * d
result = sum + product // this has to happen last
All the rules of precedence and associativity are followed there -- the first addition happens before the second addition, and the multiplication happens before the second addition. Clearly we can do the calls to A(), B(), C() and D() in any order and still obey the rules of precedence and associativity!
We need a rule unrelated to the rules of precedence and associativity to explain the order in which the subexpressions are evaluated. The relevant rule in Java (and C#) is "subexpressions are evaluated left to right". Since A() appears to the left of C(), A() is evaluated first, regardless of the fact that C() is involved in a multiplication and A() is involved only in an addition.
So now you have enough information to answer your question. In a[b] = b = 0 the rules of associativity say that this is a[b] = (b = 0); but that does not mean that the b=0 runs first! The rules of precedence say that indexing is higher precedence than assignment, but that does not mean that the indexer runs before the rightmost assignment.
(UPDATE: An earlier version of this answer had some small and practically unimportant omissions in the section which follows which I have corrected. I've also written a blog article describing why these rules are sensible in Java and C# here: https://ericlippert.com/2019/01/18/indexer-error-cases/)
Precedence and associativity only tell us that the assignment of zero to b must happen before the assignment to a[b], because the assignment of zero computes the value that is assigned in the indexing operation. Precedence and associativity alone say nothing about whether the a[b] is evaluated before or after the b=0.
Again, this is just the same as: A()[B()] = C() -- All we know is that the indexing has to happen before the assignment. We don't know whether A(), B(), or C() runs first based on precedence and associativity. We need another rule to tell us that.
The rule is, again, "when you have a choice about what to do first, always go left to right". However, there is an interesting wrinkle in this specific scenario. Is the side effect of a thrown exception caused by a null collection or out-of-range index considered part of the computation of the left side of the assignment, or part of the computation of the assignment itself? Java chooses the latter. (Of course, this is a distinction that only matters if the code is already wrong, because correct code does not dereference null or pass a bad index in the first place.)
So what happens?
The a[b] is to the left of the b=0, so the a[b] runs first, resulting in a[1]. However, checking the validity of this indexing operation is delayed.
Then the b=0 happens.
Then the verification that a is valid and a[1] is in range happens
The assignment of the value to a[1] happens last.
So, though in this specific case there are some subtleties to consider for those rare error cases that should not be occurring in correct code in the first place, in general you can reason: things to the left happen before things to the right. That's the rule you're looking for. Talk of precedence and associativity is both confusing and irrelevant.
People get this stuff wrong all the time, even people who should know better. I have edited far too many programming books that stated the rules incorrectly, so it is no surprise that lots of people have completely incorrect beliefs about the relationship between precedence/associativity, and evaluation order -- namely, that in reality there is no such relationship; they are independent.
If this topic interests you, see my articles on the subject for further reading:
http://blogs.msdn.com/b/ericlippert/archive/tags/precedence/
They are about C#, but most of this stuff applies equally well to Java.

Eric Lippert's masterful answer is nonetheless not properly helpful because it is talking about a different language. This is Java, where the Java Language Specification is the definitive description of the semantics. In particular, §15.26.1 is relevant because that describes the evaluation order for the = operator (we all know that it is right-associative, yes?). Cutting it down a little to the bits that we care about in this question:
If the left-hand operand expression is an array access expression (§15.13), then many steps are required:
First, the array reference subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason; the index subexpression (of the left-hand operand array access expression) and the right-hand operand are not evaluated and no assignment occurs.
Otherwise, the index subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and the right-hand operand is not evaluated and no assignment occurs.
Otherwise, the right-hand operand is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and no assignment occurs.
[… it then goes on to describe the actual meaning of the assignment itself, which we can ignore here for brevity …]
In short, Java has a very closely defined evaluation order that is pretty much exactly left-to-right within the arguments to any operator or method call. Array assignments are one of the more complex cases, but even there it's still L2R. (The JLS does recommend that you don't write code that needs these sorts of complex semantic constraints, and so do I: you can get into more than enough trouble with just one assignment per statement!)
C and C++ are definitely different to Java in this area: their language definitions leave evaluation order undefined deliberately to enable more optimizations. C# is like Java apparently, but I don't know its literature well enough to be able to point to the formal definition. (This really varies by language though, Ruby is strictly L2R, as is Tcl — though that lacks an assignment operator per se for reasons not relevant here — and Python is L2R but R2L in respect of assignment, which I find odd but there you go.)

a[b] = b = 0;
1) array indexing operator has higher precedence then assignment operator (see this answer):
(a[b]) = b = 0;
2) According to 15.26. Assignment Operators of JLS
There are 12 assignment operators; all are syntactically right-associative (they group right-to-left). Thus, a=b=c means a=(b=c), which assigns the value of c to b and then assigns the value of b to a.
(a[b]) = (b=0);
3) According to 15.7. Evaluation Order of JLS
The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.
and
The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated.
So:
a) (a[b]) evaluated first to a[1]
b) then (b=0) evaluated to 0
c) (a[1] = 0) evaluated last

Your code is equivalent to:
int[] a = {4,4};
int b = 1;
c = b;
b = 0;
a[c] = b;
which explains the result.

Consider another more in-depth example below.
As a General Rule of Thumb:
It's best to have a table of the Order of Precedence Rules and Associativity available to read when solving these questions e.g. http://introcs.cs.princeton.edu/java/11precedence/
Here is a good example:
System.out.println(3+100/10*2-13);
Question: What's the Output of the above Line?
Answer: Apply the Rules of Precedence and Associativity
Step 1: According to rules of precedence: / and * operators take priority over + - operators. Therefore the starting point to execute this equation will the narrowed to:
100/10*2
Step 2: According to the rules and precedence: / and * are equal in precedence.
As / and * operators are equal in precedence, we need to look at the associativity between those operators.
According to the ASSOCIATIVITY RULES of these two particular operators,
we start executing the equation from the LEFT TO RIGHT i.e. 100/10 gets executed first:
100/10*2
=100/10
=10*2
=20
Step 3: The equation is now in the following state of execution:
=3+20-13
According to the rules and precedence: + and - are equal in precedence.
We now need to look at the associativity between the operators + and - operators. According to the associativity of these two particular operators,
we start executing the equation from the LEFT to RIGHT i.e. 3+20 gets executed first:
=3+20
=23
=23-13
=10
10 is the correct output when compiled
Again, it is important to have a table of the Order of Precedence Rules and Associativity with you when solving these questions e.g. http://introcs.cs.princeton.edu/java/11precedence/

Related

Does Java have an operator precedence table? [duplicate]

Misunderstanding Java operator precedence is a source of frequently asked questions and subtle errors. I was intrigued to learn that even the Java Language Specification says, "It is recommended that code not rely crucially on this specification." JLS §15.7 Preferring clear to clever, are there any useful guidelines in this area?
As noted here, this problem should be studied in the context of Evaluation Order, detailed here. Here are a number of resources on the topic:
JLS Operators
JLS Precedence
JLS Evaluation Order
What are the rules for evaluation order in Java?
Java Glossary
Princeton
Oracle Tutorial
Conversions and Promotions
Usenet discussion
Additions or corrections welcome.
As far as the "Real World" is concerned, it's probably fair to say:
enough programmers know that multiplication/division take precedence over addition/subtraction, as is mathematically the convention
hardly any programmers can remember any of the other rules of precedence
So, apart from the specific case of */ vs +-, I'd really just use brackets to explicitly define the precedence intended.
Another related source of bugs is how rounding errors accumulate. Not an operator precedence order issue per se, but a source of surprise when you get a different result after rearranging operands in an arithmetically-equivalent way. Here's a sun.com version of David Goldberg's What Every Computer Scientist Should Know About Floating-Point Arithmetic.
The quote (from the Java Language Specification §15.7) should be read in the context of Evaluation Order. As discussed here, that section concerns evaluation order, which is not related to operator precedence (or associativity).
Precedence and associativity influence the structure of the expression tree (i.e. which operators act on which operands), while "evaluation order" merely influences the order in which the expression tree is traversed when the expression is evaluated. Evaluation order (or "traversal order") does not have any effect unless some sub-expressions have side-effects that affect the result (or side-effects) of other sub-expressions.
For example, if x==1 initially, the expression ++x/++x would evaluate as 2/3 (which evaluates to 0) since Java has left-to-right evaluation order. Had evaluation order in Java been right-to-left, x would have been incremented twice before the numerator is evaluated, and the expression would have evaluated as 3/2 (which evaluates to 1). Had evaluation order been undefined, the expression could have evaluated to either of these results.
The quote in question, together with its context, ...
The Java programming language guarantees that the operands of
operators appear to be evaluated in a specific evaluation order,
namely, from left to right.
It is recommended that code not rely crucially on this specification.
Code is usually clearer when each expression contains at most one side
effect, as its outermost operation
...discourages the reader from depending on the left-to-rightness of Java's evaluation order (as in the example above). It does not encourage unnecessary parentheses.
Edit: Resource: Java operator precedence table that also serves as an index into sections of the JLS containing the syntactic grammar from which each precedence level is inferred.
Also, don't forget the logical && and || are shortcut operators, avoid something like:
sideeffect1() || sideeffect2()
If sideeffect1() is evaluating to true, sideeffect2() isn't going to be executed. The same goes for && and false. This is not entirely related to precendence, but in these extreme cases the assiociativity can also be an important aspect that normally is really irrelevant (at least as far as i am concerned)
The JLS does not give an explicit operator precedence table; it is implied when the JLS describes various operators. For example, the grammar for ShiftExpression is this:
ShiftExpression:
AdditiveExpression
ShiftExpression << AdditiveExpression
ShiftExpression >> AdditiveExpression
ShiftExpression >>> AdditiveExpression
This means that additive operators (+ and -) have a higher precedence than the left-associative shift operators (<<, >> and >>>).
It seems to me that the truth of this is that 'most programmers' think 'most other programmers' don't know or can't remember operator precedence, so they indulge in what is indulgently called 'defensive programming' by 'inserting the missing parentheses', just to 'clarify' that. Whether remembering this third-grade stuff is a real problem is another question. It just as arguable that all this is a complete waste of time and if anything makes things worse. My own view is that redundant syntax is to be avoided wherever possible, and that computer programmers should know the language they are programming in, and also perhaps raise their expectations of their colleagues.

Arrow (->) operator precedence/priority is lowest, or priority of assignment/combined assignment is lowest?

JLS:
The lowest precedence operator is the arrow of a lambda expression
(->), followed by the assignment operators.
Followed in which direction (increasing priority, decreasing priority)? - "followed" means assignment has higher priority or lower priority (with respect to arrow operator)? I guess, in increasing, because "lowest" (for arrow) means absolutely lowest.
As I understand, arrow (->) should be at the very bottom of this Princeton operator precedence table (that is below all assignment operators), thus arrow (->) having 0 (zero) priority Level (as per that table).
Am I correct in my understanding?
ExamTray seems to say that arrow priority is at least same as assignment... Plus clarified that arrow associativity is Left->To->Right (unlike assignment). I didn't find any JLS quote for arrow associativity.
I always used to think that assignment priority is principally lowest for a reason.
Note the sentence preceding the quoted JLS text:
Precedence among operators is managed by a hierarchy of grammar productions.
The grammar of the Java language determines which constructs are possible and implicitly, the operator precedence.
Even the princeton table you’ve linked states:
There is no explicit operator precedence table in the Java Language Specification. Different tables on the web and in textbooks disagree in some minor ways.
So, the grammar of the Java language doesn’t allow lambda expressions to the left of an assignment operator and likewise, doesn’t allow assignments to the left of the ->. So there’s no ambiguity between these operators possible and the precedence rule, though explicitly stated in the JLS, becomes meaningless.
This allows to compile, e.g. such a gem, without ambiguity:
static Consumer<String> C;
static String S;
public static void main(String[] args)
{
Runnable r;
r = () -> C = s -> S = s;
}
First, let's explain the practical issue here.
Assuming you have a definition like
IntUnaryOperator op;
The following is syntactically accepted, and works as expected:
op = x -> x;
That is, we have an identity function on int assigned to the op variable. But if = had a higher priority, we'd expect Java to interpret this as
(op = x) -> x;
Which is not syntactically valid, thus should be a compile error. Hence, assignment does not, in practice, have higher precedence than the arrow.
But the following is also OK (assume t is a class/instance variable of type int):
op = x -> t = x;
This compiles, and the function, if applied, assigns the value of the operand to t and also returns it.
This means that the arrow doesn't have higher precedence than the assignment t = x. Otherwise it would have been interpreted as
op = ( x -> t ) = x
and clearly, this is not what happens.
So it seems that the operations have equal precedence. What's more, that they are right-associative. This is implied from the grammar at JLS chapter 19:
Expression:
LambdaExpression
AssignmentExpression
LambdaExpression:
LambdaParameters -> LambdaBody
...
LambdaBody:
Expression
Block
So the right side of the lambda body gets us back to Expression, which means we can either have a (higher priority) lambda inside it, or a (higher priority) assignment in it. What I mean by "higher priority" is that the deeper you go through the production rules, the earlier the expression will be evaluated.
The same is true for the assignment operator:
AssignmentExpression:
ConditionalExpression
Assignment
Assignment:
LeftHandSide AssignmentOperator Expression
Once again, the right side of the assignment throws us back to Expression, so we can have a lambda expression or an assignment there.
So rather than relying on the JLS text, the grammar gives us a well defined description of the situation.

Execution order of f1() + f2()*f3() expression and operator precedence in JLS

Given an expression f1() + f2()*f3() with 3 method calls, java evaluates (operands of) addition operation first:
int result = f1() + f2()*f3();
f1 working
f2 working
f3 working
I (wrongly) expected f2() to be called first, then f3(), and finally f1(). Because multiplication shall be evaluated before addition.
So, I don't understand JLS here - what am I missing?
15.7.3. Evaluation Respects Parentheses and Precedence
The Java programming language respects the order of evaluation
indicated explicitly by parentheses and implicitly by operator
precedence.
How exactly operator precedence is respected in this example?
JLS mentions some exceptions in 15.7.5. Evaluation Order for Other Expressions (method invocation expressions (§15.12.4),method reference expressions (§15.13.3)), but I cannot apply any of those exceptions to my example.
I understand that JLS guarantees that operands of a binary operation are evaluated left-to-right. But in my example in order to understand method invocation sequence it is necessary to understand which operation (and both its operands respectively!) is considered first. Am I wrong here - why?
More examples:
int result = f1() + f2()*f3() + f4() + f5()*f6() + (f7()+f8());
proudly produces:
f1 working
f2 working
f3 working
f4 working
f5 working
f6 working
f7 working
f8 working
Which is 100% left-to-right, totally regardless of operator precedence.
int result = f1() + f2() + f3()*f4();
produces:
f1 working
f2 working
f3 working
f4 working
Here the compiler had two options:
f1()+f2() - addition first
f3()*f4() - multiplication first
But the "evaluate-left-to-right rule" works here as a charm!
This link implies that in such examples method calls are always evaluated left-to-right regardless of any parenthesis and associativity rules.
You're missing the text immediately under 15.7:
The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.
The values of the operands of an expression are evaluated before the expression itself. In your specific case, The outermost expression is f1() + [f2() * f3()]. We must first evaluate f1(), then we must evaluate f2() * f3(). Within that, f2() is evaluated first, then f3(), then the multiplication (which returns the value for the operand of the +).
JLS §15.17.2. Evaluate Operands before Operation:
The Java programming language guarantees that every operand of an
operator (except the conditional operators &&, ||, and ? :) appears to
be fully evaluated before any part ofthe operation itself is
performed.
Thus, f1(), f2() and f3() are evaluated first (left-to-right). Then, operator precedence is applied.
After all, you could observe the execution order in the bytecode, which in my case is:
[...]
INVOKESTATIC Main.f1 ()I
INVOKESTATIC Main.f2 ()I
INVOKESTATIC Main.f3 ()I
IMUL
IADD
Here is how the Expression Tree would look like:
+
/ \
/ \
f1 *
/ \
/ \
f2 f3
Leaves are evaluated first and the operator precedence is encoded in the tree itself.
Because multiplication shall be evaluated before addition.
As you seem to mean it, that's not a reasonable description of the precedence of multiplication over addition, nor of the effect of section 15.7.3. Precedence says that your expression
f1() + f2()*f3()
is evaluated the same as
f1() + (f2() * f3())
, as opposed to the same as
(f1() + f2()) * f3()
Precedence does not speak to which operand of the + operator is evaluated first, regardless of the nature of the operands. It speaks instead to what each operand is. This does impact order of evaluation, in that it follows from precedence rules that the result of the multiplication in your original expression must be computed before the result of the addition, but that does not speak directly to the question of when the values of two factors are evaluated relative to each other or to other sub-expressions.
Different rules than the one you quoted are responsible for the order in which the two operands of a + or * operator are evaluated -- always the left operand first and the right operand second.
How exactly operator precedence is respected in this example?
The example's output does not convey information about whether precedence -- understood correctly -- is respected. The same output would be observed for both of the two parenthesized alternatives above, and precedence is about which one of those two is equivalent to the original expression.
In an order-of-operations sense, precedence says (only) that the right-hand operand of the addition is f2()*f3(), which means that product must be computed before the overall sum can be. JLS 15.7.3 isn't saying anything more than or different from that. That that sub-expression is a multiplication has no bearing on its order of evaluation relative to the other operand of the addition operation.
I understand that JLS guarantees that operands of a binary operation are evaluated left-to-right. But in my example in order to
understand method invocation sequence it is necessary to understand
which operation (and both its operands respectively!) is considered
first. Am I wrong here - why?
Yes, you are wrong, and JLS notwithstanding, your examples provide very good evidence for that.
The problem seems to be the "and both its operands respectively" bit. Your idea seems to be roughly that Java should look through the expression, find all the multiplication operations and fully evaluate them, including their operands, and only afterward go back and do the same for the addition operations. But that is not required to observe precedence, and it is not consistent with the left-to-right evaluation order of addition operations, and it would make Java code much harder for humans to write or read.
Let's consider your longer example:
int result = f1() + f2()*f3() + f4() + f5()*f6() + (f7()+f8());
Evaluation proceeds as follows:
f1() is evaluated first, because it is the left operand of the first operation, and the the operation's operands must be evaluated left-to-right
f2() is evaluated next, because it is part of the right operand to the addition (whose turn it is to be evaluated), and specifically the left operand of the multiplication
f3() is evaluated next, because the multiplication sub-expression to which it belongs is being evaluated, and it is the right operand of that multiplication. Precedence affects the order of operations here by establishing that in fact it is the product that is being evaluated. If that were not so, then instead, the result of f2() would be added to the result of f1() at this point, before f3() was computed.
the product of f2() and f3() is computed, in accordance with the precedence of multiplication over addition.
The sum of (previously evaluated) f1() and f2()*f3() is computed. This arises from a combination of precedence of * over + already discussed, and the left-to-right associativity of +, which establishes that the sum is the left operand of the next addition operation, and therefore must be evaluated before that operation's right operand.
f4() is evaluated, because it is the right operand of the addition operation being evaluated at that point.
The second sum is computed. This is again because of the left-to-right associativity of +, and the left-to-right evaluation order of the next addition operation.
f5() then f6() then their product is evaluated. This is completely analogous to the case of f2()*f3(), and thus is another exercise of operator precedence.
the result of f5()*f6(), as the right-hand operand of an addition operation, is added to the left-hand operand. This is again analogous to previous steps.
f7() is evaluated.
Because of the parentheses, f8() is evaluated next, and then its sum with the previously-determined result of evaluating f7(), and then that result is added to the preceding result to complete the computation. Without the parentheses, the order would be different: the result of f7() would be added to the preceding result, then f8() computed, then that result added.
The most useful tip in my opinion was found in this discussion
JLS is not so clear on that - section 15.7 about evaluation speaks only about operands of a single operation, leaving unclear real-life complex cases with many operations combined.

Why x=x++ gives different results for C and Java? [duplicate]

What are "sequence points"?
What is the relation between undefined behaviour and sequence points?
I often use funny and convoluted expressions like a[++i] = i;, to make myself feel better. Why should I stop using them?
If you've read this, be sure to visit the follow-up question Undefined behavior and sequence points reloaded.
(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)
C++98 and C++03
This answer is for the older versions of the C++ standard. The C++11 and C++14 versions of the standard do not formally contain 'sequence points'; operations are 'sequenced before' or 'unsequenced' or 'indeterminately sequenced' instead. The net effect is essentially the same, but the terminology is different.
Disclaimer : Okay. This answer is a bit long. So have patience while reading it. If you already know these things, reading them again won't make you crazy.
Pre-requisites : An elementary knowledge of C++ Standard
What are Sequence Points?
The Standard says
At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)
Side effects? What are side effects?
Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).
For example:
int x = y++; //where y is also an int
In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.
So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:
Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.
What are the common sequence points listed in the C++ Standard?
Those are:
at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)1
Example :
int a = 5; // ; is a sequence point here
in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18) 2
a && b (§5.14)
a || b (§5.15)
a ? b : c (§5.16)
a , b (§5.18) (here a , b is a comma operator; in func(a,a++) , is not a comma operator, it's merely a separator between the arguments a and a++. Thus the behaviour is undefined in that case (if a is considered to be a primitive type))
at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which
takes place before execution of any expressions or statements in the function body (§1.9/17).
1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument
2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.
What is Undefined Behaviour?
The Standard defines Undefined Behaviour in Section §1.3.12 as
behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.
Undefined behavior may also be expected when this
International Standard omits the description of any explicit definition of behavior.
3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.
What is the relation between Undefined Behaviour and Sequence Points?
Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.
You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.
For example:
int x = 5, y = 6;
int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.
Another example here.
Now the Standard in §5/4 says
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
What does it mean?
Informally it means that between two sequence points a variable must not be modified more than once.
In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.
From the above sentence the following expressions invoke Undefined Behaviour:
i++ * ++i; // UB, i is modified more than once btw two SPs
i = ++i; // UB, same as above
++i = 2; // UB, same as above
i = ++i + 1; // UB, same as above
++++++i; // UB, parsed as (++(++(++i)))
i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)
But the following expressions are fine:
i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i); // well defined
int j = i;
j = (++i, i++, j*i); // well defined
Furthermore, the prior value shall be accessed only to determine the value to be stored.
What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.
For example in i = i + 1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.
This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.
Example 1:
std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2
Example 2:
a[i] = i++ // or a[++i] = i or a[i++] = ++i etc
is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.
Example 3 :
int x = i + i++ ;// Similar to above
Follow up answer for C++11 here.
This is a follow up to my previous answer and contains C++11 related material..
Pre-requisites : An elementary knowledge of Relations (Mathematics).
Is it true that there are no Sequence Points in C++11?
Yes! This is very true.
Sequence Points have been replaced by Sequenced Before and Sequenced After (and Unsequenced and Indeterminately Sequenced) relations in C++11.
What exactly is this 'Sequenced before' thing?
Sequenced Before(§1.9/13) is a relation which is:
Asymmetric
Transitive
between evaluations executed by a single thread and induces a strict partial order1
Formally it means given any two evaluations(See below) A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. If A is not sequenced before B and B is not sequenced before A, then A and B are unsequenced 2.
Evaluations A and B are indeterminately sequenced when either A is sequenced before B or B is sequenced before A, but it is unspecified which3.
[NOTES]
1 : A strict partial order is a binary relation "<" over a set P which is asymmetric, and transitive, i.e., for all a, b, and c in P, we have that:
........(i). if a < b then ¬ (b < a) (asymmetry);
........(ii). if a < b and b < c then a < c (transitivity).
2 : The execution of unsequenced evaluations can overlap.
3 : Indeterminately sequenced evaluations cannot overlap, but either could be executed first.
What is the meaning of the word 'evaluation' in context of C++11?
In C++11, evaluation of an expression (or a sub-expression) in general includes:
value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and
initiation of side effects.
Now (§1.9/14) says:
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.
Trivial example:
int x;
x = 10;
++x;
Value computation and side effect associated with ++x is sequenced after the value computation and side effect of x = 10;
So there must be some relation between Undefined Behaviour and the above-mentioned things, right?
Yes! Right.
In (§1.9/15) it has been mentioned that
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced4.
For example :
int main()
{
int num = 19 ;
num = (num << 3) + (num >> 3);
}
Evaluation of operands of + operator are unsequenced relative to each other.
Evaluation of operands of << and >> operators are unsequenced relative to each other.
4: In an expression that is evaluated more than once during the execution
of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations.
(§1.9/15)
The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator.
That means in x + y the value computation of x and y are sequenced before the value computation of (x + y).
More importantly
(§1.9/15) If a side effect on a scalar object is unsequenced relative to either
(a) another side effect on the same scalar object
or
(b) a value computation using the value of the same scalar object.
the behaviour is undefined.
Examples:
int i = 5, v[10] = { };
void f(int, int);
i = i++ * ++i; // Undefined Behaviour
i = ++i + i++; // Undefined Behaviour
i = ++i + ++i; // Undefined Behaviour
i = v[i++]; // Undefined Behaviour
i = v[++i]: // Well-defined Behavior
i = i++ + 1; // Undefined Behaviour
i = ++i + 1; // Well-defined Behaviour
++++i; // Well-defined Behaviour
f(i = -1, i = -1); // Undefined Behaviour (see below)
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced. — end note]
Expressions (5), (7) and (8) do not invoke undefined behaviour. Check out the following answers for a more detailed explanation.
Multiple preincrement operations on a variable in C++0x
Unsequenced Value Computations
Final Note :
If you find any flaw in the post please leave a comment. Power-users (With rep >20000) please do not hesitate to edit the post for correcting typos and other mistakes.
C++17 (N4659) includes a proposal Refining Expression Evaluation Order for Idiomatic C++
which defines a stricter order of expression evaluation.
In particular, the following sentence
8.18 Assignment and compound assignment operators:....
In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
The right operand is sequenced before the left operand.
together with the following clarification
An expression X is said to be sequenced before an expression Y if every
value computation and every side effect associated with the expression X is sequenced before every value
computation and every side effect associated with the expression Y.
make several cases of previously undefined behavior valid, including the one in question:
a[++i] = i;
However several other similar cases still lead to undefined behavior.
In N4140:
i = i++ + 1; // the behavior is undefined
But in N4659
i = i++ + 1; // the value of i is incremented
i = i++ + i; // the behavior is undefined
Of course, using a C++17 compliant compiler does not necessarily mean that one should start writing such expressions.
I am guessing there is a fundamental reason for the change, it isn't merely cosmetic to make the old interpretation clearer: that reason is concurrency. Unspecified order of elaboration is merely selection of one of several possible serial orderings, this is quite different to before and after orderings, because if there is no specified ordering, concurrent evaluation is possible: not so with the old rules. For example in:
f (a,b)
previously either a then b, or, b then a. Now, a and b can be evaluated with instructions interleaved or even on different cores.
In C99(ISO/IEC 9899:TC3) which seems absent from this discussion thus far the following steteents are made regarding order of evaluaiton.
[...]the order of evaluation of subexpressions and the order in which
side effects take place are both unspecified. (Section 6.5 pp 67)
The order of evaluation of the operands is unspecified. If an attempt
is made to modify the result of an assignment operator or to access it
after the next sequence point, the behavior[sic] is undefined.(Section
6.5.16 pp 91)

How does expression evaluation order differ between C++ and Java?

I've had my brain wrinkled from trying to understand the examples on this page:
http://answers.yahoo.com/question/index?qid=20091103170907AAxXYG9
More specifically this code:
int j = 4;
cout << j++ << j << ++j << endl;
gives an output: 566
Now this makes sense to me if the expression is evaluated right to left, however in Java a similar expression:
int j = 4;
System.out.print("" + (j++) + (j) + (++j));
gives an output of: 456
Which is more intuitive because this indicates it's been evaluated left to right. Researching this across various sites, it seems that with C++ the behaviour differs between compilers, but I'm still not convinced I understand. What's the explanation for this difference in evaluation between Java and C++? Thanks SO.
When an operation has side effects, C++ relies on sequence points rule to decide when side effects (such as increments, combined assignments, etc.) have to take effect. Logical and-then/or-else (&& and ||) operators, ternary ? question mark operators, and commas create sequence points; +, -, << and so on do not.
In contrast, Java completes side effects before proceeding with further evaluation.
When you use an expression with side effects multiple times in the absence of sequence points, the resulting behavior is undefined in C++. Any result is possible, including one that does not make logical sense.
Java guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right. The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. Java also guarantees that every operand of an operator (except the conditional operators &&, ||, and ?:) appears to be fully evaluated before any part of the operation itself is performed. See Java language specification §15.7 for more details.
C++, on the other hand, happily let's you get away with undefined behavior if the expression is ambiguous because language itself doesn't guarantee any order of evaluation of sub-expressions. See Sequence Point for more details.
In C++ the order of evalulations of subexpressions isn't left-to-right nor right-to-left. It is undefined.

Categories

Resources