I really find the concept of conversion to be confusing.
short a = 34;
short b = (short)34;
What is the linguistic difference between these two statements?
I think that, in the first statement the int literal 34 is directly stored in the short variable a, causing a conversion of int literal 34 to short literal 34. And in the second one, it seems that the int literal 34 is first converted to the short literal 34 due to the casting instruction and then stored in the short variable b. Am i correct or its one of the two cases?
What is the functional difference between these two statements?
There is no functional difference in this example. In this example, the end result is the same.
From a (Java) linguistic perspective, these are two distinct cases:
short b = (short) 34;
The number 34 is an int literal. The (short) is a type cast that performs an explicit narrowing primitive conversion. This will truncate the value if necessary, though it is not necessary here. You then have an short value that is assigned to the variable with no further conversion.
short b = 34;
The number 34 is an int literal. In this case there is an implicit narrowing primitive conversion. This "special case" happens in an assignment context when:
the expression being evaluated is a compile time constant expression, AND
the type of the variable is byte, char or short, AND
the value of the expression is representable in the type of the variable.
This will NOT truncate the value.
If you change the code to the following, then the difference between the two contexts becomes apparent:
short a = 100000; // Compilation error because 100,000 is too large
short b = (short) 100000; // OK - 100,000 is truncated to give -31,072
Alternatively:
int x = 34;
short y = x; // Compilation error because 'x' is NOT
// a constant expression
The most relevant sections of the JLS are 5.2 Assignment Contexts and 5.4 Casting Contexts.
Related
So I understand what a dataype and literal is but I am confused about one thing.
Why do I have to put an L when the datatype is long? How come I don't have to do that for short or byte?
// Why does a not work but b does? Both are longs?
long a = 9223372036854775807;
long b = 9223372036854775807L;
There are only two integer literals types defined in Java: int and long. The latter is distinguished from the former by a suffix L or l. (A character literal is also of integral type (can be assigned to an int, for example) but the Java Language Specification (JLS) treats it separately from integer literals.)
Remember that (apart from inferring generic type parameters), the type of the left hand side of an assignment has no effect on the evaluation of the right-hand side. Therefore, if you assign something to a variable of type long, the expression will be evaluated first and only then will the result be converted to long (if possible and necessary).
The statement
long a = 10;
is perfectly valid. The right hand side expression consisting only of the integer literal 10 will be evaluated and then promoted to long which is then assigned to the variable a. This works well unless you want to assign a value that is too large to be represented in an int in which case you'll have to make the literal of type long as well.
This “problem” is not limited to literals. Consider the following statement that frequently bites new users:
long a = Integer.MAX_VALUE + 1;
The expression on the right hand side is evaluated using int arithmetic and therefore overflows. Only then the result (−2147483648) is promoted to long and assigned to a but it is too late by now. Had the 1 been written as 1L, the overflow would not have happened.
How come I don't have to do that for short or byte?
First, note that a literal of type int can hold any value you could possibly assign to a short or byte so the problem described above cannot occur. Moreover, it is not as much that you “don't have to” use a short or byte literal but that you cannot because such literals are not defined. This has always bothered me because it means that we often have to cast to byte if we want to call a function with a literal argument.
If you like reading standardese, you can look up the chapter about Integer Literals (§ 3.10.1) in the JLS.
Clearly stated in the documentation:
"An integer literal is of type long if it ends with the letter L or l; otherwise it is of type int. It is recommended that you use the upper case letter L because the lower case letter l is hard to distinguish from the digit 1."
Your 2nd question is answered there as well:
"Values of the integral types byte, short, int, and long can be created from int literals"
If you didn't set L, the compiler will try to parse the literal as an int like as below sample scenario.
long accepts a value between the ranges of: -9,223,372,036,854 to +9,223,372,036,854,775,807. Now I have create a Long variable called testLong, although when I insert 9223372036854775807 as the value, I get an error stating:
"The literal of int 9223372036854775807 is out of range."
Solution is add a capital L to the end:
I understand why the following is wrong:
byte a = 3;
byte b = 8;
byte c = a + b; // compile error
It won't compile. Expressions always result in an int. So we should have done the explicit cast:
byte c = (byte) (a + b); // valid code
But I don't understand why the following is correct:
byte d = 3 + 8; // it's valid! why?
Because literal integer (such as 3 or 8) is always implicitly an int. And int-or-smaller expressions always result in an int too. Can anybody explain what is going on here?
The only thing I can guess is that the compiler equates this expression to the following:
byte d = 11;
and doesn't consider this an expression.
This has less† to do with whether or not 3 + 8 is evaluated to 11 at compile-time, and more to do with the fact the compiler is explicitly permitted to implicitly narrow ints to bytes in certain cases. In particular, the language specification explicitly permits implicit narrowing conversions to byte of constant expressions of type int that can fit in a byte at compile-time.
The relevant section of the JLS here is section §5.2:
In addition, if the expression is a constant expression (§15.28) of type byte, short,
char, or int:
A narrowing primitive conversion may be used if the type of the
variable is byte, short, or char, and the value of the constant
expression is representable in the type of the variable.
The compile-time narrowing of constants means that code such as:
byte theAnswer = 42;
is allowed. Without the narrowing, the fact that the integer literal 42 has type int would
mean that a cast to byte would be required:
†: Obviously, as per the specification, the constant expression needs to be evaluated to see if it fits in the narrower type or not. But the salient point is that without this section of the specification, the compiler would not be permitted to make the implicit narrowing conversion.
Let's be clear here:
byte a = 3;
byte b = 8;
The reason that these are permitted is because of the above section of the specification. That is, the compiler is allowed to make the implicit narrowing conversion of the literal 3 to a byte. It's not because the compiler evaluates the constant expression 3 to its value 3 at compile-time.
The only thing I can guess is that the compiler equates this expression to the following:
Yes it does. As long as the right side expression is made of constants (which fit into the required primitive type -- see #Jason's answer for what the JLS says about this exactly), you can do that. This will not compile because 128 is out of range:
byte a = 128;
Note that if you transform your first code snippet like this:
final byte a = 3;
final byte b = 8;
byte c = a + b;
it compiles! As your two bytes are final and their expressions are constants, this time, the compiler can determine that the result will fit into a byte when it is first initialized.
This, however, will not compile:
final byte a = 127; // Byte.MAX_VALUE
final byte b = 1;
byte c = a + b // Nope...
The compiler will error out with a "possible loss of precision".
It is because 3 and 8 are compile time constants.
Therefore, at the time of compilation happens, compiler can identify that 3 + 8 can fit into a byte variable.
If you make your a and b to final (constant) variable. a + b will become a compile time constant. Therefore, it will compile without any issue.
final byte a = 3;
final byte b = 8;
byte c = a + b;
Let's consider the following expressions in Java.
byte a = 32;
byte b = (byte) 250;
int i = a + b;
This is valid in Java even though the expression byte b = (byte) 250; is forced to assign the value 250 to b which is outside the range of the type byte. Therefore, b is assigned -6 and consequently i is assigned the value 26 through the statement int i = a + b;.
The same thing is possible with short as follows.
short s1=(short) 567889999;
Although the specified value is outside the range of short, this statement is legal.
The same thing is however wrong with higher data types such int, double, folat etc and hence, the following case is invalid and causes a compile-time error.
int z=2147483648;
This is illegal, since the range of int in Java is from -2,147,483,648 to 2147483647 which the above statement exceeds and issues a compile-time error. Why is such not wrong with byte and short data types in Java?
The difference is in the literal itself. You can do:
int z = (int) 2147483648L;
because now you've got a "long" literal. So basically, there are two steps involved:
Parsing the literal to a valid value for that literal type (int in your earlier examples, long for the "big" one)
Converting that value to the target type
The error is because by default all numeric integer literals are int and 2147483648 is out of range.
To code this, you need to make it a long with an explicit downcast to int:
int z = (int)2147483648L; // compiles
Why is int i = 2147483647 + 1; OK, but byte b = 127 + 1; is not compilable?
Constants are evaluated as ints, so 2147483647 + 1 overflows and gives you a new int, which is assignable to int, while 127 + 1 also evaluated as int equals to 128, and it is not assignable to byte.
The literal 127 denotes a value of type int. So does the literal 1. The sum of these two is the integer 128. The problem, in the second case, is that you are assigning this to a variable of type byte. It has nothing to do with the actual value of the expressions. It has to do with Java not supporting coercions (*). You have to add a typecast
byte b = (byte)(127 + 1);
and then it compiles.
(*) at least not of the kind String-to-integer, float-to-Time, ... Java does support coercions if they are, in a sense, non-loss (Java calls this "widening").
And no, the word "coercion" did not need correcting. It was chosen very deliberately and correctly at that. From the closest source to hand (Wikipedia) : "In most languages, the word coercion is used to denote an implicit conversion, either during compilation or during run time." and "In computer science, type conversion, typecasting, and coercion are different ways of, implicitly or explicitly, changing an entity of one data type into another.".
As an evidence to #MByD:
The following code compiles:
byte c = (byte)(127 + 1);
Because although expression (127 + 1) is int and beyond the scope off byte type the result is casted to byte. This expression produces -128.
JLS3 #5.2 Assignment Conversion
( variable = expression )
In addition, if the expression is a constant expression (§15.28) of type byte, short, char or int :
A narrowing primitive conversion may be used if the type of the variable is byte, short, or char, and the value of the constant expression is representable in the type of the variable.
Without this clause, we wouldn't be able to write
byte x = 0;
char c = 0;
But should we be able to do this? I don't think so. There are quite some magic going on in conversion among primitives, one must be very careful. I would go out of my way to write
byte x = (byte)0;
This may sound too trivial for an intermediate Java programmer. But during my process of reviewing Java fundamentals, found a question:
Why is narrowing conversion like:
byte b = 13;
will be allowed while
int i = 13;
byte b = i;
will be complained by the compiler?
Because byte b = 13 ; is assignment of a constant. Its value is known at compile time, so the compiler can/should/will whine if assignment of the constant's value would result in overflow (try byte b = 123456789 ; and see what happens.)
Once you assign it to a variable, you're assigning the value of an expression, which, while it may well be invariant, the compiler doesn't know that. That expression might result in overflow and so the compiler whines.
From here:
Assignment conversion occurs when the
value of an expression is assigned
(§15.26) to a variable: the type of
the expression must be converted to
the type of the variable. Assignment
contexts allow the use of an identity
conversion (§5.1.1), a widening
primitive conversion (§5.1.2), or a
widening reference conversion
(§5.1.4). In addition, a narrowing
primitive conversion may be used if
all of the following conditions are
satisfied:
The expression is a constant expression of type byte, short, char or int.
The type of the variable is byte, short, or char.
The value of the expression (which is known at compile time, because it is a constant expression) is representable in the type of the variable.
In your example all three conditions are satisfied, so the narrowing conversion is allowed.
P.S. I know the source I'm quoting is old, but this aspect of the language hasn't changed since.
Because a literal number has no type.
Once you give it a type it must be casted to the other one:
int i = 13;
byte b = (byte) i;
A byte has 8 bits. An int, 32 bits, and it is a signed number.
Conversion from a number with a smaller range of magnitude ( like int to long or long to float ) is called widening. The goal of widening conversions is to produce no change in the magnitude of the number while preserving as much of the precision as possible. For example, converting the int 2147483647 to float produces 2.14748365e9 or 2,147,483,650. The difference is usually small, but it may be significant.
Conversely, conversion where there is the possibility of losing information about the magnitude of the number ( like long to int or double to long ) is called narrowing. With narrowing conversions, some information may be lost, but the nearest representation is found whenever possible. For example, converting the float 3.0e19 to long yields -9223372036854775807, a very different number.