Range for int when using binary literal

Range for int when using binary literal - java

Consider the following code:
int x1 = 0b1111_1111_1111_1111_1111_1111_1111_111; // binary for 2147483647
System.out.println(x1); // prints 2147483647
int x2 = 2147483648; // The literal 2147483648 of type int is out of range
// x3 is binary representation for 2147483648
int x3 = 0b1000_0000_0000_0000_0000_0000_0000_0000; // accepted without any compile errors
System.out.println(x3); // prints -2147483648
// x4 is binary representation for 4294967295
int x4 = 0b1111_1111_1111_1111_1111_1111_1111_1111; // long value: 4294967295
System.out.println(x4); // prints -1
int x5 = 0b1111_1111_1111_1111_1111_1111_1111_1111_1; // The literal 0b1111_1111_1111_1111_1111_1111_1111_1111_1 of type int is out of range
The Integer.MAX_VALUE is 2147483647 and compiler accepts any int in that range, and throws an error when this value goes beyond 2147483647. However, the int x3(int: -1, long: 2147483648) and x4(int: -1, long: 4294967295) in above snippet is accepted without any errors but throws errors in case of x5.
First question: Why the compiler did not complain about the range of x3?
Second question: If the value of x3 and x4 is accepted without any errors, why does it throw errors in case of x5?

TL;DR
Why the compiler did not complain about the range of x3?
Because it fits in 32 bits, and the Java Language Specification (JLS) says that literal is valid when it does.
If the value of x3 and x4 is accepted without any errors, why does it throw errors in case of x5?
Because it doesn't fits in 32 bits, given that it is 33 bits long.
Comment on Code Style
You should insert the _ separators in a binary literal where the nibble boundaries are, so instead of 0b1111_1111_1111_1111_1111_1111_1111_111 it should be 0b111_1111_1111_1111_1111_1111_1111_1111.
That then correctly represents that it is the first nibble that's missing a digit. It also makes it directly comparable to the hex representation, e.g. 0x7F_FF_FF_FF.
Your way of inserting _ is very confusing.
Long answer
In Java, numbers formatted using Integer.toBinaryString(int i), Integer.toHexString(int i), and Integer.toOctalString(int i) are formatted as unsigned numbers.
This fits the Java integer literal syntax as defined by JLS 3.10.1. Integer Literals, which states:
It is a compile-time error if a hexadecimal, octal, or binary int literal does not fit in 32 bits.
Since 0b1000_0000_0000_0000_0000_0000_0000_0000 and 0b1111_1111_1111_1111_1111_1111_1111_1111, as well as their hex counterparts 0x80_00_00_00 and 0xFF_FF_FF_FF, all fit in 32 bits, they are valid int literals.
If you print them using the methods above, they match the literal, even though they would all print -1 if printed as a (signed) decimal:
System.out.println(Integer.toBinaryString(0b1111_1111_1111_1111_1111_1111_1111_1111));
System.out.println(Integer.toOctalString(037_777_777_777));
System.out.println(Integer.toHexString(0xFF_FF_FF_FF));
11111111111111111111111111111111
37777777777
ffffffff

The first bit for both x3 and x4 is 1, hence they are treated as negative numbers. They are both declared as 32 bit numbers, so they fit an int data type and the compiler doesn't complain. x5 gives error because you are attempting to assign 33 bits to a 32 bit data type, so it overflows.

The primitive int is a 32 bit number, with the leading bit being the sign of the integer, so when you used int for 2147483648 you caused an overflow error. To fix your problem, use the primitives double or long for higher values.

The Java Language Specification has the answer.
4.2. Primitive Types and Values
The integral types are byte, short, int, and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers
So int is a 32-bit two's-complement. (Also read the great answer about: What is “2's Complement”?)
And in 3.10.1. Integer Literals it shows:
The largest positive hexadecimal, octal, and binary literals of type int - each of which represents the decimal value 2147483647 (2^31-1) - are respectively:
0x7fff_ffff,
0177_7777_7777, and
0b0111_1111_1111_1111_1111_1111_1111_1111
The most negative hexadecimal, octal, and binary literals of type int - each of which represents the decimal value -2147483648 (-2^31) - are respectively:
0x8000_0000,
0200_0000_0000, and
0b1000_0000_0000_0000_0000_0000_0000_0000
The following hexadecimal, octal, and binary literals represent the decimal value -1:
0xffff_ffff,
0377_7777_7777, and
0b1111_1111_1111_1111_1111_1111_1111_1111
It is a compile-time error if a hexadecimal, octal, or binary int literal does not fit in 32 bits.

Related

'Lossy Conversion' vs. 'Loss of precision'

Can someone please explain/link any documentation that can differentiate between
Possible Loss of Precision Error
and
Lossy Conversion.
I cannot understand which error will occur under what circumstance.
Any explanation with examples is deeply appreciated

The difference is the end of the number which gets chopped off:
Lossy conversion returns least-significant bits. It is described in JLS Sec 5.1.3:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
It is something like converting an int to a byte: you simply get the 8 least-significant bits in this case:
System.out.println((byte) 258); // 2
Loss of precision returns most-significant bits. It is described in JLS Sec 5.1.2:
A widening primitive conversion from int to float, or from long to float, or from long to double, may result in loss of precision - that is, the result may lose some of the least significant bits of the value.
It is something like storing an int in a float which is too large to be represented accurately
int i = (1 << 24) + 1;
float f = i;
System.out.println((int) f == i); // false, because precision is lost.

You get possible loss of precision error when you try to cast such as double to int. You are trying to convert one primitive to another primitive but there is no enough space and you can get loss of your some bytes.
double x = 10.5; // 8 bytes
int y = x; // 4 bytes ; raises compilation error
You should look for the primitives documentation

How does integer type cast behave in Java for numbers beyond the range of integers?

Here is my program.
public class Foo
{
public static void main(String[] args)
{
System.out.println((int) 2147483648l);
System.out.println((int) 2147483648f);
}
}
Here is the output.
-2147483648
2147483647
Why isn't 2147483648l and 2147483648f type cast to the same integer? Can you explain what is going on here or what concept in Java I need to understand to predict the output of type casts like these?

These are examples of the Narrowing Primitive Conversion operation.
In your first example, long to int:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
So your (int) 2147483648l is taking the 64 bits of the long:
00000000 00000000 00000000 00000000 10000000 00000000 00000000 00000000
...and dropping the top 32 bits entirely:
10000000 00000000 00000000 00000000
...and taking the remaining 32 bits as an int. Since the leftmost of those is now a sign bit (long and int are stored as two's complement), and since it happens to be set in your 2147483648l value, you end up with a negative number. Since no other bits are set, in two's complement, that means you have the lowest negative number int can represent: -2147483648.
The float to int example follows a more complex rule. The relevant parts for your value are:
...if the floating-point number is not an infinity, the floating-point value is rounded to an integer value V, rounding toward zero using IEEE 754 round-toward-zero mode (§4.2.3).
...[if] the value [is] too large (a positive value of large magnitude or positive infinity), [then] the result of the first step is the largest representable value of type int or long.
(But see the part of the spec linked above for the details.)
So since 2147483648f rounds to 2147483648, and 2147483648 is too large to fit in int, the largest value for int (2147483647) is used instead.
So in the long to int, it's bit fiddling; in the float to int, it's more mathematical.
In a comment you've asked:
Do you know why both (short) 32768 and (short) 32768f evaluate to -32768? I was exepecting the latter to evaluate to 32767.
Excellent question, and that's where my "see the part of the spec linked above for the details" above comes in. :-) (short) 32768f does, in effect, (short)(int)32768f:
In the spec section linked above, under "A narrowing conversion of a floating-point number to an integral type T takes two steps:", it says
In the first step, the floating-point number is converted either to a long, if T is long, or to an int, if T is byte, short, char, or int...
and then later in Step 2's second bullet:
* If T is byte, char, or short, the result of the conversion is the result of a narrowing conversion to type T (§5.1.3) of the result of the first step.
So in step one, 32768f becomes 32768 (an int value), and then of course (short)32768 does the bit-chopping we saw in long => int above, giving us a short value of -32768.

Nice! It's wonderful to see the effects of design decisions presented in the way that you have.
2147483648l is a long type and the rule for converting a long that's too big for the int is to apply the wrap-around rule into the destination type. (Under the hood, the significant bits from the source type are simply discarded.)
2147483648f is a float type and the rule for converting a float that's too big for the destination type is to take the largest possible for the destination type. Reference Are Java integer-type primitive casts "capped" at the MAX_INT of the casting type?
The good thing about standards is that there are so many to choose from.

Why can't I use (long) to promote an int?

long x = (long)(2147483649);
Why is this wrong? Why do I have to use L and F for floats and longs but I can use (byte) for example?

Your problem is technically related to Java specification. see https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.10
The largest decimal literal of type int is 2147483648 (2^31).
All decimal literals from 0 to 2147483647 may appear anywhere an int
literal may appear. The decimal literal 2147483648 may appear only as
the operand of the unary minus operator - (§15.15.4).
It is a compile-time error if the decimal literal 2147483648 appears
anywhere other than as the operand of the unary minus operator; or if
a decimal literal of type int is larger than 2147483648 (2^31).
Why?
because, as other people said before, in just 32 bits you can only represent from -2147483648 to 2147483647 (2^32 different numbers). So, before promoting it to long the compiler needs to be able to represent it, but it cannot. Indeed, java specification indicates that this is a compile-time error.
The largest decimal literal of type int is 2147483648 (2^31).
Just adding an L/l at the end of the literal (2147483649L), specifies that it is a literal of type long, and now it can contain bigger numbers.
An integer literal is of type long if it is suffixed with an ASCII
letter L or l (ell); otherwise it is of type int (§4.2.1). The suffix
L is preferred, because the letter l (ell) is often hard to
distinguish from the digit 1 (one).
So there are only two type of literal numbers in decimal format (for integer numbers).

You can, but not this int. The number you are trying to cast is 80000001 in hex. This number is not a valid int.
This works because 2147483647 is 7fffffff in hex - a perfgectly acceptable int.
long x = (long)(2147483647);

2147483649 = Integer.MAX_VALUE + 2
Can't fit that into an int variable.
try long x = 2147483649L;

Integer range is defined between Integer.MIN_VALUE and Integer.MAX_VALUE (-2147483648 and 2147483647)
your number 2147483649 is out of that range, therefore can not event be a valid integer
your compiler should even complains like:

Largest decimal literal of type int is 2147483648 or 2147483647?

As per JLS §3.10.1
The largest decimal literal of type int is 2147483648.
Can this statement be considered as true because Integer.MAX_VALUE is 2147483647?
Please note that emphasis in above statement is on "int". If it is argued that it is being talked in context of "decimal literal" then even 2147483649 and so on, should be also true.
So, if something is of type int then its largest value has to be 2147483647.
Am I getting it wrong or that statement should be updated?

Note that there are no negative integral literals and Integer.MIN_VALUE is −2147483648. So -2147483648 is parsed as “apply unary minus to 2147483648”. It would be very bad if 2147483648 would not be a valid decimal int literal or you couldn't use an int literal of value Integer.MIN_VALUE directly in your program.
Side note: The JLS defines what is correct. So it is correct by definition. It can be bad, though.

From the same JLS section
The decimal literal 2147483648 may appear only as the operand of the unary minus operator
i.e
int value = -2147483648;
exists
but
int value = 2147483648;
is a compile time error.

Every literal is of a specific type of literal (boolean literal, integer literal, floating point literal, etc), although it may be assigned to a field/variable of different type. For example, 2147483647 is a valid integer literal, while 2147999999 is not (while 2147999999L is, although it is a long literal). While the writing is unclear there appears to be no contradiction of any sort.

Note: Reimeus has the right answer above.
Yep, you are right, the JLS says
The largest decimal literal of type int is 2147483648 (2^31)
but if you try to compile
int j = 2147483648;
you get
Error:(20, 17) java: integer number too large: 2147483648
2^31 is equal to 2147483648, which is 0x80000000, but in 32 bit two's complement notation this is actually equal to -1.
So, 2^31 cannot be represented in an int.
An int can only represent values from Integer.MIN_VALUE, which is -2^31, to Integer.MAX_VALUE, which is (2^31)-1. And luckily the compiler does not accept integer literals outside of that range.

Casting a double to another numeric type

there is something puzzling me and I did not find much information on the VM specs. It's a bit obscure and that'd be nice if someone could explain me.
These few lines of code.....
double myTest = Double.MAX_VALUE;
System.out.println("1. float: " + (float)myTest);
System.out.println("2. int: " + (int)myTest);
System.out.println("3. short: " + (short)myTest);
System.out.println("4. byte: " + (byte)myTest);
..... produce this output:
float: Infinity
int: 2147483647
short: -1
byte: -1
byte, short and int are 8, 16, 32 bit with two's complement. float and double are 32 and 64 bit IEEE 754 (see here).
From my understanding, the max value of a double implies that all the bits of the mantisse (52 bits) are switched to 1. Therefore it's not (very) surprising that a cast to short or to byte returns -1 i.e all bits are switched to 1. It seems that the cast keeps the 'tail' of the double so that it fits into 8 bit byte or 16 bit short.
What surprises me is the cast to int and, to a lesser extent, the cast to float.
How is it possible to get "2. int: 2147483647" which is 0x7FFFFFFF, the maximal value while short and byte 3. and 4. are -1 ?
The cast to float is also weird. If the 32 bits at the 'tail' of myTest were kept, then shouldn't it generate a NaN ?

JLS spells out the rules in section 5.1.3 Narrowing Primitive Conversion. The rules depend on the target type.
float:
A narrowing primitive conversion from double to float is governed by the IEEE 754 rounding rules (§4.2.4). This conversion can lose precision, but also lose range, resulting in a float zero from a nonzero double and a float infinity from a finite double. A double NaN is converted to a float NaN and a double infinity is converted to the same-signed float infinity.
int and long:
one of the following two cases must be true:
...
The value must be too large (a positive value of large magnitude or positive infinity), and the result of the first step is the largest representable value of type int or long.
byte, char and short:
If the target type is byte, char or short, the conversion it two-step. First, the double is converted to long as explained above. Then, the long is converted to the final type as follows:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.