Java floating-point numbers representation as a hexadecimal numbers - java

Why does 0x1p3 equal 8.0? Why does 0x1e3 equal 483, whereas 0x1e3d equals 7741? It is confusing since 1e3d equals 1000.0.

0x1e3 and 0x1e3d are hexadecimal integer literals. Note that e and d are hexadecimal digits, not the exponent indicator or double type indicator in this case.
1e3d is a decimal floating-point literal. The e is the exponent indicator, the d says that this is a double rather than a float.
The notation 0x1p3 is a way to express a floating-point literal in hexadecimal, as you can read in section 3.10.2 of the Java Language Specification. It means 1 times 2 to the power 3; the exponent is binary (so, it's 2-to-the-power instead of 10-to-the-power).

0x1e3 is hex for 483, as is 0x1e3d hex for 7741. The e is being read as a hex digit with value 14.

Related

Range for int when using binary literal

Consider the following code:
int x1 = 0b1111_1111_1111_1111_1111_1111_1111_111; // binary for 2147483647
System.out.println(x1); // prints 2147483647
int x2 = 2147483648; // The literal 2147483648 of type int is out of range
// x3 is binary representation for 2147483648
int x3 = 0b1000_0000_0000_0000_0000_0000_0000_0000; // accepted without any compile errors
System.out.println(x3); // prints -2147483648
// x4 is binary representation for 4294967295
int x4 = 0b1111_1111_1111_1111_1111_1111_1111_1111; // long value: 4294967295
System.out.println(x4); // prints -1
int x5 = 0b1111_1111_1111_1111_1111_1111_1111_1111_1; // The literal 0b1111_1111_1111_1111_1111_1111_1111_1111_1 of type int is out of range
The Integer.MAX_VALUE is 2147483647 and compiler accepts any int in that range, and throws an error when this value goes beyond 2147483647. However, the int x3(int: -1, long: 2147483648) and x4(int: -1, long: 4294967295) in above snippet is accepted without any errors but throws errors in case of x5.
First question: Why the compiler did not complain about the range of x3?
Second question: If the value of x3 and x4 is accepted without any errors, why does it throw errors in case of x5?
TL;DR
Why the compiler did not complain about the range of x3?
Because it fits in 32 bits, and the Java Language Specification (JLS) says that literal is valid when it does.
If the value of x3 and x4 is accepted without any errors, why does it throw errors in case of x5?
Because it doesn't fits in 32 bits, given that it is 33 bits long.
Comment on Code Style
You should insert the _ separators in a binary literal where the nibble boundaries are, so instead of 0b1111_1111_1111_1111_1111_1111_1111_111 it should be 0b111_1111_1111_1111_1111_1111_1111_1111.
That then correctly represents that it is the first nibble that's missing a digit. It also makes it directly comparable to the hex representation, e.g. 0x7F_FF_FF_FF.
Your way of inserting _ is very confusing.
Long answer
In Java, numbers formatted using Integer.toBinaryString​(int i), Integer.toHexString​(int i), and Integer.toOctalString​(int i) are formatted as unsigned numbers.
This fits the Java integer literal syntax as defined by JLS 3.10.1. Integer Literals, which states:
It is a compile-time error if a hexadecimal, octal, or binary int literal does not fit in 32 bits.
Since 0b1000_0000_0000_0000_0000_0000_0000_0000 and 0b1111_1111_1111_1111_1111_1111_1111_1111, as well as their hex counterparts 0x80_00_00_00 and 0xFF_FF_FF_FF, all fit in 32 bits, they are valid int literals.
If you print them using the methods above, they match the literal, even though they would all print -1 if printed as a (signed) decimal:
System.out.println(Integer.toBinaryString(0b1111_1111_1111_1111_1111_1111_1111_1111));
System.out.println(Integer.toOctalString(037_777_777_777));
System.out.println(Integer.toHexString(0xFF_FF_FF_FF));
11111111111111111111111111111111
37777777777
ffffffff
The first bit for both x3 and x4 is 1, hence they are treated as negative numbers. They are both declared as 32 bit numbers, so they fit an int data type and the compiler doesn't complain. x5 gives error because you are attempting to assign 33 bits to a 32 bit data type, so it overflows.
The primitive int is a 32 bit number, with the leading bit being the sign of the integer, so when you used int for 2147483648 you caused an overflow error. To fix your problem, use the primitives double or long for higher values.
The Java Language Specification has the answer.
4.2. Primitive Types and Values
The integral types are byte, short, int, and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers
So int is a 32-bit two's-complement. (Also read the great answer about: What is “2's Complement”?)
And in 3.10.1. Integer Literals it shows:
The largest positive hexadecimal, octal, and binary literals of type int - each of which represents the decimal value 2147483647 (2^31-1) - are respectively:
0x7fff_ffff,
0177_7777_7777, and
0b0111_1111_1111_1111_1111_1111_1111_1111
The most negative hexadecimal, octal, and binary literals of type int - each of which represents the decimal value -2147483648 (-2^31) - are respectively:
0x8000_0000,
0200_0000_0000, and
0b1000_0000_0000_0000_0000_0000_0000_0000
The following hexadecimal, octal, and binary literals represent the decimal value -1:
0xffff_ffff,
0377_7777_7777, and
0b1111_1111_1111_1111_1111_1111_1111_1111
It is a compile-time error if a hexadecimal, octal, or binary int literal does not fit in 32 bits.

if 0.1 has no binary representation, why I get 0.1

When I run:
System.out.println(1f - 0.9f);
I get:
0.100000024
This is because 0.1 has no representation in binary.
Then why when I print this:
System.out.println(0.1f);
I get this:
0.1
0.1 can be represented better in floating point than 0.9. Loosely speaking that's because 0.1 is smaller and closer to its nearest dyadic rational.
So the error when subtracting from 1.0 is larger.
Hence the two values differ.
The embedded formatting heuristics in println do a better job with the 0.1
The value of 0.1f
Java’s float uses IEEE-754 basic 32-bit binary floating-point. In binary
floating-point, every representable number is an integer multiple of some
power of two. (This includes non-negative powers 1, 2, 4, 8, 16,…, and it
includes negative powers ½, ¼, ⅛, 1/16, 1/32,…)
For numbers from 1 to 2, the representable numbers are multiples of
2−23, which is 0.00000011920928955078125:
1.00000000000000000000000
1.00000011920928955078125
1.00000023841857910156250
1.00000035762786865234375
1.00000047683715820312500
…
For numbers near 0.1, the representable numbers are multiples of 2−27, which is 0.000000007450580596923828125:
…
0.099999979138374328613281250 (a)
0.099999986588954925537109375 (b)
0.099999994039535522460937500 (c)
0.100000001490116119384765625 (d)
0.100000008940696716308593750 (e)
0.100000016391277313232421875 (f)
…
(I labeled the numbers (a) to (f) to refer to them in text below.)
As we can see, the closest of these to 0.1 is (d), 0.100000001490116119384765625.
Thus, when 0.1 appears in source code, it is converted to this value,
0.100000001490116119384765625.
This is a general rule—any numeral in source code is converted to the nearest representable number.
(Note that 0.1 is not “represented by” 0.100000001490116119384765625, and
0.100000001490116119384765625 does not “represent” 0.1. The float
0.100000001490116119384765625 is exactly that. The 0.1f in source text was
converted to 0.100000001490116119384765625 and is now just that value.)
If 0.1f is not 0.1, why is “0.1” printed?
Java’s default formatting for floating-point numbers uses the fewest
significant decimal digits needed to distinguish the number from nearby
representable numbers.
The rule for Java SE 10 can be found in the documentation for java.lang.float, in
the toString(float d) section. I quote the passage below1. The
critical part says:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type float. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument f. Then f must be the float value nearest to x; or, if two float values are equally close to x, then f must be one of them and the least significant bit of the significand of f must be 0.
Let us see how this applies while formatting 0.100000001490116119384765625 and
0.099999994039535522460937500.
For 0.100000001490116119384765625, which is (d), we will first consider formatting one digit
after the decimal point: “0.1”. This represents the number 0.1, of course.
Now, we ask: Is this good enough? If we take 0.1 and ask which number in the
list of nearby numbers above is closest, what is the answer? The nearest number
in the list is (d), 0.100000001490116119384765625. That is the number we are
formatting, so we are done, and the result is “0.1”. This does not mean the float is 0.1, just that, when it is converted to a string with default options, the result is the string “0.1”.
Now consider 0.099999994039535522460937500, which is (c). Again, if we consider using just
one digit, the number rounds to 0.1. When we ask which number in the list is
closest to that, the answer is (d), 0.100000001490116119384765625. That is not the
number we are formatting, so we need more digits. If we consider two digits,
rounding would give us 0.10, and that clearly is also not enough. Considering
more and more digits gives us 0.100, 0.1000, and so on, until we get to eight
digits. With eight digits, 0.099999994039535522460937500 rounds to
0.09999999. Now, when we check the list, we see the nearest number is
(b), 0.099999986588954925537109375. (Adding about 0.0000000035 to that produces
0.09999999, whereas the number we are formatting is about 0.0000000040 away,
which is farther.) So we try nine digits, which gives us 0.099999994. Finally,
the closest number in the list is (c), 0.099999994039535522460937500, which is the
number we are formatting, so we are done, and the result is “0.099999994”.
Footnote
1 The documentation for toString(float d) says:
Returns a string representation of the float argument. All characters mentioned below are ASCII characters.
If the argument is NaN, the result is the string "NaN".
Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative, the first character of the result is '-' ('\u002D'); if the sign is positive, no sign character appears in the result. As for the magnitude m:
If m is infinity, it is represented by the characters "Infinity"; thus, positive infinity produces the result "Infinity" and negative infinity produces the result "-Infinity".
If m is zero, it is represented by the characters "0.0"; thus, negative zero produces the result "-0.0" and positive zero produces the result "0.0".
If m is greater than or equal to 10-3 but less than 107, then it is represented as the integer part of m, in decimal form with no leading zeroes, followed by '.' ('\u002E'), followed by one or more decimal digits representing the fractional part of m.
If m is less than 10-3 or greater than or equal to 107, then it is represented in so-called "computerized scientific notation." Let n be the unique integer such that 10n ≤ m < 10n+1; then let a be the mathematically exact quotient of m and 10n so that 1 ≤ a < 10. The magnitude is then represented as the integer part of a, as a single decimal digit, followed by '.' ('\u002E'), followed by decimal digits representing the fractional part of a, followed by the letter 'E' ('\u0045'), followed by a representation of n as a decimal integer, as produced by the method Integer.toString(int).
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type float. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument f. Then f must be the float value nearest to x; or, if two float values are equally close to x, then f must be one of them and the least significant bit of the significand of f must be 0.
System.out.println, and most methods of converting floating-point numbers to strings, operate using the following rule: they use exactly as many digits are necessary so that the true value of the double is the closest representable number to the printed value.
That is, it only prints out the digits 0.1 because the true value of the double, 0.1000000000000000055511151231257827021181583404541015625, is the closest double to the displayed value.

Numeric literals

I was reading a book about Java and I found the following points unclear, please help me:
For integer literal expressed in any base other than base 10 (0b, 0, 0x) can we use the L suffix that stands for Long?
For floating point can we use any other base other than decimal? If yes can we specify float or double using F or D for other bases other than 10?
If yes with other bases than 10 could we use scientific notation or only decimal point is allowed?
1) Yes, it's possible for hexadecimal, octal and binary too. See jls-3.10.1
2) Yes, you can use hexadecimal notation, but you are restricted to binary exponents and specifying the exponent is required. See jls-3.10.2
Examples:
0xFF.Ap0d
0xFF.1p0f
0xFF.Ap1d
0xFF.Ap-1f
0xFF.Ap-1
0x.1p16
If you print these literals using System.out.println, you get:
255.625
255.0625
511.25
127.8125
63.90625
4096.0
The meaning of the binary exponent is as follows:
The value in front of the p or P is multipled by 2^z where z is the integer after the p (or P). The integer is in decimal format. E.g. 0xFF.1Ap0101d stands for 255.1015625 * 2^101.

Unable to Understand o/p of Float.toHexString() method in java

I want to know how to get hexadecimal representation of float number.
I tried following code
System.out.println(Float.toHexString(56));
Got o/p
0x1.cp5
I really do not understand. If I use Integer method the o/p would be 38, which I can understand...but how o/p comes 0x1.cp5. Could any one tell me or point to some good tutorial..thanks in advance.
Just refer to java.lang.Float javadocs:
http://docs.oracle.com/javase/7/docs/api/java/lang/Float.html#toHexString%28float%29
If m is a float value with a normalized representation, substrings are used to represent the significand and exponent fields. The
significand is represented by the characters "0x1." followed by a
lowercase hexadecimal representation of the rest of the significand as
a fraction. Trailing zeros in the hexadecimal representation are
removed unless all the digits are zero, in which case a single zero is
used. Next, the exponent is represented by "p" followed by a decimal
string of the unbiased exponent as if produced by a call to
Integer.toString on the exponent value.
If m is a float value with a subnormal representation, the significand is represented by the characters "0x0." followed by a
hexadecimal representation of the rest of the significand as a
fraction. Trailing zeros in the hexadecimal representation are
removed. Next, the exponent is represented by "p-126". Note that there
must be at least one nonzero digit in a subnormal significand.
Use this tool to see what happens to your float when represented in IEEE 754 format.
56 in binary is 111000 which when normalized converts to 1.11000.
As the javadoc says, 0x1. is for the significand part, which is in this case '1100' in binary, which is 'c' in hex. And the exponent part is 5.

Given IEEE binary representation of a real how to get its true binary representation in Java?

I am just wonder how to use bit operations to achieve the goal: given an IEEE binary representation of a real, for example, 40AC0000 (5.375 in decimal), how to get its true binary representation (expecting 101.011 for the example) in Java?
This is kind of a tough question, especially if you don't already know about IEEE floats.
Since there are 4 bytes in your number, it's single precision. This means it has a structure of 1 sign bit, 8 Exponent bits and 23 Mantissa bits. The sign bit is obvious. The meaning of the exponent bits affects how you interpret the Mantissa bits. First check the 8 exponents bits. If they are all 0, you have a denormalized number; if they are all 1, you have an infinity value or a NaN; otherwise, it is normalized.
In the normalized , take the exponent bits, interpret it as an 8 bit number and subtract 127_10 (or 0xf7) from it. This is your exponent. Then take the remaining Mantissa bits, add a leading 1. Your result is then (-1)^[Sign] * 1.[Mantissa] * 2^[Exponent].
If it is a denormalized number, your exponent is -126 (1-127). In this case, interpret as (-1)^[Sign] * 0.[Mantissa] * 2^[Exponent].
In the remaining cases, if the Mantissa is all 0s, your number is (-1)^[Sign] * infinity. Otherwise, your float is a NaN.
Hope that helps.
Do you mean Float.floatToIntBits() and Float.intBitsToFloat() ?
What do you mean by "true binary representation"? There is nothing "untrue" about the hex representation (40AC0000).
You can convert between different radixes (hex, binary, decimal) using the methods on Integer:
Float.floatToIntBits(new Float("5.375"));
// = 1085014016
Integer.toString(1085014016, 16);
// = "40ac0000"
Integer.valueOf("40AC0000", 16);
// = 1085014016
Integer.toString(1085014016, 2);
// returns 1000000101011000000000000000000

Categories

Resources