How do I convert a decimal fraction to binary in Java? - java

I need to convert 0.5 in base 10 to base 2 (0.1).
I have tried using
Double.doubleToRawLongBits(0.5)
and it returns 4602678819172646912 which I guess is in hex, but it does not make sense to me.

No. 4602678819172646912 is in dec, hex is 0x3fe0000000000000. To dismantle that:
3 | F | E | 0 ...
0 0 1 1 1 1 1 1 1 1 1 0 0 ...
s| exponent | mantissa
s is the sign bit, exponent is the exponent shifted by 2^9 (hence this exponent means -1), mantissa is the xxx part of the number 1.xxx (1. is implied). Therefore, this number is 1.000...*2^-1, which is 0.5.
Note that this describes the "normal" numbers only, so no zeros, denormals, NaNs or infinities

Multiply you number by 2^n, convert to an BigInteger, convert to binary String, add a decimal point at position n (from right to left).
Example (quick & ++dirty):
private static String convert(double number) {
int n = 10; // constant?
BigDecimal bd = new BigDecimal(number);
BigDecimal mult = new BigDecimal(2).pow(n);
bd = bd.multiply(mult);
BigInteger bi = bd.toBigInteger();
StringBuilder str = new StringBuilder(bi.toString(2));
while (str.length() < n+1) { // +1 for leading zero
str.insert(0, "0");
}
str.insert(str.length()-n, ".");
return str.toString();
}

This is decimal for 0x3FE0_0000_0000_0000. The mantissa is the list of zeros after 3FE (which codes sign and exponent). This is what you are looking for, given that 0.1 before the zeros is implicit.

Do you want to convert the decimal string to floating-point binary or to a binary string? If the former, just use valueOf(); if the latter, use valueOf() followed by toString() or printf().

0.1 is NOT a binary representation of 0.5
Java will represent 0.5 using IEEE 754, as specified on the Java Language Specification. BigInteger.valueOf(Double.doubleToRawLongBits(0.5)).toByteArray() will give you a byte per byte representation of 0.5 as Java does internally.

Related

Java Understanding Math.getExponent(Double)

Double dble = new Double("2.2737367544323201e-13");
int exponent = Math.getExponent(dble);
I have the above code and exponent has value of '-43'. I'm not sure how the exponent is '-43', when the passed double value contains '-13'. Could someone shed some light into this API?
Math.getExponent() returns the exponent of the binary representation of the number. In your example -13 is the exponent of the decimal representation, and -43 the exponent of the binary representation.
For example,
System.out.println (Math.getExponent (1024));
prints
10
since
1024 = 2 ^ 10
so the exponent is 10.
System.out.println (Math.getExponent (1.0/8192));
will print
-13
since
1.0/8192 = 2 ^ (-13)

What are the Java primitive data type modifiers?

Alright, I've been programming in Java for the better part of three years, now, and consider myself very experienced. However, while looking over the Java SE source code, I ran into something I didn't expect:
in class Double:
public static final double MIN_NORMAL = 0x1.0p-1022; // 2.2250738585072014E-308
public static final double MIN_VALUE = 0x0.0000000000001P-1022; // 4.9e-324
I did not expect this and can't find out what it means. If you don't know, I'm referring to the p and P that are after these numbers, before the subtraction operator. I know you can use suffixes to force a number to be a double, long, float, etc., but I've never encountered a p or P. I checked the Java API, but it doesn't mention it. Is there a complete list of Java primitive number literal modifiers somewhere? Does anyone know them all?
For reference, below are the ones I've used or encountered, with the ones whose purposes elude me in bold with question marks (# represents any arbitrary number within respective limits):
Suffixes:
# = 32-bit integer int
#L = 64-bit integer long
#l = another 64-bit integer l?
#f = 32-bit floating-point float
#F = another 32-bit floating-point float?
#d = 64-bit floating-point double
#D = another 64-bit floating-point double?
#e# = scientific notation
#E# = another scientific notation?
#p = ?
#P = ?
Any more?
Prefixes:
0b# = binary (base 2) literal
0B# = another binary (base 2) literal?
0# = octal (base 8) literal
# = decimal (base 10) literal
0x# = hexadecimal (base 16) literal
0X# = another hexadecimal (base 16) literal?
Any more?
Other (are there suffixes or prefixes for these?):
(byte)# = 8-bit integer byte
(short)# = 16-bit integer short
(char)# - 32-bit character char
P is the exponent. It does not matter if it's capital or not.
According to the Javadoc for toHextString (which we know is being used because it begins with 0x:
public static String toHexString(double d) Returns a hexadecimal string representation of the double argument. All characters mentioned below are ASCII characters. If the argument is NaN, the result is the string "NaN". Otherwise, the result is a string that represents the sign and magnitude of the argument. If the sign is negative, the first character of the result is '-' ('\u002D'); if the sign is positive, no sign character appears in the result. As for the magnitude m:
If m is infinity, it is represented by the string "Infinity"; thus, positive infinity produces the result "Infinity" and negative
infinity produces the result "-Infinity".
If m is zero, it is represented by the string "0x0.0p0"; thus, negative zero produces the result "-0x0.0p0" and positive zero produces the result "0x0.0p0".
If m is a double value with a normalized representation, substrings are used to represent the significand and exponent fields. The significand is represented by the characters "0x1." followed by a lowercase hexadecimal representation of the rest of the significand as a fraction. Trailing zeros in the hexadecimal representation are removed unless all the digits are zero, in which case a single zero is used. Next, the exponent is represented by "p" followed by a decimal string of the unbiased exponent as if produced by a call to Integer.toString on the exponent value.
If m is a double value with a subnormal representation, the significand is represented by the characters "0x0." followed by a hexadecimal representation of the rest of the significand as a fraction. Trailing zeros in the hexadecimal representation are removed. Next, the exponent is represented by "p-1022". Note that there must be at least one nonzero digit in a subnormal significand.
According to the JLS, the following pieces of grammar are accepted:
3.10.1. Integer Literals
IntegerTypeSuffix:
l
L
OctalNumeral:
0 OctalDigits
0 Underscores OctalDigits
HexNumeral:
0 x HexDigits
0 X HexDigits
BinaryNumeral:
0 b BinaryDigits
0 B BinaryDigits
3.10.2. Floating-Point Literals
ExponentIndicator: one of
e
E
FloatTypeSuffix: one of
f
F
d
D
HexSignificand:
HexNumeral
HexNumeral .
0 x HexDigitsopt . HexDigits
0 X HexDigitsopt . HexDigits
BinaryExponentIndicator: one of
p
P
No other single character literals are specified for those purposes.
All of the legal ways to declare a literal are defined in the JLS.
p or P is the binary exponent of a number.
l or L defines a long.
f or F defines a float.
d or D defines a double.
0B or 0b defines a binary literal.
0x or 0X defines a hexadecimal literal.
e or E is also an exponent, but since e is a valid character in hexadecimal, p is also used.
P or p is a BinaryExponentIndicator. See the Java language specification.
See http://docs.oracle.com/javase/specs/jls/se5.0/html/lexical.html#3.10.2

trying to convert double presision number to decimal

I'm trying to convert double precision number to decimal .
for example the number 22 in double precision and as bytes are :
[0] 0
[1] 0
[2] 0
[3] 0
[4] 0
[5] 0
[6] 54
[7] 64
now I try to convert these values again to 22 :
ByteBuffer buffer = ByteBuffer.wrap(data);
long l = buffer.getLong();
long b = Double.doubleToLongBits(l);
but the result is something totally wrong :4.6688606E18
what's this number ? please help , I'm totally confused!
according to IEEE Standard 754 for double precision:
Any value stored as a double requires 64 bits, formatted as shown in the table below:
63
Sign (0 = positive, 1 = negative)
62 to 52
Exponent, biased by 1023
51 to 0
Fraction f of the number 1.f
now how should I convert double precision to numbers to get 22 again ? all of these answers are wrong
I'm not sure exactly what sort of conversion you're trying to do (when you say "convert to decimal", what is your desired output format/class?)
However, my first thought reading the title was that a BigDecimal would be a valid representation. So the first approach would be to do something like the following:
double d = ...; // your input number
BigDecimal b = new BigDecimal(d);
That said, if you want to convert to decimal then it's presumably because there are floating ponit/rounding issues with the value of d, which will still be present in the BigDecimal representation as it's being constructed based on d.
The best approach to get around this is to use BigDecimals from the get-go, using their String constructor - so there won't be any instances of floating-point rounding. If this isn't an option for whatever reason, you can convert the double to a string in such a way that it will account for many floating point rounding issues:
String strRep = Double.toString(d);
BigDecimal b = new BigDecimal(strRep);
You can write simple test (with JUnit):
double initial = 22.0;
long bits = Double.doubleToLongBits(initial);
double converted = Double.longBitsToDouble(bits);
assertEquals(Double.valueOf(initial), Double.valueOf(converted));
If this works - check you have correct byte representation for 22 (correct representation will be at bits variable).

why is the Double.parseDouble making 9999999999999999 to 10000000000000000? [duplicate]

This question already has answers here:
How to resolve a Java Rounding Double issue [duplicate]
(13 answers)
Closed 9 years ago.
why is the Double.parseDouble making 9999999999999999 to 10000000000000000 ?
For Example :
Double d =Double.parseDouble("9999999999999999");
String b= new DecimalFormat("#.##").format(d);
System.out.println(b);
IS Printing
10000000000000000
instead it has to show 9999999999999999 or 9999999999999999.00
Any sort of help is greatly appreciated.
The number 9999999999999999 is just above the precision limit of double-precision floating-point. In other words, the 53-bit mantissa is not able to hold 9999999999999999.
So the result is that it is rounded to the nearest double-precision value - which is 10000000000000000.
9999999999999999 = 0x2386f26fc0ffff // 54 significant bits needed
10000000000000000 = 0x2386f26fc10000 // 38 significant bits needed
double only has 15/16 digits of accuracy and when you give it a number it can't represent (which is most of the time, even 0.1 is not accurate) it takes the closest representable number.
If you want to represent 9999999999999999 exactly, you need to use BigDecimal.
BigDecimal bd = new BigDecimal("9999999999999999");
System.out.println(new DecimalFormat("#.##").format(bd));
prints
9999999999999999
Very few real world problems need this accuracy because you can't measure anything this accurately anyway. i.e. to an error of 1 part per quintillion.
You can find the largest representable integer with
// search all the powers of 2 until (x + 1) - x != 1
for (long l = 1; l > 0; l <<= 1) {
double d0 = l;
double d1 = l + 1;
if (d1 - d0 != 1) {
System.out.println("Cannot represent " + (l + 1) + " was " + d1);
break;
}
}
prints
Cannot represent 9007199254740993 was 9.007199254740992E15
The largest representable integer is 9007199254740992 as it needs one less bit (as its even)
9999999999999999 requires 54 bits of mantissa in order to be represented exactly, and double only has 52. The number is therefore rounded to the nearest number that can be represented using a 52-bit mantissa. This number happens to be 10000000000000000.
The reason 10000000000000000 requires fewer bits is that its binary representation ends in a lot of zeroes, and those zeroes can get represented by increasing the (binary) exponent.
For detailed explanation of a similar problem, see Why is (long)9223372036854665200d giving me 9223372036854665216?

converting floating point to 32-bit fixed point in Java

I have to convert a floating point to 32-bit fixed point in Java .
Not able to understand what is a 32-bit fixed point ?
Can any body help with algorithm ?
A fixed-point number is a representation of a real number using a certain number of bits of a type for the integer part, and the remaining bits of the type for the fractional part. The number of bits representing each part is fixed (hence the name, fixed-point). An integer type is usually used to store fixed-point values.
Fixed-point numbers are usually used in systems which don't have floating point support, or need more speed than floating point can provide. Fixed-point calculations can be performed using the CPU's integer instructions.
A 32-bit fixed-point number would be stored in an 32-bit type such as int.
Normally each bit in an (unsigned in this case) integer type would represent an integer value 2^n as follows:
1 0 1 1 0 0 1 0 = 2^7 + 2^5 + 2^4 + 2^1 = 178
2^7 2^6 2^5 2^4 2^3 2^2 2^1 2^0
But if the type is used to store a fixed-point value, the bits are interpreted slightly differently:
1 0 1 1 0 0 1 0 = 2^3 + 2^1 + 2^0 + 2^-3 = 11.125
2^3 2^2 2^1 2^0 2^-1 2^-2 2^-3 2^-4
The fixed point number in the example above is called a 4.4 fixed-point number, since there are 4 bits in the integer part and 4 bits in the fractional part of the number. In a 32 bit type the fixed-point value would typically be in 16.16 format, but also could be 24.8, 28.4 or any other combination.
Converting from a floating-point value to a fixed-point value involves the following steps:
Multiply the float by 2^(number of fractional bits for the type), eg. 2^8 for 24.8
Round the result (just add 0.5) if necessary, and floor it (or cast to an integer type) leaving an integer value.
Assign this value into the fixed-point type.
Obviously you can lose some precision in the fractional part of the number. If the precision of the fractional part is important, the choice of fixed-point format can reflect this - eg. use 16.16 or 8.24 instead of 24.8.
Negative values can also be handled in the same way if your fixed-point number needs to be signed.
If my Java were stronger I'd attempt some code, but I usually write such things in C, so I won't attempt a Java version. Besides, stacker's version looks good to me, with the minor exception that it doesn't offer the possibility of rounding. He even shows you how to perform a multiplication (the shift is important!)
A very simple example for converting to fixed point, it shows how to convert and multiplies PI by2. The resulting is converted back to double to demonstrate that the mantissa wasn't lost during calculation with integers.
You could expand that easily with sin() and cos() lookup tables etc.
I would recommend if you plan to use fixed point to look for a java fixed point library.
public class Fix {
public static final int FIXED_POINT = 16;
public static final int ONE = 1 << FIXED_POINT;
public static int mul(int a, int b) {
return (int) ((long) a * (long) b >> FIXED_POINT);
}
public static int toFix( double val ) {
return (int) (val * ONE);
}
public static int intVal( int fix ) {
return fix >> FIXED_POINT;
}
public static double doubleVal( int fix ) {
return ((double) fix) / ONE;
}
public static void main(String[] args) {
int f1 = toFix( Math.PI );
int f2 = toFix( 2 );
int result = mul( f1, f2 );
System.out.println( "f1:" + f1 + "," + intVal( f1 ) );
System.out.println( "f2:" + f2 + "," + intVal( f2 ) );
System.out.println( "r:" + result +"," + intVal( result));
System.out.println( "double: " + doubleVal( result ));
}
}
OUTPUT
f1:205887,3
f2:131072,2
r:411774,6
double: 6.283172607421875
A fixed-point type is one that has a fixed number of decimal/binary places after the radix point. Or more generally, a type that can store multiples of 1/N for some positive integer N.
Internally, fixed-point numbers are stored as the value multiplied by the scaling factor. For example, 123.45 with a scaling factor of 100 is stored as if it were the integer 12345.
To convert the internal value of a fixed-point number to floating point, simply divide by the scaling factor. To convert the other way, multiply by the scaling factor and round to the nearest integer.
The definition of 32-bit fixed point could vary. The general idea of fixed point is that you have some fixed number of bits before and another fixed number of bits after the decimal point (or binary point). For a 32-bit one, the most common split is probably even (16 before, 16 after), but depending on the purpose there's no guarantee of that.
As far as the conversion goes, again it's open to some variation -- for example, if the input number is outside the range of the target, you might want to do any number of different things (e.g., in some cases wraparound could make sense, but in others saturation might be preferred).

Categories

Resources