I have a float-based storage of decimal by their nature numbers. The precision of float is fine for my needs. Now I want is to perform some more precise calculations with these numbers using double.
An example:
float f = 0.1f;
double d = f; //d = 0.10000000149011612d
// but I want some code that will convert 0.1f to 0.1d;
Update 1:
I know very well that 0.1f != 0.1d. This question is not about precise decimal calculations. Sadly, the question was downvoted. I will try to explain it again...
Let's say I work with an API that returns float numbers for decimal MSFT stock prices. Believe or not, this API exists:
interface Stock {
float[] getDayPrices();
int[] getDayVolumesInHundreds();
}
It is known that the price of a MSFT share is a decimal number with no more than 5 digits, e.g. 31.455, 50.12, 45.888. Obviously the API does not work with BigDecimal because it would be a big overhead for the purpose to just pass the price.
Let's also say I want to calculate a weighted average of these prices with double precision:
float[] prices = msft.getDayPrices();
int[] volumes = msft.getDayVolumesInHundreds();
double priceVolumeSum = 0.0;
long volumeSum = 0;
for (int i = 0; i < prices.length; i++) {
double doublePrice = decimalFloatToDouble(prices[i]);
priceVolumeSum += doublePrice * volumes[i];
volumeSum += volumes[i];
}
System.out.println(priceVolumeSum / volumeSum);
I need a performant implemetation of decimalFloatToDouble.
Now I use the following code, but I need a something more clever:
double decimalFloatToDouble(float f) {
return Double.parseDouble(Float.toString(f));
}
EDIT: this answer corresponds to the question as initially phrased.
When you convert 0.1f to double, you obtain the same number, the imprecise representation of the rational 1/10 (which cannot be represented in binary at any precision) in single-precision. The only thing that changes is the behavior of the printing function. The digits that you see, 0.10000000149011612, were already there in the float variable f. They simply were not printed because these digits aren't printed when printing a float.
Ignore these digits and compute with double as you wish. The problem is not in the conversion, it is in the printing function.
As I understand you, you know that the float is within one float-ulp of an integer number of hundredths, and you know that you're well inside the range where no two integer numbers of hundredths map to the same float. So the information isn't gone at all; you just need to figure out which integer you had.
To get two decimal places, you can multiply by 100, rint/Math.round the result, and multiply by 0.01 to get a close-by double as you wanted. (To get the closest, divide by 100.0 instead.) But I suspect you knew this already and are looking for something that goes a little faster. Try ((9007199254740992 + 100.0 * x) - 9007199254740992) * 0.01 and don't mess with the parentheses. Maybe strictfp that hack for good measure.
You said five significant figures, and apparently your question isn't limited to MSFT share prices. Up until doubles can't represent powers of 10 exactly, this isn't too bad. (And maybe this works beyond that threshold too.) The exponent field of a float narrows down the needed power of ten down to two things, and there are 256 possibilities. (Except in the case of subnormals.) Getting the right power of ten just needs a conditional, and the rounding trick is straightforward enough.
All of this is all going to be a mess, and I'd recommend you stick with the toString approach for all the weird cases.
If your goal is to have a double whose canonical representation will match the canonical representation of a float converting the float to string and converting the result back to double would probably be the most accurate way of achieving that result, at least when it's possible (I don't know for certain whether Java's double-to-string logic would guarantee that there won't be a pair of consecutive double values which report themselves as just above and just-below a number with five significant figures).
If your goal is to round to five significant figures a value which is known to have been rounded to five significant figures while in float form, I would suggest that the simplest approach is probably to simply round to five significant figures. If your magnitude of your numbers will be roughly within the range 1E+/-12, start by finding the smallest power of ten which is smaller than your number, multiply that by 100,000, multiply your number by that, round to the nearest unit, and divide by that power of ten. Because division is often much slower than multiplication, if performance is critical, you might keep a table with powers of ten and their reciprocals. To avoid the possibility of rounding errors, your table should store for each power of then the closest power-of-two double to its reciprocal, and then the closest double to the difference between the first double and the actual reciprocal. Thus, the reciprocal of 100 would be stored as 0.0078125 + 0.0021875; the value n/100 would be computed as n*0.0078125 + n*0.0021875. The first term would never have any round-off error (multiplying by a power of two), and the second value would have precision beyond that needed for the final result, so the final result should thus be rounded accurately.
Related
I'm writing a bank program with a variable long balance to store cents in an account. When users inputs an amount I have a method to do the conversion from USD to cents:
public static long convertFromUsd (double amountUsd) {
if(amountUsd <= maxValue || amountUsd >= minValue) {
return (long) (amountUsd * 100.0)
} else {
//no conversion (throws an exception, but I'm not including that part of the code)
}
}
In my actual code I also check that amountUsd does not have more than 2 decimals, to avoid inputs that cannot be accurately be converted (e.g 20.001 dollars is not exactly 2000 cents). For this example code, assume that all inputs has 0, 1 or 2 decimals.
At first I looked at Long.MAX_VALUE (9223372036854775807 cents) and assumed that double maxValue = 92233720368547758.07 would be correct, but it gave me rounding errors for big amounts:
convertFromUsd(92233720368547758.07) gives output 9223372036854775807
convertFromUsd(92233720368547758.00) gives the same output 9223372036854775807
What should I set double maxValue and double minValue to always get accurate return values?
You could use BigDecimal as a temp holder
If you have a very large double (something between Double.MAX_VALUE / 100.0 + 1 and Double.MAX_VALUE) the calculation of usd * 100.0 would result in an overflow of your double.
But since you know that every possible result of <any double> * 100 will fit in a long you could use a BigDecimal as a temporary holder for your calculation.
Also, the BigDecimal class defines two methods which come in handy for this purpose:
BigDecimal#movePointRight
BigDecimal#longValueExact
By using a BigDecimal you don't have to bother about specifying a max-value at all -> any given double representing USD can be converted to a long value representing cents (assuming you don't have to handle cent-fractions).
double usd = 123.45;
long cents = BigDecimal.valueOf(usd).movePointRight(2).setScale(0).longValueExact();
Attention: Keep in mind that a double is not able to store the exact USD information in the first place. It is not possible to restore the information that has been lost by converting the double to a BigDecimal.
The only advantage a temporary BigDecimal gives you is that the calculation of usd * 100 won't overflow.
First of all, using double for monetary amounts is risky.
TL;DR
I'd recommend to stay below $17,592,186,044,416.
The floating-point representation of numbers (double type) doesn't use decimal fractions (1/10, 1/100, 1/1000, ...), but binary ones (e.g. 1/128, 1/256). So, the double number will never exactly hit something like $1.99. It will be off by some fraction most of the time.
Hopefully, the conversion from decimal digit input ("1.99") to a double number will end up with the closest binary approximation, being a tiny fraction higher or lower than the exact decimal value.
To be able to correctly represent the 100 different cent values from $xxx.00 to $xxx.99, you need a binary resolution where you can at least represent 128 different values for the fractional part, meaning that the least significant bit corresponds to 1/128 (or better), meaning that at least 7 trailing bits have to be dedicated to the fractional dollars.
The double format effectively has 53 bits for the mantissa. If you need 7 bits for the fraction, you can devote at most 46 bits to the integral part, meaning that you have to stay below 2^46 dollars ($70,368,744,177,664.00, 70 trillions) as the absolute limit.
As a precaution, I wouldn't trust the best-rounding property of converting from decimal digits to double too much, so I'd spend two more bits for the fractional part, resulting in a limit of 2^44 dollars, $17,592,186,044,416.
Code Warning
There's a flaw in your code:
return (long) (amountUsd * 100.0);
This will truncate down to the next-lower cent if the double value lies between two exact cents, meaning that e.g. "123456789.23" might become 123456789.229... as a double and getting truncated down to 12345678922 cents as a long.
You should better use
return Math.round(amountUsd * 100.0);
This will end up with the nearest cent value, most probably being the "correct" one.
EDIT:
Remarks on "Precision"
You often read statements that floating-point numbers aren't precise, and then in the next sentence the authors advocate BigDecimal or similar representations as being precise.
The validity of such a statement depends on the type of number you want to represent.
All the number representation systems in use in today's computing are precise for some types of numbers and imprecise for others. Let's take a few example numbers from mathematics and see how well they fit into some typical data types:
42: A small integer can be represented exactly in virtually all types.
1/3: All the typical data types (including double and BigDecimal) fail to represent 1/3 exactly. They can only do a (more or less close) approximation. The result is that multiplication with 3 does not exactly give the integer 1. Few languages offer a "ratio" type, capable to represent numbers by numerator and denominator, thus giving exact results.
1/1024: Because of the power-of-two denominator, float and double can easily do an exact representation. BigDecimal can do as well, but needs 10 fractional digits.
14.99: Because of the decimal fraction (can be rewritten as 1499/100), BigDecimal does it easily (that's what it's made for), float and double can only give an approximation.
PI: I don't know of any language with support for irrational numbers - I even have no idea how this could be possible (aside from treating popular irrationals like PI and E symbolically).
123456789123456789123456789: BigInteger and BigDecimal can do it exactly, double can do an approximation (with the last 13 digits or so being garbage), int and long fail completely.
Let's face it: Each data type has a class of numbers that it can represent exactly, where computations deliver precise results, and other classes where it can at best deliver approximations.
So the questions should be:
What's the type and range of numbers to be represented here?
Is an approximation okay, and if yes, how close should it be?
What's the data type that matches my requirements?
Using a double, the biggest, in Java, would be: 70368744177663.99.
What you have in a double is 64 bit (8 byte) to represent:
Decimals and integers
+/-
Problem is to get it to not round of 0.99 so you get 46 bit for the integer part and the rest need to be used for the decimals.
You can test with the following code:
double biggestPossitiveNumberInDouble = 70368744177663.99;
for(int i=0;i<130;i++){
System.out.printf("%.2f\n", biggestPossitiveNumberInDouble);
biggestPossitiveNumberInDouble=biggestPossitiveNumberInDouble-0.01;
}
If you add 1 to biggestPossitiveNumberInDouble you will see it starting to round off and lose precision.
Also note the round off error when subtracting 0.01.
First iterations
70368744177663.99
70368744177663.98
70368744177663.98
70368744177663.97
70368744177663.96
...
The best way in this case would not to parse to double:
System.out.println("Enter amount:");
String input = new Scanner(System.in).nextLine();
int indexOfDot = input.indexOf('.');
if (indexOfDot == -1) indexOfDot = input.length();
int validInputLength = indexOfDot + 3;
if (validInputLength > input.length()) validInputLength = input.length();
String validInput = input.substring(0,validInputLength);
long amout = Integer.parseInt(validInput.replace(".", ""));
System.out.println("Converted: " + amout);
This way you don't run into the limits of double and just have the limits of long.
But ultimately would be to go with a datatype made for currency.
You looked at the largest possible long number, while the largest possible double is smaller. Calculating (amountUsd * 100.0) results in a double (and afterwards gets casted into a long).
You should ensure that (amountUsd * 100.0) can never be bigger than the largest double, which is 9007199254740992.
Floating values (float, double) are stored differently than integer values (int, long) and while double can store very large values, it is not good for storing money amounts as they get less accurate the bigger or more decimal places the number has.
Check out How many significant digits do floats and doubles have in java? for more information about floating point significant digits
A double is 15 significant digits, the significant digit count is the total number of digits from the first non-zero digit. (For a better explanation see https://en.wikipedia.org/wiki/Significant_figures Significant figures rules explained)
Therefor in your equation to include cents and make sure you are accurate you would want the maximum number to have no more than 13 whole number places and 2 decimal places.
As you are dealing with money it would be better not to use floating point values. Check out this article on using BigDecimal for storing currency: https://medium.com/#cancerian0684/which-data-type-would-you-choose-for-storing-currency-values-like-trading-price-dd7489e7a439
As you mentioned users are inputting an amount, you could read it in as a String rather than a floating point value and pass that into a BigDecimal.
Situation
I am in a situation where I will have a lot of numbers around about 0 - 15. The vast majority are whole numbers, but very few will have decimal values. All of the ones with decimal value will be "#.5", so 1.5, 2.5, 3.5, etc. but never 1.1, 3.67, etc.
I'm torn between using float and int (with the value multiplied by 2 so the decimal is gone) to store these numbers.
Question
Because every value will be .5, can I safely use float without worrying about the wierdness that comes along with floating point numbers? Or do I need to use int? If I do use int, can every smallish number be divided by 2 to safely give the absolute correct float?
Is there a better way I am missing?
Other info
I'm not considering double because I don't need that kind of precision or range.
I'm storing these in a wrapper class, if I go with int whenever I need to get the value I am going to be returning the int cast as a float divided by 2.
What I went with in the end
float seems to be the way to go.
This is not a theoretical proof but you can test it empirically:
public static void main(String[] args) {
BigDecimal half = new BigDecimal("0.5");
for (int i = 0; i < Integer.MAX_VALUE; i++) {
float f = i + 0.5f;
if (new BigDecimal(f).compareTo(new BigDecimal(i).add(half)) != 0) {
System.out.println(new BigDecimal(i).add(half) + " => " + new BigDecimal(f));
break;
}
}
}
prints:
8388608.5 => 8388608
Meaning that all xxx.5 can be exactly represented as a float between 0.5 and 8388607.5.
For larger numbers float's precision is not enough to represent the number and it is rounded to something else.
Let's refer to the subset of floating point numbers which have a decimal portion of .0 or .5 as point-five floats, or PFFs.
The following properties are guaranteed:
Any number up to 8 million or so (2^23, to be exact) which ends in .0 or .5 is representable as a PFF.
Adding/subtracting two PFFs results in a PFF, unless there's overflow.
Multiplying a PFF by an integer results in a PFF, unless there's overflow.
These properties are guaranteed by the IEEE-754 rules, which give a 24-bit mantissa and guarantee exact rounding of exact results.
Using ints will give you a somewhat larger range.
There will be no accuracy issues with .5's with float for that range, so both approaches will work.
If these represent actual number values, I would chose the float simply because it consumes the same amount of memory and I don't need to write code to convert between some internal int representation and the exposed float value.
If these numbers represent something other than a value, e.g. a grade from a very limited set, I would consider modelling them as an enum, depending on how these are ultimately used.
Say I have 2 double values. One of them is very large and one of them is very small.
double x = 99....9; // I don't know the possible max and min values,
double y = 0,00..1; // so just assume these values are near max and min.
If I add those values together, do I lose precision?
In other words, does the max possible double value increase if I assign an int value to it? And does the min possible double value decrease if I choose a small integer part?
double z = x + y; // Real result is something like 999999999999999.00000000000001
double values are not evenly distributed over all numbers. double uses the floating point representation of the number which means you have a fixed amount of bits used for the exponent and a fixed amount of bits used to represent the actual "numbers"/mantissa.
So in your example using a large and a small value would result in dropping the smaller value since it can not be expressed using the larger exponent.
The solution to not dropping precision is using a number format that has a potentially growing precision like BigDecimal - which is not limited to a fixed number of bits.
I'm using a decimal floating point arithmetic with a precision of three decimal digits and (roughly) with the same features as the typical binary floating point arithmetic. Say you have 123.0 and 4.56. These numbers are represented by a mantissa (0<=m<1) and an exponent: 0.123*10^3 and 0.456*10^1, which I'll write as <.123e3> and <.456e1>. Adding two such numbers isn't immediately possible unless the exponents are equal, and that's why the addition proceeds according to:
<.123e3> <.123e3>
<.456e1> <.004e3>
--------
<.127e3>
You see that the necessary alignment of the decimal digits according to a common exponent produces a loss of precision. In the extreme case, the entire addend could be shifted into nothingness. (Think of summing an infinite series where the terms get smaller and smaller but would still contribute considerably to the sum being computed.)
Other sources of imprecision result from differences between binary and decimal fractions, where an exact fraction in one base cannot be represented without error using the other one.
So, in short, addition and subtraction between numbers from rather different orders of magnitude are bound to cause a loss of precision.
If you try to assign too big value or too small value a double, compiler will give an error:
try this
double d1 = 1e-1000;
double d2 = 1e+1000;
I want to use double up to just 2 decimal places. i.e. it will be stored upto 2 decimal places, if two double values are compared then the comparison should be based on only the first 2 decimal places. How to achieve such a thing? I mean storing, comparison, everything will be just based on the 1st two decimal places. The remaining places may be different, greater than, less than, doesn't matter.
EDIT
My values arent large. say from 0 to 5000 maximum. But I have to multiply by Cos A, Sin A a lot of times, where the value of A keeps changing during the course of the program.
EDIT
Look in my program a car is moving at a particular speed, say 12 m/s. Now after every few minutes, the car changes direction, as in chooses a new angle and starts moving in a straight line along that direction. Now everytime it moves, I have to find out its x and y position on the map. which will be currentX+velocity*Cos A and currentY+Velocity*Sin A. but since this happens often, there will be a lot of cumulative error over time. How to avoid that?
Comparing floating point values for equality should always use some form of delta/epsilon comparison:
if (Abs(value1 - value2) < 0.01 )
{
// considered equal to 2 decimal places
}
Don't use a float (or double). For one, it can't represent all two-decimal-digit numbers. For another, you can get the same (but accurate) effect with an int or long. Just pretend the tens and ones column is really the tenths and hundredths column. You can always divide by 100.0 if you need to output the result to screen, but for comparisons and behind-the-scenes work, integer storage should be fine. You can even get arbitrary precision with BigInteger.
To retain a value of 2 decimal places, use the BigDecimal class as follows:
private static final int DECIMAL_PLACES = 2;
public static void main(String... args) {
System.out.println(twoDecimalPlaces(12.222222)); // Prints 12.22
System.out.println(twoDecimalPlaces(12.599999)); // Prints 12.60
}
private static java.math.BigDecimal twoDecimalPlaces(final double d) {
return new java.math.BigDecimal(d).setScale(DECIMAL_PLACES,
java.math.RoundingMode.HALF_UP);
}
To round to two decimal places you can use round
public static double round2(double d) {
return Math.round(d * 100) / 100.0;
}
This only does round half up.
Note: decimal values in double may not be an exact representation. When you use Double.toString(double) directly or indirectly, it does a small amount of rounding so the number will appear as intended. However if you take this number and perform an operation you may need to round the number again.
it will be stored up to 2 decimal
places
Impossible. Floating-point numbers don't have decimal places. They have binary places after the dot.
You have two choices:
(a) don't use floating-point, as per dlev's answer, and specifically use BigDecimal;
(b) set the required precision when doing output, e.g. via DecimalFormat, an SQL column with defined decimal precision, etc.
You should also have a good look at What every computer scientist should know about floating-point.
I am trying to measure the accuracy of a computed result (as a double), compared to a BigDecimal with the known correct result to arbitrary precision. I want to make sure it is correct up to x decimal places. I figure this can be done like so:
(order of magnitude of correct result) - (order of magnitude of difference) > x
I am having trouble finding a simple way to compute the order of magnitude of a BigDecimal though. Any ideas?
If this is a bad way to measure accuracy, I would be open to other techniques.
To check the number of correct decimal places you just want the order of magnitude of the difference, not the order of magnitude of the correct result. So I suppose you need to convert your double computed result to a BigDecimal, subtract the precise result, then convert back to a double and take the logarithm in base 10.
Or if you just need to check whether the result is accurate to x decimal places then just check if the difference is greater than 0.5 * 10^(-x), or equivalently:
int x = 3; // number of decimal places required
BigDecimal difference = accurateResult.subtract(new BigDecimal(approxResult));
BigDecimal testStat = difference.movePointRight(x).abs();
boolean ok = testStat.compareTo(new BigDecimal(0.5)) <= 0;
Actually that probably isn't quit right depending on exactly what you mean by "correct to x decimal places" and how rigorous you need to be. You could say that 0.15001 and 0.24999 are equal to 1 decimal place (both round to 0.2) but that 0.19999 and 0.25001 are not even though the difference is smaller. If you go that way I think you just have to explicitly round both numbers to x decimal places and then compare.
Since you seem to be interested only in a rough estimate: you should compare the log_10 of both values, which tells you the order of magnitude of you error... You can get a good approximation of log_10 of a BigInteger by looking at the length of its decimal representation toString(10).length()