double versus long for large numbers in Java - java

I'm just starting to learn to code and this might be a very simple question but I have seen double used for numbers that are larger than int can hold. If I understand correctly, double is less precise than using long might be.
So if I have a number larger than int can hold, would it be best to use double or long? In what cases is one preferred over the other? What is best practice for this?
(Say I would like to store a variable with Earth's population. Is double or long preferred?)

tl;dr
For population of humans, use long primitive or Long class.
Details
The floating-point types trade away accuracy in exchange for speed of execution. These types in Java include float/Float and double/Double.
If working with whole numbers (no fractions):
For smaller numbers ranging from -2^31 to 2^31-1 (roughly plus or minus 2 billion , use int/Integer. Needs 32-bits for content.
For larger numbers, use long/Long. Needs 64-bits for content.
For extremely large numbers, use BigInteger.
If working with fractional numbers (not integers):
For numbers between (2-2^-23) * 2^127 and 2^-149 where you do not care about accuracy, use float/Float. Needs 32-bits for content.
For larger/smaller numbers where you do not care about accuracy, use double/Double. Needs 64-bits for content.
For more extreme numbers, use BigDecimal.
If you care about accuracy (such as money matters), use BigDecimal.
So for the earth’s population of humans, use long/Long.

Related

Big datatype in java

I have been working on an application. It is related to shopping business.
There is a model that has:
private Float count;
private Long price;
I want to know the best java datatype for count * price because of huge amount of price and count.
Another hand the overflow is not occured whenprice * count operation.
JSR 354: Money and Currency API
As a solution you can consider an approach with the JavaMoney.org library on GitHub. This library implements the JSR 354: Money and Currency API specification.
The goals of that API:
To provide an API for handling and calculating monetary amounts
To define classes representing currencies and monetary amounts, as well as monetary rounding
To deal with currency exchange rates
To deal with formatting and parsing of currencies and monetary amounts
If you don't want use any library of course you should use BigDecimal.
The maximum value of:
Float is Float.MAX_VALUE, (2-2^23)*2^127, something like 3.40282346638528860e+38 or 340,282,346,638,528,860,000,000,000,000,000,000,000.000000.
Long is Long.MAX_VALUE, 2^63-1, or 9,223,372,036,854,776,000.
Just what kind of shopping business are you running that does not fit in those types?
Floating-point
Actually, you do not want Float for the simple reason that it is based on floating-point technology, and you never use floating-point for money. Never use floating-point for any context where accuracy is important. Floating-point trades away accuracy for speed of execution. Usually, your customers will care about money. So you would never use the float, Float, double, or Double types in Java for such purposes. This workaround is prone to confusion and mistakes obviously, so it requires careful documentation and coding.
BigDecimal
Where accuracy matters, such as money, use BigDecimal. Not because it can handle large numbers, but because it is accurate. Slow, but accurate.
The advice to use BigDecimal applies only where you have a fractional amount such as tracking the pennies on a dollar. If you are using only integer numbers, you have no need for BigDecimal.
Integer workaround for fractional amounts
Indeed a workaround for the floating-point problem in languages lacking an alternative like BigDecimal is to multiple all fractional amounts until they become integers. For example, if doing bookkeeping to the penny on a dollar (United States), then multiple all amounts by 100 to keep a count of whole pennies as an integer rather than a count of fractional dollars.
Integers
As for working with integer numbers in Java, you have multiple simple choices.
For numbers 127 and less, use byte or Byte, using 8 bits.
For numbers 32,767 and less, use short or Short, using 16 bits.
For numbers 2^31-1 and less (about 2 billion), use int or Integer, using 32 bits.
For numbers 2^63-1 and less (about umpteen gazillion), use long or Long, using 64 bits.
For even larger numbers, use BigInteger.
Generally best to use the smallest integer type that can comfortably fit your current values as well as fit foreseeable future values.
The 32-bit or 64-bit types are your main choices for modern hardware. You needn't worry about the smallest types unless working with massive amounts of these values, or are quite constrained on memory. And using BigInteger is overkill for most any business-oriented app. (Science and engineering apps might be a different story.)
See also the Answer by i.merkurev on the JSR 354: Money and Currency API library.
For huge values there are BigDecimal or BigInteger classes. I will use BigDecimal in your case. You never get overflow with this classes.

When shall use float type in Java?

I know float type is A IEEE floating point, and it's not accuracy in calculation, for example, if I'd like to sum two floats 8.4 and 2.4, what I get is 10.7999999 rather than 10.8. I also know BigDecimal can solve this problem, but BigDecimal is much slower than float type.
In most real productions we'd like an accuracy value like above 10.8 not a 10.7999.. so my question is shall I prevent to use float as much as I can in programming? if not is there any use cases? I mean in a real production.
If you're handling monetary amounts, then numbers like 8.4 and 2.4 are exact values, and you'll want to use BigDecimal for those. However, if you're doing a physics calculation where you're dealing with measurements, the values 8.4 and 2.4 aren't going to be exact anyway, since measurements aren't exact. That's a use case where using double is better. Also, a scientific calculation could involve things like square roots, trigonometric functions, logarithms, etc., and those can be done only using IEEE floats. Calculations involving money don't normally involve those kinds of functions.
By the way, there's very little reason to ever use the float type; stick with double.
You use float when the percision is enough. It is generally faster to do calculations with float and requires less memory. Sometimes you just need the performance.
What you describe is caused by the fact that binary floating point numbers cannot exactly represent many numbers that can be exactly represented by decimal floating point numbers, like 8.4 or 2.4.
This affects not only the float type in Java but also double.
In many cases you can do calculations with integers and then rescale to get the deciamls correctly. But if you require numbers with equal relative accurracies, no matter how large they are, floating point is far superior.
So yes, if you can, you should prefer integers over floats, but there are many applications where floating point is required. This includes many scientific and mathematical algorithms.
You should also consider that 10.7999999 instead of 10.8 looks weird when displayed but actually the difference is really small. So it's not so much an accurracy issue but more related to number formatting. In most cases this problem is resolved by rounding the number appropriately when converting it to a string for output, for example:
String price = String.format("%.2f", floatPrice);
BigDecimals are very precise (you can determine their precision -- it is mainly limited by memory) but pretty slow and memory intensive. You use them when you need exact results, i.e. in financial applications, or when you otherwise need very precise results and when speed is not too critical.
Floating point types (double and float) are not nearly as precise, but much faster and they only take up limited memory. Typically, a float takes up 4 bytes and a double takes up 8 bytes. You use them with measurements that can't be very exact anyway, but also if you need the speed or the memory. I use them for (real time) graphics and real time music. Or when otherwise precision of the result is not so important, e.g. when measuring time or percentages when downloading or some such.

Using Double for financial Software [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I know that this question has already been discussed several times but I am not entirely satisfied with the answer. Please don't respond "Doubles are inaccurate, you can't represent 0.1! You have to use BigDecimal"...
Basically I am doing a financial software and we needed to store a lot of prices in memory. BigDecimal was too big to fit in the cache so we have decided to switch to double.
So far we are not experiencing any bug for the good reason and we need an accuracy of 12 digits. The 12 digits estimations is based on the fact that even when we talk in million, we are still able to deal with cents.
A double gives a 15 significant decimal digit precision. If you round your doubles when you have to display/compare them, what can goes wrong??
I guess on problem is the accumulation of the inaccuracy, but how bad is it? How many operations will it take before it affect the 12th digit?
Do you see any other problems with doubles?
EDIT: About long, that's definitely something that we have thinked about. We are doing a lot of division multiplication and long won't deal well with that (losing the decimal and overflow), or at least you have to be very very careful with what you do. My question is more about the theory of doubles, basically how bad is it and is the inaccuracy acceptable?
EDIT2: Don't try to solve my software, I am fine with inaccuracy :). I re-word the question : How likely an inaccuracy will happen if you need only 12digits and that you round doubles when displaying/comparing?
If you absolutely can't use BigDecimal and would prefer not to use doubles, use longs to do fixed-point arithmetic (so each long value would represent the number of cents, for example). This will let you represent 18 significant digits.
I'd say use joda-money, but this uses BigDecimal under the covers.
Edit (as the above doesn't really answer the question):
Disclaimer: Please, if accuracy matters to you at all, don't use double to represent money. But it seems the poster doesn't need exact accuracy (this seems to be about a financial pricing model which probably has more than 10**-12 built-in uncertainty), and cares more about performance. Assuming this is the case, using a double is excusable.
In general, a double cannot exactly represent a decimal fraction. So, how inexact is a double? There's no short answer for this.
A double may be able to represent a number well enough that you can read the number into a double, then write it back out again, preserving fifteen decimal digits of precision. But as it's a binary rather than a decimal fraction, it can't be exact - it's the value we wish to represent, plus or minus some error. When many arithmetic operations are performed involving inexact doubles, the amount of this error can build up over time, such that the end product has fewer than fifteen decimal digits of accuracy. How many fewer? That depends.
Consider the following function that takes the nth root of 1000, then multiplies it by itself n times:
private static double errorDemo(int n) {
double r = Math.pow(1000.0, 1.0/n);
double result = 1.0;
for (int i = 0; i < n; i++) {
result *= r;
}
return 1000.0 - result;
}
Results are as follows:
errorDemo( 10) = -7.958078640513122E-13
errorDemo( 31) = 9.094947017729282E-13
errorDemo( 100) = 3.410605131648481E-13
errorDemo( 310) = -1.4210854715202004E-11
errorDemo( 1000) = -1.6370904631912708E-11
errorDemo( 3100) = 1.1107204045401886E-10
errorDemo( 10000) = -1.2255441106390208E-10
errorDemo( 31000) = 1.3799308362649754E-9
errorDemo( 100000) = 4.00075350626139E-9
errorDemo( 310000) = -3.100740286754444E-8
errorDemo(1000000) = -9.706695891509298E-9
Note that the size of the accumulated inaccuracy doesn't increase exactly in proportion to the number of intermediate steps (indeed, it's not monotonically increasing). Given a known series of intermediate operations we can determine the probability distribtion of the inaccuracy; while this will have a wider range the more operations there are, the exact amount will depend on the numbers fed into the calculation. The uncertainty is itself uncertain!
Depending on what kind of calculation you're performing, you may be able to control this error by rounding to whole units/whole cents after intermediate steps. (Consider the case of a bank account holding $100 at 6% annual interest compounded monthly, so 0.5% interest per month. After the third month of interest is credited, do you want the balance to be $101.50 or $101.51?) Having your double stand for the number of fractional units (i.e. cents) rather than the number of whole units would make this easier - but if you're doing that, you may as well just use longs as I suggested above.
Disclaimer, again: The accumulation of floating-point error makes the use of doubles for amounts of money potentially quite messy. Speaking as a Java dev who's had the evils of using double for a decimal representation of anything drummed into him for years, I'd use decimal rather than floating-point arithmetic for any important calculations involving money.
Martin Fowler wrote something on that topic. He suggests a Money class with internal long representation, and a decimal factor.
http://martinfowler.com/eaaCatalog/money.html
Without using fixed point (integer) arithmetic you can NOT be sure that your calculations are ALWAYS correct. This is because of the way IEEE 754 floating point representation works, some decimal numbers cannot be represented as finite-length binary fractions. However, ALL fixed point numbers can be expressed as a finite length integer; therefore, they can be stored as exact binary values.
Consider the following:
public static void main(String[] args) {
double d = 0.1;
for (int i = 0; i < 1000; i++) {
d += 0.1;
}
System.out.println(d);
}
This prints 100.09999999999859. ANY money implementation using doubles WILL fail.
For a more visual explanation, click the decimal to binary converter and try to convert 0.1 to binary. You end up with 0.00011001100110011001100110011001 (0011 repeating), converting it back to decimal you get 0.0999999998603016138.
Therefore 0.1 == 0.0999999998603016138
As a sidenote, BigDecimal is simply a BigInteger with an int decimal location. BigInteger relys on an underlying int[] to hold its digits, therefore offering fixed point precision.
public static void main(String[] args) {
double d = 0;
BigDecimal b = new BigDecimal(0);
for (long i = 0; i < 100000000; i++) {
d += 0.1;
b = b.add(new BigDecimal("0.1"));
}
System.out.println(d);
System.out.println(b);
}
Output:
9999999.98112945 (A whole cent is lost after 10^8 additions)
10000000.0
Historically, it was often reasonable to use floating-point types for precise calculations on whole numbers which could get bigger than 2^32, but not bigger than 2^52 [or, on machines with a proper "long double" type, 2^64]. Dividing a 52-bit number by a 32-bit number to yield a 20-bit quotient would require a rather lengthy drawn-out process on the 8088, but the 8087 processor can do it comparatively quickly and easily. Using decimals for financial calculations would have been perfectly reasonable, if all values that needed to be precise were always represented by whole numbers.
Nowadays, computers are much more able to handle larger integer values efficiently, and as a consequence it generally makes more sense to use integers to handle quantities which are going to be represented by whole numbers. Floating-point may seem convenient for things like fractional division, but correct code will have to deal with the effects of rounding things to whole numbers no matter what it does. If three people need to pay for something that costs $100.00, one can't achieve penny-accurate accounting by having everyone pay $33.333333333333; the only way to make things balance will be to have the people pay unequal amounts.
If the size of BigDecimal is too large for your cache, than you should convert amounts to long values when they are written to the cache and convert them back to BigDecimal when they are read. This will give you a smaller memory footprint for your cache and will have accurate calculations in your application.
Even if you are able to represent your inputs to calculations correctly with doubles, that doesn't mean that you will always get accurate results. You can still suffer from cancellation and other things.
If you refuse to use BigDecimal for your application logic, than you will rewrite lots of functionality that BigDecimal already provides.
I am going to answer at question by addressing a different part of the problem. Please accept that I am trying to address the root problem not the state question to the letter. Have you looked at all of the options for reducing memory?
For example, how are you caching?
Are you using a Fly Weight pattern to reduce storage of duplicate numbers?
Have you considered representing common numbers in a certain way?
Example zero is a constant, ZERO.
How about some sort of digit range compression, or hierarchy of digits, for example a hash map by major digits? Store a 32 bit within flag or multiple of some kind
Hints at a cool difference approach, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.2643
Is your run of the mill cache doing something less efficient?
Pointers are not free, thought about array groups? depending on your problem.
Are you storing objects in the cache as well, they are not small, you can serialize them to structs etc, as well.
Look at the storage problem and stop looking to avoid a potential math issue. Typically there is a lot of excess in Java before you have to worry about digits. Even some you can work around them with the ideas above.
You cannot trust doubles in financial software. They may work great in simple cases, but due to rounding, inaccuracy in presenting certain values etc. you will run into problems.
You have no choice but to use BigDecimal. Otherwise you're saying "I'm writing financial code which almost works. You'll barely notice any discrepancies." and that's not something that'll make you look trustworthy.
Fixed point works in certain cases, but can you be sure that 1 cent accuracy is enough now and in the future?
I hope you have read Joshua Bloch Java Puzzlers Traps Pitfalls. This is what he has said in the puzzle 2: Time for a change.
Binary
floating-point is particularly ill-suited to monetary calculations, as it is impossible to represent
0.1— or any other negative power of 10— exactly as a finite-length binary fraction [EJ Item 31].

Is BigDecimal an overkill in some cases?

I'm working with money so I need my results to be accurate but I only need a precision of 2 decimal points (cents). Is BigDecimal needed to guarantee results of multiplication/division are accurate?
BigDecimal is a very appropriate type for decimal fraction arithmetic with a known number of digits after the decimal point. You can use an integer type and keep track of the multiplier yourself, but that involves doing in your code work that could be automated.
As well as managing the digits after the decimal point, BigDecimal will also expand the number of stored digits as needed - many business and government financial calculations involve sums too large to store in cents in an int.
I would consider avoiding it only if you need to store a very large array of amounts of money, and are short of memory.
One common option is to do all your calculation with integer or long(the cents value) and then simply add two decimal places when you need to display it.
Similarly, there is a JODA Money library that will give you a more full-featured API for money calculations.
It depends on your application. One reason to use that level of accuracy is to prevent errors accumulated over many operations from percolating up and causing loss of valuable information. If you're creating a casual application and/or are only using it for, say, data entry, BigDecimal is very likely overkill.
+1 for Patricias answer, but I very strongly discourage anyone to implement own classes with an integer datatype with fixed bitlength as long as someone really do not know what you are doing. BigDecimal supports all rounding and precision issues while a long/int has severe problems:
Unknown number of fraction digits: Trade exchanges/Law/Commerce are varying in their amount
of fractional digits, so you do not know if your chosen number of digits must be changed and
adjusted in the future. Worse: There are some things like stock evaluation which need a ridiculous amount of fractional digits. A ship with 1000 metric tons of coal causes e.g.
4,12 € costs of ice, leading to 0,000412 €/ton.
Unimplemented operations: It means that people are likely to use floating-point for
rounding/division or other arithmetic operations, hiding the inexactness and leading to
all the known problems of floating-point arithmetic.
Overflow/Underflow: After reaching the maximum amount, adding an amount results in changing the sign. Long.MAX_VALUE switches to Long.MIN_VALUE. This can easily happen if you are doing fractions like (a*b*c*d)/(e*f) which may perfectly valid results in range of a long, but the intermediate nominator or denominator does not.
You could write your own Currency class, using a long to hold the amount. The class methods would set and get the amount using a String.
Division will be a concern no matter whether you use a long or a BigDecimal. You have to determine on a case by case basis what you do with fractional cents. Discard them, round them, or save them (somewhere besides your own account).

compress floating point numbers with specified range and precision

In my application I'm going to use floating point values to store geographical coordinates (latitude and longitude).
I know that the integer part of these values will be in range [-90, 90] and [-180, 180] respectively. Also I have requirement to enforce some fixed precision on these values (for now it is 0.00001 but can be changed later).
After studying single precision floating point type (float) I can see that it is just a little bit small to contain my values. That's because 180 * 10^5 is greater than 2^24 (size of the significand of float) but less than 2^25.
So I have to use double. But the problem is that I'm going to store huge amounts of this values, so I don't want to waste bytes, storing unnecessary precision.
So how can I perform some sort of compression when converting my double value (with fixed integer part range and specified precision X) to byte array in java? So for example if I use precision from my example (0.00001) I end up with 5 bytes for each value.
I'm looking for a lightweight algorithm or solution so that it doesn't imply huge calculations.
To store a number x to a fixed precision of (for instance) 0.00001, just store the integer closest to 100000 * x. (By the way, this requires 26 bits, not 25, because you need to store negative numbers too.)
As TonyK said in his answer, use an int to store the numbers.
To compress the numbers further, use locality: Geo coordinates are often "clumped" (say the outline of a city block). Use a fixed reference point (full 2x26 bits resolution) and then store offsets to the last coordinate as bytes (gives you +/-0.00127). Alternatively, use short which gives you more than half the value range.
Just be sure to hide the compression/decompression in a class which only offers double as outside API, so you can adjust the precision and the compression algorithm at any time.
Considering your use case, i would nonetheless use double and compress them directly.
The reason is that strong compressors, such as 7zip, are extremely good at handling "structured" data, which an array of double is (one data = 8 bytes, this is very regular & predictable).
Any other optimisation you may come up "by hand" is likely to be inferior or offer negligible advantage, while simultaneously costing you time and risks.
Note that you can still apply the "trick" of converting the double into int before compression, but i'm really unsure if it would bring you tangible benefit, while on the other hand it would seriously reduce your ability to cope with unforeseen ranges of figures in the future.
[Edit] Depending on source data, if "lower than precision level" bits are "noisy", it can be usefull for compression ratio to remove the noisy bits, either by rounding the value or even directly applying a mask on lowest bits (i guess this last method will not please purists, but at least you can directly select your precision level this way, while keeping available the full range of possible values).
So, to summarize, i'd suggest direct LZMA compression on your array of double.

Categories

Resources