Generating random unique double values in Java - java

I need a collection of 64-bit floating point random numbers, and they should be distinct. Is there a library routine for this, or should I manually search for duplicates?
It is actually more important to have the numbers not being closer than some very small constant \epsilon. Is there a library routine for that as well?

You may use streams for that.
double[] array = new Random().doubles()
.distinct()
.limit(500) // How many you want.
.toArray();

You can use Set collection. It won't allow insertion of unique values. Below is an example:
Set<Double> doubles = new HashSet<Double>();
Random r = new Random();
for(int i=0 ; i<100 ; i++){
doubles.add(r.nextDouble() * 100);
}

At first you need to understand, how a random-number-generator works. A sequence of positive integers, long integers, with no doubles in it, is calculated. This sequence is at least 2^31 elements long. The real doubles in the range of 0.0 ..... 1.0 are the result of a floating point division.Floating point division is never exact.
If you use this real numbers to generate integer in smaller interval, it is the quickest method,to use a random-number-generator, which gives you positive integer from that interval.
The algorithm for the Lehmer-generator is
x1 = (x0 * m) % div
x0 : the last random number,x1 the next random number. Div and m are prime numbers. m < div. The first x0 is select by the user.called seed number.
It is clear, that the x_i are smaller then div. For the other properties of good random-number-generator, there is no short proof.
My suggestion:
Write a method for a Lehmer-generator with m = 279470273 and div = 4294967291. I found these numbers on several web pages. Div = 2^32-5, so you can be sure to get a sequence of nearly 2^32 positive long integer,all different. Convert them to doubles and divide them with div as double. You get doubles in the open interval (0.0, ..... 1.0) and all these doubles are different.
The random integers are small enough, that the quotients are also different. If you use a random generator, which generate bigger integer random numbers, you can not sure, that doubles are also different, the reason are rounding errors.

Related

Dividing 2 double elements in an array keep resulting in unintentional rounding

I have a double array that has the population of some cities in the first array, and the populations of the country those cities are in in the 2nd array.
double[][] population = new double[][]{
{24153000,18590000,18000000,14657000,14543000,13617000,13197596,12877000,12784000,12400000,12038000,
11908000,11548000,11035000,10608000,10355000,10290000,10152000,10125000,9752000},
{1384688986,1384688986,207862518,81257239,162951560,126168156,143964513,105920222,
1384688986,1296834042,207652865,1384688986,1384688986,1296834042,1384688986,207862518,50791919,1384688986,86300000,31773839
}
I am trying to find the percent of a Countries population that lives in the city. So I have a for loop that divides array 1 at some index by the equivalent index at array 2. The body of the loop says
double percent= (population[0][i]*100f)/population[1][i];
Yet I keep getting integer division. For example, for dividing the first elements of both arrays, I get 2 instead of 1.744.
Can anyone tell me why this is happening? I am getting whole numbers only even though I am dividing doubles and not ints.
Edit: Here is the rest of the loop. There is some extra stuff, such as i'm only supposed to print out the cities that have over 10m people. Also some formatting.
for (int i = 0; i < population[1].length; i++) {
if (population[0][i] > 10000000) {
double percent = (population[0][i]*100f)/population[1][i];
System.out.printf("%10.0f", percent);
System.out.println();
}
It's the way you are printing it out. Try
System.out.println(percent);
or check out the documentation for the Formatter
https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html
The problem is the way that you are printing the number:
System.out.printf("%10.0f", percent);
The format specifier "%10.0f" says:
decimal floating point
10 characters width
zero digits after the decimal point.
That results in rounding to the nearest whole number.
Alternatives:
Increase the zero in the format
Remove it, and use the default (up to 6 digits after the decimal point)
Just use println(percent) ... which will display the number to the full significant precision ... and no more.
Note: it is advisable use 100.0 (type double) rather than 100f (type float) when forcing an expression to use floating point calculation. It is better to do the calculation with double precision, especially if you are going to assign the result to a double.
In this case, you are not forcing to floating point, since population is already floating point. However:
It is stylistically preferable to use 100.0 to flag to the reader that the calculation is being done in double precision.
It is possible that a float literal is less precise than the corresponding double literal ... though not in this case.

Safely generate random numbers between some range in Java

How do I safely generate a random integer value in a specific range?
I know many people have asked this before, this post for example, but the method doesn't seem to be safe. Let me explain:
The 'Math' library has Math.random() which generates a random value in the range [0, 1). Using that, one can construct an algorithm like
int randomInteger = Math.floor(Math.random() * (Integer.MAX_VALUE - Integer.MIN_VALUE + 1) + Integer.MIN_VALUE)
to generate a random number between Integer.MAX_VALUE and Integer.MIN_VALUE. However, Integer.MAX_VALUE - Integer.MIN_VALUE will overflow.
The goal is not to merely generate random numbers but to generate them evenly, meaning 1 has the same probability to appear as Integer.MAX_VALUE. I know there are work-arounds to this, such as casting large values to long but then the problem again is how to generate a long integer value from Long.MIN_VALUE to Long.MAX_VALUE.
I'm also not sure about other pre-written algorithms as they can overflow too and cause the probability distribution to change. So my question is whether there is a mathematical equation that uses only integers (no casting to long anywhere) and Math.random() to generate random numbers from Integer.MIN_VALUE to Integer.MAX_VALUE. Or if anyone know any random generators that don't get overflow internally?
The 'Math' library have Math.random() which generates a random value in [0, 1) range.
So don't use Math.random() - use this:
Random r = new Random();
int i = r.nextInt();
The docs for nextInt say:
nextInt() - Returns the next pseudorandom, uniformly distributed int value from this random number generator's sequence. All 2^32 possible int values are produced with (approximately) equal probability.
It appears I misread the question slightly, and you need long not int - luckily the contract is the same.
long l = r.nextLong()
This will quite literally take two ints and jam them together to make a long.
Still better might be to use
java.security.SecureRandom which is a cryptographically strong random number generator (RNG).
Random x = new Random(1000); // code running and generate values 1000
int i = x.nextInt();

int or float to represent numbers that can be only integer or "#.5"

Situation
I am in a situation where I will have a lot of numbers around about 0 - 15. The vast majority are whole numbers, but very few will have decimal values. All of the ones with decimal value will be "#.5", so 1.5, 2.5, 3.5, etc. but never 1.1, 3.67, etc.
I'm torn between using float and int (with the value multiplied by 2 so the decimal is gone) to store these numbers.
Question
Because every value will be .5, can I safely use float without worrying about the wierdness that comes along with floating point numbers? Or do I need to use int? If I do use int, can every smallish number be divided by 2 to safely give the absolute correct float?
Is there a better way I am missing?
Other info
I'm not considering double because I don't need that kind of precision or range.
I'm storing these in a wrapper class, if I go with int whenever I need to get the value I am going to be returning the int cast as a float divided by 2.
What I went with in the end
float seems to be the way to go.
This is not a theoretical proof but you can test it empirically:
public static void main(String[] args) {
BigDecimal half = new BigDecimal("0.5");
for (int i = 0; i < Integer.MAX_VALUE; i++) {
float f = i + 0.5f;
if (new BigDecimal(f).compareTo(new BigDecimal(i).add(half)) != 0) {
System.out.println(new BigDecimal(i).add(half) + " => " + new BigDecimal(f));
break;
}
}
}
prints:
8388608.5 => 8388608
Meaning that all xxx.5 can be exactly represented as a float between 0.5 and 8388607.5.
For larger numbers float's precision is not enough to represent the number and it is rounded to something else.
Let's refer to the subset of floating point numbers which have a decimal portion of .0 or .5 as point-five floats, or PFFs.
The following properties are guaranteed:
Any number up to 8 million or so (2^23, to be exact) which ends in .0 or .5 is representable as a PFF.
Adding/subtracting two PFFs results in a PFF, unless there's overflow.
Multiplying a PFF by an integer results in a PFF, unless there's overflow.
These properties are guaranteed by the IEEE-754 rules, which give a 24-bit mantissa and guarantee exact rounding of exact results.
Using ints will give you a somewhat larger range.
There will be no accuracy issues with .5's with float for that range, so both approaches will work.
If these represent actual number values, I would chose the float simply because it consumes the same amount of memory and I don't need to write code to convert between some internal int representation and the exposed float value.
If these numbers represent something other than a value, e.g. a grade from a very limited set, I would consider modelling them as an enum, depending on how these are ultimately used.

Java: convert float to double preserving decimal point precision

I have a float-based storage of decimal by their nature numbers. The precision of float is fine for my needs. Now I want is to perform some more precise calculations with these numbers using double.
An example:
float f = 0.1f;
double d = f; //d = 0.10000000149011612d
// but I want some code that will convert 0.1f to 0.1d;
Update 1:
I know very well that 0.1f != 0.1d. This question is not about precise decimal calculations. Sadly, the question was downvoted. I will try to explain it again...
Let's say I work with an API that returns float numbers for decimal MSFT stock prices. Believe or not, this API exists:
interface Stock {
float[] getDayPrices();
int[] getDayVolumesInHundreds();
}
It is known that the price of a MSFT share is a decimal number with no more than 5 digits, e.g. 31.455, 50.12, 45.888. Obviously the API does not work with BigDecimal because it would be a big overhead for the purpose to just pass the price.
Let's also say I want to calculate a weighted average of these prices with double precision:
float[] prices = msft.getDayPrices();
int[] volumes = msft.getDayVolumesInHundreds();
double priceVolumeSum = 0.0;
long volumeSum = 0;
for (int i = 0; i < prices.length; i++) {
double doublePrice = decimalFloatToDouble(prices[i]);
priceVolumeSum += doublePrice * volumes[i];
volumeSum += volumes[i];
}
System.out.println(priceVolumeSum / volumeSum);
I need a performant implemetation of decimalFloatToDouble.
Now I use the following code, but I need a something more clever:
double decimalFloatToDouble(float f) {
return Double.parseDouble(Float.toString(f));
}
EDIT: this answer corresponds to the question as initially phrased.
When you convert 0.1f to double, you obtain the same number, the imprecise representation of the rational 1/10 (which cannot be represented in binary at any precision) in single-precision. The only thing that changes is the behavior of the printing function. The digits that you see, 0.10000000149011612, were already there in the float variable f. They simply were not printed because these digits aren't printed when printing a float.
Ignore these digits and compute with double as you wish. The problem is not in the conversion, it is in the printing function.
As I understand you, you know that the float is within one float-ulp of an integer number of hundredths, and you know that you're well inside the range where no two integer numbers of hundredths map to the same float. So the information isn't gone at all; you just need to figure out which integer you had.
To get two decimal places, you can multiply by 100, rint/Math.round the result, and multiply by 0.01 to get a close-by double as you wanted. (To get the closest, divide by 100.0 instead.) But I suspect you knew this already and are looking for something that goes a little faster. Try ((9007199254740992 + 100.0 * x) - 9007199254740992) * 0.01 and don't mess with the parentheses. Maybe strictfp that hack for good measure.
You said five significant figures, and apparently your question isn't limited to MSFT share prices. Up until doubles can't represent powers of 10 exactly, this isn't too bad. (And maybe this works beyond that threshold too.) The exponent field of a float narrows down the needed power of ten down to two things, and there are 256 possibilities. (Except in the case of subnormals.) Getting the right power of ten just needs a conditional, and the rounding trick is straightforward enough.
All of this is all going to be a mess, and I'd recommend you stick with the toString approach for all the weird cases.
If your goal is to have a double whose canonical representation will match the canonical representation of a float converting the float to string and converting the result back to double would probably be the most accurate way of achieving that result, at least when it's possible (I don't know for certain whether Java's double-to-string logic would guarantee that there won't be a pair of consecutive double values which report themselves as just above and just-below a number with five significant figures).
If your goal is to round to five significant figures a value which is known to have been rounded to five significant figures while in float form, I would suggest that the simplest approach is probably to simply round to five significant figures. If your magnitude of your numbers will be roughly within the range 1E+/-12, start by finding the smallest power of ten which is smaller than your number, multiply that by 100,000, multiply your number by that, round to the nearest unit, and divide by that power of ten. Because division is often much slower than multiplication, if performance is critical, you might keep a table with powers of ten and their reciprocals. To avoid the possibility of rounding errors, your table should store for each power of then the closest power-of-two double to its reciprocal, and then the closest double to the difference between the first double and the actual reciprocal. Thus, the reciprocal of 100 would be stored as 0.0078125 + 0.0021875; the value n/100 would be computed as n*0.0078125 + n*0.0021875. The first term would never have any round-off error (multiplying by a power of two), and the second value would have precision beyond that needed for the final result, so the final result should thus be rounded accurately.

Java's '==' operator on doubles

This method returns 'true'. Why ?
public static boolean f() {
double val = Double.MAX_VALUE/10;
double save = val;
for (int i = 1; i < 1000; i++) {
val -= i;
}
return (val == save);
}
You're subtracting quite a small value (less than 1000) from a huge value. The small value is so much smaller than the large value that the closest representable value to the theoretical result is still the original value.
Basically it's a result of the way floating point numbers work.
Imagine we had some decimal floating point type (just for simplicity) which only stored 5 significant digits in the mantissa, and an exponent in the range 0 to 1000.
Your example is like writing 10999 - 1000... think about what the result of that would be, when rounded to 5 significant digits. Yes, the exact result is 99999.....9000 (with 999 digits) but if you can only represent values with 5 significant digits, the closest result is 10999 again.
When you set val to Double.MAX_VALUE/10, it is set to a value approximately equal to 1.7976931348623158 * 10^307. substracting values like 1000 from that would required a precision on the double representation that is not possible, so it basically leaves val unchanged.
Depending on your needs, you may use BigDecimal instead of double.
Double.MAX_VALUE is so big that the JVM does not tell the difference between it and Double.MAX_VALUE-1000
if you subtract a number fewer than "1.9958403095347198E292" from Double.MAV_VALUE the result is still Double.MAX_VALUE.
System.out.println(
new BigDecimal(Double.MAX_VALUE).equals( new BigDecimal(
Double.MAX_VALUE - 2.E291) )
);
System.out.println(
new BigDecimal(Double.MAX_VALUE).equals( new BigDecimal(
Double.MAX_VALUE - 2.E292) )
);
Ouptup:
true
false
A double does not have enough precision to perform the calculation you are attempting. So the result is the same as the initial value.
It is nothing to do with the == operator.
val is a big number and when subtracting 1 (or even 1000) from it, the result cannot be expressed properly as a double value. The representation of this number x and x-1 is the same, because double only has a limited number of bits to represent an unlimited number of numbers.
Double.MAX_VALUE is a huge number compared to 1 or 1000. Double.MAX_VALUE-1 is generally equals to Double.MAX_VALUE. So your code roughly does nothing when substracting 1 or 1000 to Double.MAX_VALUE/10.
Always remember that:
doubles or floats are just approximations of real numbers, they are just rationals not equally distributed among the reals
you should use very carefully arithmetic operators between doubles or floats which are not close (there is many other rules such like this...)
in general, never use doubles or float if you need arbitrary precision
Because double is a floating point numeric type, which is a way of approximating numeric values. Floating point representations encode numbers so that we can store numbers much larger or smaller than we normally could. However, not all numbers can be represented in the given space, so multiple numbers get rounded to the same floating point value.
As a simplified example, we might want to be able to store values ranging from -1000 to 1000 in some small amount of space where we would normally only be able to store -10 to 10. So we could round all values to the nearest thousand and store them in the small space: -1000 gets encoded as -10, -900 gets encoded as -9, 1000 gets encoded as 10. But what if we want to store -999? The closest value we can encoded is -1000, so we have to encode -999 as the same value as -1000: -10.
In reality, floating point schemes are much more complicated than the example above, but the concept is similar. Floating point representations of numbers can only represent some of all the possible numbers, so when we have a number that can't be represented as part of the scheme, we have to round it to the closest representable value.
In your code, all values within 1000 of Double.MAX_VALUE / 10 automatically get rounded to Double.MAX_VALUE / 10, which is why the computer thinks (Double.MAX_VALUE / 10) - 1000 == Double.MAX_VALUE / 10.
The result of a floating point calculation is the closest representable value to the exact answer. This program:
public class Test {
public static void main(String[] args) throws Exception {
double val = Double.MAX_VALUE/10;
System.out.println(val);
System.out.println(Math.nextAfter(val, 0));
}
}
prints:
1.7976931348623158E307
1.7976931348623155E307
The first of these numbers is your original val. The second is the largest double that is less than it.
When you subtract 1000 from 1.7976931348623158E307, the exact answer is between those two numbers, but very, very much closer to 1.7976931348623158E307 than to 1.7976931348623155E307, so the result will be rounded to 1.7976931348623155E307, leaving val unchanged.

Categories

Resources