Generate random numbers correctly

Generate random numbers correctly - java

I would like to have 5 random numbers for every object I process. I process many objects (separately) and need to make sure that randomness is achieved across all numbers. If I process 5 objects, I will have 25 random numbers:
RN1 RN2 RN3 RN4 RN5
Object 1 1 2 3 4 5
Object 2 6 7 8 9 10
Object 3 11 12 13 14 15
Object 4 16 17 18 19 20
Object 5 21 22 23 24 25
Questions are:
for single object, does it make a difference if I create random number generator for every single number using current time in milliseconds as seed or when I create one random number generator and get series of numbers using nextDouble in terms of randomness quality?
once I process multiple objects and I take all first random numbers of all objects, will these will these form uniform random distribution (e.g. numbers 1, 6, 11, 16, 21) or this will be somehow broken?
My view is that it would be best to create one random number generator only (shared by all objects) so that whenever new random number is required I can call nextDouble() and get next number in sequence of random numbers.

Have a look at the ThreadLocalRandom class from Java.
It provides uniform distribution and avoids bottleneck as each of your threads will have its own copy.
Regarding them having different sequences, it's all about changing their seed. One common practice in that case is to see the generator with the Thread/Task/Process's identifier.

•for single object, does it make a difference if I create random
number generator for every single number using current time in
milliseconds as seed or when I create one random number generator and
get series of numbers using nextDouble in terms of randomness quality?
Don't use current time as seed for every number. The generation takes less time than the the resolution of current time in milliseconds.

The safest way is probably to preliminary generate the required number of random numbers, save it into an array, and establish the rules of access order. In such way you have full control over the process. There is also no "loss of randomness".
Otherwise, if you launch several generators at once, they most likely will be seeded with the same value (system time by default), and if you use single generator accessed simultaneously by different threads, you need to pass an object of Random class, which is may be good but also can lead to the loss of reproducibility (I'm not sure, if this is crucial in your case).

Related

Java Math.random period

I'm working on a big school project about random numbers, but I can't find the period for Math.random(). I have version 7.0.800.15 installed and I'm working an a windows 10 computer. I've tried determining the period with a simple program which saves the first values of:
double num = Math.random();
in an array and then loops until it finds the same values in a row again, thus a period would have passed, but with no result, the period is too long.
So my question is: What is the period of Math.random() on my version?
Or: Is there a way to determine the period using a simple program?
Edit: took away a source pointing to a page about JavaScript, it was not relevant

Java's Math.Random uses a linear congruential generator with a modulus of 2^48. The period of such pseudorandom generator with well-chosen parameters is equal to the modulus. Apparently the parameters in Java are sanely chosen, so in practise the period is 2^48.
Sources:
https://en.wikipedia.org/wiki/Linear_congruential_generator
http://www.javamex.com/tutorials/random_numbers/java_util_random_algorithm.shtml#.WKX-gRJ97dQ

The wiki on linear congruential generator cites Java (java.util.Random) as having a modulus of 248.
That is likely the period but you may need to read more about these types of random generators.
This question (How good is java.util.Random?) also cites the same period.

Just to add to the other answers and to comment a little more generally on random number generators and writing a program to determine what the period is, beware of the Birthday Paradox and the Gambler's Fallacy. If you generate some value x, the next number is still just as likely to be x as any other number, and the number of numbers you need to generate before you're likely to have a duplicate is actually surprisingly small (meaning that you could, in principle, start seeing some duplicates before the end of the period, which complicates writing a program to test this).
The probability of a duplicate for probabilities up to 50% or so can be approximated by sqrt(2m * p(n)) where p(n) is the probability you're trying to calculating and m is the number of choices. For a 32-bit integer, sqrt(2m * p(n)) = sqrt(2 * 2^32 * 0.5) = sqrt(2^32) = 65,536. There you have it - once you generate 65,536 numbers there's approximately a 50-50 chance you've generated a duplicate.
Once you've generated 2^32 + 1 values, the Pigeonhole Principle specifies that you must have generated at least one duplicate (assuming, of course, that you're generating a 32-bit number).
You may also be interested in this question on whether you can count on random numbers to be unique.

Weighed Number Generator

I've been searching for, and found similar topics, but cannot understand them or work out how to apply them.
Simple: all I want is to generate a number between 1 and 100:
Numbers 1 to 30 should have a 60% probability.
Numbers 31 to 60 should have a 35% probability.
Numbers 61 to 100 should have a 5% probability.

Get numbers in your ranges
First generate 3 random numbers in your 3 intervals.
1: 1-30
2: 31-60
3: 61-100
Generate the probability number
Next, generate a number 1-100. If the number is 1-60 choose the first random number from the first step if it is 61-95 do the second option and if it is 96-100 chose the third.

This sounds like a homework problem so I will not provide the code, but here is a description of a simple algorithm:
Generate a random number between 1 and 100. Lets call this X. X will be used to determine how to generate your final result:
If X is between 1 and 60, generate a random number between 1 and 30 to be your final result.
If X is between 61 and 95, generate a random number between 31 and 60 to be your final result.
If X is between 96 and 100, generate a random number between 61 and 100 to be your final result.
You can see that this requires two random number generations for every weighted number that you want. It can actually be simplified into a single random number generation, and that is left as an exercise for you.
FYI, how to generate a random number within a range is found here: How do I generate random integers within a specific range in Java?

Simplest way for me would be to generate four randoms. The first would be a number 1-100. The second would be a number 1-30, the third would be a number 31-60, and the fourth would be a number 61-100. Think of the first random as a percent. If it is 1-60 you then move on to run the second random, if it is 60-95 run the third random, and if it is 95-100 run the fourth random. There are other ways to make it shorter, but in my opinion this is easiest to comprehend.
Create random number 1-100 with this: (int)(Math.random()*100)+1
The rest should just be conditionals.

Why random of 0 to 4 is 1 most of times?

I am using a simple random calculations for a small range with 4 elements.
indexOfNext = new Random().nextInt(4); //randomize 0 to 3
When I attach the debugger I see that for the first 2-3 times every time the result is 1.
Is this the wrong method for a small range or I should implement another logic (detecting previous random result)?
NOTE: If this is just a coincidence, then using this method is the wrong way to go, right? If yes, can you suggest alternative method? Shuffle maybe? Ideally, the alternative method would have a way to check that next result is not the same as the previous result.

Don't create a new Random() each time, create one object and call it many times.
This is because you need one random generator and many numbers from its
random sequence, as opposed to having many random generators and getting
just the 1st numbers of the random sequences they generate.

You're abusing the random number generator by creating a new instance repeatedly. (This is due to the implementation setting a starting random number value that's a very deterministic function of your system clock time). This ruins the statistical properties of the generator.
You should create one instance, then call nextInt as you need numbers.

One thing you can do is hold onto the Random instance. It has to seed itself each time you instantiate it and if that seed is the same then it will generate the same random sequence each time.
Other options are converting to SecureRandom, but you definitely want to hold onto that random instance to get the best random number performance. You really only need SecureRandom is you are randomly generating things that have security implications. Like implementing crypto algorithms or working around such things.

"Random" doesn't mean "non-repeating", and you cannot judge randomness by short sequences. For example, imagine that you have 2 sequences of 1 and 0:
101010101010101010101010101010101
and
110100100001110100100011111010001
Which looks more random? Second one, of course. But when you take any short sequence of 3-4-5 numbers from it, such sequence will look less random than taken from the first one. This is well-known and researched paradox.

A pseudo-random number generator requires a seed to work. The seed is a bit array which size depends on the implementation. The initial seed for a Random can either be specified manually or automatically assigned.
Now there are several common ways of assigning a random seed. One of them is ++currentSeed, the other is current timestamp.
It is possible that java uses System.currentTimeMillis() to get the timestamp and initialize the seed with it. Since the resolution of the timestamp is at most a millisecond (it differs on some machines, WinXP AFAIK had 3ms) all Random instances instantiated in the same millisecond-resolution window will have the same seed.
Another feature of pseudo-random number generators is that if they have the same seed, they return the same numbers in the same order.
So if you get the first pseudo-random number returned by several Randoms initialized with the same seed you are bound to get the same number for a few times. And I suspect that's what's happening in your case.

It seems that the number 1 has a 30% chance of showing its self which is more than other numbers. You can read this link for more information on the subject.
You can also read up on Benford's law.

Java Math.random() How random is it?

I am working on a project that needs to generate two random numbers from a given range (both of them at the same time, one after another) and check if they are equal to each other - if they are, proceed executing other code; if they aren't - generate the numbers again. Now my question is, if we have a range [0;10], and the first randomly generated number turned out to be 5, is the probability of the second number also being 5 as good as any other number? Specifically, does Math.random() have any "defense" against generating same number if it is called twice consecutively? or it "tries" to not generate the same number?

Generating the same number in the range [0,10] twice in succession is a perfectly valid occurrence for any random number generator. If it took any steps to prevent that it wouldn't be random.
On any invocation, the chances of any individual number being chosen should be 1:11, and each choice should be independent of previous choices, so the chances that in a pair the second number matches the first is 1 in 11.
As to how random Math.random() is, it's pseudo-random, meaning it uses an algorithm to generate a series of evenly distributed numbers starting with a "seed" value. It's not suitable for cryptography but quite good for simulations and other non-cryptographic uses.

Seeding java.util.Random with consecutive numbers

I've simplified a bug I'm experiencing down to the following lines of code:
int[] vals = new int[8];
for (int i = 0; i < 1500; i++)
vals[new Random(i).nextInt(8)]++;
System.out.println(Arrays.toString(vals));
The output is: [0, 0, 0, 0, 0, 1310, 190, 0]
Is this just an artifact of choosing consecutive numbers to seed Random and then using nextInt with a power of 2? If so, are there other pitfalls like this I should be aware of, and if not, what am I doing wrong? (I'm not looking for a solution to the above problem, just some understanding about what else could go wrong)
Dan, well-written analysis. As the javadoc is pretty explicit about how numbers are calculated, it's not a mystery as to why this happened as much as if there are other anomalies like this to watch out for-- I didn't see any documentation about consecutive seeds, and I'm hoping someone with some experience with java.util.Random can point out other common pitfalls.
As for the code, the need is for several parallel agents to have repeatably random behavior who happen to choose from an enum 8 elements long as their first step. Once I discovered this behavior, the seeds all come from a master Random object created from a known seed. In the former (sequentially-seeded) version of the program, all behavior quickly diverged after that first call to nextInt, so it took quite a while for me to narrow the program's behavior down to the RNG library, and I'd like to avoid that situation in the future.

As much as possible, the seed for an RNG should itself be random. The seeds that you are using are only going to differ in one or two bits.
There's very rarely a good reason to create two separate RNGs in the one program. Your code is not one of those situations where it makes sense.
Just create one RNG and reuse it, then you won't have this problem.
In response to comment from mmyers:
Do you happen to know java.util.Random
well enough to explain why it picks 5
and 6 in this case?
The answer is in the source code for java.util.Random, which is a linear congruential RNG. When you specify a seed in the constructor, it is manipulated as follows.
seed = (seed ^ 0x5DEECE66DL) & mask;
Where the mask simply retains the lower 48 bits and discards the others.
When generating the actual random bits, this seed is manipulated as follows:
randomBits = (seed * 0x5DEECE66DL + 0xBL) & mask;
Now if you consider that the seeds used by Parker were sequential (0 -1499), and they were used once then discarded, the first four seeds generated the following four sets of random bits:
101110110010000010110100011000000000101001110100
101110110001101011010101011100110010010000000111
101110110010110001110010001110011101011101001110
101110110010011010010011010011001111000011100001
Note that the top 10 bits are indentical in each case. This is a problem because he only wants to generate values in the range 0-7 (which only requires a few bits) and the RNG implementation does this by shifting the higher bits to the right and discarding the low bits. It does this because in the general case the high bits are more random than the low bits. In this case they are not because the seed data was poor.
Finally, to see how these bits convert into the decimal values that we get, you need to know that java.util.Random makes a special case when n is a power of 2. It requests 31 random bits (the top 31 bits from the above 48), multiplies that value by n and then shifts it 31 bits to the right.
Multiplying by 8 (the value of n in this example) is the same as shifting left 3 places. So the net effect of this procedure is to shift the 31 bits 28 places to the right. In each of the 4 examples above, this leaves the bit pattern 101 (or 5 in decimal).
If we didn't discard the RNGs after just one value, we would see the sequences diverge. While the four sequences above all start with 5, the second values of each are 6, 0, 2 and 4 respectively. The small differences in the initial seeds start to have an influence.
In response to the updated question: java.util.Random is thread-safe, you can share one instance across multiple threads, so there is still no need to have multiple instances. If you really have to have multiple RNG instances, make sure that they are seeded completely independently of each other, otherwise you can't trust the outputs to be independent.
As to why you get these kind of effects, java.util.Random is not the best RNG. It's simple, pretty fast and, if you don't look too closely, reasonably random. However, if you run some serious tests on its output, you'll see that it's flawed. You can see that visually here.
If you need a more random RNG, you can use java.security.SecureRandom. It's a fair bit slower, but it works properly. One thing that might be a problem for you though is that it is not repeatable. Two SecureRandom instances with the same seed won't give the same output. This is by design.
So what other options are there? This is where I plug my own library. It includes 3 repeatable pseudo-RNGs that are faster than SecureRandom and more random than java.util.Random. I didn't invent them, I just ported them from the original C versions. They are all thread-safe.
I implemented these RNGs because I needed something better for my evolutionary computation code. In line with my original brief answer, this code is multi-threaded but it only uses a single RNG instance, shared between all threads.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.