Different random number sequences on different computers - java

If a seed number is defined for the random number generation, is it possible that different random number sequences are achieved on different computers? If so, how to achieve the same sequences?
private static final long seed = 1;
Random generator = new Random(seed);
for (int i = 0; i < nchrom; i++) {
val = (int) Math.round(generater.nextDouble()*(nchrom-1));
//...
}

Yes, with the same seed you should get the same sequence of numbers. The algorithm is specified in the documentation:
An instance of this class is used to generate a stream of pseudorandom numbers. The class uses a 48-bit seed, which is modified using a linear congruential formula. (See Donald Knuth, The Art of Computer Programming, Volume 2, Section 3.2.1.)
If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers. In order to guarantee this property, particular algorithms are specified for the class Random. Java implementations must use all the algorithms shown here for the class Random, for the sake of absolute portability of Java code. However, subclasses of class Random are permitted to use other algorithms, so long as they adhere to the general contracts for all the methods.
My only concern would be that if you're using nextDouble() you could run into some artifacts of floating point unit differences. I suspect you won't, but that would be my concern. I'd recommend that you use nextInt anyway:
val = generator.nextInt(nchrom); // Exclusive upper bound

Related

Random number in java Math class

Simple question
java.lang.Math.random()
How does this work? Meaning there is no seed input, so does it generate a random number off the system time? Meaning like if two calls were made to this function at .00001s away from eachother (basically the same time), would it produce the same result?
Thanks!
The javadoc explains how it works:
When this method is first called, it creates a single new pseudorandom-number generator, exactly as if by the expression
new java.util.Random()
This new pseudorandom-number generator is used thereafter for all calls to this method and is used nowhere else.
Returned values are chosen pseudorandomly with (approximately) uniform distribution from that range. When this method is first called, it creates a single new pseudorandom-number generator, exactly as if by the expression new java.util.Random
This new pseudorandom-number generator is used thereafter for all calls to this method and is used nowhere else. This method is properly synchronized to allow correct use by more than one thread. However, if many threads need to generate pseudorandom numbers at a great rate, it may reduce contention for each thread to have its own pseudorandom-number generator.
In order to understand how does this code runs you must go through the various Random Number generator algorithms. In actual practice theres no concept call random numbers if you google "Psuedo Random Number Algorithm" then you can have a better insight about the various concepts.
Answering your Question : Yes there will be different if the Random Number Generator Algorithm is based on time (usually they are).
But at the output if u write
Random obj1 = new Random()
int p = obj1.nextInt(10%2)
int q = obj1.nextGaussian();
theres a chance that the same number may appear more than once. It is because the Number generated is undoubtedly a unique number but it due to various parameters the obtained output is filtered and so theres a probabilty that the ouput can be same
There are two principal means of generating random (really pseudo-random) numbers:
the Random class generates random integers, doubles, longs and so on, in various ranges.
the static method Math.random generates doubles between 0 (inclusive) and 1 (exclusive).
To generate random integers:
do not use Math.random (it produces doubles, not integers)
use the Random class to generate random integers between 0 and N.
To generate a series of random numbers as a unit, you need to use a single Random object - do not create a new Random object for each new random number.
Other alternatives are:
SecureRandom, a cryptographically strong subclass of Random
ThreadLocalRandom, intended for multi-threaded cases
import java.util.Random;
/** Generate 10 random integers in the range 0..99. */
public final class RandomInteger {
public static final void main(String... aArgs){
log("Generating 10 random integers in range 0..99.");
//note a single Random object is reused here
Random randomGenerator = new Random();
for (int idx = 1; idx <= 10; ++idx){
int randomInt = randomGenerator.nextInt(100);
log("Generated : " + randomInt);
}
log("Done.");
}
private static void log(String aMessage){
System.out.println(aMessage);
}
}

Please explain me the role of seed in class Random in java.util

Whenever we create an object of Random class in java. We either of the constructor
Random()
Random(long seed)
What is the purpose of seed here in the 2nd constructor and how can I use it to my benefit i.e. manipulate its use?
The answer above sums it up clearly. As per java api docs from oracle, the first constructor
Random()
"Creates a new random number generator. This constructor sets the seed of the random number generator to a value very likely to be distinct from any other invocation of this constructor. "
The seed is probably a derivative of the current time, or the current time itself. That should be enough to be "very likely to be distinct from any other invocation". Which, in essence, is most likely what you need, most of the time.
So why have another constructor that takes a seed?
Simply put, if you want to generate the same set of random numbers over and over, you use the same seed on your Random constructor. This is useful when doing experiments on different control sets, and you don't want to bother creating your own table of random inputs, but still want the same set of random input on a different experiment/control set.
There's no such thing as truly random numbers in computing. The available methods for getting a random number across all programming languages is nothing but an algorithm to simulate random numbers.
In some languages (C++, I know for sure), an unseeded random number generator will return the same series of numbers on every fresh execution of the program.
What is common is to seed the random number generator with the current time (which will be random enough for most purposes) so that the algorithm starts with a random number each time.
Pseudo-random number generators maintain some set of state information, which is advanced through some recurrence relation to determine the next value of the state. The output of a PRNG is some function of the state. Java's Random class uses a Linear Congruential Generator. LCG's work using the recurrence relationship Ui+1 = (A Ui + C) % M for some constant integer values A, C, and M. Java's current implementation uses a 48-bit state but uses 32 bits or less of it on each iteration of the recurrence.
Based on this, you can see that if you start with the same state you will get the exact same sequence of values out of your PRNG. This can be useful if you want to be able to reproduce exactly the same sequence of "randomness", for instance for debugging or for comparing two experiments head-to-head.
If you invoke the constructor without an argument, it picks a starting value for the state with a promise that different invocations are very likely to be distinct from each other. If you supply a seed to the constructor, that seed's value is used to set the initial state.

What's with 181783497276652981 and 8682522807148012 in Random (Java 7)?

Why were 181783497276652981 and 8682522807148012 chosen in Random.java?
Here's the relevant source code from Java SE JDK 1.7:
/**
* Creates a new random number generator. This constructor sets
* the seed of the random number generator to a value very likely
* to be distinct from any other invocation of this constructor.
*/
public Random() {
this(seedUniquifier() ^ System.nanoTime());
}
private static long seedUniquifier() {
// L'Ecuyer, "Tables of Linear Congruential Generators of
// Different Sizes and Good Lattice Structure", 1999
for (;;) {
long current = seedUniquifier.get();
long next = current * 181783497276652981L;
if (seedUniquifier.compareAndSet(current, next))
return next;
}
}
private static final AtomicLong seedUniquifier
= new AtomicLong(8682522807148012L);
So, invoking new Random() without any seed parameter takes the current "seed uniquifier" and XORs it with System.nanoTime(). Then it uses 181783497276652981 to create another seed uniquifier to be stored for the next time new Random() is called.
The literals 181783497276652981L and 8682522807148012L are not placed in constants, but they don't appear anywhere else.
At first the comment gives me an easy lead. Searching online for that article yields the actual article. 8682522807148012 doesn't appear in the paper, but 181783497276652981 does appear -- as a substring of another number, 1181783497276652981, which is 181783497276652981 with a 1 prepended.
The paper claims that 1181783497276652981 is a number that yields good "merit" for a linear congruential generator. Was this number simply mis-copied into Java? Does 181783497276652981 have an acceptable merit?
And why was 8682522807148012 chosen?
Searching online for either number yields no explanation, only this page that also notices the dropped 1 in front of 181783497276652981.
Could other numbers have been chosen that would have worked as well as these two numbers? Why or why not?
Was this number simply mis-copied into Java?
Yes, seems to be a typo.
Does 181783497276652981 have an acceptable merit?
This could be determined using the evaluation algorithm presented in the paper. But the merit of the "original" number is probably higher.
And why was 8682522807148012 chosen?
Seems to be random. It could be the result of System.nanoTime() when the code was written.
Could other numbers have been chosen that would have worked as well as these two numbers?
Not every number would be equally "good". So, no.
Seeding Strategies
There are differences in the default-seeding schema between different versions and implementation of the JRE.
public Random() { this(System.currentTimeMillis()); }
public Random() { this(++seedUniquifier + System.nanoTime()); }
public Random() { this(seedUniquifier() ^ System.nanoTime()); }
The first one is not acceptable if you create multiple RNGs in a row. If their creation times fall in the same millisecond range, they will give completely identical sequences. (same seed => same sequence)
The second one is not thread safe. Multiple threads can get identical RNGs when initializing at the same time. Additionally, seeds of subsequent initializations tend to be correlated. Depending on the actual timer resolution of the system, the seed sequence could be linearly increasing (n, n+1, n+2, ...). As stated in How different do random seeds need to be? and the referenced paper Common defects in initialization of pseudorandom number generators, correlated seeds can generate correlation among the actual sequences of multiple RNGs.
The third approach creates randomly distributed and thus uncorrelated seeds, even across threads and subsequent initializations.
So the current java docs:
This constructor sets the seed of the random number generator to a
value very likely to be distinct from any other invocation of this
constructor.
could be extended by "across threads" and "uncorrelated"
Seed Sequence Quality
But the randomness of the seeding sequence is only as good as the underlying RNG.
The RNG used for the seed sequence in this java implementation uses a multiplicative linear congruential generator (MLCG) with c=0 and m=2^64. (The modulus 2^64 is implicitly given by the overflow of 64bit long integers)
Because of the zero c and the power-of-2-modulus, the "quality" (cycle length, bit-correlation, ...) is limited. As the paper says, besides the overall cycle length, every single bit has an own cycle length, which decreases exponentially for less significant bits. Thus, lower bits have a smaller repetition pattern. (The result of seedUniquifier() should be bit-reversed, before it is truncated to 48-bits in the actual RNG)
But it is fast! And to avoid unnecessary compare-and-set-loops, the loop body should be fast. This probably explains the usage of this specific MLCG, without addition, without xoring, just one multiplication.
And the mentioned paper presents a list of good "multipliers" for c=0 and m=2^64, as 1181783497276652981.
All in all: A for effort # JRE-developers ;) But there is a typo.
(But who knows, unless someone evaluates it, there is the possibility that the missing leading 1 actually improves the seeding RNG.)
But some multipliers are definitely worse:
"1" leads to a constant sequence.
"2" leads to a single-bit-moving sequence (somehow correlated)
...
The inter-sequence-correlation for RNGs is actually relevant for (Monte Carlo) Simulations, where multiple random sequences are instantiated and even parallelized. Thus a good seeding strategy is necessary to get "independent" simulation runs. Therefore the C++11 standard introduces the concept of a Seed Sequence for generating uncorrelated seeds.
If you consider that the equation used for the random number generator is:
Where X(n+1) is the next number, a is the multipler, X(n) is the current number, c is the increment and m is the modulus.
If you look further into Random, a, c and m are defined in the header of the class
private static final long multiplier = 0x5DEECE66DL; //= 25214903917 -- 'a'
private static final long addend = 0xBL; //= 11 -- 'c'
private static final long mask = (1L << 48) - 1; //= 2 ^ 48 - 1 -- 'm'
and looking at the method protected int next(int bits) this is were the equation is implemented
nextseed = (oldseed * multiplier + addend) & mask;
//X(n+1) = (X(n) * a + c ) mod m
This implies that the method seedUniquifier() is actually getting X(n) or in the first case at initialisation X(0) which is actually 8682522807148012 * 181783497276652981, this value is then modified further by the value of System.nanoTime(). This algorithm is consistent with the equation above but with the following X(0) = 8682522807148012, a = 181783497276652981, m = 2 ^ 64 and c = 0. But as the mod m of is preformed by the long overflow the above equation just becomes
Looking at the paper, the value of a = 1181783497276652981 is for m = 2 ^ 64, c = 0. So it appears to just be a typo and the value 8682522807148012 for X(0) which appears to be a seeming randomly chosen number from legacy code for Random. As seen here. But the merit of these chosen numbers could still be valid but as mentioned by Thomas B. probably not as "good" as the one in the paper.
EDIT - Below original thoughts have since been clarified so can be disregarded but leaving it for reference
This leads me the conclusions:
The reference to the paper is not for the value itself but for the methods used to obtain the values due to the different values of a, c and m
It is mere coincidence that the value is otherwise the same other than the leading 1 and the comment is misplaced (still struggling to believe this though)
OR
There has been a serious misunderstanding of the tables in the paper and the developers have just chosen a value at random as by the time it is multiplied out what was the point in using the table value in the first place especially as you can just provide your own seed value any way in which case these values are not even taken into account
So to answer your question
Could other numbers have been chosen that would have worked as well as these two numbers? Why or why not?
Yes, any number could have been used, in fact if you specify a seed value when you Instantiate Random you are using any other value. This value does not have any effect on the performance of the generator, this is determined by the values of a,c and m which are hard coded within the class.
As per the link you provided, they have chosen (after adding the missing 1 :) ) the best yield from 2^64 because long can't have have a number from 2^128

C's stdlib rand() function in Java

I've generated a series of random numbers from a known seed in C using srand() and rand() from stdlib. I now need to generate the same series of numbers using the same seed from C in Java.
Java's Random class documentation states it uses a "linear congruential formula". The documentation I've found on rand() says it uses a "linear congruential" generator, although I'm not sure if this is for one specific implementation.
Does anyone know if both generators will produce the same numbers if given the same seed or if a port of srand() and rand() exists for Java?
The C standard does not dictate the implementations of srand() and rand(). As such, different environments (OS, C libraries, architecture, etc.) will more than likely produce sequences of numbers that are different for the same seed value.
Also, the implementation Java's Random class is not bound to any particular algorithm. Here again, different JVMs may very well produce different sequences for the same seed value. In addition, the implementation will more than likely not be tied to the standard C functions. Which means, the Java produced sequence will be different than a C sequence using the same seed.
If you truly need to generate random sequence in Java to match exactly that of the standard C functions, the best you could hope to do is replicate the sequence for a particular environment. This would require creating a JNI library to proxy calls to srand() and rand() or creating some other external process that makes the calls and is invoked from Java. Either way, that's a lot of complexity and more program maintenance.
If in fact all you need are random sequences that appear to be uniformly distributed, regardless of the exact values, then use Random as is. It is more than sufficient for most RNG needs.
As said in other answer, the C standard doesn't even specify that rand() will return the same sequence across different C platforms (libraries), and likewise nothing in Java guarantees that it matches any given C (or other Java) implementation. You could use JNI to call the specific C implementation on that platform, but this would only guarantee the same sequence to be produced when both the C and Java programs are run on the same platform using the same C library.
If you wish to ensure the same sequence in all situations, you need to implement the same random number generator in both languages. A simple example can be found in POSIX.1-2001, and is quoted on many man 3 rand pages:
static unsigned long next = 1;
/* RAND_MAX assumed to be 32767 */
int myrand(void) {
next = next * 1103515245 + 12345;
return((unsigned)(next/65536) % 32768);
}
void mysrand(unsigned seed) {
next = seed;
}
It is trivially portable to Java. The quality of randomness produced this generator is not very high, but it's not really guaranteed to be any better with rand() either, so you would need to implement something fancier if better randomness is required.
In both C and Java, the same seed will generate the same random values. While the underlying mechanism might be different, this property is maintained in every programming language I know about.
C and Java will probably not generate the same set of random numbers given a seed. The only property that is maintained is that given a seed, a programming language will generate the same random numbers.

Reimplementing mkpasswd

On Linux I am used to using mkpasswd to generate random passwords to use, on OS X however I don't have this command. Instead of sshing in to my vps every time, I wanted to re implement it using Java. What I have done is pick at random 4 lower case letters, 2 upper case letters, 2 symbols (/ . , etc) and 2 numbers. Then I create a vector and shuffle that too.
Do you think this is good enough randomization?
If you use java.security.SecureRandom instead of java.util.Random then it's probably secure. SecureRandom provides a "cryptographically strong pseudo-random number generator (PRNG)". I.e. it ensures that the seed cannot easily be guessed and that the numbers generated have high entropy.
yes, it is. If you are using java.util.Random:
An instance of this class is used to generate a stream of pseudorandom numbers. The class uses a 48-bit seed, which is modified using a linear congruential formula. (See Donald Knuth, The Art of Computer Programming, Volume 2, Section 3.2.1.)
The algorithms implemented by class Random use a protected utility method that on each invocation can supply up to 32 pseudorandomly generated bits.
EDIT
in response to a comment:
/**
* Creates a new random number generator. This constructor sets
* the seed of the random number generator to a value very likely
* to be distinct from any other invocation of this constructor.
*/
public Random() {
this(++seedUniquifier + System.nanoTime());
}
private static volatile long seedUniquifier = 8682522807148012L;
There is a similar pwgen command available in the Mac Ports.
Depends on where your entropy comes from. Using rand() or similar functions that your particular language comes with may not be secure.
On OSX you can use /dev/random I think.
It might be OK, but you should allow some randomization in password lengths perhaps.
If your program became popular it would become a weakness that the password length was public knowledge. Also randomize the exact ratio of lowercase:uppercase:symbols:numbers a little.
Why not just compile mkpasswd on your OS X host?

Categories

Resources