So I have some C code which calculate some results based on the number generated by srand(). If I use the same seed number, the result will always be the same.
Now I have an Android app load these C code via JNI. However, the results become different although the same seed number is being used. I have double checked the seed number to make sure it is the same. However, since both the Android program and the native code are pretty complicated, I am having a hard time to figure out what is causing this problem.
What I am sure is, we did not use function in the java program to generate random numbers. So presumably srand() is not called with a different seed number every time. Can other functions in Java or C change the random number generated by srand()?
Thanks!
Update:
I guess my question was a little confusing. To clarify, the results I am comparing are from the same platform, but different runs. The c code use rand() to get a number calculate a result based on that. So if the seed number of srand() is always the same, the number get by rand() should be the same and hence the results should be the same. but somehow even I use the same seed for srand(), the rand() give me different number... Any thought on that?
There are many different types of random number generators, and they are not all guaranteed to be the same from platform to platform. If having a cross platform 100% predictable solution is necessary for your project, you'll probably have to write your own.
It's really not as bad as it may sound...
I'd recommend looking up random number generation such as the Mersenne Twister algorithm (which is what I use in my projects), and write a small block of code that you can share amongst all your projects. This also gives you the benefit of being able to have multiple generators with varying seeds, which comes in really useful for something like a puzzle game, where you might want a predictably random set based on a specific seed to generate your puzzle, but another clock seeded generator for randomizing special FX or other game elements.
The pseudo-random algorithm implemented by rand() is determined by the C library, and there is no standard algorithm for it. You are absolutely not guaranteed to get the same sequence of numbers from one implementation to the next, and it sounds like the Android implementation differs from your development environment. If you need a predictable sequence across platforms, you should implement your own random number generator.
Related
I am trying to procedurally generate a multiplayer world for a game without having to store the world server-side. So, I need a source of random numbers that I can be sure is identical across different platforms when seeded with the same number. I've done some searching, and it seems that Java's built-in Random class does not provide this guarantee.
Does MersenneTwister in Commons Math provide this guarantee? The documentation does not specify, but I believe by definition the Mersenne Twister is deterministic, and thus any implementation of it will give the same sequence. I want to be sure my understanding is correct:
Can I rely on it to always give me the same sequence of pseudo-random numbers on different platforms when seeded with the same value?
In what scenarios could the sequence of numbers change (e.g. an update to the library that changes some specific thing)?
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I've read a lot of guides on how to write your own random number generator, so I'm interested in the reasons why you would write your own since most languages already provide functions for generating random numbers:
Like C++
srand(time(NULL));
rand();
C#
Random rand = new Random();
rand.Next(100);
and Java
Random rand = new Random();
rand.nextInt(0, 100);
I'm mainly looking for advantages to using your own.
If you've done your research and found out that the default generator is horrible (as is the case in C or Excel, or with IBM's infamous randu), you might be motivated to download or implement a better generator. However, unless you have a very deep understanding of probability, statistics, and numerical methods, you should under no circumstances try to create your own. Even such luminaries as John von Neumann have screwed up royally on this.
Another reason might be to get cross-platform reproducibility of results.
Never, ever, roll-your-own cryptography or random number generation unless you are very comfortable with the higher math involved. Here's a short test: if you understand probability distributions, linear feedback shift registers, the incomplete gamma function, and the Chinese remainder theorem, you might be qualified to roll your own.
Otherwise, use a generator provided by someone who does understand these things. The one built into your language might not be. So look for add-on libraries with good reputations.
Sometimes, even though you want a sequence of random numbers, you might want the same exact sequence of random numbers (for debugging or other purposes).
In a portable program, designed to be run on different systems with different libraries, and possibly different random number generators, accomplishing the goal stated above might not be possible.
If you instead implement your own, you would have control over this, and could make it behave the same on a multitude of systems, rather than relying on the provided implementation.
Also, as mentioned in a comment, a provided implementation may be bugged somehow.
One of the possible reasons is right in your question..:
most languages already provide functions
They do but they are more often than not incompatible.
I had to write one once because the (lightweight) encryption I wrote was using a different language (Powerscript) than the decryption (VB) and their Random generators were not compatible.
Stock random number generators are usually pseudo-random number generators in most languages.
A pseudo-random number generator starts with some state, and uses it to produce an unpredictable sequence of seemingly uniform numbers.
There are many different pseudo-random number generators that have been researched. They have different advantages and disadvantages -- some are more random, some have longer periods, some are cryptographically strong and difficult to work out the seed from previous samples, some are fast, etc.
The one picked for a given language is going to be some compromise of the above features. In some cases, the one picked will be known to be a poor one, but for legacy reasons is left alone as the "stock" random number generator (rand() is an example of a poor random number generator). If you need different features than your given language random number generator picked as important, writing your own (or finding one) is about the only way to get it.
In some languages, the random number generator (or the distribution generator) is under specified, or subject to change between revisions of the language. If you need stability of your random number generator (say, you are using it to procedurally generate a game universe from a small seed -- see the classic game star control 2), writing it yourself may be required, even if it is a clone of the standard one on your system.
If you need your random number generator to be stable from one language to another, each language is going to have made different choices.
In C++11, the old rand() was mostly deprecated, and a new library with 3 engines, 10 predefined generators, 3 engine adapters, 21 distributions, and 1 non-pseudo random number generator (random_device) was added. The distributions are under-specified, while the generators are not: if you need cross-compiler compatibility of results from a given seed state, you would need to write your own distributions.
Even in C++11 with that embarrassment of riches, the exact trade offs you want might not be available. So you'd have to write your own.
Note that C++11's set of generators was mostly written prior to C++11 being in existence. It was written because rand() was considered useless, and people wrote libraries with their own random number generators. Best practices where gathered, and formalized in that version of C++. So another reason to learn how to write them is that your language of choice will need to be improved, and programmers are the ones who need to do it.
For an in-depth discussion of pseudo-random number generator properties, wikipedia has an acceptable place to start. Here it mentions that Java's JCG is a low quality one.
The generators you list are all PRNGs. These particular PRNGs are not suitable for gaming, scientific, or cryptographic applications.
I'm implementing a genetic algorithm using Java programming language. As you know, there are some random events in the algorithm like roullete selection, crossover, mutation, etc. In order to generate a better probability distribution among these events, which approach should be better, to use a unique Random object or create a separate Random object for each event?
Use a single object. Random number generators are designed to have long periods -- using the same seeded instance, you get a good sequence of random digits out. If you're constantly creating and destroying them, you're only getting however much randomness there is in the seeding process, which may even be none. Imagine what happens if your RNG is seeded from the system clock, and you're doing it thousands of times per second, for instance.
Use a single Random stored globally, and refer to it everywhere you need randomness. What's more, initialize it with a known seed, and write this seed to a file along with the results of the genetic algorithm.
In addition to the benefits mentioned by deong, this lets you rerun the whole program exactly if you find some interesting outputs. It can be extremely frustrating with genetic algorithms to see an interesting result and then be unable to reproduce it, because it was a rare outcome. If you have the seed, you can just rerun the program deterministically.
If you want each run to use a new seed you can do it like this:
long seed = new Random().nextLong();
log("Seed for the current run is: " + seed);
Global.setRandom(new Random(seed));
That way you get a new random seed everytime, but you can still reconstruct a given run if you need to.
Note that the Random object should not be shared between two different runs. At the start of each run, you should create a new random object and make a note of the seed.
I have a function which is expected to randomly return one value from a collection of values. Is there a good way to test this random behavior with unit-test tools like JUnit etc.?
In situations that called for a lot of unit testing of code that normally behaves randomly, I've sometimes wrapped a stream of results from a java.util.Random in an Iterable<Integer>. The advantage is that, during unit testing, I can call the same method with an ArrayList<Integer> and get completely predictable behavior.
No. By definition, the result is (or should be) indeterminate, so the usual concept of "expected result" doesn't apply.
However, you could write a fairly simple test using a statistical approach, perhaps that when called n times, the set (unique) of values returned is at least .75 n, or something similar.
Another approach may be to defer the "randomness" to a trusted and sufficient implementation, such as an algorithm based on Math.random(), meaning your unit test would not have to test the randomness, rather just the functionality.
java.util.Random can probably be relied upon to be 'sufficiently random' for your purposes, so I assume what you're trying to test is that you get a proper random distribution among the items in your collection. It's true that there are a number of ways you could select an item out of a collection based on a random number and still end up biasing the results. For instance, iterating across the collection and using a random check at each stage will bias towards earlier items in the list.
If you want to test that your function actually produces random results you're going to need to find a statistical analysis toolkit which can do that. I'd suggest you fill up collections of various sizes with integer sequences, and then run tests of your random fetching code against those collections. You can feed the fetched values into the statistical analysis to determine if they're random or biased, and since they are linear sequences, the result should imply the same property for your fetching code as a whole.
The standard way of testing random values is to generate a few thousand of them, enumerate how many of each you get, calculate the chi-square statistic of the data set, then the incomplete gamma function will give you the probability of that distribution occurring at random. If that probability is too close to 0, it is likely that your RNG is biased.
The classic example of this is the "dieharder" test suite. You might also check out the test code in my http://github.com/lcrocker/ojrandlib.
Two Questions:
Will I get different sequences of numbers for every seed I put into it?
Are there some "dead" seeds? (Ones that produce zeros or repeat very quickly.)
By the way, which, if any, other PRNGs should I use?
Solution: Since, I'm going to be using the PRNG to make a game, I don't need it to be cryptographically secure. I'm going with the Mersenne Twister, both for it's speed and huge period.
To some extent, random number generators are horses for courses. The Random class implements an LCG with reasonably chosen parameters. But it still exhibits the following features:
fairly short period (2^48)
bits are not equally random (see my article on randomness of bit positions)
will only generate a small fraction of combinations of values (the famous problem of "falling in the planes")
If these things don't matter to you, then Random has the redeeming feature of being provided as part of the JDK. It's good enough for things like casual games (but not ones where money is involved). There are no weak seeds as such.
Another alternative which is the XORShift generator, which can be implemented in Java as follows:
public long randomLong() {
x ^= (x << 21);
x ^= (x >>> 35);
x ^= (x << 4);
return x;
}
For some very cheap operations, this has a period of 2^64-1 (zero is not permitted), and is simple enough to be inlined when you're generating values repeatedly. Various shift values are possible: see George Marsaglia's paper on XORShift Generators for more details. You can consider bits in the numbers generated as being equally random. One main weakness is that occasionally it will get into a "rut" where not many bits are set in the number, and then it takes a few generations to get out of this rut.
Other possibilities are:
combine different generators (e.g. feed the output from an XORShift generator into an LCG, then add the result to the output of an XORShift generator with different parameters): this generally allows the weaknesses of the different methods to be "smoothed out", and can give a longer period if the periods of the combined generators are carefully chosen
add a "lag" (to give a longer period): essentially, where a generator would normally transform the last number generated, store a "history buffer" and transform, say, the (n-1023)th.
I would say avoid generators that use a stupid amount of memory to give you a period longer than you really need (some have a period greater than the number of atoms in the universe-- you really don't usually need that). And note that "long period" doesn't necessarily mean "high quality generator" (though 2^48 is still a little bit low!).
As zvrba said, that JavaDoc explains the normal implementation. The Wikipedia page on pseudo-random number generators has a fair amount of information and mentions the Mersenne twister, which is not deemed cryptographically secure, but is very fast and has various implementations in Java. (The last link has two implementations - there are others available, I believe.)
If you need cryptographically secure generation, read the Wikipedia page - there are various options available.
As RNGs go, Sun's implementation is definitely not state-of-theart, but's good enough for most purposes. If you need random numbers for cryptography purposes, there's java.security.SecureRandom, if you just want something faster and better than java.util.random, it's easy to find Java implementations of the Mersenne Twister on the net.
This is described in the documentation. Linear congruential generators are theoretically well-understood and a lot of material on them is available in literature and on the internet. Linear congruential generator with same parameters always outputs the same periodic sequence, and the only thing that seed decides is where the sequence begins. So the answer to your first question is "yes, if you generate enough random numbers."
See the answer in my blog post:
http://code-o-matic.blogspot.com/2010/12/how-long-is-period-of-random-numbers.html
Random has a maximal period for its state (a long, i.e. 2^64 period). This can be directly generalized to 2^k - invest as many state bits as you want, and you get the maximal period. 2Mersenne Twister has actually a very short period, comparatively (see the comments in said blog post).
--Oops. Random restricts itself to 48bits, instead of using the full 64 bits of a long, so correspondingly, its period is 2^48 after all, not 2^64.
If RNG quality really matters to you, I'd recommend using your own RNG. Maybe java.util.Random is just great, in this version, on your operating system, etc. It probably is. But that could change. It's happened before that a library writer made things worse in a later version.
It's very simple to write your own, and then you know exactly what's going on. It won't change on upgrade, etc. Here's a generator you could port to Java in 10 minutes. And if you start writing in some new language a week from now, you can port it again.
If you don't implement your own, you can grab code for a well-known RNG from a reputable source and use it in your projects. Then nobody will change your generator out from under you.
(I'm not advocating that people come up with their own algorithms, only their own implementation. Most people, myself included, have no business developing their own algorithm. It's easy to write a bad generator that you think is wonderful. That's why people need to ask questions like this one, wondering how good the library generator is. The algorithm in the generator I referenced has been through the ringer of much peer review.)