I'd like to understand Java's SecureRandom object

I'd like to understand Java's SecureRandom object - java

While doing a beginner's crypto course I'm trying to get to grips with Java's SecureRandom object. What I think I understand is that:
a) No matter how long a sequence of random numbers you know, there is no way of predicting the next random number in the sequence.
b) No matter how long a sequence of random numbers you know, there is no way of knowing which seed was used to start them off, other than brute force guesswork.
c) You can request secure random numbers of various sizes.
d) You can seed a newly-created SRNG with various different-sized values. Every newly-created SRNG you create and seed with the same value will produce the same sequence of random numbers.
I should add that I'm assuming that this code is used on Windows:
Random sr = SecureRandom.getInstance("SHA1PRNG", "SUN");
Is my basic understanding correct? Thanks in advance.
I have some further questions for anyone who's reasonably expert in crypto. They relate to seeding a SRNG as opposed to letting it seed itself on first use.
e) What difference, if any, does it make to the random numbers generated, if you seed a SRNG with a long integer as opposed to an array of 8 bytes?
f) If I seed a SRNG with, say, 256 bytes is there any other seed that can produce the same sequence of random numbers?
g) Is there some kind of optimum seed size? It seems to me that this might be a meaningless question.
h) If I encrypt a plaintext by seeding a SRNG with, say, 256 bytes then getting it to generate random bytes to XOR with the bytes in the plaintext, how easy would it be for an eavesdropper to decrypt the resulting ciphertext? How long might it take? Am I right in thinking that the eavesdropper would have to know, guess, or calculate the 256-byte seed?
I've looked at previous questions about SecureRandom and none seem to answer my particular concerns.
If any of these questions seem overly stupid, I'd like to reiterate that I'm very much a beginner in studying this field. I'd be very grateful for any input as I want to understand how Java SecureRandom objects might be used in cryptography.

d) This is true for a PRNG. It is not always true for a CSRNG. Read the Javadoc for SecureRandom.setSeed(): "The given seed supplements, rather than replaces, the existing seed. Thus, repeated calls are guaranteed never to reduce randomness."
Any reasonable CSRNG will have "invisible" sources of entropy that you cannot explicitly control, often various internal parameters taken from the Operating System level. Hence there is more seeding than any number you explicitly pass to the RNG.

OK, in order:
a) correct
b) correct
c) correct, you can even request a number in a range [0, n) using nextInt(n)
d) as good as correct: the implementation of SHA1PRNG is not publicly defined by any algorithm and there are indications that the implementation has changed in time, so this is only true for the Sun provider, and then only for a specific runtime configuration
e) as the API clearly indicates that all the bytes within the long are used ("using the eight bytes contained in the given long seed") there should not be any difference regarding the amount of entropy added to the state
Note that a quick check shows that setSeed(long) behaves entirely different from setSeed(byte[]) with the main difference that the seed of the long value is always mixed in with randomness retrieved from the system, even if it is the first call after the SecureRandom instance is constructed.
f) yes - an infinite number of seeds generate the same stream; since a hash function is used, it will be impossible to find one though
g) if you mix in additional entropy, then the more entropy the better, but there is no minimum; if you use it as the only seed then you should not start off with less than 20 bytes of seed, that is: if you want to keep the seed to the same security constraints as the inner state of the PRNG
And I would add that if you use less than 64 bytes of entropy you are in the danger zone for sure. Note that 1 bit of entropy does not always mean 1 bit in a byte. A byte array of size 8 may have 64 bits of entropy or less.
h) that's basically a hash based stream cipher; it's secure, so an attacker has little to no chance (given you don't reuse the seed) but it is a horribly unreliable (see answer d) and slow stream cipher, so please never ever do this - use a Cipher with "AES/CTR/NoPadding" or "AES/GCM/NoPadding" instead

e) I don't think that it makes a difference. Assuming that the long and the 8-byte array contain the same data.
f) In principle, yes. If your seed is larger than the internal state of the RNG, then there may exist some other seed that will result in the same internal state. If the seed is smaller than the state, then there shouldn't be. I don't know what SecureRandom's internal state looks like.
g) It's not the size of the seed that matters; it's the amount of entropy in it. You need there to be at least as much entropy in your seed as the security you expect out of the RNG; I'm not quite sure what best practices are here.
h) I'm not sure how easy it would be to break the RNG-based stream cipher that you propose. But I would recommend against using it in practice, because it's not a standard cryptographic construction that has been reviewed by experts and has reasonable security proofs. Remember the Rules of Crypto:
Never design your own crypto.
Never implement your own crypto.
Anyone can design crypto that they can't break themselves.

Related

Does the Android implementation of SecureRandom produce true random numbers?

I have read that, generally, some implementations of SecureRandom may produce true random numbers.
In particular, the Android docs say
instances of this class will generate an initial seed using an internal entropy source, such as /dev/urandom
but does that mean it will produce true random numbers (i.e., rather than pseudo-random numbers)?
And if I use SecureRandom in Android in this manner...
SecureRandom sr = new SecureRandom();
...will I get a truly random output whenever I call sr.nextBoolean()?
Or is the output likely to be more (or less?) random if I, instead, obtain output by doing this each time:
new SecureRandom().nextBoolean()?

"True" and "pseudorandom" random numbers mean a lot of different things to different people. It's best to avoid those.
/dev/urandom got a bad rep because people do not understand the differences between it and /dev/random (much, much less difference than you would expect).
If you're asking whether seeding by /dev/urandom might compromise the fitness of SecureRandom to use it for cryptographic purposes, the answer is a resounding "no".
If you've got some time you might want to read my essay about the whole issue.

According to the Android Developer Docs:
(SecureRandom) complies with the statistical random number generator tests specified in FIPS 140-2, Security Requirements for Cryptographic Modules, section 4.9.1
However, the same caveats apply to Android as to Java:
Many SecureRandom implementations are in the form of a pseudo-random number generator (PRNG), which means they use a deterministic algorithm to produce a pseudo-random sequence from a true random seed. Other implementations may produce true random numbers, and yet others may use a combination of both techniques.
So, the short answer is: it depends on the implementation, but if you're ok with FIPS 140-2, then SecureRandom is legally sufficient for your purposes.

The key answer is that /dev/urandom, as defined by the linux kernel, is guaranteed not to block. The emphasis being upon not stalling the user while sufficient entropy is generated. If the android docs say they are using /dev/urandom to initialize, AND there is insufficient entropy in the kernel to supply random numbers, the kernel will fall back to a pseudo-random algorithm.
Per the kernel documentation, /dev/urandom is considered sufficient for almost all purposes except "long lived [encryption] keys". Given the description of your intended use, I suspect android SecureRandom will prove to be random enough for your purposes.

How random.nextBytes(bytes) generates the bytes after we have seed

I know that if we declare:
SecureRandom random=new SecureRandom();
It initializes the default algorithm to generate random number is NativePRNG which reads /dev/random to generate the truly random seed. Now we have truly random seed which is 160 bit size, but i am confused what happens when we call random.nextBytes(bytes);. How it generates bytes from the seed,does it again read the /dev/random or something else.
Thanks.
N.B.: i am looking for default behavior in java 7 in linux/MAC box.

From the Java API docs:
Many SecureRandom implementations are in the form of a pseudo-random
number generator (PRNG), which means they use a deterministic
algorithm to produce a pseudo-random sequence from a true random seed.
Other implementations may produce true random numbers, and yet others
may use a combination of both techniques.
So whether nextBytes(bytes) returns true random bytes from /dev/random or whether it returns pseudo-random numbers generated from the true random seed depends. The second case means that using the initially random seed, a deterministic but seemingly random (and hence, pseudo-random) number sequence is generated by any calls to the SecureRandom.
Java 7 allows for a configurable PRNG source to be specified, but on Linux the default one is NativePRNG and on Windows SHA1PRNG. You can also specify SHA1PRNG on Linux, but the default option of NativePRNG is better. The SHA1PRNG generates PRNG bits and bytes through the use of SHA1. On Linux (and possibly other Unixes, where the mechanism is "NativePRNG"), the algorithm reads from /dev/random and /dev/urandom, so as long as there is enough entropy available through either of those. For the sake of completeness, from the Linux man page on random:
A read from the /dev/urandom device will not block waiting for more
entropy. As a result, if there is not sufficient entropy in the
entropy pool, the returned values are theoretically vulnerable to a
cryptographic attack on the algorithms used by the driver.
Therefore, on Linux at least, your SecureRandom will have a certain amount of true random output until /dev/random blocks due to a shortage of entropy, however if you request too many random bits, they will eventually start being generated by the underlying /dev/urandom machinery, which may use SHA1 or some other cryptographic hashing algorithm in a PRNG.
It's best to create a SecureRandom without specifying any explicit seed yourself, as it will seed itself (by default via /dev/random and /dev/urandom for the NativePRNG on Linux) with a good seed. Calling nextBytes(bytes) every few minutes, even for a large amount of bytes, is not bound to be an issue in almost any circumstance. Even if you are using the NativePRNG and it resorts to getting pseudo-random bytes from /dev/urandom via something like SHA-1, the output of this will still be extremely difficult to predict.
If you are asking for gigabytes of randomness, it might be good to re-seed, either using some output from the SecureRandom itself or by providing your own seed. Note that it should be safe providing any kind of seed to setSeed(), as SecureRandom internally augments the current seed by feeding the seed you provide and the previous seed to something like SHA-1 or another cryptographic hashing algorithm. However, it is still best to create the initial SecureRandom without giving your own seed.

Would there be less collisions from murmurhash or from taking 64 bits from an MD5 hash if you want a 64 bit int?

Looking at using a hashing algorithm that accepts a string and returns a 64bit signed integer value.
It doesn't have to be cryptographically sound, just provide a decent collision rate to be used as a key for distributed storage.
I'm looking at murmur hash that seems to fit the bill
Curious how the properties of this compare to taking the first 64 bits of something like an MD5 hash.

Secure hashes - even theoretically 'broken' ones like MD5 - exhibit distribution that's indistinguishable from randomness (or else they wouldn't be secure). Thus, they're as close to perfect as it's possible to be.
Like all general purpose hash functions, murmurhash trades off correctness for speed. While it shows very good distribution characteristics for most inputs, it has its own pathological cases, such as the one documented here, where repeated 4-byte sequences lead to collisions more often than desired.
In short: Using a secure hash function will never be worse, and will sometimes be better than using a general purpose hash. It will also be substantially slower, however.

Will this algorithm generate a cryptographically-secure bitstream?

I'm in the rough stages of creating a Spades game and I'm having a hard time thinking up a cryptographically-secure way to shuffle the cards. So far, I have this:
Grab 32-bit system time before calling Random
Grab 32 bits from Random
Grab 32-bit system time after calling Random
Multiply the system times together bitwise and xor the two halves together
xor the 32 bits from Random with the value from the first xor, and call this the seed
Create a new Random using the seed
And basically from here I both save the 32-bit result from each Random instance and use it to seed the next instance until I get my desired amount of entropy. From here I create an alternating-step generator to produce the final 48-bit seed value I use for my final Random instance to shuffle the cards.
My question pertains to the portion before the alternating-step generator, but if not, since I'll be using a CSPRNG anyway would this algorithm be good enough?
Alternatively, would the final Random instance be absolutely necessary? Could I get away with grabbing six bits at a time off the ASG and taking the value mod 52?

No, it may be secure enough for your purposes, but it is certainly not cryptographically secure. On a fast system you may have two identical system times. On top of that, the multiplication will only remove entropy.
If you wan't, you can download the FIPS tests for RNG's and input a load of data using your RNG, then test it. Note that even I have trouble actually reading the documentation on the RNG tests, so be prepared to do some math.
All this while the Java platform already contains a secure PRNG (which is based on SHA1 and uses the RNG of the operating system as seed). The operating system almost certainly uses some time based information as seed, no need to input it yourself (of course you may always seed the system time if you really want to).

Sometimes the easy answer is the best one:
List<Card> deck; // Get this from whereever.
SecureRandom rnd = new SecureRandom();
java.util.Collections.shuffle(deck, rnd);
// deck is now securely shuffled!

You need a good shuffling algorithm and a way of gathering enough entropy to feed it so that it can theoretically cover all possible permutations for a deck of cards. See this earlier question: Commercial-grade randomization for Poker game

Seeding java.util.Random with consecutive numbers

I've simplified a bug I'm experiencing down to the following lines of code:
int[] vals = new int[8];
for (int i = 0; i < 1500; i++)
vals[new Random(i).nextInt(8)]++;
System.out.println(Arrays.toString(vals));
The output is: [0, 0, 0, 0, 0, 1310, 190, 0]
Is this just an artifact of choosing consecutive numbers to seed Random and then using nextInt with a power of 2? If so, are there other pitfalls like this I should be aware of, and if not, what am I doing wrong? (I'm not looking for a solution to the above problem, just some understanding about what else could go wrong)
Dan, well-written analysis. As the javadoc is pretty explicit about how numbers are calculated, it's not a mystery as to why this happened as much as if there are other anomalies like this to watch out for-- I didn't see any documentation about consecutive seeds, and I'm hoping someone with some experience with java.util.Random can point out other common pitfalls.
As for the code, the need is for several parallel agents to have repeatably random behavior who happen to choose from an enum 8 elements long as their first step. Once I discovered this behavior, the seeds all come from a master Random object created from a known seed. In the former (sequentially-seeded) version of the program, all behavior quickly diverged after that first call to nextInt, so it took quite a while for me to narrow the program's behavior down to the RNG library, and I'd like to avoid that situation in the future.

As much as possible, the seed for an RNG should itself be random. The seeds that you are using are only going to differ in one or two bits.
There's very rarely a good reason to create two separate RNGs in the one program. Your code is not one of those situations where it makes sense.
Just create one RNG and reuse it, then you won't have this problem.
In response to comment from mmyers:
Do you happen to know java.util.Random
well enough to explain why it picks 5
and 6 in this case?
The answer is in the source code for java.util.Random, which is a linear congruential RNG. When you specify a seed in the constructor, it is manipulated as follows.
seed = (seed ^ 0x5DEECE66DL) & mask;
Where the mask simply retains the lower 48 bits and discards the others.
When generating the actual random bits, this seed is manipulated as follows:
randomBits = (seed * 0x5DEECE66DL + 0xBL) & mask;
Now if you consider that the seeds used by Parker were sequential (0 -1499), and they were used once then discarded, the first four seeds generated the following four sets of random bits:
101110110010000010110100011000000000101001110100
101110110001101011010101011100110010010000000111
101110110010110001110010001110011101011101001110
101110110010011010010011010011001111000011100001
Note that the top 10 bits are indentical in each case. This is a problem because he only wants to generate values in the range 0-7 (which only requires a few bits) and the RNG implementation does this by shifting the higher bits to the right and discarding the low bits. It does this because in the general case the high bits are more random than the low bits. In this case they are not because the seed data was poor.
Finally, to see how these bits convert into the decimal values that we get, you need to know that java.util.Random makes a special case when n is a power of 2. It requests 31 random bits (the top 31 bits from the above 48), multiplies that value by n and then shifts it 31 bits to the right.
Multiplying by 8 (the value of n in this example) is the same as shifting left 3 places. So the net effect of this procedure is to shift the 31 bits 28 places to the right. In each of the 4 examples above, this leaves the bit pattern 101 (or 5 in decimal).
If we didn't discard the RNGs after just one value, we would see the sequences diverge. While the four sequences above all start with 5, the second values of each are 6, 0, 2 and 4 respectively. The small differences in the initial seeds start to have an influence.
In response to the updated question: java.util.Random is thread-safe, you can share one instance across multiple threads, so there is still no need to have multiple instances. If you really have to have multiple RNG instances, make sure that they are seeded completely independently of each other, otherwise you can't trust the outputs to be independent.
As to why you get these kind of effects, java.util.Random is not the best RNG. It's simple, pretty fast and, if you don't look too closely, reasonably random. However, if you run some serious tests on its output, you'll see that it's flawed. You can see that visually here.
If you need a more random RNG, you can use java.security.SecureRandom. It's a fair bit slower, but it works properly. One thing that might be a problem for you though is that it is not repeatable. Two SecureRandom instances with the same seed won't give the same output. This is by design.
So what other options are there? This is where I plug my own library. It includes 3 repeatable pseudo-RNGs that are faster than SecureRandom and more random than java.util.Random. I didn't invent them, I just ported them from the original C versions. They are all thread-safe.
I implemented these RNGs because I needed something better for my evolutionary computation code. In line with my original brief answer, this code is multi-threaded but it only uses a single RNG instance, shared between all threads.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.