java.util.Random digging a bit deep

java.util.Random digging a bit deep - java

I was reading Data Structures and Algorithms in Java book and I came across the following question that I would like to get help with:
Suppose you are given an array, A, containing 100 integers that were generated using the method r.nextInt(10), where r is an object of type java.util.Random. Let x denote the product of the integers in A. There is a single number that x will equal with probability at least 0.99. What is that number and what is a formula describing the probability that x is equal to that number?
I think x is equal to zero; as most probably 0 will be generated. However, that's just a guess. I wasn't able to find the formula. The java documentation doesn't specify the randomization equation and I wasn't able to find any related topics either here or after searching using Google.
I would like to get some help with the probability formula please. Thanks in advance.

The possible values for the array elements are 0 .. 9, each with probability 1/10. If one of the elements is 0, the product will be 0 as well. So we calculate the probability that at least one element is 0.
It turns out, this is the opposite of all elements being greater than zero. The probability for an element to be greater than 0 is 9/10, and the probability that all elements are greater than zero is therefore (9/10)^100.
The probability that at least one element is 0 is therefore 1 - (9/10)^100 which is approximately 0.9999734.

Regarding nextInt: The javadoc specifies:
uniformly distributed int value between 0 (inclusive) and the
specified value (exclusive)
a "uniform distribution" is a distribution where each outcome is equally likely.
hence the chances for a particular outcome are "1/[number of possible outcomes]" (so they all add up to 1).
Regarding the array:
Filling the array can be regarded as observing 100 statistically independent events.
You should read up, on how the maths work when combining multiple independent events.

https://docs.oracle.com/javase/7/docs/api/java/util/Random.html#nextInt(int)
As you already expect, it will of be 0
r.nextInt(10)
will return numbers from 0 to 9
The product of any 0 * 1-9 will be 0, therefore for 100 random numbers, the chance that no 0 will be returned from this function is pretty low.

Related

How is it possible to get a random number with a specific probability?

I wanted to make a random number picker in the range 1-50000.
But I want to do it so that the larger the number, the smaller the probability.
Probability like (1/2*number) or something else.
Can anybody help?

You need a mapping function of some sort. What you get from Random is a few 'primitive' constructs that you can trust do exactly what their javadoc spec says they do:
.nextInt(X) which returns, uniform random (i.e. the probability chart is an exact horizontal line), a randomly chosen number between 0 and X-1 inclusive.
.nextBoolean() which gives you 1 bit of randomness.
.nextDouble(), giving you a mostly uniform random number between 0.0 and 1.0
nextGaussian() which gives you a random number whose probability chart is a uniform normal curve with standard deviation = 1.0 and midpoint (average) of 0.0.
For the double-returning methods, you run into some trouble if you want exact precision. Computers aren't magical. As a consequence, if you e.g. write this mapping function to turn nextDouble() into a standard uniformly distributed 6-sided die roll, you'd think: int dieRoll = 1 + (int) (rnd.nextDouble() * 6); would do it. Had double been perfect, you'd be right. But they aren't, so, instead, best case scenario, 4 of 6 die faces are going to come up 750599937895083 times, and the other 2 die faces are going to come up 750599937895082 times. It'll be hard to really notice that, but it is provably imperfect. I assume this kind of tiny deviation doesn't matter to you, but, it's good to be aware that anytime you so much as mention double, inherent tiny errors creep into everything and you can't really stop that from happening.
What you need is some sort of mapping function that takes any amount of such randomly provided data (from those 3 primitives, and really only from nextInt/nextBoolean if you want to avoid the errors that double inherently brings) to produce what you want.
For example, imagine instead the 'primitive' I gave you is a uniform random value between 1 and 6, inclusive, i.e.: A standard 6-sided die roll. And I ask you to come up with a uniform algorithm (as in, each value is equally likely) to produce a number between 2 and 12, inclusive.
Perhaps you might think: Easy, just roll 2 dice and add em up. But that would be incorrect: 7 is far more likely than 12.
Instead, you'd roll 1 die and just register if it was even or odd. Then you roll the second die and that's your result, unless the first die was odd in which case you add 6 to it. If you get odd on the first die and 1 on the second die, you start the process over again; eventually you're bound to not roll snake eyes.
That'd be uniform random.
You can apply the same principle to your question. You need a mathematical function that maps the 'horizontal line' of .nextInt() to whatever curve you want. For example, sounds like you want to perhaps generate something and then take the square root and round it down, maybe. You're going to have to draw out or write a formula that precisely describes the probability density.
Here's an example:
while (true) {
int v = (int) (50000.0 * Math.abs(r.nextGaussian()));
if (v >= 1 && v <= 50000) return v;
}
That returns you a roughly normally distributed value, 1 being the most likely, 50000 being the least likely.

One simple formula that will give you a very close approximation to what you want is
Random random = new Random();
int result = (int) Math.pow( 50001, random.nextDouble());
That will give a result in the range 1 - 50000, where the probability of each result is approximately proportional to 1 / result, which is what you asked for.
The reason why it works is that the probability of result being any value n within the range is P( n <= 50001^x < n+1) where x is randomly distributed in [0,1). That's the probability that x falls between log(n) and log(n+1), where the logs are base 50001. But that probability is proportional to log (1 + 1/n), which is very close to 1/n.

Effective Java Item 47: Know and use your libraries - Flawed random integer method example

In the example Josh gives of the flawed random method that generates a positive random number with a given upper bound n, I don't understand the two of the flaws he states.
The method from the book is:
private static final Random rnd = new Random();
//Common but deeply flawed
static int random(int n) {
return Math.abs(rnd.nextInt()) % n;
}
He says that if n is a small power of 2, the sequence of random numbers that are generated will repeat itself after a short period of time. Why is this the case? The documentation for Random.nextInt() says Returns the next pseudorandom, uniformly distributed int value from this random number generator's sequence. So shouldn't it be that if n is a small integer then the sequence will repeat itself, why does this only apply to powers of 2?
Next he says that if n is not a power of 2, some numbers will be returned on average more frequently than others. Why does this occur, if Random.nextInt() generates random integers that are uniformly distributed? (He provides a code snippet which clearly demonstrates this but I don't understand why this is the case, and how this is related to n being a power of 2).

Question 1: if n is a small power of 2, the sequence of random numbers that are generated will repeat itself after a short period of time.
This is not a corollary of anything Josh is saying; rather, it is simply a known property of linear congruential generators. Wikipedia has the following to say:
A further problem of LCGs is that the lower-order bits of the generated sequence have a far shorter period than the sequence as a whole if m is set to a power of 2. In general, the n-th least significant digit in the base b representation of the output sequence, where bk = m for some integer k, repeats with at most period bn.
This is also noted in the Javadoc:
Linear congruential pseudo-random number generators such as the one implemented by this class are known to have short periods in the sequence of values of their low-order bits.
The other version of the function, Random.nextInt(int), works around this by using different bits in this case (emphasis mine):
The algorithm treats the case where n is a power of two specially: it returns the correct number of high-order bits from the underlying pseudo-random number generator.
This is a good reason to prefer Random.nextInt(int) over using Random.nextInt() and doing your own range transformation.
Question 2: Next he says that if n is not a power of 2, some numbers will be returned on average more frequently than others.
There are 232 distinct numbers that can be returned by nextInt(). If you try to put them into n buckets by using % n, and n isn't a power of 2, some buckets will have more numbers than others. This means that some outcomes will occur more frequently than others even though the original distribution was uniform.
Let's look at this using small numbers. Let's say nextInt() returned four equiprobable outcomes, 0, 1, 2 and 3. Let's see what happens if we applied % 3 to them:
0 maps to 0
1 maps to 1
2 maps to 2
3 maps to 0
As you can see, the algorithm would return 0 twice as frequently as it would return each of 1 and 2.
This does not happen when n is a power of two, since one power of two is divisible by the other. Consider n=2:
0 maps to 0
1 maps to 1
2 maps to 0
3 maps to 1
Here, 0 and 1 occur with the same frequency.
Additional resources
Here are some additional -- if only tangentially relevant -- resources related to LCGs:
Spectral tests are statistical tests used to assess the quality of LCGs. Read more here and here.
A collection of classical pseudorandom number generators with linear structures has some pretty scatterplots (the generator used in Java is called DRAND48).
There is an interesting discussion on crypto.SE about predicting values from Java's generator.

1) When n is a power of 2, rnd % n is equivalent to selecting a few lower bits of the original. Lower bits of numbers generated by the type of generators used by java are known to be "less random" than the higher bits. It's just the property of the formula used for generating the numbers.
2) Imagine, that the largest possible value, returned by random() is 10, and n = 7. Now doing n % 7 maps numbers 7, 8, 9 and 10 into 0, 1, 2, 3 respectively. Therefore, if the original number is uniformly distributed, the result will be heavily biased towards the lower numbers, because they will appear twice as often as 4, 5 and 6. In this case, this does happen regardless of whether n is a power of two or not, but, if instead of 10 we chose, say, 15 (which is 2^4-1), then any n, that is a power of two would result in a uniform distribution, because there would be no "excess" numbers left at the end of the range to cause bias, because the total number of possible values would be exactly divisible by the number of possible remainders.

Generate N random numbers in given ranges that sum up to a given sum

first time here at Stackoverflow. I hope someone can help me with my search of an algorithm.
I need to generate N random numbers in given Ranges that sum up to a given sum!
For example: Generatare 3 Numbers that sum up to 11.
Ranges:
Value between 1 and 3.
Value between 5 and 8.
value between 3 and 7.
The Generated numbers for this examle could be: 2, 5, 4.
I already searched alot and couldnt find the solution i need.
It is possible to generate like N Numbers of a constant sum unsing modulo like this:
generate random numbers of which the sum is constant
But i couldnt get that done with ranges.
Or by generating N random values, sum them up and then divide the constant sum by the random sum and afterwards multiplying each random number with that quotient as proposed here.
Main Problem, why i cant adopt those solution is that every of my random values has different ranges and i need the values to be uniformly distributed withing the ranges (no frequency occurances at min/max for example, which happens if i cut off the values which are less/greater than min/max).
I also thought of an soultion, taking a random number (in that Example, Value 1,2 or 3), generate the value within the range (either between min/max or min and the rest of the sum, depending on which is smaller), substracting that number of my given sum, and keep that going until everything is distributed. But that would be horrible inefficiant. I could really use a way where the runtime of the algorithm is fixed.
I'm trying to get that running in Java. But that Info is not that importend, except if someone already has a solution ready. All i need is a description or and idea of an algorithm.

First, note that the problem is equivalent to:
Generate k numbers that sums to a number y, such that x_1, ..., x_k -
each has a limit.
The second can be achieved by simply reducing the lower bound from the number - so in your example, it is equivalent to:
Generate 3 numbers such that x1 <= 2; x2 <= 3; x3 <= 4; x1+x2+x3 = 2
Note that the 2nd problem can be solved in various ways, one of them is:
Generate a list with h_i repeats per element - where h_i is the limit for element i - shuffle the list, and pick the first elements.
In your example, the list is:[x1,x1,x2,x2,x2,x3,x3,x3,x3] - shuffle it and choose first two elements.
(*) Note that shuffling the list can be done using fisher-yates algorithm. (you can abort the algorithm in the middle after you passed the desired limit).

Add up the minimum values. In this case 1 + 5 + 3 = 9
11 - 9 = 2, so you have to distribute 2 between the three numbers (eg: +2,+0,+0 or +0,+1,+1).
I leave the rest for you, it's relatively easy to create a uniform distribution after this transformation.

This problem is equivalent to randomly distributing an excess of 2 over the minimum of 9 on 3 positions.
So you start with the minima (1/5/3) and then cycle 2 times, generating a (pseudo-)random value of [0-2] (3 positions) and increment the indexed value.
e.g.
Start 1/5/3
1st random=1 ... increment index 1 ... 1/6/3
2nd random=0 ... increment index 0 ... 2/6/3
2+6+3=11
Edit
Reading this a second time, I understand, this is exactly what #KarolyHorvath mentioned.

8 puzzle: Solvability and shortest solution

I have built a 8 puzzle solver using Breadth First Search. I would now want to modify the code to use heuristics. I would be grateful if someone could answer the following two questions:
Solvability
How do we decide whether an 8 puzzle is solvable ? (given a starting state and a goal state )
This is what Wikipedia says:
The invariant is the parity of the permutation of all 16 squares plus
the parity of the taxicab distance (number of rows plus number of
columns) of the empty square from the lower right corner.
Unfortunately, I couldn't understand what that meant. It was a bit complicated to understand. Can someone explain it in a simpler language?
Shortest Solution
Given a heuristic, is it guaranteed to give the shortest solution using the A* algorithm? To be more specific, will the first node in the open list always have a depth ( or the number of movements made so fat ) which is the minimum of the depths of all the nodes present in the open list?
Should the heuristic satisfy some condition for the above statement to be true?
Edit : How is it that an admissible heuristic will always provide the optimal solution? And how do we test whether a heuristic is admissible?
I would be using the heuristics listed here
Manhattan Distance
Linear Conflict
Pattern Database
Misplaced Tiles
Nilsson's Sequence Score
N-MaxSwap X-Y
Tiles out of row and column
For clarification from Eyal Schneider :

I'll refer only to the solvability issue. Some background in permutations is needed.
A permutation is a reordering of an ordered set. For example, 2134 is a reordering of the list 1234, where 1 and 2 swap places. A permutation has a parity property; it refers to the parity of the number of inversions. For example, in the following permutation you can see that exactly 3 inversions exist (23,24,34):
1234
1432
That means that the permutation has an odd parity. The following permutation has an even parity (12, 34):
1234
2143
Naturally, the identity permutation (which keeps the items order) has an even parity.
Any state in the 15 puzzle (or 8 puzzle) can be regarded as a permutation of the final state, if we look at it as a concatenation of the rows, starting from the first row. Note that every legal move changes the parity of the permutation (because we swap two elements, and the number of inversions involving items in between them must be even). Therefore, if you know that the empty square has to travel an even number of steps to reach its final state, then the permutation must also be even. Otherwise, you'll end with an odd permutation of the final state, which is necessarily different from it. Same with odd number of steps for the empty square.
According to the Wikipedia link you provided, the criteria above is sufficient and necessary for a given puzzle to be solvable.

The A* algorithm is guaranteed to find the (one if there are more than one equal short ones) shortest solution, if your heuristic always underestimates the real costs (In your case the real number of needed moves to the solution).
But on the fly I cannot come up with a good heuristic for your problem. That needs some thinking to find such a heuristic.
The real art using A* is to find a heuristic that always underestimates the real costs but as little as possible to speed up the search.
First ideas for such a heuristic:
A quite pad but valid heuristic that popped up in my mind is the manhatten distance of the empty filed to its final destination.
The sum of the manhatten distance of each field to its final destination divided by the maximal number of fields that can change position within one move. (I think this is quite a good heuristic)

For anyone coming along, I will attempt to explain how the OP got the value pairs as well as how he determines the highlighted ones i.e. inversions as it took me several hours to figure it out. First the pairs.
First take the goal state and imagine it as a 1D array(A for example)
[1,2,3,8,0,4,7,5]. Each value in that array has it's own column in the table(going all the way down, which is the first value of the pair.)
Then move over 1 value to the right in the array(i + 1) and go all the way down again, second pair value. for example(State A): the first column, second value will start [2,3,8,0,4,7,5] going down. the second column, will start [3,8,0,4,7,5] etc..
okay now for the inversions. for each of the 2 pair values, find their INDEX location in the start state. if the left INDEX > right INDEX then it's an inversion(highlighted). first four pairs of state A are: (1,2),(1,3),(1,8),(1,0)
1 is at Index 3
2 is at Index 0
3 > 0 so inversion.
1 is 3
3 is 2
3 > 2 so inversion
1 is 3
8 is 1
3 > 2 so inversion
1 is 3
0 is 7
3 < 7 so No inversion
Do this for each pairs and tally up the total inversions.
If both even or both odd (Manhattan distance of blank spot And total inversions)
then it's solvable. Hope this helps!

FastSineTransformer - pad array with zeros to fit length

I'm trying to implement a poisson solver for image blending in Java. After descretization with 5-star method, the real work begins.
To do that i do these three steps with the color values:
using sine transformation on rows and columns
multiply eigenvalues
using inverse sine transformation on rows an columns
This works so far.
To do the sine transformation in Java, i'm using the Apache Commons Math package.
But the FastSineTransformer has two limitations:
first value in the array must be zero (well that's ok, number two is the real problem)
the length of the input must be a power of two
So right now my excerpts are of the length 127, 255 and so on to fit in. (i'm inserting a zero in the beginning, so that 1 and 2 are fulfilled) That's pretty stupid, because i want to choose the size of my excerpt freely.
My Question is:
Is there a way to extend my array e.g. of length 100 to fit the limitations of the Apache FastSineTransformer?
In the FastFourierTransfomer class it is mentioned, that you can pad with zeros to get a power of two. But when i do that, i get wrong results. Perhaps i'm doing it wrong, but i really don't know if there is anything i have to keep in mind, when i'm padding with zeros

As far as I can tell from http://books.google.de/books?id=cOA-vwKIffkC&lpg=PP1&hl=de&pg=PA73#v=onepage&q&f=false and the sources http://grepcode.com/file/repo1.maven.org/maven2/org.apache.commons/commons-math3/3.2/org/apache/commons/math3/transform/FastSineTransformer.java?av=f
The rules are as follows:
According to implementation the dataset size should be a power of 2 - presumable in order for algorithm to guarantee O(n*log(n)) execution time.
According to James S. Walker function must be odd, that is the mentioned assumptions must be fullfiled and implementation trusts with that.
According to implementation for some reason the first and the middle element must be 0:
x'[0] = x[0] = 0,
x'[k] = x[k] if 1 <= k < N,
x'[N] = 0,
x'[k] = -x[2N-k] if N + 1 <= k < 2N.
As for your case when you may have a dataset which is not a power of two I suggest that you can resize and pad the gaps with zeroes with not violating the rules from the above. But I suggest referring to the book first.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.