How does this shuffling with Math rand work? - java

I saw this code to shuffle a list:
public static void shuffle(List<Integer> numbers) {
if(numbers == null || numbers.isEmpty()) return;
for(int i = 0; i < numbers.size(); ++i) {
int index = (int) (i + Math.random()*(numbers.size() - i));
swap(numbers, i, index);
}
}
The code seem to work but I don't understand this snippet:
int index = (int) (i + Math.random()*(numbers.size() - i));
Basically it is i + R*(n-i) but how does this ensure that: i) we won't get an out of bounds index or ii) I won't be changing the same element's i.e. index == i and the shuffle would not be that random?

Math.random() returns a uniform random number in the interval [0, 1), and numbers.size() - i, ideally, scales that number to the interval [0, numbers.size() - i). For example, if i is 2 and the size of the list is 5, a random number in the interval [0, 3) is chosen this way, in the ideal case. Finally, i is added to the number and the (int) cast discards the number's fractional part. Thus, in this example, a random integer in [2, 5) (that is, either 2, 3, or 4) is generated at random, so that at each iteration, the number at index X swaps with itself or a number that follows it.
However, there is an important subtlety here. Due to the nature of floating-point numbers and rounding error when scaling the number, in extremely rare cases the output of Math.random()*(numbers.size() - i) might be equal to numbers.size() - i, even if Math.random() outputs a number that excludes 1. rounding error can cause the idiom Math.random()*(numbers.size() - i) to bias some results over others. For example, this happens whenever 2^53 is not divisible by numbers.size() - i, since Math.random() uses java.util.Random under the hood, and its algorithm generates numbers with 53 bits of precision. Because of this, Math.random() is not the best way to write this code, and the code could have used a method specially made for generating random integers instead (such as the nextInt method of java.util.Random). See also this question and this question.
EDIT: As it turns out, the Math.random() * integer idiom does not produce the issue that it may return integer, at least when integer is any positive int and the round-to-nearest rounding mode is used as in Java. See this question.

Math.random() always returns a floating-point number between 0 (inclusive) and 1 (exclusive). So when you do Math.random()*(numbers.size() - i), the result will always be between 0 (inclusive) and n-i (exclusive).
Then you add i to it in i + Math.random()*(numbers.size() - i).
Now the result, as you can see, will be between i (inclusive) and n (exclusive).
After that, you are casting it to an int. When you cast a double to an int, you truncate it, so now the value of index will somewhere from ``iton - 1``` (inclusive for both).
Therefore, you will not have an ArrayIndexOutOfBoundsException, since it will always be at least 1 less than the size of the array.
However, the value of index could be equal to i, so yes, you are right in that a number could be swapped with itself and stay right there. That's perfectly fine.

You have a list of 1 to 50 ints.
So get a random value from 0 to 49 inclusive to index it.
say it is 30.
Get item at index 30.
Now replace item at index 30 with item at index 49.
Next time generate a number between 0 and 48 inclusive. 49 will never be reached and the number that was there occupies the slot of the last number used.
Continue this process until you've exhausted the list.
Note: that the expression (int)(Math.random() * n) will generate a random number between 0 and n-1 inclusive because Math.random generates a number between 0 and 1 exclusive.

Instead of using such a custom method, I recommend you use OOTB Collections.shuffle. Check this to understand the logic implemented for Collections.shuffle.
Analysis of your code:
Math.random() returns a double value with a positive sign, greater than or equal to 0.0 and less than 1.0.
Now, let's assume numbers.size() = 5 and dry run the for loop:
When i = 0, index = (int) (0 + Math.random()*(5 - 0)) = (int) (0 + 4.x) = 4
When i = 1, index = (int) (1 + Math.random()*(5 - 1)) = (int) (1 + 3.x) = 4
When i = 2, index = (int) (2 + Math.random()*(5 - 2)) = (int) (2 + 2.x) = 4
When i = 3, index = (int) (3 + Math.random()*(5 - 3)) = (int) (3 + 1.x) = 4
When i = 4, index = (int) (4 + Math.random()*(5 - 4)) = (int) (4 + 0.x) = 4
As you can see, the value of index will remain 4 in each iteration when numbers.size() = 5.
Your queries:
how does this ensure that: i) we won't get an out of bounds index
As already explained above using the dry run, it will never go out of bounds.
or ii) I won't be changing the same element's i.e. index == i and the
shuffle would not be that random?
swap(numbers, i, index); is swapping the element at index, i with the element at index, 4 each time when numbers.size() = 5. This is illustrated with the following example:
Let's say numbers = [1, 2, 3, 4, 5]
When i = 0, numbers will become [5, 2, 3, 4, 1]
When i = 1, numbers will become [5, 1, 3, 4, 2]
When i = 2, numbers will become [5, 1, 2, 4, 3]
When i = 3, numbers will become [5, 1, 2, 3, 4]
When i = 4, numbers will become [5, 1, 2, 3, 4]

int index = (int) (i + Math.random()*(numbers.size() - i)); - it is important to note that Math.random() will generate a number which belongs to <0;1). So it will never exceed the boundry as exclusive max will be: i + 1*(number.size() -i) = number.size
This point is valid, it can happen.

Related

What is happening in this array? [duplicate]

This question already has answers here:
What is a debugger and how can it help me diagnose problems?
(2 answers)
Closed 3 years ago.
I'm reviewing arrays and for loops among other things in my computer science course. One of the examples used was the code below. After being run, the array displays 2, 3, 4, 2. How is this?
int[] numbers = {1, 2, 3, 4};
for (int i = 0; i < numbers.length; i++) {
numbers[i] = numbers[(i+1) % numbers.length];
}
System.out.println(numbers);
An important concept in understanding the code you are looking it, especially the block inside the for loop, is the modulus operator (%) also known as the remainder operator. When applied, the % operator returns the remainder of two numbers.
Hence, the computation:
(i+1) % numbers.length
will always return a remainder.
Walking through it with a debugger (or print statements as suggested) while evaluating the values (and operations) at each iteration is a great way of understanding it.
You can also read more about it: https://www.baeldung.com/modulo-java
As you are doing the operation on the same array in place.
original array [1, 2, 3, 4]
when i - 0 => number [2,2,3,4] // moving index 1 item to index 0
when i - 2 => number [2,3,3,4] // moving index 2 item to index 1 from the
immediate previous array
when i - 2 => number [2,3,4,4] // moving index 3 item to index 2 from the
immediate previous array
when i - 3 => number [2,3,4,2] // moving index 0 item to index 3 from the
previous array
Keep in mind that % is modular arithmetic - it's the remainder when you divide the left-hand side by the right-hand side. For example, 2 % 4 = 2, 4 % 4 = 0, 6 % 4 = 2, and 8 % 4 = 0.
Here are the steps:
i = 0. (i + 1) % 4 = 1 % 4 = 1. Value at index 1 is 2. Array now contains {2, 2, 3, 4}.
i = 1. (i + 1) % 4 = 2 % 4 = 2. Value at index 2 is 3. Array now contains {2, 3, 3, 4}.
i = 2. (i + 1) % 4 = 3 % 4 = 3. Value at index 3 is 4. Array now contains {2, 3, 4, 4}.
i = 3. (i + 1) % 4 = 4 % 4 = 0. Value at index 0 is 2. Array now contains {2, 3, 4, 2}.
I'd encourage you to step through this with a debugger and/or use print statements to convince yourself that this is true and to understand why this is the case (doing so will help with understanding; if you're not convinced by a statement, either it's not true or there's something you don't understand or know about yet).
Your loop reassignes each index in your array with the value currently stored in the next index by doing (i+1).
Therefor on index [0] you get the value 2 which was on index [1] before. Same for second and third index. For the last one it is a little bit special since there is no next index the % numbers.length computation basically wraps the index so that for index [3] you will get the value of index [0] since (3+1)%4 = 0.
Note on index [3] you are getting value 2 and not 1 since you already changed the value on index [0] before that.
1st iteration
numbers[i] = numbers[1 % 4]; // answer is 1 so numbers[1] = 2
2nd iteration
numbers[i] = numbers[2 % 4]; // answer is 2 so numbers[2] = 3
Like this it will go until I becomes 3

How to fix a method that calculates powers in a binary number, but fails more often than not?

I've been doodling around with this little piece of code that's supposed to calculate and print out which powers of 2 are summarized into a given number. It works fine with small odd numbers but gets lost when I want it to calculate even numbers or bigger ones.
I don't even know what I could try, the code looks alright, but I probably keep failing to notice.
System.out.println("Give a number");
int gigaInt = si.nextInt();
String gigaBit = Integer.toBinaryString(gigaInt);
String[] gigaBitArray = gigaBit.split("");
System.out.println("Binary: " + gigaBit);
List<Integer> powers = new ArrayList<Integer>();
for(int counter = gigaBitArray.length-1; counter >= 0; counter--){
if (gigaBitArray[counter].equals("1"))
powers.add((int)Math.pow(2,counter));
else if(gigaBitArray[counter].equals("0")){
powers.add(0);
}
}
System.out.println("Powers: " + powers);
So, obviously, the program is supposed to calculate the powers, and it does! in some cases... here, when given 9
Give a number
9
Binary: 1001
Powers: [8, 0, 0, 1]
But when I want it to calculate an even number, it always shows "1" as the only component, like this:
Give a number
8
Binary: 1000
Powers: [0, 0, 0, 1]
And whenever asked to deal with a big number, it just goes completely crazy:
Give a number
542
Binary: 1000011110
Powers: [0, 256, 128, 64, 32, 0, 0, 0, 0, 1]
I would be amazingly grateful for any kind of advice on this. It's probably just an infantile kind of mistake, so please, do point it out.
As per the comment by Dawood ibn Kareem, you are testing the low order bits first. If you want the high order powers listed first you will need an index variable and a power variable. Also, no need to check for "0". If it is not "1" then it must be "0".
int iIndex;
int iLength = gigaBitArray.length;
int iPower = iLength - 1;
for ( iIndex = 0; iIndex < iLength; ++iIndex, --iPower )
{
if ( gigaBitArray[iIndex].equals("1") )
{
powers.add((int)Math.pow(2, iPower));
}
else
{
powers.add(0);
}
}
The problem with your code is the array index you are looking at.
When you input the number 8, its binary representation is 1000. And when you split it into an array you get:
index: 0 1 2 3
value: 1 0 0 0
Because you are starting at the end of the list, index 0 will be processed last (and will be the same as 2^0).
All you need to do to fix this is to inverse the order of the elements you are looking at while keeping the same order of the for loop.
Eg:
Instead of:
gigaBitArray[counter]
It should be:
gigaBitArray[gigaBitArray.length -1 - counter]
In addition to both answers above you could also get rid of the if else by multiplying the 0s and 1s:
int len = gigaBitArray.length;
for (int i = 0; i < gigaBitArray.length; i++) {
powers.add((int)Math.pow(2, --len)*Integer.parseInt(gigaBitArray[i]));
}
Here is one way to do it. Comments in code where not obvious. The idea here is that all information inside a computer is binary. Characters and numbers are printed out based on context. Since all information is in binary it can be shifted left or right to move the field of bits the same direction. This permits detecting a 1 or 0 bit without resorting to the overhead of String manipulation.
for (int number : new int[] { 8, 10, 23, 11, 2, 4, 99
}) {
List<Integer> powers = new ArrayList<>();
// starting bits to shift
int shift = 0;
// save number for printout
int save = number;
while (number > 0) {
// ANDing the number with 1 will mask the
// low order bit to a 1 or 0.
// Then shift that bit "shift" number
// of bits (first time thru is 0) and store
// the power in p. Then increment # of bits
// to shift.
int p = (number & 1) << shift++;
//add power to beginning of list.
powers.add(0, p);
// now shift the number right by 1 to position
// for next bit.
number >>= 1;
}
System.out.printf("%3d -> %s%n", save, powers);
}
The above prints the following:
8 -> [8, 0, 0, 0]
10 -> [8, 0, 2, 0]
23 -> [16, 0, 4, 2, 1]
11 -> [8, 0, 2, 1]
2 -> [2, 0]
4 -> [4, 0, 0]
99 -> [64, 32, 0, 0, 0, 2, 1]

Calculate the values of counters after applying all alternating operations

I was trying to solve a problem from the Codility with a given solution. The problem is provided below:
You are given N counters, initially set to 0, and you have two possible operations on them:
increase(X) − counter X is increased by 1,
max counter − all counters are set to the maximum value of any counter.
A non-empty array A of M integers is given. This array represents consecutive operations:
if A[K] = X, such that 1 ≤ X ≤ N, then operation K is increase(X),
if A[K] = N + 1 then operation K is max counter.
For example, given integer N = 5 and array A such that:
A[0] = 3
A[1] = 4
A[2] = 4
A[3] = 6
A[4] = 1
A[5] = 4
A[6] = 4
the values of the counters after each consecutive operation will be:
(0, 0, 1, 0, 0)
(0, 0, 1, 1, 0)
(0, 0, 1, 2, 0)
(2, 2, 2, 2, 2)
(3, 2, 2, 2, 2)
(3, 2, 2, 3, 2)
(3, 2, 2, 4, 2)
The goal is to calculate the value of every counter after all operations.
Write a function:
class Solution { public int[] solution(int N, int[] A); }
that, given an integer N and a non-empty array A consisting of M integers, returns a sequence of integers representing the values of the counters.
The sequence should be returned as:
a structure Results (in C), or
a vector of integers (in C++), or
a record Results (in Pascal), or
an array of integers (in any other programming language).
For example, given:
A[0] = 3
A[1] = 4
A[2] = 4
A[3] = 6
A[4] = 1
A[5] = 4
A[6] = 4
the function should return [3, 2, 2, 4, 2], as explained above.
Assume that:
N and M are integers within the range [1..100,000];
each element of array A is an integer within the range [1..N + 1].
Complexity:
expected worst-case time complexity is O(N+M);
expected worst-case space complexity is O(N) (not counting the storage required for input arguments).
I have a solution provided,
public static int[] solution(int N, int[] A) {
int[] counters = new int[N];
int currMax = 0;
int currMin = 0;
for (int i = 0; i < A.length; i++) {
if (A[i] <= N) {
counters[A[i] - 1] = Math.max(currMin, counters[A[i] - 1]);
counters[A[i] - 1]++;
currMax = Math.max(currMax, counters[A[i] - 1]);
} else if (A[i] == N + 1) {
currMin = currMax;
}
}
for (int i = 0; i < counters.length; i++) {
counters[i] = Math.max(counters[i], currMin);
}
return counters;
}
It seems they use 2 storage to hold and update the min/max values and use them inside the algorithm. Obviously, there is a more direct way to solve the problem ie. increase the value by 1 or set all the values to max as suggested and I can do that. The drawback will be to lower perfromance and increased time complexity.
However, I would like to understand what is going on here. I spend times debugging with the example array but the algorithm is still little confusing.
Anyone understand it and can explain to me briefly?
It is quite simple, they do lazy update. You keep track at all times of what is the value of the counter that has the highest value (currMax). Then, when you get a command to increase all counters to that maxValue, as that is too expensive, you just save that the last time you had to increase all counters to maxValue, that value was currMin.
So, when do you update a counter value to that value? You do it lazily, you just update it when you get a command to update that counter (increase it). So when you need to increase a counter, you update the counter to the max between its old value and currMin. If this was the first update on this counter since a N + 1 command, the correct value it should have is actually currMin, and that will be higher (or equal) to its old value. One you updated it, you add 1 to it. If now another increase happens, currMin doesn't actually matter, as the max will take its old value until another N + 1 command happens.
The second for is to account for counters that did not get an increase command after the last N + 1 command.
Note that there can be any number of N + 1 commands between 2 increase operations on a counter. It still follows that the value it should have is the maxValue at the time of the last N + 1 command, it doesn't really matter that we didn't update it before with the other maxValue from a previous N + 1, we only care about latest.

Expected number of maxima

I have is algorithm, which takes an array as an argument, and returns its maximum value.
find_max(as) :=
max = as[0]
for i = 1 ... len(as) {
if max < as[i] then max = as[i]
}
return max
My question is: given that the array is initially in a (uniformly) random permutation and that all its elements are distinct, what's the expected number of times the max variable is updated (ignoring the initial assignment).
For example, if as = [1, 3, 2], then the number of updates to max would be 1 (when reading the value 3).
Assume the original array contains the values 1, 2, ..., N.
Let X_i, i = 1..N be random variables that take the value 1 if i is, at some point during the algorithm, the maximum value.
Then the number of maximums the algorithm takes is the random variable: M = X_1 + X_2 + ... + X_N.
The average is (by definition) E(M) = E(X_1 + X_2 + ... + X_N). Using linearity of expectation, this is E(X_1) + E(X_2) + .. + E(X_N), which is prob(1 appears as a max) + prob(2 appears as a max) + ... + prob(N appears as a max) (since each X_i takes the value 0 or 1).
When does i appear as a maximum? It's when it appears first in the array amongst the i, i+1, i+2, ..., N. The probability of this is 1/(N-i+1) (since each of those numbers are equally likely to be first).
So... prob(i appears as a max) = 1/(N-i+1), and the overall expectation is 1/N + 1/(N-1) + ..+ 1/3 + 1/2 + 1/1
This is Harmonic(N) which is approximated closely by ln(N) + emc where emc ~= 0.5772156649, the Euler-Mascheroni constant.
Since in the problem you don't count the initial setting of the maximum to the first value as a step, the actual answer is Harmonic(N) - 1, or approximately ln(N) - 0.4227843351.
A quick check for some simple cases:
N=1, only one permutation, and no maximum updates. Harmonic(1) - 1 = 0.
N=2, permutations are [1, 2] and [2, 1]. The first updates the maximum once, the second zero times, so the average is 1/2. Harmonic(2) - 1 = 1/2.
N=3, permutations are [1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]. Maximum updates are 2, 1, 1, 1, 0, 0 respectively. Average is (2+1+1+1)/6 = 5/6. Harmonic(3) - 1 = 1/2 + 1/3 = 5/6.
So the theoretical answer looks good!
Empirical Solution
A simulation of many different array sizes with multiple trials each can be performed and analyzed:
#include <iostream>
#include <fstream>
#include <cstdlib>
#define UPTO 10000
#define TRIALS 100
using namespace std;
int arr[UPTO];
int main(void){
ofstream outfile ("tabsep.txt");
for(int i = 1; i < UPTO; i++){
int sum = 0;
for(int iter = 0; iter < TRIALS; iter++){
for(int j = 0; j < i; j++){
arr[j] = rand();
}
int max = arr[0];
int times_changed = 0;
for(int j = 0; j < i; j++){
if (arr[j] > max){
max = arr[j];
times_changed++;
}
}
sum += times_changed;
}
int avg = sum/TRIALS;
outfile << i << "\t" << avg << "\n";
cout << "\r" << i;
}
outfile.close();
cout << endl;
return 0;
}
When I graphed these results, the complexity appeared to be logarithmic:
I think it's safe to conclude that the time complexity is O(log n).
Theoretical solution:
Assume that the numbers are in the range 0...n
You have a tentative maximum m
The next maximum will be a random number in the range m+1...n, which averages out to be (m+n)/2
This means that each time you find a new maximum, you are dividing the range of possible maximums by 2
Repeated division is equivalent to a logarithm
Therefore the number of times a new maximum is found is O(log n)
Worst case scenario (which is often what is sought) is O(n). If the list is sorted in reverse order every single one will result in an assignment.
HOWEVER, if your assignment is the most expensive operation why don't you just store it's index and only ever copy once, if at all? In that case, you will have exactly 1 assignment and n-1 comparisons.

How to express 2n as sum of n variables (Java implementation?)

I wonder if there is an elegant way to derive all compositions of 2n as the sum of n non-negative integer variables.
For example, for n = 2 variables x and y, there are 5 compositions with two parts :
x = 0 y = 4; x = 1 y = 3; x = 2 y = 2; x = 3 y = 1; x = 4 y = 0
such that x + y = 4 = 2n.
More generally, the problem can be formulated to find all the compositions of s into n non-negative integer variables with their sum equals to s.
Any suggestion on how to compute this problem efficiently would be welcome, and some pseudo-code would be much appreciated. Thanks.
Edit: while solutions are presented below as in Perl and Prolog, a Java implementation may present a new problem as linear data structures such as arrays need to be passed around and manipulated during the recursive calls, and such practice can become quite expensive as n gets larger, I wonder if there is an alternative (and more efficient) Java implementation for this problem.
Here's some python:
def sumperms(n, total = None):
if total == None:
# total is the target sum, if not specified, set to 2n
total = 2 * n
if n == 1:
# if n is 1, then there is only a single permutation
# return as a tuple.
# python's syntax for single element tuple is (element,)
yield (total,)
return
# iterate i over 0 ... total
for i in range(total + 1):
# recursively call self to solve the subproblem
for perm in sumperms(n - 1, total - i):
# append the single element tuple to the "sub-permutation"
yield (i,) + perm
# run example for n = 3
for perm in sumperms(3):
print perm
Output:
(0, 0, 6)
(0, 1, 5)
(0, 2, 4)
(0, 3, 3)
(0, 4, 2)
(0, 5, 1)
(0, 6, 0)
(1, 0, 5)
(1, 1, 4)
(1, 2, 3)
(1, 3, 2)
(1, 4, 1)
(1, 5, 0)
(2, 0, 4)
(2, 1, 3)
(2, 2, 2)
(2, 3, 1)
(2, 4, 0)
(3, 0, 3)
(3, 1, 2)
(3, 2, 1)
(3, 3, 0)
(4, 0, 2)
(4, 1, 1)
(4, 2, 0)
(5, 0, 1)
(5, 1, 0)
(6, 0, 0)
The number of compositions (sums where ordering matters) of 2n into exactly n non-negative parts is the binomial coefficient C(3n-1,n-1). For example, with n = 2 as above, C(5,1) = 5.
To see this, consider lining up 3n-1 positions. Choose any subset of n-1 of these, and place "dividers" in those positions. You then have the remaining blank positions grouped into n groups between dividers (some possibly empty groups where dividers are adjacent). Thus you have constructed a correspondance of the required compositions with the arrangements of spaces and dividers, and the latter is manifestly counted as combinations of 3n-1 things taken n-1 at a time.
For the purpose of enumerating all the possible compositions we could write a program that actually selects n-1 strictly increasing items s[1],...,s[n-1] from a list [1,...,3n-1]. In accordance with the above, the "parts" would be x[i] = s[i] - s[i-1] - 1 for i = 1,...,n with the convention that s[0] = 0 and s[n] = 3n.
More elegant for the purpose of listing compositions would be to select n-1 weakly increasing items t[1],...,t[n-1] from a list [0,...,2n] and calculate the parts x[i] = t[i] - t[i-1] for i = 1,...,n with the convention t[0] = 0 and t[n] = 2n.
Here's a brief Prolog program that gives the more general listing of compositions of N using P non-negative parts:
/* generate all possible ordered sums to N with P nonnegative parts */
composition0(N,P,List) :-
length(P,List),
composition0(N,List).
composition0(N,[N]).
composition0(N,[H|T]) :-
for(H,0,N),
M is N - H,
composition0(M,T).
The predicate compostion0/3 expresses its first argument as the sum of a list of non-negative integers (third argument) having the second argument as its length.
The definition requires a couple of utility predicates that are often provided by an implementation, perhaps in slightly different form. For completeness a Prolog definition of the counting predicate for/3 and length of list predicate are as follows:
for(H,H,N) :- H =< N.
for(H,I,N) :-
I < N,
J is I+1,
for(H,J,N).
length(P,List) :- length(P,0,List).
length(P,P,[ ]) :- !.
length(P,Q,[_|T]) :-
R is Q+1,
length(P,R,T).

Categories

Resources