Related
I'm looking at the way the Java Random library generates an integer given an upper bound, but I don't quite understand the algorithm. In the docs it says:
The algorithm is slightly tricky. It rejects values that would result
in an uneven distribution (due to the fact that 2^31 is not divisible
by n). The probability of a value being rejected depends on n. The
worst case is n=2^30+1, for which the probability of a reject is 1/2,
and the expected number of iterations before the loop terminates is 2.
But I really don't see how this implementation takes this into account, specifically the while condition in the code. To me it seems that this would (almost) always succeed with 50% success rate. Especially when looking at very low values for bound (which I think is used a lot when imposing a bound). It seems to me like the condition in the while is just checking the sign of bits, so why bother with the line they use?
public int nextInt(int bound) {
if (bound <= 0)
throw new IllegalArgumentException("bound must be positive");
if ((bound & -bound) == bound) // i.e., bound is a power of 2
return (int)((bound * (long)next(31)) >> 31);
int bits, val;
do {
bits = next(31);
val = bits % bound;
} while (bits - val + (bound-1) < 0);
return val;
}
Note that bits - val + (bound-1) < 0 is actually checking whether bits - val + (bound-1) overflows. bits is always equal to or greater than val, and bound is always positive, so there is no way for the LHS to be positive under normal circumstances.
We can think of the < 0 as > Integer.MAX_VALUE.
Let's plot a graph of bits - val + (bound - 1). I have made one on desmos here. Let's say bound is 100 (small bound):
The x axis is bits and y axis is bits - val + (bound-1), and I have added lines on both the x and y axes to indicate Integer.MAX_VALUE. Note that bits is bounded by Integer.MAX_VALUE.
At this scale, you can see that bits - val + (bound-1) seems to never overflow. If you zoom a lot, you'll see:
Note that there is a tiny range of values of bits for which bits < Integer.MAX_VALUE, but bits - val + (bound - 1) > Integer.MAX_VALUE.
For b = (1 << 30) + 1, the graph looks like:
Any b that is greater than 1 << 30 overflows. Hence the 1/2 chance of rejecting the bounds as the documentation said.
We can easily get random floating point numbers within a desired range [X,Y) (note that X is inclusive and Y is exclusive) with the function listed below since Math.random() (and most pseudorandom number generators, AFAIK) produce numbers in [0,1):
function randomInRange(min, max) {
return Math.random() * (max-min) + min;
}
// Notice that we can get "min" exactly but never "max".
How can we get a random number in a desired range inclusive to both bounds, i.e. [X,Y]?
I suppose we could "increment" our value from Math.random() (or equivalent) by "rolling" the bits of an IEE-754 floating point double precision to put the maximum possible value at 1.0 exactly but that seems like a pain to get right, especially in languages poorly suited for bit manipulation. Is there an easier way?
(As an aside, why do random number generators produce numbers in [0,1) instead of [0,1]?)
[Edit] Please note that I have no need for this and I am fully aware that the distinction is pedantic. Just being curious and hoping for some interesting answers. Feel free to vote to close if this question is inappropriate.
I believe there is much better decision but this one should work :)
function randomInRange(min, max) {
return Math.random() < 0.5 ? ((1-Math.random()) * (max-min) + min) : (Math.random() * (max-min) + min);
}
First off, there's a problem in your code: Try randomInRange(0,5e-324) or just enter Math.random()*5e-324 in your browser's JavaScript console.
Even without overflow/underflow/denorms, it's difficult to reason reliably about floating point ops. After a bit of digging, I can find a counterexample:
>>> a=1.0
>>> b=2**-54
>>> rand=a-2*b
>>> a
1.0
>>> b
5.551115123125783e-17
>>> rand
0.9999999999999999
>>> (a-b)*rand+b
1.0
It's easier to explain why this happens with a=253 and b=0.5: 253-1 is the next representable number down. The default rounding mode ("round to nearest even") rounds 253-0.5 up (because 253 is "even" [LSB = 0] and 253-1 is "odd" [LSB = 1]), so you subtract b and get 253, multiply to get 253-1, and add b to get 253 again.
To answer your second question: Because the underlying PRNG almost always generates a random number in the interval [0,2n-1], i.e. it generates random bits. It's very easy to pick a suitable n (the bits of precision in your floating point representation) and divide by 2n and get a predictable distribution. Note that there are some numbers in [0,1) that you will will never generate using this method (anything in (0,2-53) with IEEE doubles).
It also means that you can do a[Math.floor(Math.random()*a.length)] and not worry about overflow (homework: In IEEE binary floating point, prove that b < 1 implies a*b < a for positive integer a).
The other nice thing is that you can think of each random output x as representing an interval [x,x+2-53) (the not-so-nice thing is that the average value returned is slightly less than 0.5). If you return in [0,1], do you return the endpoints with the same probability as everything else, or should they only have half the probability because they only represent half the interval as everything else?
To answer the simpler question of returning a number in [0,1], the method below effectively generates an integer [0,2n] (by generating an integer in [0,2n+1-1] and throwing it away if it's too big) and dividing by 2n:
function randominclusive() {
// Generate a random "top bit". Is it set?
while (Math.random() >= 0.5) {
// Generate the rest of the random bits. Are they zero?
// If so, then we've generated 2^n, and dividing by 2^n gives us 1.
if (Math.random() == 0) { return 1.0; }
// If not, generate a new random number.
}
// If the top bits are not set, just divide by 2^n.
return Math.random();
}
The comments imply base 2, but I think the assumptions are thus:
0 and 1 should be returned equiprobably (i.e. the Math.random() doesn't make use of the closer spacing of floating point numbers near 0).
Math.random() >= 0.5 with probability 1/2 (should be true for even bases)
The underlying PRNG is good enough that we can do this.
Note that random numbers are always generated in pairs: the one in the while (a) is always followed by either the one in the if or the one at the end (b). It's fairly easy to verify that it's sensible by considering a PRNG that returns either 0 or 0.5:
a=0 b=0 : return 0
a=0 b=0.5: return 0.5
a=0.5 b=0 : return 1
a=0.5 b=0.5: loop
Problems:
The assumptions might not be true. In particular, a common PRNG is to take the top 32 bits of a 48-bit LCG (Firefox and Java do this). To generate a double, you take 53 bits from two consecutive outputs and divide by 253, but some outputs are impossible (you can't generate 253 outputs with 48 bits of state!). I suspect some of them never return 0 (assuming single-threaded access), but I don't feel like checking Java's implementation right now.
Math.random() is twice for every potential output as a consequence of needing to get the extra bit, but this places more constraints on the PRNG (requiring us to reason about four consecutive outputs of the above LCG).
Math.random() is called on average about four times per output. A bit slow.
It throws away results deterministically (assuming single-threaded access), so is pretty much guaranteed to reduce the output space.
My solution to this problem has always been to use the following in place of your upper bound.
Math.nextAfter(upperBound,upperBound+1)
or
upperBound + Double.MIN_VALUE
So your code would look like this:
double myRandomNum = Math.random() * Math.nextAfter(upperBound,upperBound+1) + lowerBound;
or
double myRandomNum = Math.random() * (upperBound + Double.MIN_VALUE) + lowerBound;
This simply increments your upper bound by the smallest double (Double.MIN_VALUE) so that your upper bound will be included as a possibility in the random calculation.
This is a good way to go about it because it does not skew the probabilities in favor of any one number.
The only case this wouldn't work is where your upper bound is equal to Double.MAX_VALUE
Just pick your half-open interval slightly bigger, so that your chosen closed interval is a subset. Then, keep generating the random variable until it lands in said closed interval.
Example: If you want something uniform in [3,8], then repeatedly regenerate a uniform random variable in [3,9) until it happens to land in [3,8].
function randomInRangeInclusive(min,max) {
var ret;
for (;;) {
ret = min + ( Math.random() * (max-min) * 1.1 );
if ( ret <= max ) { break; }
}
return ret;
}
Note: The amount of times you generate the half-open R.V. is random and potentially infinite, but you can make the expected number of calls otherwise as close to 1 as you like, and I don't think there exists a solution that doesn't potentially call infinitely many times.
Given the "extremely large" number of values between 0 and 1, does it really matter? The chances of actually hitting 1 are tiny, so it's very unlikely to make a significant difference to anything you're doing.
What would be a situation where you would NEED a floating point value to be inclusive of the upper bound? For integers I understand, but for a float, the difference between between inclusive and exclusive is what like 1.0e-32.
Think of it this way. If you imagine that floating-point numbers have arbitrary precision, the chances of getting exactly min are zero. So are the chances of getting max. I'll let you draw your own conclusion on that.
This 'problem' is equivalent to getting a random point on the real line between 0 and 1. There is no 'inclusive' and 'exclusive'.
The question is akin to asking, what is the floating point number right before 1.0? There is such a floating point number, but it is one in 2^24 (for an IEEE float) or one in 2^53 (for a double).
The difference is negligible in practice.
private static double random(double min, double max) {
final double r = Math.random();
return (r >= 0.5d ? 1.5d - r : r) * (max - min) + min;
}
Math.round() will help to include the bound value. If you have 0 <= value < 1 (1 is exclusive), then Math.round(value * 100) / 100 returns 0 <= value <= 1 (1 is inclusive). A note here is that the value now has only 2 digits in its decimal place. If you want 3 digits, try Math.round(value * 1000) / 1000 and so on. The following function has one more parameter, that is the number of digits in decimal place - I called as precision:
function randomInRange(min, max, precision) {
return Math.round(Math.random() * Math.pow(10, precision)) /
Math.pow(10, precision) * (max - min) + min;
}
How about this?
function randomInRange(min, max){
var n = Math.random() * (max - min + 0.1) + min;
return n > max ? randomInRange(min, max) : n;
}
If you get stack overflow on this I'll buy you a present.
--
EDIT: never mind about the present. I got wild with:
randomInRange(0, 0.0000000000000000001)
and got stack overflow.
I am fairly less experienced, So I am also looking for solutions as well.
This is my rough thought:
Random number generators produce numbers in [0,1) instead of [0,1],
Because [0,1) is an unit length that can be followed by [1,2) and so on without overlapping.
For random[x, y],
You can do this:
float randomInclusive(x, y){
float MIN = smallest_value_above_zero;
float result;
do{
result = random(x, (y + MIN));
} while(result > y);
return result;
}
Where all values in [x, y] has the same possibility to be picked, and you can reach y now.
Generating a "uniform" floating-point number in a range is non-trivial. For example, the common practice of multiplying or dividing a random integer by a constant, or by scaling a "uniform" floating-point number to the desired range, have the disadvantage that not all numbers a floating-point format can represent in the range can be covered this way, and may have subtle bias problems. These problems are discussed in detail in "Generating Random Floating-Point Numbers by Dividing Integers: a Case Study" by F. Goualard.
Just to show how non-trivial the problem is, the following pseudocode generates a random "uniform-behaving" floating-point number in the closed interval [lo, hi], where the number is of the form FPSign * FPSignificand * FPRADIX^FPExponent. The pseudocode below was reproduced from my section on floating-point number generation. Note that it works for any precision and any base (including binary and decimal) of floating-point numbers.
METHOD RNDRANGE(lo, hi)
losgn = FPSign(lo)
hisgn = FPSign(hi)
loexp = FPExponent(lo)
hiexp = FPExponent(hi)
losig = FPSignificand(lo)
hisig = FPSignificand(hi)
if lo > hi: return error
if losgn == 1 and hisgn == -1: return error
if losgn == -1 and hisgn == 1
// Straddles negative and positive ranges
// NOTE: Changes negative zero to positive
mabs = max(abs(lo),abs(hi))
while true
ret=RNDRANGE(0, mabs)
neg=RNDINT(1)
if neg==0: ret=-ret
if ret>=lo and ret<=hi: return ret
end
end
if lo == hi: return lo
if losgn == -1
// Negative range
return -RNDRANGE(abs(lo), abs(hi))
end
// Positive range
expdiff=hiexp-loexp
if loexp==hiexp
// Exponents are the same
// NOTE: Automatically handles
// subnormals
s=RNDINTRANGE(losig, hisig)
return s*1.0*pow(FPRADIX, loexp)
end
while true
ex=hiexp
while ex>MINEXP
v=RNDINTEXC(FPRADIX)
if v==0: ex=ex-1
else: break
end
s=0
if ex==MINEXP
// Has FPPRECISION or fewer digits
// and so can be normal or subnormal
s=RNDINTEXC(pow(FPRADIX,FPPRECISION))
else if FPRADIX != 2
// Has FPPRECISION digits
s=RNDINTEXCRANGE(
pow(FPRADIX,FPPRECISION-1),
pow(FPRADIX,FPPRECISION))
else
// Has FPPRECISION digits (bits), the highest
// of which is always 1 because it's the
// only nonzero bit
sm=pow(FPRADIX,FPPRECISION-1)
s=RNDINTEXC(sm)+sm
end
ret=s*1.0*pow(FPRADIX, ex)
if ret>=lo and ret<=hi: return ret
end
END METHOD
I have a simple program:
public class Mathz {
static int i = 1;
public static void main(String[] args) {
while (true){
i = i + i;
System.out.println(i);
}
}
}
When I run this program, all I see is 0 for i in my output. I would have expected the first time round we would have i = 1 + 1, followed by i = 2 + 2, followed by i = 4 + 4 etc.
Is this due to the fact that as soon as we try to re-declare i on the left hand-side, its value gets reset to 0?
If anyone can point me into the finer details of this that would be great.
Change the int to long and it seems to be printing numbers as expected. I'm surprised at how fast it hits the max 32-bit value!
Introduction
The problem is integer overflow. If it overflows, it goes back to the minimum value and continues from there. If it underflows, it goes back to the maximum value and continues from there. The image below is of an Odometer. I use this to explain overflows. It's a mechanical overflow but a good example still.
In an Odometer, the max digit = 9, so going beyond the maximum means 9 + 1, which carries over and gives a 0 ; However there is no higher digit to change to a 1, so the counter resets to zero. You get the idea - "integer overflows" come to mind now.
The largest decimal literal of type int is 2147483647 (231-1). All
decimal literals from 0 to 2147483647 may appear anywhere an int
literal may appear, but the literal 2147483648 may appear only as the
operand of the unary negation operator -.
If an integer addition overflows, then the result is the low-order
bits of the mathematical sum as represented in some sufficiently large
two's-complement format. If overflow occurs, then the sign of the
result is not the same as the sign of the mathematical sum of the two
operand values.
Thus, 2147483647 + 1 overflows and wraps around to -2147483648. Hence int i=2147483647 + 1 would be overflowed, which isn't equal to 2147483648. Also, you say "it always prints 0". It does not, because http://ideone.com/WHrQIW. Below, these 8 numbers show the point at which it pivots and overflows. It then starts to print 0s. Also, don't be surprised how fast it calculates, the machines of today are rapid.
268435456
536870912
1073741824
-2147483648
0
0
0
0
Why integer overflow "wraps around"
Original PDF
The issue is due to integer overflow.
In 32-bit twos-complement arithmetic:
i does indeed start out having power-of-two values, but then overflow behaviors start once you get to 230:
230 + 230 = -231
-231 + -231 = 0
...in int arithmetic, since it's essentially arithmetic mod 2^32.
No, it does not print only zeros.
Change it to this and you will see what happens.
int k = 50;
while (true){
i = i + i;
System.out.println(i);
k--;
if (k<0) break;
}
What happens is called overflow.
static int i = 1;
public static void main(String[] args) throws InterruptedException {
while (true){
i = i + i;
System.out.println(i);
Thread.sleep(100);
}
}
out put:
2
4
8
16
32
64
...
1073741824
-2147483648
0
0
when sum > Integer.MAX_INT then assign i = 0;
Since I don't have enough reputation I cannot post the picture of the output for the same program in C with controlled output, u can try yourself and see that it actually prints 32 times and then as explained due to overflow i=1073741824 + 1073741824 changes to
-2147483648 and one more further addition is out of range of int and turns to Zero .
#include<stdio.h>
#include<conio.h>
int main()
{
static int i = 1;
while (true){
i = i + i;
printf("\n%d",i);
_getch();
}
return 0;
}
The value of i is stored in memory using a fixed quantity of binary digits. When a number needs more digits than are available, only the lowest digits are stored (the highest digits get lost).
Adding i to itself is the same as multiplying i by two. Just like multiplying a number by ten in decimal notation can be performed by sliding each digit to the left and putting a zero on the right, multiplying a number by two in binary notation can be performed the same way. This adds one digit on the right, so a digit gets lost on the left.
Here the starting value is 1, so if we use 8 digits to store i (for example),
after 0 iterations, the value is 00000001
after 1 iteration , the value is 00000010
after 2 iterations, the value is 00000100
and so on, until the final non-zero step
after 7 iterations, the value is 10000000
after 8 iterations, the value is 00000000
No matter how many binary digits are allocated to store the number, and no matter what the starting value is, eventually all of the digits will be lost as they are pushed off to the left. After that point, continuing to double the number will not change the number - it will still be represented by all zeroes.
It is correct, but after 31 iterations, 1073741824 + 1073741824 doesn't calculate correctly (overflows) and after that prints only 0.
You can refactor to use BigInteger, so your infinite loop will work correctly.
public class Mathz {
static BigInteger i = new BigInteger("1");
public static void main(String[] args) {
while (true){
i = i.add(i);
System.out.println(i);
}
}
}
For debugging such cases it is good to reduce the number of iterations in the loop. Use this instead of your while(true):
for(int r = 0; r<100; r++)
You can then see that it starts with 2 and is doubling the value until it is causing an overflow.
I'll use an 8-bit number for illustration because it can be completely detailed in a short space. Hex numbers begin with 0x, while binary numbers begin with 0b.
The max value for an 8-bit unsigned integer is 255 (0xFF or 0b11111111).
If you add 1, you would typically expect to get: 256 (0x100 or 0b100000000).
But since that's too many bits (9), that's over the max, so the first part just gets dropped, leaving you with 0 effectively (0x(1)00 or 0b(1)00000000, but with the 1 dropped).
So when your program runs, you get:
1 = 0x01 = 0b1
2 = 0x02 = 0b10
4 = 0x04 = 0b100
8 = 0x08 = 0b1000
16 = 0x10 = 0b10000
32 = 0x20 = 0b100000
64 = 0x40 = 0b1000000
128 = 0x80 = 0b10000000
256 = 0x00 = 0b00000000 (wraps to 0)
0 + 0 = 0 = 0x00 = 0b00000000
0 + 0 = 0 = 0x00 = 0b00000000
0 + 0 = 0 = 0x00 = 0b00000000
...
The largest decimal literal of type int is 2147483648 (=231). All decimal literals from 0 to 2147483647 may appear anywhere an int literal may appear, but the literal 2147483648 may appear only as the operand of the unary negation operator -.
If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format. If overflow occurs, then the sign of the result is not the same as the sign of the mathematical sum of the two operand values.
Here is my FIRST Question
Here is my code:
public class Bits{
public static void main(String args[]){
int i = 2 , j = 4;
int allOnes = ~0;
int left = allOnes << (j+1);
System.out.println("Binary Equivalent at this stage: " +Integer.toBinaryString(left));
}
}
The following is the output I'm getting:
Binary Equivalent at this stage: 11111111111111111111111111100000
How can I restrict it to only 8 bits from the right hand side. I mean 11100000 .
Please explain.
Here is my SECOND Question:
Also, I have one more Question which is totally different with the above one:
public static void main(String args[]){
int i = 2 , j = 4;
int allOnes = ~0; // will equal sequence of all 1s
int left = allOnes << (j+1);
System.out.println("Binary Equivalent at this stage: " +Integer.toBinaryString(left));
}
}
Since I didn't understand the following line:
int allOnes = ~0; // will equal sequence of all 1s
When I tried to output the value of "allOnes" then I got "-1" as my output.
I'm having hard time understanding the very next line which is as follows:
int left = allOnes << (j+1);
int allOnes = ~0;
Takes the integer 0 and applies the NOT operation bitwise so it will have all ones in its binary representation. Intagers use the two's complement format, meaning that a value of a word having all bits as one is value of -1.
If you only care about byte boundaries, then use a ByteBuffer
byte lastByte = ByteBuffer.allocate(4).putInt(i).array()[3];
To restrict this byte to the first four or last four bits, use lastByte & 0b11110000 or lastByte & 0b00001111
The integer representation of -1 is all 1's, i.e. 32 bits all set to 1. You can think of the first bit as -2^31 (note the negative sign), and of each subsequent bit as 2^30, 2^29, etc. Adding 2^0 + 2^1 + 2^2 ... + 2^30 - 2^31 = -1.
I suggest reading this tutorial on bitwise operations.
For #1 Integer.toBinaryString(left) is printing 32 bits (length of Integer), so if you just want the right 8 you can do the following:
Integer.toBinaryString(left).substring(24)
The ~ operator in Java inverts the the bit pattern. Thus 0 turns into ffff.
The << operator shifts the bits by x. You are shifting the bits to the left by 5 so you end up with 5 zeros on the right.
Here are all the bitwise operators for Java
First, a more general solution for the first question than what I've seen so far is
left &= (2 ^ n) - 1;
where n is the number of binary digits that you want to take from the right. This is based around the bitwise AND operator, &, which compares corresponding bits in two numbers and outputs a 1 if they are both 1s and 0 otherwise. For example:
10011001 & 11110000 == 10010000; // true
This is used to create what are known as bitmasks (http://en.wikipedia.org/wiki/Mask_(computing)). Notice how in this example how the left 4 bits of the first number are copied over to the result and how those same 4 bits are all ones in the second number? That's the idea in a bit mask.
So in your case, let's look at n = 8
left &= (2 ^ 8) - 1;
left &= 256 - 1;
left &= 255; // Note that &=, like += or *=, just means left = left & 255
// Also, 255 is 11111111 in binary so it can be used as the bitmask for
// the 8 rightmost bits.
Integer.toBinaryString(left) = "11100000";
Your second question is much more in depth, but you'd probably benefit most from reading the Wikipedia article (http://en.wikipedia.org/wiki/Two's_complement) instead of trying to understand a brief explanation here.
8 bits in decimal has a maximum value of 255. You can use the modulo (remainder) division operator to limit it to 8 bits at this point. For isntance:
int yournum = 35928304284 % 256;
will limit yournum to 8 bits of length. Additionally, as suggested in the comments, you can do this:
int yournum = 3598249230 & 255;
This works as well, and is actually preferred in this case, because it is much faster. The bitwise and function returns 1 if both associated bits are 1; since only the last 8 bits of 255 are one, the integer is implicitly limited to 255.
To answer your second question: A tilde is the bitwise inversion operator. Thus,
int allOnes = ~0;
creates an integer of all 1s. Because of the way twos complements works, that number actually represents -1.
This question directly follows after reading through Bits counting algorithm (Brian Kernighan) in an integer time complexity . The Java code in question is
int count_set_bits(int n) {
int count = 0;
while(n != 0) {
n &= (n-1);
count++;
}
}
I want to understand what n &= (n-1) is achieving here ? I have seen a similar kind of construct in another nifty algorithm for detecting whether a number is a power of 2 like:
if(n & (n-1) == 0) {
System.out.println("The number is a power of 2");
}
Stepping through the code in a debugger helped me.
If you start with
n = 1010101 & n-1=1010100 => 1010100
n = 1010100 & n-1=1010011 => 1010000
n = 1010000 & n-1=1001111 => 1000000
n = 1000000 & n-1=0111111 => 0000000
So this iterates 4 times. Each iteration decrements the value in such a way that the least significant bit that is set to 1 disappears.
Decrementing by one flips the lowest bit and every bit up to the first one. e.g. if you have 1000....0000 -1 = 0111....1111 not matter how many bits it has to flip and it stops there leaving any other bits set untouched. When you and this with n the lowest bit set and only the lowest bit becomes 0
Subtraction of 1 from a number toggles all the bits (from right to left) till the rightmost set bit(including the righmost set bit).
So if we subtract a number by 1 and do bitwise & with itself (n & (n-1)), we unset the righmost set bit. In this way we can unset 1s one by one from right to left in loop.
The number of times the loop iterates is equal to the number of set
bits.
Source : Brian Kernighan's Algorithm