I already know how to generate random numbers in a range. I can do this by using
rand.nextInt((max - min) + 1) + min;
The problem is that I would also like to set a standard deviation for these numbers. The numbers also need to be positive and they are not between 0 and 1
EDIT I removed the ThreadLocalRandom class because I cannot set a seed in that class and these random numbers should be reproducible in a different system.
Choosing the standard deviation (or variance) for a bounded distribution can only be done subject to constraints that depend on the selected distribution and the bounds (min, max) of your interval. Some distributions may allow you to make the variance arbitrarily small (e.g. the Beta distribution), other distributions (like the Uniform distribution) don't allow any flexibility once the bounds (min, max) have been set. In any case, you'll never be able to make the variance arbitrarily large -- the bounds do prevent this (they'll always enter the expression for the distribution's variance).
I'll illustrate this for a very simple example that can be implemented without requiring any 3rd party libraries. Let's assume you want a symmetric distribution on the interval (min, max), symmetry implying that the mean E(X) of the distribution is located in the middle of the interval: E(X) = (min + max)/2.
Using Random's nextDouble as in x = a + (b - a) * rnd.nextDouble() will give you a uniformly distributed random variable in the interval a <= x < b that has a fixed variance Var(X) = (b - a)^2 / 12 (not what we want).
OTH, simulating a symmetric triangular distribution on the same interval (a, b) would give us a random variate whith the same mean but with only half the variance: Var(X) = (b - a)^2 / 24 (also fixed, so also not what we want).
A symmetric trapezoidal distribution with parameters (a < b < c < d) lies somewhere in the middle of a Uniform and a triangular distribution on the interval (a, d). The symmetry condition implies d - c = b - a, in the following I'll refer to the distance b - a as x or as "displacement" (I've made up that name, it's not a technical term).
If you let x approach 0.0 from above, the trapezoidal will begin to look very similar to a uniform distribution and its variance will tend to the maximum possible value (d - a)^2 / 12. If you let x approach the maximum possible value (d - a)/2 from below, the trapezoidal will look very similar to a symmetric triangle distribution and its variance will approach the minimum possible value of (d - a)^2 / 24) (but note that we should stay away a little from these extreme values in order not to break the variance formula or our algorithm for the trapezoidal).
So, the idea is to construct a trapezoidal distribution with a value for x that yields the standard deviation you want, given the condition that your targeted standard deviation must lie inside the open range (roughly) given by (0.2041(d - a), 0.2886(d - a)). For convenience let's assume that a = min = 2.0 and d = max = 10.0 which gives us this range of possible stddevs: (1.6328, 2.3088). Let's further assume that we want to construct a distribution with a stddev of 2.0 (which, of course, has to be in the admissible range).
Solving this requires 3 steps:
1) we need to have a formula for the variance given min, max and an admissible value for the displacement x
2) we need to somehow "invert" this expression to give us the value of x for our target variance
3) once we know the value of x we must construct a random variable that has a symmetric trapezoidal distribution with the parameters (min, max, x)
Step 1:
/**
* Variance of a symmetric trapezoidal distribution with parameters
* {#code a < b < c < d} and the length of {#code d - c = b - a}
* (by symmetry) identified by {#code x}.
*
* #param a support lower bound
* #param d support upper bound
* #param x length of {#code d - c = b - a}, constrained to lie in the open
* interval {#code (0, (d-a)/2)}
* #return variance of the symmetric trapezoidal distribution defined by
* the triple {#code (a, d, x)}
*/
static double varSymTrapezoid(double a, double d, double x) {
if (a <= 0.0 || d <= 0.0 || a >= d) {
throw new IllegalArgumentException();
}
if (x <= 0.0 || x >= (d - a) / 2) {
throw new IllegalArgumentException();
}
double b = a + x;
double c = d - x;
double b3 = pow(b, 3);
double c3 = pow(c, 3);
double ex2p1 = pow(b, 4) / 4 - a * b3 / 3 + pow(a, 4) / 12;
double ex2p2 = (c3 / 3 - b3 / 3) * (d - c);
double ex2p3 = pow(c, 4) / 4 - d * c3 / 3 + pow(d, 4) / 12;
double ex2 = (ex2p1 + ex2p2 + ex2p3) / ((d - b) * (d - c));
return ex2 - pow((a + d) / 2, 2);
}
Note that this formula is only valid for symmetric trapezoidal distributions. As an example, if you call this method with a displacement of 2.5 (varSymTrapezoid(2.0, 10.0, 2.5)) it'd give you back a variance of approximately 3.0416 which is too low (we need 4.0), meaning that a displacement of 2.5 is too much (higher displacements give lower variances).
The variance expression is a 4th order polynomial in x that I'd rather not want to solve analytically. However, for a target x in the admissible range this expression is monotonically decreasing, so we can construct a function that crosses zero for our target variance and solve this by simple bisection. This is
Step 2:
/**
* Find the displacement {#code x} for the given {#code stddev} by simple
* bisection.
* #param min support lower bound
* #param max support upper bound
* #param stddev the standard deviation we want
* #return the length {#code x} of {#code d - c = b - a} that yields a
* standard deviation roughly equal to {#code stddev}
*/
static double bisect(double min, double max, double stddev) {
final double eps = 1e-4;
final double var = pow(stddev, 2);
int iters = 0;
double a = eps;
double b = (max - min) / 2 - eps;
double x = eps;
double dx = b - a;
while (abs(dx) > eps && iters < 150 && eval(min, max, x, var) != 0.0) {
x = ((a + b) / 2);
if ((eval(min, max, a, var) * eval(min, max, x, var)) < 0.0) {
b = x;
dx = b - a;
} else {
a = x;
dx = b - a;
}
iters++;
}
if (abs(eval(min, max, x, var)) > eps) {
throw new RuntimeException("failed to find solution");
}
return x;
}
/**
* Function whose root we want to find.
*/
static double eval(double min, double max, double x, double var) {
return varSymTrapezoid(min, max, x) - var;
}
Calling the bisect method with the desired value 2.0 for the standard deviation (bisect(2.0, 10.0, 2.0)) gives us the needed displacement: ~ 1.1716. Now that the value for x is known, the final thing we have to do is to construct a suitably distributed random variable which is
Step 3:
It is a well-known fact of probability theory that the sum of two independent uniformly distributed random variables X1 ~ U[a1, b1] and X2 ~ U[a2, b2] is a symmetric trapezoidally distributed random variable on the interval [a1 + a2, b1 + b2] provided that either a1 + b2 < a2 + b1 (case 1) or a2 + b1 < a1 + b2 (case 2). We have to avoid the case a2 + b1 = a1 + b2 (case 3) since then the sum has a symmetric triangular distribution which we don't want.
We'll choose case 1 (a1 + b2 < a2 + b1). In that case the length of b2 - a2 will be equal to the "displacement" x.
So, all we have to do is to choose the interval boundaries a1, a2, b1 and b2 such that a1 + a2 = min, b1 + b2 = max, b2 - a2 = x and the above inequality is fullfilled:
/**
* Return a pseudorandom double for the symmetric trapezoidal distribution
* defined by the triple {#code (min, max, x)}
* #param min support lower bound
* #param max support upper bound
* #param x length of {#code max - c = b - min}, constrained to lie in the
* open interval {#code (0, (max-min)/2)}
*/
public static double symTrapezoidRandom(double min, double max, double x) {
final double a1 = 0.5 * min;
final double a2 = a1;
final double b1 = max - a2 - x;
final double b2 = a2 + x;
if ((a1 + b2) >= (a2 + b1)) {
throw new IllegalArgumentException();
}
double u = a1 + (b1 - a1) * rnd.nextDouble();
double v = a2 + (b2 - a2) * rnd.nextDouble();
return u + v;
}
Calling symTrapezoidRandom(2.0, 10.0, 1.1716) repeatedly gives you random variables that have the desired distribution.
You could do very similar things with other, more sophisticated, distributions like the Beta. This would give you other (usually more flexible) bounds on the admissible variances but you'd need a 3rd party library like commons.math for that.
abs, pow, sqrt in the code refer to the statically imported java.lang.Math methods and rnd is an instance of java.util.Random.
Related
I want to calculate the increase of percentage of a variable from type int while using another variable from type int for the percentage (50 percent).
thanks in advance for anyone who is willing to help.
`
int a = 3;
int percentage = 3 / 2;
// here I get 3 instead of 4 which is the required answer.
a = a * percentage;
System.out.println(a);
// but here I get the required answer normally.
a = 3;
a = a * 3 / 2;
System.out.println(a);
`
"Percentage" is just a weird of "this value that generally is between 0 and 1 should be rendered by multiplying by 100 and adding a % symbol afterwards". In other words, it's purely a way to display a thing. 50% means 0.5.
int cannot represent 0.5. double sort of can (double and float aren't perfectly accurate). In addition / is integer division if both the left and right side are ints. So, we need to do a few things:
int a = 3;
double b = 1.0 * 3 / 2; // without that 1.0 *, it wouldn't work
System.out.println(b); // prints "1.5"
double c = a * b;
System.out.println(c); // prints 4.5.
int d = ((int) (a * b) + 0.1);
System.out.println(d); // prints 4
Because doubles aren't entirely accurate, and (int) rounds down, adding a small delta (here, 0.1) is a good idea. Otherwise various values will surprise you and go wrong (say, your math ends up at 3.99999999, solely because double is not perfectly accurate, then casting that to int gets you a 3).
How do I map numbers, linearly, between a and b to go between c and d.
That is, I want numbers between 2 and 6 to map to numbers between 10 and 20... but I need the generalized case.
My brain is fried.
If your number X falls between A and B, and you would like Y to fall between C and D, you can apply the following linear transform:
Y = (X-A)/(B-A) * (D-C) + C
That should give you what you want, although your question is a little ambiguous, since you could also map the interval in the reverse direction. Just watch out for division by zero and you should be OK.
Divide to get the ratio between the sizes of the two ranges, then subtract the starting value of your inital range, multiply by the ratio and add the starting value of your second range. In other words,
R = (20 - 10) / (6 - 2)
y = (x - 2) * R + 10
This evenly spreads the numbers from the first range in the second range.
It would be nice to have this functionality in the java.lang.Math class, as this is such a widely required function and is available in other languages.
Here is a simple implementation:
final static double EPSILON = 1e-12;
public static double map(double valueCoord1,
double startCoord1, double endCoord1,
double startCoord2, double endCoord2) {
if (Math.abs(endCoord1 - startCoord1) < EPSILON) {
throw new ArithmeticException("/ 0");
}
double offset = startCoord2;
double ratio = (endCoord2 - startCoord2) / (endCoord1 - startCoord1);
return ratio * (valueCoord1 - startCoord1) + offset;
}
I am putting this code here as a reference for future myself and may be it will help someone.
As an aside, this is the same problem as the classic convert celcius to farenheit where you want to map a number range that equates 0 - 100 (C) to 32 - 212 (F).
https://rosettacode.org/wiki/Map_range
[a1, a2] => [b1, b2]
if s in range of [a1, a2]
then t which will be in range of [b1, b2]
t= b1 + ((s- a1) * (b2-b1))/ (a2-a1)
In addition to #PeterAllenWebb answer, if you would like to reverse back the result use the following:
reverseX = (B-A)*(Y-C)/(D-C) + A
Each unit interval on the first range takes up (d-c)/(b-a) "space" on the second range.
Pseudo:
var interval = (d-c)/(b-a)
for n = 0 to (b - a)
print c + n*interval
How you handle the rounding is up to you.
if your range from [a to b] and you want to map it in [c to d] where x is the value you want to map
use this formula (linear mapping)
double R = (d-c)/(b-a)
double y = c+(x*R)+R
return(y)
Where X is the number to map from A-B to C-D, and Y is the result:
Take the linear interpolation formula, lerp(a,b,m)=a+(m*(b-a)), and put C and D in place of a and b to get Y=C+(m*(D-C)). Then, in place of m, put (X-A)/(B-A) to get Y=C+(((X-A)/(B-A))*(D-C)). This is an okay map function, but it can be simplified. Take the (D-C) piece, and put it inside the dividend to get Y=C+(((X-A)*(D-C))/(B-A)). This gives us another piece we can simplify, (X-A)*(D-C), which equates to (X*D)-(X*C)-(A*D)+(A*C). Pop that in, and you get Y=C+(((X*D)-(X*C)-(A*D)+(A*C))/(B-A)). The next thing you need to do is add in the +C bit. To do that, you multiply C by (B-A) to get ((B*C)-(A*C)), and move it into the dividend to get Y=(((X*D)-(X*C)-(A*D)+(A*C)+(B*C)-(A*C))/(B-A)). This is redundant, containing both a +(A*C) and a -(A*C), which cancel each other out. Remove them, and you get a final result of: Y=((X*D)-(X*C)-(A*D)+(B*C))/(B-A)
TL;DR: The standard map function, Y=C+(((X-A)/(B-A))*(D-C)), can be simplified down to Y=((X*D)-(X*C)-(A*D)+(B*C))/(B-A)
int srcMin = 2, srcMax = 6;
int tgtMin = 10, tgtMax = 20;
int nb = srcMax - srcMin;
int range = tgtMax - tgtMin;
float rate = (float) range / (float) nb;
println(srcMin + " > " + tgtMin);
float stepF = tgtMin;
for (int i = 1; i < nb; i++)
{
stepF += rate;
println((srcMin + i) + " > " + (int) (stepF + 0.5) + " (" + stepF + ")");
}
println(srcMax + " > " + tgtMax);
With checks on divide by zero, of course.
I've been using java's SplittableRandom ever since I heard about it, due to its speed, and not being in need of multithreading. However, although it has almost every method from Random class, it doesn't come with nextFloat(). Why's that?
Now, the real question is, how would I then go about creating that nextFloat method? Seeing the double is generated as follows: (from JDK 8)
final double internalNextDouble(final double n, final double n2) {
double longBitsToDouble = (this.nextLong() >>> 11) * 1.1102230246251565E-16;
if (n < n2) {
longBitsToDouble = longBitsToDouble * (n2 - n) + n;
if (longBitsToDouble >= n2) {
longBitsToDouble = Double.longBitsToDouble(Double.doubleToLongBits(n2) - 1L);
}
}
return longBitsToDouble;
}
.. I was hoping that I could just turn it to a 32-bit number generation with the following;
final float internalNextFloat(final float min, final float max) {
float intBitsToFloat = (this.nextInt() >>> 11) * 1.1102230246251565E-16f;
if (min < max) {
intBitsToFloat = intBitsToFloat * (max - min) + min;
if (intBitsToFloat >= max) {
intBitsToFloat = Float.intBitsToFloat(Float.floatToIntBits(max) - 1);
}
}
return intBitsToFloat;
}
However, this returns 0.000000. I can only assume it overflows somewhere, in which case I'm pretty sure the problem lies at the following line:
(this.nextInt() >>> 11) * 1.1102230246251565E-16f;
So, not being experienced with shifting (and using epsilon I guess), how could I achieve what I want?
Without having thought about the mathematics of this too deeply, it seems to me that you could just use the nextDouble method to generate a double within the desired range and then cast the result to float.
You need to first understand the meaning behind this line:
double longBitsToDouble = (this.nextLong() >>> 11) * 1.1102230246251565E-16;
this.nextLong() returns a 64 long.
>>> 11 turn the long to unsigned and removes the last 11 bits, so now we get a 53-bit random value. This is also the precision of double.
* 1.1102230246251565E-16. This is equivalent to 1 / 9007199254740992.0, or 2-53.
So longBitsToDouble is a randomly uniform double from 0 (inclusive) to 1 (exclusive).
Compared with a float, its precision is 24 bits, while this.nextInt() generates a 32-bit random value, so the corresponding line should be written as
float intBitsToFloat = (this.nextInt() >>> 8) * 5.960464477539063E-8f;
(Instead of the decimal representation 5.960464477539063E-8f you could also use hexadecimal float, which may be clearer to readers:
float intBitsToFloat = (this.nextInt() >>> 8) * 0x1.0p-24;
)
I have two equations: x * x - D * y * y = 1 and x = sqrt(1 + D * y * y).
Both are algebraic manipulations of the other.
Given D, I need to solve for the smallest integer value of x so that y is also an integer. I loop through possible y values, plug them into the second equation and test if x is an integer. If it is, I return x.
The problem I have is when x, y, and D are plugged into the 1st equation, it does not equal 1.
These are some problematic values:
1. x=335159612 y=42912791 D=61
2. x=372326272 y=35662389 D=109
My intuition is that java's Math.sqrt method does not calculate such a small decimal, however BigDecimal does not have a square root method.
Is my math just wrong? If not, what can I do to accurately calculate x and y?
Edit: Here is the root of the problem along with the method that tests if a double is a a natural number.
public static void main(String[] args){
long x = 335159612, D = 61, y = 42912791;
System.out.println(Math.sqrt(D * y * y + 2)); // 3.35159612E8
System.out.println(x * x - D * y * y); // 3
}
public static boolean isNatural(double d){
return d == (int)d;
}
Be careful with precisions in 'double'.
As per IEEE 754-1985 the double precision provides 16 digits (15,9 to be absolutely precise).
E.g.
a) SQRT(112331965515990542) is
335159611.99999999701634694576505237017910
Which, when converted into double, gives 3.3515961199999999E8
b) SQRT(112331965515990543)
335159611.99999999850817347288252618840968
Which, when converted into double, gives 3.3515961199999999E8.
So, as per IEEE 754-1985 definition, those values are equal.
Apparently, any further logical/mathematical checks will be, generally speaking, inaccurate.
To overcome this limitation I recommend BigMath package from www.javasoft.ch
import ch.javasoft.math.BigMath;
import java.math.BigDecimal;
class Tester {
public static void main(String[] args) {
long D = 61L, y = 42912791L;
double a = Math.sqrt(D * y * y + 1);
double b = Math.sqrt(D * y * y + 2);
System.out.println(a);
System.out.println(b);
System.out.println(a == b);
BigDecimal bda = BigMath.sqrt(new BigDecimal(D * y * y + 1), 32);
BigDecimal bdb = BigMath.sqrt(new BigDecimal(D * y * y + 2), 32);
System.out.println(bda.toString());
System.out.println(bdb.toString());
System.out.println(bda.equals(bdb));
}
}
Result:
3.35159612E8
3.35159612E8
true
335159611.99999999701634694576505237017910
335159611.99999999850817347288252618840968
false
P.s. to completely ruin your faith in standard Java maths try this:
System.out.println(0.9200000000000002);
System.out.println(0.9200000000000001);
You will see:
0.9200000000000002
0.9200000000000002
This kind of Diophantine's equations is known as Pell's equations.
Wiki.
Mathworld.
Both links contain clues - how to solve this equation using continued fractions.
I think it would be nice to apply some math instead of brutforce/
If sqrt is the issue, use the first equation instead. If x is an integer, x^2 will also be an integer; if x is not an integer, then x^2 would also not be an integer, as long as you are using BigDecimals with sufficient scale for your math and not doubles.
If you are unsure of what "Poisson Distrubtion using Normal Approximation" means, follow this link and check the texts inside the yellow box.
https://onlinecourses.science.psu.edu/stat414/node/180
Here, is the simple snapshot of the math from the link.
P(Y≥9) = P(Y>8.5) = P(Z>(8.5−6.5)/√6.5) = P(Z>0.78)= 0.218
So to get the value in .218, we use Simpson's integration rule which
integrates the function(Implemented in method named "f" from code below) from "negative
infinity" to the value that equals to this >> "((8.5−6.5)/√6.5))"
R successfully gives the correct output. But in Java when i implemented the code
below copied from "http://introcs.cs.princeton.edu/java/93integration/SimpsonsRule.java.html"
I get "0.28360853976343986" which should have been ".218" Is it any how because of the negative infinity value I am using, which is "Double.MIN_VALUE"
This is the code in Java. See at the very end for my INPUTS in the main method.
* Standard normal distribution density function.
* Replace with any sufficiently smooth function.
**********************************************************************/
public static double f(double x) {
return Math.exp(- x * x / 2) / Math.sqrt(2 * Math.PI);
}
/**********************************************************************
* Integrate f from a to b using Simpson's rule.
* Increase N for more precision.
**********************************************************************/
public static double integrate(double a, double b) {
int N = 10000; // precision parameter
double h = (b - a) / (N - 1); // step size
// 1/3 terms
double sum = 1.0 / 3.0 * (f(a) + f(b));
// 4/3 terms
for (int i = 1; i < N - 1; i += 2) {
double x = a + h * i;
sum += 4.0 / 3.0 * f(x);
}
// 2/3 terms
for (int i = 2; i < N - 1; i += 2) {
double x = a + h * i;
sum += 2.0 / 3.0 * f(x);
}
return sum * h;
}
// sample client program
public static void main(String[] args) {
double z = (8.5-6.5)/Math.sqrt(6.5);
double a = Double.MIN_VALUE;
double b = z;
System.out.println(integrate(a, b));
}
Anybody has any ideas? I tried using Apache math's "PoissonDistribution" class's method "normalApproximateProbability(int x)". But the problem is this method takes an "int".
Anyone has any better ideas on how do I get the correct output or any other code. I have used another library for simpson too but I get the same output.
I need this to be done in Java.
I tried to test the code by writing another method that implements Simpson's 3/8 rule instead of your integrate function. It gave the same result as the one you obtained at first time. So i think the difference arises most probably from rounding errors.