Does the number generated by this function obey uniform distribution ？ [closed]

Does the number generated by this function obey uniform distribution ？ [closed] - java

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I have implemented a function Generating numbers in [0,x] by given string :
import org.apache.commons.codec.digest.DigestUtils;
public class BaseUtil {
public static int getIntValue(String stringText, String x) {
var modNum = new BigInteger(x);
String sha256hex = DigestUtils.sha256Hex(stringText);
var result = new BigInteger(sha256hex, 16);
int intValue = result.mod(modNum).intValue();
return intValue;
}
}
Does the return intValue obey uniform distribution in [0,x] for much random string？

Does the method Generating int number in [0,x] by given string follow uniform distribution？
Almost. It is difficult to entirely eliminate non-uniformity ... unless 2256 happens to be evenly divisible by x.
Note that you are actually generating a number in [0,x) ... since the result cannot be x.
You also asked about more efficient implementations than the one in your question.
public class BaseUtil {
public static int getIntValueV1(String stringText, int x) {
if (x <= 0) {
throw new InvalidArgumentException("x must be strictly positive");
}
MessageDigest digest = MessageDigest.getInstance("SHA-256");
byte[] hash = digest.digest(
stringText.getBytes(StandardCharsets.UTF_8));
return new BigInteger(hash).mod(new BigInteger(x)).intValue()
}
public static int getIntValueV2(String stringText, int x) {
if (x <= 0) {
throw new InvalidArgumentException("x must be strictly positive");
}
MessageDigest digest = MessageDigest.getInstance("SHA-256");
byte[] hash = digest.digest(
stringText.getBytes(StandardCharsets.UTF_8));
ByteBuffer buff = new ByteBuffer(hash);
long number = 0;
for (int i = 0; i < 8; i++) {
number = number ^ buff.getLong();
}
return (int)(number % x);
}
}
Preconditions:
Since the result of your method is an int, that implies that x must also be an int.
Since you want numbers in the range [0,x), that implies that x must be greater than zero.
Implementations:
I am using the standard MessageDigest class since it has been in Java SE since Java 5 (if not earlier).
The first version uses BigInteger to minimize the non-uniformity when we reduce the bytes into a number in the range [0, x)
The second version uses long arithmetic to compute the remainder. I think that means that the distribution might be a bit more non-uniform than the first version. (My math skills are too rusty ...) It would also be possible to eliminate the use of ByteBuffer to convert the bytes to a sequence of longs.
I have not benchmarked these versions to see which is faster. But both should be faster than producing an intermediate hexadecimal string and parsing it.
Note that you could probably get away with using a less expensive hashing algorithm, depending on the actual use-case for this generator.

Related

How computer adds two number internally? [duplicate]

This question already has answers here:
How does addition work in Computers?
(2 answers)
Closed 6 years ago.
i believe computer must be achieving it with the help of exclusive OR with bitwise left shift opeartor. correct ?
Here is the implementation in java
public class TestAddWithoutPlus {
public static void main(String[] args) {
int result = addNumberWithoutPlus(6, 5);
System.out.println("result is " + result);
}
public static int addNumberWithoutPlus(int a, int b) {
if (a == 0) {
return b;
} else if (b == 0) {
return a;
}
int result = 0;
int carry = 0;
while (b != 0) {
result = a^b; // SUM of two bits is A XOR B
carry = (a&b); // CARRY is AND of two bits
carry = carry << 1; // shifts carry to 1 bit to calculate sum
a=result;
b=carry;
}
return result;
}
}

I'll answer for a typical bit-parallel processor as would be seen in personal computers, microcontrollers, etc. This does not apply to a bit-serial architecture, which is more often seen in specialized situations such as certain types of DSP and certain FPGA designs.
Typically this is not the case, since for a narrow width such as 32 or 64 bits, an adder circuit is more efficient than serial addition as you show, since it can complete an addition asynchronously, as opposed to over multiple clock cycles.
However, the principle is the same for a basic ripple-carry adder--the adder for the least-significant bit calculates a bit of the result and a carry bit, which is passed into the full adder corresponding to the next bit as the carry in, and so on, as shown in this image:
Source: Wikimedia Commons, user cburnett, under Creative Commons 3.0 Share-alike
In practice, however, the fact that a carry coming from the LSB adder may need to propagate all the way to the MSB adder poses a limitation on performance (due to propagation delays) so various lookahead schemes may be used.

Generating random number without using built-in functions [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Am on generating snippets for built-in functions to know how they are executed, and is a part of my research also. i successfully completed some string handling functions,split, substring,reverse etc.. but i stuck in random numbers how they ware generated? how the RND or Random functions are work?
Thanks in advance

The most popular current algorithm is probably the Mersenne twister, pseudo-code in the Wikipedia article (and there are many implementations available from Google).
Another well known algorithm is Blum Blum Shub which has a proof that reduces its' security to the computational difficulty of computing modular square roots, a problem whose difficulty is assumed to be equivalent to factoring. However Blum Blum Shub is very slow.
Finally, here is a large list of additional pseudo-random number generators. Which algorithm a particular language uses varies.

Here's two algorithms that I've used most often in some of my projects:
1.Uniform distributed numbers in range between 0 and 1
result = getRidOfIntegerPart(A * prevResult)
Where A is a seed.
Possible implementation(C++):
int main()
{
double A = 12.2345; //seed
double prevResult = 1; //You can assign any value here
double result;
double intPart;
//This will give you 10 uniform distributed numbers
for(unsigned i = 0; i < 10; ++i)
{
double r = A * prevResult;
result = modf(r, &intPart); // To get rid of integer part
prevResult = result;
cout<<"result "<<i<<" = "<<result<<endl;
}
}
2. Normal(Gauss) distribution
Here's an original formula:
http://en.wikipedia.org/wiki/Normal_distribution
But I'm using a bit different simplifed formula:
It says that new normal distributed number obtained from sum of 12 uniform distributed numbers(Sj)
Possible implementation(C++):
class RandomNumberGenerator
{
public:
RandomNumberGenerator()
{
uniformPrevResult = 1;
uniformResult = 0;
uniformIntPart = 0;
}
double generateUniformNumber(double seed)
{
double r = seed * uniformPrevResult;
uniformResult = modf(r, &uniformIntPart); // To get rid of integer part
uniformPrevResult = uniformResult;
return uniformResult;
}
double generateNormalNumber(double seed)
{
double uniformSum = 0;
for(unsigned i = 0; i < 12; ++i)
{
uniformSum += generateUniformNumber(seed);
}
double normalResult = uniformSum - 6;
return normalResult; // 6 is a magic number
}
private:
double uniformPrevResult;
double uniformResult;
double uniformIntPart;
};
int main()
{
const double seed = 12.2345;
RandomNumberGenerator rndGen;
for(unsigned i = 0; i < 100; ++i)
{
double newNormalNumber = rndGen.generateNormalNumber(seed);
cout<<"newNormalNumber = "<<newNormalNumber<<endl;
}
return 0;
}
I hope it'll help you!

The JAva languages uses the algorithm that is documented in java.util.Random. There it also states that all implementations must use this algorithm
seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
return (int)(seed >>> (48 - bits));
Hence it is not true for Java that "which algorithm a particular language uses varies".

Detecting Common Elements in an Unsorted Array [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I found a code which will detect common elements in an unsorted array. The program runs in linear time! But i did not understand the logic of the program. It would be very helpful if some one could explain the logic of the program.
Here is the code:
public class DeleteUnsortedDataFromArray {
public static List<Integer> findDuplicatesArray(int[] sequence){
int bitarray = 0;
for(int i=0; i< sequence.length; i++){
int x = 1;
x = x << sequence[i];
if((bitarray & x) != 0){
System.out.println("Duplicate found in given array: " + sequence[i]);
} else {
bitarray = bitarray | x;
}
}
return null;
}
public static void main(String[] args) {
int[] input = {1,1,2,3};
findDuplicatesArray(input);
}
}

What it does is to represent each found value as an 1 in a position of the bits composing an integer (bitarray).
The lines:
x = 1;
x = x << sequence[i];
Will put a 1 at the position given by the sequence value+1 (<< is a shift operator).
For example, if sequence[i] value is four, x will have the binary value: ...010000
The line:
(bitarray & x) != 0
Uses bit operation AND to check if a position has been already occupied and hence the valued found.
The problem is that this algorithm only works if your values at sequence are constrained to be low: Between 0 and 30 as there are 32 bits in an Java integer and the value 0 is represented as a 1 at the position 0 of bitarray.
You should consider too what happens when the sequence values are negative.

It works only as long as all values in the array belong to [0, number-of-bits-in-int). If of course you can say 'works' about a function that is supposed to return list of duplicates but always returns null.

You can understand the algorithm as using an array of booleans to test whether a value has occured previously in the array or not. Now instead of using an array of booleans, what you are doing is using the bits in an int to represent whether a value has occured previously or not. The code calls this "bitarray".
To set the Ith bit in an int, you use
x = (x | (1<< i));
Here '|' is the bitwise or operator.
And to test whether the Ith bit has been set you check the condition
if((x & (1<< i)) != 0){
}
here '&' is the bitwise and operator.
Moreover, the algorithm used above will only work if the range of values in the array is between 0 and 31 inclusive. That's because java ints are represented using 32 bits. However it consumes lesser space than other alternatives like using a HashSet or an explicit array of booleans.
However, if space is at a premium, and you know that the range of data in the array is small, you can consider using Bitset class (link).

Interesting thing happening when I take a number, multiply it by 10, and then add 1 [duplicate]

This question already has answers here:
Multiplication of two ints overflowing to result in a negative number
(5 answers)
Closed 9 years ago.
static int fn = 0;
static int sn = 0;
static boolean running = false;
public static void run()
{
while (running == true)
{
fn = numbers[0];
sn = numbers[1];
if (sign == 0)
{
input.setText(String.valueOf(fn));
}
}
}
static class one implements ActionListener
{
public void actionPerformed(ActionEvent e)
{
if (Display.sign == 0)
{
Display.numbers[0] = Display.numbers[0] *10;
Display.numbers[0] = Display.numbers[0] +1;
}
}
}
This is the code for a calculator that I am programming (not all of it of course). This is the part where I display the number on the screen which I have done, but weirdly this works up until 10 characters
So after I get the program to display 1111111111 I want to do it once more and it gives me this weird number -1773790777. I am confused about how the program comes up with this. As you can see, above Display.numbers[] is the array I am storing the two numbers in. So to go over a place I multiply the number in the array by 10 then add 1. So how does this give me a negative number in the first place and what can I do to solve this problem?

Is your number overflowing?
You can check it by looking at Integer.MAX_VALUE (assuming you are using an integer). If you go over that you will loop will get weird results like this. See - http://javapapers.com/core-java/java-overflow-and-underflow/ for more details.

It's overflowing!
1111111111*10 + 1 = 11111111111 which is 0x2964619C7 in hexadecimal. It's a 34-bit value which can't be stored in a 32-bit int
In Java arithmetic operations wrap around by default, so if the result overflowed then it'll be wrapped back to the other end of the value range. See How does Java handle integer underflows and overflows and how would you check for it?
However due to the use of 2's complement, the result will be the lower bits of the result 11111111111 mod 232 = 2521176519 = 0x964619C7 which is -1'773'790'777 in 32-bit int, that's why you see the number. You should read more on binary, that's the basic of nowadays computers
In Java 8 you'll have an easier way to detect overflow with the new *Exact methods
The platform uses signed two's complement integer arithmetic with int and long primitive types. The developer should choose the primitive type to ensure that arithmetic operations consistently produce correct results, which in some cases means the operations will not overflow the range of values of the computation. The best practice is to choose the primitive type and algorithm to avoid overflow. In cases where the size is int or long and overflow errors need to be detected, the methods addExact, subtractExact, multiplyExact, and toIntExact throw an ArithmeticException when the results overflow. For other arithmetic operations such as divide, absolute value, increment, decrement, and negation overflow occurs only with a specific minimum or maximum value and should be checked against the minimum or maximum as appropriate.
https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html

Bit-wise efficient uniform random number generation

I recall reading about a method for efficiently using random bits in an article on a math-oriented website, but I can't seem to get the right keywords in Google to find it anymore, and it's not in my browser history.
The gist of the problem that was being asked was to take a sequence of random numbers in the domain [domainStart, domainEnd) and efficiently use the bits of the random number sequence to project uniformly into the range [rangeStart, rangeEnd). Both the domain and the range are integers (more correctly, longs and not Z). What's an algorithm to do this?
Implementation-wise, I have a function with this signature:
long doRead(InputStream in, long rangeStart, long rangeEnd);
in is based on a CSPRNG (fed by a hardware RNG, conditioned through SecureRandom) that I am required to use; the return value must be between rangeStart and rangeEnd, but the obvious implementation of this is wasteful:
long doRead(InputStream in, long rangeStart, long rangeEnd) {
long retVal = 0;
long range = rangeEnd - rangeStart;
// Fill until we get to range
for (int i = 0; (1 << (8 * i)) < range; i++) {
int in = 0;
do {
in = in.read();
// but be sure we don't exceed range
} while(retVal + (in << (8 * i)) >= range);
retVal += in << (8 * i);
}
return retVal + rangeStart;
}
I believe this is effectively the same idea as (rand() * (max - min)) + min, only we're discarding bits that push us over max. Rather than use a modulo operator which may incorrectly bias the results to the lower values, we discard those bits and try again. Since hitting the CSPRNG may trigger re-seeding (which can block the InputStream), I'd like to avoid wasting random bits. Henry points out that this code biases against 0 and 257; Banthar demonstrates it in an example.
First edit: Henry reminded me that summation invokes the Central Limit Theorem. I've fixed the code above to get around that problem.
Second edit: Mechanical snail suggested that I look at the source for Random.nextInt(). After reading it for a while, I realized that this problem is similar to the base conversion problem. See answer below.

Your algorithm produces biased results. Let's assume rangeStart=0 and rangeEnd=257. If first byte is greater than 0, that will be the result. If it's 0, the result will be either 0 or 256 with 50/50 probability. So 0 and 256 are twice less likely to be chosen than any other number.
I did a simple test to confirm this:
p(0)=0.001945
p(1)=0.003827
p(2)=0.003818
...
p(254)=0.003941
p(255)=0.003817
p(256)=0.001955
I think you need to do the same as java.util.Random.nextInt and discard the whole number, instead just the last byte.

After reading the source to Random.nextInt(), I realized that this problem is similar to the base conversion problem.
Rather than converting a single symbol at a time, it would be more effective to convert blocks of input symbol at a time through an accumulator "buffer" which is large enough to represent at least one symbol in the domain and in the range. The new code looks like this:
public int[] fromStream(InputStream input, int length, int rangeLow, int rangeHigh) throws IOException {
int[] outputBuffer = new int[length];
// buffer is initially 0, so there is only 1 possible state it can be in
int numStates = 1;
long buffer = 0;
int alphaLength = rangeLow - rangeHigh;
// Fill outputBuffer from 0 to length
for (int i = 0; i < length; i++) {
// Until buffer has sufficient data filled in from input to emit one symbol in the output alphabet, fill buffer.
fill:
while(numStates < alphaLength) {
// Shift buffer by 8 (*256) to mix in new data (of 8 bits)
buffer = buffer << 8 | input.read();
// Multiply by 256, as that's the number of states that we have possibly introduced
numStates = numStates << 8;
}
// spits out least significant symbol in alphaLength
outputBuffer[i] = (int) (rangeLow + (buffer % alphaLength));
// We have consumed the least significant portion of the input.
buffer = buffer / alphaLength;
// Track the number of states we've introduced into buffer
numStates = numStates / alphaLength;
}
return outputBuffer;
}
There is a fundamental difference between converting numbers between bases and this problem, however; in order to convert between bases, I think one needs to have enough information about the number to perform the calculation - successive divisions by the target base result in remainders which are used to construct the digits in the target alphabet. In this problem, I don't really need to know all that information, as long as I'm not biasing the data, which means I can do what I did in the loop labeled "fill."

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Does the number generated by this function obey uniform distribution ？ [closed] - java

Related

How computer adds two number internally? [duplicate]

Generating random number without using built-in functions [closed]

Detecting Common Elements in an Unsorted Array [closed]

Interesting thing happening when I take a number, multiply it by 10, and then add 1 [duplicate]

Bit-wise efficient uniform random number generation

Categories

Resources