How Java BigInteger nextProbablePrime method works?

How Java BigInteger nextProbablePrime method works? - java

I'm working with Java BigInteger Class and curious about the Algorithm behind nextProbablePrime method. I know about some efficient primality testing algorithm like Miller-Rabin but not sure about which algorithm was implemented here.
Trying the following code for a good time and still no response.
BigInteger number = BigInteger.ZERO;
number = number.setBit(82589933);
number = number.nextProbablePrime();

I have gone through with the source code of BigInteger. It is internally using the MillerRabin algorithm for the nextProbablePrime method.

Why your example runs and runs without returning:
Your number is 82million bits long, and (by prime number th'm) such primes are ab out 82million / log_e(2) numbers apart. So you're asking Miller-Rabin to test about 15million-ish candidates, where each candidate involves 82million bits, and each check is non-trivial. So yeah, even efficient algorithms like Miller-Rabin will take a while on such beyond-mind-bogglingly-big inputs.
(I remember once running raising one number to another, having it take too long, and complaining to the language-developer that they should use repeated squaring for faster exponentiation ... before I stepped back and realized that my test-number also had millions of digits.)

Related

Why there is no nextDouble(), nextFloat() and nextLong() which accept a bound in java.util.Random

I was reading java.util.Random class and noticed that there is no nextDouble(), nextFloat() and nextLong() which can accept a bound.
There are many way to get it done like this.
But my question is why java did not provide us with these required method like nextInt(int n) which accept the bound.
Is there any specific reason they did not provide these methods?

A good API always tries to provide the essential elements that a user needs to do his job.
Having nextInt(int n) is just one possible implementation. What if you need other distributions?!
In other words: the Random API could try to anticipate all potential usage patterns, but that would very much bloat the whole API. Instead, the designers choice a very small interface - but you got all the elements required to build your own things on top of that.
Thing is: in the end, this is a design style decision by the people who created the Random class. And as so often, problems could be solved in many different ways. Thus you shouldn't draw deep conclusions on the solution that was picked here.

Looking at the code from Random.java (in jdk 8) there are two statements that stand out.
* The algorithm is slightly tricky. It rejects values that would result
* in an uneven distribution (due to the fact that 2^31 is not divisible
* by n). The probability of a value being rejected depends on n.
and
* Linear congruential pseudo-random number generators such as the one
* implemented by this class are known to have short periods in the
* sequence of values of their low-order bits.
Without being an expert in random number generation (they refer to it as pseudo random number generation), it seems evident that the algorithm is trying to to a better job of returning "random" numbers than if you simply did next() % bound, in terms of randomness (and possibly also efficiency).
There is also the convenience factor but that doesn't seem to be the primary reason, given the comments in the code.

what is the significance of modulo 10^9+7 used in codechef and spoj problems?

I was working on a problem which requires output as "For each line output the answer modulo 10^9+7". Why is modulo 10^9+7 included in the problem? What is its significance?
I'm not looking for a solution to the problem; only the significance of that particular constant.

Problems ask for results modulo primes because the alternatives, namely asking for a floating-point result giving the "high-order bits" and asking for the whole result, aren't always what the problem setter is looking for.
These problems are often "find and implement a recurrence" problems. The low-order bits will often tell you whether the recurrence you found is right.
There may be a special trick for the "high-order bits" problem, perhaps based on a clever analytic approximation.
The reason people don't often ask for the whole result is that this requires the contestant to implement big-number arithmetic.
Problem setters usually don't want unexpected tricks to crack their problems for "the wrong reasons."
10^9+7 winds up being a pretty good choice of prime. It is a "safe prime." What that means:
10^9+7 is a prime number. This means that the "Chinese remainder trick" doesn't apply; if you're trying to work something out modulo a product of two primes, say pq, then you can work it out modulo p and modulo q and use the extended Euclidean algorithm to put the pieces together.
More than that, 10^9+6, which is 10^9+7-1, is twice a prime. So the multiplicative group modulo 10^9+7 doesn't decompose into small things and hence no Chinese-remainder-like trick applies there.

In some problems the answers are very big numbers, but forcing you to implement long arighmetics is not the purpose of the problem authors. Therefore they ask you to calculate answer modulo some number, like 1000000007, so you don't have to implement long arithmetics, but the answer is still verifiable.

If it was asked to give answer as modulo 10^9 you could mask the bits easily but to make problems more tough a number such as 10^9+7 is choosen

Logarithm Algorithm

I need to evaluate a logarithm of any base, it does not matter, to some precision. Is there an algorithm for this? I program in Java, so I'm fine with Java code.
How to find a binary logarithm very fast? (O(1) at best) might be able to answer my question, but I don't understand it. Can it be clarified?

Use this identity:
logb(n) = loge(n) / loge(b)
Where log can be a logarithm function in any base, n is the number and b is the base. For example, in Java this will find the base-2 logarithm of 256:
Math.log(256) / Math.log(2)
=> 8.0
Math.log() uses base e, by the way. And there's also Math.log10(), which uses base 10.

I know this is extremely late, but this may come to be useful for some since the matter here is precision. One way of doing this is essentially implementing a root-finding algorithm that uses, from its base, the high precision types you might want to be using, consisting of simple +-x/ operations.
I would recommend implementing Newton's method since it demands relatively few iterations and has great convergence. For this sort of application, specifically, I believe it's fair to say it will always provide the correct result provided good input validation is implemented.
Considering a simple constant "a" where
Where a is sought to be solved for such that it obeys, then
We can use the Newton method iteratively to find "a" within any specified tolerance, where each a-ith iteration can be computed by
and the denominator is
,
because that's the first derivative of the function, as necessary for the Newton method. Once this is solved for, "a" is the direct answer for the "a = log,b(x)" problem, obtainable by simple +-x/ operations, so you're already good to go. "Wait, but there's a power there?". Yes. If you can rely on your power function as being accurate enough, then there are no issues with going ahead and using it there. Otherwise, you can further break down the power operation into a series of other +-x/ operations by using these methods to simplify whatever decimal number that is on the power into two integer power operations that can be computed easily with a series of multiplication operations. This process will eventually leave you with nth-roots to solve for, which you can also find with the Newton method. If you do go down that road, you can use this for the newton method
which, as you can see, has to be solved for recursively until you reach b = 1.
Phew, but yeah, that's it. This is the way you can solve the problem by making sure you use high precision types along the whole way with only +-x/ operations. Below is a quick implementation I did in Excel to solve for log,2(3), compared with the solution given by the software's original function. As you can see, I can just keep refining "a" until I reach the tolerance I want by monitoring what the optimization function gives me. In this, I used a=2 as the initial guess, which you can use and should be fine for most cases.

Complexity running times LAB and fibonacci numbers (java)

have been looking the page and lots of great people helping outhere so i have a Lab Assignment and i know i have to do a method concerning the fibonacci numbers to caclulate the number in the position n, but im not quite sure what do put inside the method i know is what i have to think about hope you can give and idea. Having trouble.(not asking to do the hw for me ok) Thank you.
Fibonacci numbers and complexity
Fibonacci numbers are defined recursively as follows:
F(n) = n, for n<=1
F(n) = F(n-1) + F(n-2) for n>1
Write the following methods to compute F(n):
a) A O(2n^n) method based on the recursive definition
b) A O(n) method that uses a loop
c) A O(1) method that uses the closed form solution – feel free to look for this formula on line.
Test all three methods using n = 10; 20; 50; 100; 1,000; 10,000; 100,000 and 1,000,000. If a particular algorithm and input combination does not return an answer in a reasonable amount of time, note that in your report (that is, don’t wait for hours (or worse) for your program to finish).

Well, to answer part c there is a constant time function that will calculate the nth fibonacci number. You can find the formula for it here: http://en.wikipedia.org/wiki/Fibonacci_number#Closed_form_expression

I assume "Hw" means homework, so no code I'm afraid.
a) O(2n) and O(n) are the same thing. Do you mean O(2^n)? This will happen if you use the recursive method without caching the results.
b) This is the "obvious" way to implement it, using a procedural implementation and remembering the last two numbers and using those to calculate the next one. In pseudo-code it would be something like loop { a, b = b, a+b; }
c) This won't work for all n unless you have infinite precision, and infinite precision isn't O(1). For example, when I use doubles fib(73) works out to be 806515533049395, but actually it is 806515533049393. The difference is due to rounding errors when working with floating point numbers.
And regarding the O(n) solution, if you are going to calculate up to fib(1000000) then a 64-bit integer isn't going to be anywhere near enough to store the result. You'll need to use BigIntegers. Adding two BigIntegers is not an O(1) operation, so the O(n) performance I mentioned before is too optimistic.

What complexity are operations on BigInteger?

What complexity are the methods multiply, divide and pow in BigInteger currently? There is no mention of the computational complexity in the documentation (nor anywhere else).

If you look at the code for BigInteger (provided with JDK), it appears to me that
multiply(..) has O(n^2) (actually the method is multiplyToLen(..)). The code for the other methods is a bit more complex, but you can see yourself.
Note: this is for Java 6. I assume it won't differ in Java 7.

As noted in the comments on #Bozho's answer, Java 8 and onwards use more efficient algorithms to implement multiplication and division than the naive O(N^2) algorithms in Java 7 and earlier.
Java 8 multiplication adaptively uses either the naive O(N^2) long multiplication algorithm, the Karatsuba algorithm or the 3 way Toom-Cook algorithm depending in the sizes of the numbers being multiplied. The latter are (respectively) O(N^1.58) and O(N^1.46).
Java 8 division adaptively uses either Knuth's O(N^2) long division algorithm or the Burnikel-Ziegler algorithm. (According to the research paper, the latter is 2K(N) + O(NlogN) for a division of a 2N digit number by an N digit number, where K(N) is the Karatsuba multiplication time for two N-digit numbers.)
Likewise some other operations have been optimized.
There is no mention of the computational complexity in the documentation (nor anywhere else).
Some details of the complexity are mentioned in the Java 8 source code. The reason that the javadocs do not mention complexity is that it is implementation specific, both in theory and in practice. (As illustrated by the fact that the complexity of some operations is significantly different between Java 7 and 8.)

There is a new "better" BigInteger class that is not being used by the sun jdk for conservateism and lack of useful regression tests (huge data sets). The guy that did the better algorithms might have discussed the old BigInteger in the comments.
Here you go http://futureboy.us/temp/BigInteger.java

Measure it. Do operations with linearly increasing operands and draw the times on a diagram.
Don't forget to warm up the JVM (several runs) to get valid benchmark results.
If operations are linear O(n), quadratic O(n^2), polynomial or exponential should be obvious.
EDIT: While you can give algorithms theoretical bounds, they may not be such useful in practice. First of all, the complexity does not give the factor. Some linear or subquadratic algorithms are simply not useful because they are eating so much time and resources that they are not adequate for the problem on hand (e.g. Coppersmith-Winograd matrix multiplication).
Then your computation may have all kludges you can only detect by experiment. There are preparing algorithms which do nothing to solve the problem but to speed up the real solver (matrix conditioning). There are suboptimal implementations. With longer lengths, your speed may drop dramatically (cache missing, memory moving etc.). So for practical purposes, I advise to do experimentation.
The best thing is to double each time the length of the input and compare the times.
And yes, you do find out if an algorithm has n^1.5 or n^1.8 complexity. Simply quadruple
the input length and you need only the half time for 1.5 instead of 2. You get again nearly half the time for 1.8 if you multiply the length 256 times.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.