The following code returns all the prime factors of the given number n.
Approach behind the algorithm:
Iterate over the numbers (i.e. >=2) till n/i to get the prime factors.
The internal loop simply reduces the size of the number by dividing it with the current prime number and if the same prime number appears more than once it will keep on dividing.
The if statement would add the last & the highest prime number for n > 2 since n would have been reduced to that value by that time.
static List<Integer> getAllPrimes(int n){
List<Integer> factors = new ArrayList<Integer>();
for(int i = 2 ; i <= n/i ; ++i){
while(n % i == 0){
factors.add(i); //LINE 1
n/=i;
}
}
if(n > 2){factors.add(n);}
return factors;
}
How the running time would be determined for this algorithm? Since Each time the inner loop is iterated it decreases the size with some constant value say n/2,n/3.... etc based on the index i if it's a prime number.
When analyzing an algorithm like this, it's often helpful to clarify whether you're looking for a best-case, average-case, or worst-case analysis, since the answer might differ in each case.
Let's start with a worst-case analysis. What would have to happen to keep this algorithm running as long as possible? Well, if we never divide out any prime factors, then the outer loop will run as many times as possible. Specifically, it'll run Θ(√n) times. This only happens if the number in question is prime, and so we can say that the worst case occurs on prime number inputs, where the runtime is Θ(√n).
What about the best case? Well, this algorithm is going to terminate either when i gets too large for n or n gets too small for i. It's significantly faster to drop n than to increase i because n drops geometrically while i increases arithmetically. An ideal case would be an input that drops as fast as possible, which happens if you provide it an input that only has tiny small factors (these are called smooth numbers). In the ideal case, you'll get a perfect power of two, and in that case the algorithm cuts n in half repeatedly until it drops to 1. That's the hallmark of logarithmic behavior, so in the best case the runtime is Θ(log n).
Related
The Babylonian aka Heron's method seems to be one of the faster algorithms for finding the square root for a number n. How fast it converges is dependent on how far of your initial guess is.
Now as number n increases its root x as its percentage decreases.
root(10):10 - 31%
10:100 - 10%
root(100) : 100 - 3%
root(1000) : 1000 - 1%
So basically for each digit in the number divide by around 3. Then use that as your intial guess. Such as -
public static double root(double n ) {
//If number is 0 or negative return the number
if(n<=0)
return n ;
//call to a method to find number of digits
int num = numDigits(n) ;
double guess = n ;
//Divide by 0.3 for every digit from second digit onwards
for (int i = 0 ; i < num-1 ; ++i )
guess = guess * 0.3;
//Repeat until it converges to within margin of error
while (!(n-(guess*guess) <= 0.000001 && n-(guess*guess) >= 0 )) {
double divide = n/guess ;
guess = Math.abs(0.5*(divide+guess)) ;
}
return Math.abs(guess) ;
}
Does this help and optimize the algorithm. And is this O(n) ?
Yes. What works even better is to exploit the floating point representation, by dividing the binary exponent approximately by two, because operating on the floating-point bits is very fast. See Optimized low-accuracy approximation to `rootn(x, n)`.
My belief is that the complexity of an algorithm is independent on the input provided. (the complexity is a general characteristic of that algorithm, we cannot say that algorithm x has complexity O1 for input I1 and complexity O2 for input I2). Thus, no matter what initial value you provide, it should not improve complexity. It may improve the number of iterations for that particular case, but that's a different thing. Reducing the number of iterations by half still means the same complexity. Keep in mind that n, 2*n, n/3 all fit into the O(n) class .
Now, with regard to the actual complexity, i read on wikipedia (https://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method) that
This is a quadratically convergent algorithm, which means that the number of correct digits of the approximation roughly doubles with each iteration.
This means you need as many iterations as the number of precision decimals that you expect. Which is constant. If you need 10 exact decimals, 10 is a constant, being totally independent of n.
But on wikipedia's example, they chose from the very begining a candidate which had the same order of magnitude as the correct answer (600 compared to 354). However, if your initial guess is way too wrong (by orders of magnitude) you will need some extra iterations to cut down/add to the necessary digits. Which will add complexity. Suppose correct answer is 10000 , while your initial guess is 10. The difference is 4 orders of magnitude, and i think in this case, the complexity needed to reach the correct magnitude is proportional to the difference between the number of digits of your guess and the number of digits of the correct answer. Since number of digits is approximativelly log(n), in this case the extra complexity is log(corect_answer) -log(initial_guess), taken as absolute value.
To avoid this, pick a number that has the right number of digits, which is generally half the number of digits of your initial number. My best choice would be picking the first half of the number as a candidate (from 123456, keep 123, from 1234567, either 123 or 1234). In java, you could use byte operations to keep the first half of a number/string/whatever is kept in memory. Thus you will need no iteration, just a byte operation with constant complexity.
For n ≥ 4, sqrt(n) = 2 sqrt(n / 4). For n < 1, sqrt(n) = 1/2 sqrt(n × 4). So you can always multiply or divide by 4 to normalize n on the range [1, 4).
Once you do that, take sqrt(4) = 2 as the starting point for Heron's algorithm, since that is the geometric mean and will yield the greatest possible improvement per iteration, and unroll the loop to perform the needed number of iterations for the desired accuracy.
Finally, multiply or divide by all the factors of 2 that you removed at the beginning. Note that multiplying and dividing by 2 or 4 is easy and fast for binary computers.
I discuss this algorithm at my blog.
I need help.
I have a method which determines if a int is a prime:
public boolean isPrime(int n) {
if (n % 2 == 0) {
return false;
}
for (int i = 3; i < n; i += 2) {
if (n % i == 0) {
return false;
}
}
return true;
}
Could anyone tell me how to determine the worst case time running of this program?
Then let B equal the number of bits in the binary representation of N ... what would be the worst case time running in terms of B?
Thanks! :)
The running-time of this function is O(n) in every case though in fact the worst case will only come up if n is really a prime number.
Also, if you want to detect all primes in a range from for examle 1 until n, your runtime will be O(n^2).
The asymptotic time complexity of your prime number calculator is O(n). There is no reason to incorporate "B" into the calculation of time complexity in this case.
Well at firs glans it seams it will take O(n) but there is an issue here if n=Integer.MaxValue which is equals to 2147483647 at some point you will reach 2147483645 then when adding 2 i will become 2147483647 and in next addition it will become -2147483647 so you have sciped to negative numbers so that may take a while .
Your function incorrectly states that 2 is not a prime number. The third line that currently reads return false should read return (n == 2) or whatever the proper syntax for that is in Java.
Running time for the algorithm is O(n), which will occur whenever n is prime. You could improve the run time to O(sqrt n) by changing the test in the for loop to i * i <= n.
The snarky response is to say that it finishes in O(1), because n is bounded above by a 2^31-1 in Java. Hence, I can find a single constant K such that the time is guaranteed to finish in less than that time K.
However, this response does not address what is intended by your question, which is, how the time differs as a function of n within a "reasonable" range for n. As others have pointed out already, your running time will be O(n).
In terms of B, there's a potential bit of confusion, because the integers are always represented with a 32-bit signed integer. However, if we ask how many bits are required at maximum for a given value of N, we can note that 2^B < N <= 2^(B+1), so if your calculation requires O(n), it would necessarily require O(2^B) operations.
Also, as others have pointed out, there's an error condition for the value 2 - easily fixable. In addition, this calculation could be easily improved using Sieve methods. While you did not really ask about optimization, I provide the service free of charge. :) See: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes for an interesting approach to addressing this problem. And believe it or not, this is really just the beginning of optimizing this. There are many more things you can do from there.
Trying to brush up on my Big-O understanding for a test (A very basic Big-O understanding required obviously) I have coming up and was doing some practice problems in my book.
They gave me the following snippet
public static void swap(int[] a)
{
int i = 0;
int j = a.length-1;
while (i < j)
{
int temp = a[i];
a[i] = a[j];
a[j] = temp;
i++;
j--;
}
}
Pretty easy to understand I think. It has two iterators each covering half the array with a fixed amount of work (which I think clocks them both at O(n/2))
Therefore O(n/2) + O(n/2) = O(2n/2) = O(n)
Now please forgive as this is my current understanding and that was my attempt at the solution to the problem. I have found many examples of big-o online but none that are quite like this where the iterators both increment and modify the array at basically the same time.
The fact that it has one loop is making me think it's O(n) anyway.
Would anyone mind clearing this up for me?
Thanks
The fact that it has one loop is making me think it's O(n) anyway.
This is correct. Not because it is making one loop, but because it is one loop that depends on the size of the array by a constant factor: the big-O notation ignores any constant factor. O(n) means that the only influence on the algorithm is based on the size of the array. That it actually takes half that time, does not matter for big-O.
In other words: if your algorithm takes time n+X, Xn, Xn + Y will all come down to big-O O(n).
It gets different if the size of the loop is changed other than a constant factor, but as a logarithmic or exponential function of n, for instance if size is 100 and loop is 2, size is 1000 and loop is 3, size is 10000 and loop is 4. In that case, it would be, for instance, O(log(n)).
It would also be different if the loop is independent of size. I.e., if you would always loop 100 times, regardless of loop size, your algorithm would be O(1) (i.e., operate in some constant time).
I was also wondering if the equation I came up with to get there was somewhere in the ballpark of being correct.
Yes. In fact, if your equation ends up being some form of n * C + Y, where C is some constant and Y is some other value, the result is O(n), regardless of whether see is greater than 1, or smaller than 1.
You are right about the loop. Loop will determine the Big O. But the loop runs only for half the array.
So its. 2 + 6 *(n/2)
If we make n very large, other numbers are really small. So they won't matter.
So its O(n).
Lets say you are running 2 separate loops. 2 + 6* (n/2) + 6*(n/2) . In that case it will be O(n) again.
But if we run a nested loop. 2+ 6*(n*n). Then It will be O(n^2)
Always remove the constants and do the math. You got the idea.
As j-i decreases by 2 units on each iteration, N/2 of them are taken (assuming N=length(a)).
Hence the running time is indeed O(N/2). And O(N/2) is strictly equivalent to O(N).
I am taking up algorithm course on coursera,and I am stuck on this particular problem. I am supposed to find the time complexity of this code.
int sum = 0
for (int i = 1; i <= N*N; i = i*2)
{
for (int j = 0; j < i; j++)
sum++;
}
I checked it in eclipse itself, for any value of N the number of times sum statement is executed is less than N
final value of sum:
for N=8 sum=3
for N=16 sum=7
for N=100000 sum=511
so the time complexity should be less than N
but the answer that is given is N raised to the power 2, How is it possible?
What I have done so far:
the first loop will run log(N^ 2) times, and consequently second loop will be execute 1,2,3.. 2 logN
The first inner loop will be 1 + 2 + 4 + 8 .. 2^M where 2^M is <= N * N.
The sum of powers of 2 up to N * N is approximately 2 * N * N or O(N ^ 2)
Note: When N=100000 , N*N will overflow so its result is misleading. If you consider overflow to be part of the problem, the sum is pretty random for large numbers so you could argue its O(1), i.e. if N=2^15, N^2 = 2^30 and the sum will be Integer.MAX_VALUE. No higher value of N will give a higher sum.
There is a lot of confusion here, but the important thing is that Big-O notation is all about growth rate, or limiting behavior as mathematicians will say. That a function will execute in O(n*n) means that the time to execute will increase faster than, for example n, but slower than, for example 2^n.
When reasoning with big-O notation, remember that constants "do not count". There is a few querks in this particular question.
The N*N expression it-self would lead to a O(log n*n) complexity if the loop was a regular for-loop...
...however, the for-loop increment is i = i*2 causing the outer loop to be executed approximately log n and the function would be in O(log n) if the contents of the inner loop run in a time independent of n.
But again, the inner-loop run-time depends on n, but it doesn't do n*n runs, rather it does roughly log (n*n)/2 loops. Remembering that "constants don't count" and that we end up in O(n*n).
Hope this cleared things up.
So sum ++ will be executed 1 + 2 + 4 + 8 + ... + N*N, total log2(N*N) times. Sum of geometrical progression 1 * (1 - 2 ^ log2(N*N)/(1 - 2) = O(N*N).
Your outer loop is log(N^2)->2*log(N)->log(N), your inner loop is N^2/2->N^2. So, the time complexity is N^2*log(N).
About the benchmark, values with N=8 or N=16 are ridiculous, the time in the loop is marginal in relation with setting JVM, cache fails, and so on. You must:
Begin with biggest N, and check how it evaluate.
Make multiple runs with each value of N.
Think that the time complexity is a measure of how the algorithm works when N becomes really big.
You are given a list of n numbers L=<a_1, a_2,...a_n>. Each of them is
either 0 or of the form +/- 2k, 0 <= k <= 30. Describe and implement an
algorithm that returns the largest product of a CONTINUOUS SUBLIST
p=a_i*a_i+1*...*a_j, 1 <= i <= j <= n.
For example, for the input <8 0 -4 -2 0 1> it should return 8 (either 8
or (-4)*(-2)).
You can use any standard programming language and can assume that
the list is given in any standard data structure, e.g. int[],
vector<int>, List<Integer>, etc.
What is the computational complexity of your algorithm?
In my first answer I addressed the OP's problem in "multiplying two big big numbers". As it turns out, this wish is only a small part of a much bigger problem which I'm going to address now:
"I still haven't arrived at the final skeleton of my algorithm I wonder if you could help me with this."
(See the question for the problem description)
All I'm going to do is explain the approach Amnon proposed in little more detail, so all the credit should go to him.
You have to find the largest product of a continuous sublist from a list of integers which are powers of 2. The idea is to:
Compute the product of every continuous sublist.
Return the biggest of all these products.
You can represent a sublist by its start and end index. For start=0 there are n-1 possible values for end, namely 0..n-1. This generates all sublists that start at index 0. In the next iteration, You increment start by 1 and repeat the process (this time, there are n-2 possible values for end). This way You generate all possible sublists.
Now, for each of these sublists, You have to compute the product of its elements - that is come up with a method computeProduct(List wholeList, int startIndex, int endIndex). You can either use the built in BigInteger class (which should be able to handle the input provided by Your assignment) to save You from further trouble or try to implement a more efficient way of multiplication as described by others. (I would start with the simpler approach since it's easier to see if Your algorithm works correctly and first then try to optimize it.)
Now that You're able to iterate over all sublists and compute the product of their elements, determining the sublist with the maximum product should be the easiest part.
If it's still to hard for You to make the connections between two steps, let us know - but please also provide us with a draft of Your code as You work on the problem so that we don't end up incrementally constructing the solution and You copy&pasting it.
edit: Algorithm skeleton
public BigInteger listingSublist(BigInteger[] biArray)
{
int start = 0;
int end = biArray.length-1;
BigInteger maximum;
for (int i = start; i <= end; i++)
{
for (int j = i; j <= end; j++)
{
//insert logic to determine the maximum product.
computeProduct(biArray, i, j);
}
}
return maximum;
}
public BigInteger computeProduct(BigInteger[] wholeList, int startIndex,
int endIndex)
{
//insert logic here to return
//wholeList[startIndex].multiply(wholeList[startIndex+1]).mul...(
// wholeList[endIndex]);
}
Since k <= 30, any integer i = 2k will fit into a Java int. However the product of such two integers might not necessarily fit into a Java int since 2k * 2k = 22*k <= 260 which fill into a Java long. This should answer Your question regarding the "(multiplication of) two numbers...".
In case that You might want to multiply more than two numbers, which is implied by Your assignment saying "...largest product of a CONTINUOUS SUBLIST..." (a sublist's length could be > 2), have a look at Java's BigInteger class.
Actually, the most efficient way of multiplication is doing addition instead. In this special case all you have is numbers that are powers of two, and you can get the product of a sublist by simply adding the expontents together (and counting the negative numbers in your product, and making it a negative number in case of odd negatives).
Of course, to store the result you may need the BigInteger, if you run out of bits. Or depending on how the output should look like, just say (+/-)2^N, where N is the sum of the exponents.
Parsing the input could be a matter of switch-case, since you only have 30 numbers to take care of. Plus the negatives.
That's the boring part. The interesting part is how you get the sublist that produces the largest number. You can take the dumb approach, by checking every single variation, but that would be an O(N^2) algorithm in the worst case (IIRC). Which is really not very good for longer inputs.
What can you do? I'd probably start from the largest non-negative number in the list as a sublist, and grow the sublist to get as many non-negative numbers in each direction as I can. Then, having all the positives in reach, proceed with pairs of negatives on both sides, eg. only grow if you can grow on both sides of the list. If you cannot grow in both directions, try one direction with two (four, six, etc. so even) consecutive negative numbers. If you cannot grow even in this way, stop.
Well, I don't know if this alogrithm even works, but if it (or something similar) does, its an O(N) algorithm, which means great performance. Lets try it out! :-)
Hmmm.. since they're all powers of 2, you can just add the exponent instead of multiplying the numbers (equivalent to taking the logarithm of the product). For example, 2^3 * 2^7 is 2^(7+3)=2^10.
I'll leave handling the sign as an exercise to the reader.
Regarding the sublist problem, there are less than n^2 pairs of (begin,end) indices. You can check them all, or try a dynamic programming solution.
EDIT: I adjusted the algorithm outline to match the actual pseudo code and put the complexity analysis directly into the answer:
Outline of algorithm
Go seqentially over the sequence and store value and first/last index of the product (positive) since the last 0. Do the same for another product (negative) which only consists of the numbers since the first sign change of the sequence. If you hit a negative sequence element swap the two products (positive and negative) along with the associagted starting indices. Whenever the positive product hits a new maximum store it and the associated start and end indices. After going over the whole sequence the result is stored in the maximum variables.
To avoid overflow calculate in binary logarithms and an additional sign.
Pseudo code
maxProduct = 0
maxProductStartIndex = -1
maxProductEndIndex = -1
sequence.push_front( 0 ) // reuses variable intitialization of the case n == 0
for every index of sequence
n = sequence[index]
if n == 0
posProduct = 0
negProduct = 0
posProductStartIndex = index+1
negProductStartIndex = -1
else
if n < 0
swap( posProduct, negProduct )
swap( posProductStartIndex, negProductStartIndex )
if -1 == posProductStartIndex // start second sequence on sign change
posProductStartIndex = index
end if
n = -n;
end if
logN = log2(n) // as indicated all arithmetic is done on the logarithms
posProduct += logN
if -1 < negProductStartIndex // start the second product as soon as the sign changes first
negProduct += logN
end if
if maxProduct < posProduct // update current best solution
maxProduct = posProduct
maxProductStartIndex = posProductStartIndex
maxProductEndIndex = index
end if
end if
end for
// output solution
print "The maximum product is " 2^maxProduct "."
print "It is reached by multiplying the numbers from sequence index "
print maxProductStartIndex " to sequence index " maxProductEndIndex
Complexity
The algorithm uses a single loop over the sequence so its O(n) times the complexity of the loop body. The most complicated operation of the body is log2. Ergo its O(n) times the complexity of log2. The log2 of a number of bounded size is O(1) so the resulting complexity is O(n) aka linear.
I'd like to combine Amnon's observation about multiplying powers of 2 with one of mine concerning sublists.
Lists are terminated hard by 0's. We can break the problem down into finding the biggest product in each sub-list, and then the maximum of that. (Others have mentioned this).
This is my 3rd revision of this writeup. But 3's the charm...
Approach
Given a list of non-0 numbers, (this is what took a lot of thinking) there are 3 sub-cases:
The list contains an even number of negative numbers (possibly 0). This is the trivial case, the optimum result is the product of all numbers, guaranteed to be positive.
The list contains an odd number of negative numbers, so the product of all numbers would be negative. To change the sign, it becomes necessary to sacrifice a subsequence containing a negative number. Two sub-cases:
a. sacrifice numbers from the left up to and including the leftmost negative; or
b. sacrifice numbers from the right up to and including the rightmost negative.
In either case, return the product of the remaining numbers. Having sacrificed exactly one negative number, the result is certain to be positive. Pick the winner of (a) and (b).
Implementation
The input needs to be split into subsequences delimited by 0. The list can be processed in place if a driver method is built to loop through it and pick out the beginnings and ends of non-0 sequences.
Doing the math in longs would only double the possible range. Converting to log2 makes arithmetic with large products easier. It prevents program failure on large sequences of large numbers. It would alternatively be possible to do all math in Bignums, but that would probably perform poorly.
Finally, the end result, still a log2 number, needs to be converted into printable form. Bignum comes in handy there. There's new BigInteger("2").pow(log); which will raise 2 to the power of log.
Complexity
This algorithm works sequentially through the sub-lists, only processing each one once. Within each sub-list, there's the annoying work of converting the input to log2 and the result back, but the effort is linear in the size of the list. In the worst case, the sum of much of the list is computed twice, but that's also linear complexity.
See this code. Here I implement exact factorial of a huge large number. I am just using integer array to make big numbers. Download the code from Planet Source Code.