Determining the worst case running time (prime numbers)

Determining the worst case running time (prime numbers) - java

I need help.
I have a method which determines if a int is a prime:
public boolean isPrime(int n) {
if (n % 2 == 0) {
return false;
}
for (int i = 3; i < n; i += 2) {
if (n % i == 0) {
return false;
}
}
return true;
}
Could anyone tell me how to determine the worst case time running of this program?
Then let B equal the number of bits in the binary representation of N ... what would be the worst case time running in terms of B?
Thanks! :)

The running-time of this function is O(n) in every case though in fact the worst case will only come up if n is really a prime number.
Also, if you want to detect all primes in a range from for examle 1 until n, your runtime will be O(n^2).

The asymptotic time complexity of your prime number calculator is O(n). There is no reason to incorporate "B" into the calculation of time complexity in this case.

Well at firs glans it seams it will take O(n) but there is an issue here if n=Integer.MaxValue which is equals to 2147483647 at some point you will reach 2147483645 then when adding 2 i will become 2147483647 and in next addition it will become -2147483647 so you have sciped to negative numbers so that may take a while .

Your function incorrectly states that 2 is not a prime number. The third line that currently reads return false should read return (n == 2) or whatever the proper syntax for that is in Java.
Running time for the algorithm is O(n), which will occur whenever n is prime. You could improve the run time to O(sqrt n) by changing the test in the for loop to i * i <= n.

The snarky response is to say that it finishes in O(1), because n is bounded above by a 2^31-1 in Java. Hence, I can find a single constant K such that the time is guaranteed to finish in less than that time K.
However, this response does not address what is intended by your question, which is, how the time differs as a function of n within a "reasonable" range for n. As others have pointed out already, your running time will be O(n).
In terms of B, there's a potential bit of confusion, because the integers are always represented with a 32-bit signed integer. However, if we ask how many bits are required at maximum for a given value of N, we can note that 2^B < N <= 2^(B+1), so if your calculation requires O(n), it would necessarily require O(2^B) operations.
Also, as others have pointed out, there's an error condition for the value 2 - easily fixable. In addition, this calculation could be easily improved using Sieve methods. While you did not really ask about optimization, I provide the service free of charge. :) See: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes for an interesting approach to addressing this problem. And believe it or not, this is really just the beginning of optimizing this. There are many more things you can do from there.

Related

What would be the execution time for the following algorithm?

The following code returns all the prime factors of the given number n.
Approach behind the algorithm:
Iterate over the numbers (i.e. >=2) till n/i to get the prime factors.
The internal loop simply reduces the size of the number by dividing it with the current prime number and if the same prime number appears more than once it will keep on dividing.
The if statement would add the last & the highest prime number for n > 2 since n would have been reduced to that value by that time.
static List<Integer> getAllPrimes(int n){
List<Integer> factors = new ArrayList<Integer>();
for(int i = 2 ; i <= n/i ; ++i){
while(n % i == 0){
factors.add(i); //LINE 1
n/=i;
}
}
if(n > 2){factors.add(n);}
return factors;
}
How the running time would be determined for this algorithm? Since Each time the inner loop is iterated it decreases the size with some constant value say n/2,n/3.... etc based on the index i if it's a prime number.

When analyzing an algorithm like this, it's often helpful to clarify whether you're looking for a best-case, average-case, or worst-case analysis, since the answer might differ in each case.
Let's start with a worst-case analysis. What would have to happen to keep this algorithm running as long as possible? Well, if we never divide out any prime factors, then the outer loop will run as many times as possible. Specifically, it'll run Θ(√n) times. This only happens if the number in question is prime, and so we can say that the worst case occurs on prime number inputs, where the runtime is Θ(√n).
What about the best case? Well, this algorithm is going to terminate either when i gets too large for n or n gets too small for i. It's significantly faster to drop n than to increase i because n drops geometrically while i increases arithmetically. An ideal case would be an input that drops as fast as possible, which happens if you provide it an input that only has tiny small factors (these are called smooth numbers). In the ideal case, you'll get a perfect power of two, and in that case the algorithm cuts n in half repeatedly until it drops to 1. That's the hallmark of logarithmic behavior, so in the best case the runtime is Θ(log n).

Big-Oh notation for a single while loop covering two halves of an array with two iterator vars

Trying to brush up on my Big-O understanding for a test (A very basic Big-O understanding required obviously) I have coming up and was doing some practice problems in my book.
They gave me the following snippet
public static void swap(int[] a)
{
int i = 0;
int j = a.length-1;
while (i < j)
{
int temp = a[i];
a[i] = a[j];
a[j] = temp;
i++;
j--;
}
}
Pretty easy to understand I think. It has two iterators each covering half the array with a fixed amount of work (which I think clocks them both at O(n/2))
Therefore O(n/2) + O(n/2) = O(2n/2) = O(n)
Now please forgive as this is my current understanding and that was my attempt at the solution to the problem. I have found many examples of big-o online but none that are quite like this where the iterators both increment and modify the array at basically the same time.
The fact that it has one loop is making me think it's O(n) anyway.
Would anyone mind clearing this up for me?
Thanks

The fact that it has one loop is making me think it's O(n) anyway.
This is correct. Not because it is making one loop, but because it is one loop that depends on the size of the array by a constant factor: the big-O notation ignores any constant factor. O(n) means that the only influence on the algorithm is based on the size of the array. That it actually takes half that time, does not matter for big-O.
In other words: if your algorithm takes time n+X, Xn, Xn + Y will all come down to big-O O(n).
It gets different if the size of the loop is changed other than a constant factor, but as a logarithmic or exponential function of n, for instance if size is 100 and loop is 2, size is 1000 and loop is 3, size is 10000 and loop is 4. In that case, it would be, for instance, O(log(n)).
It would also be different if the loop is independent of size. I.e., if you would always loop 100 times, regardless of loop size, your algorithm would be O(1) (i.e., operate in some constant time).
I was also wondering if the equation I came up with to get there was somewhere in the ballpark of being correct.
Yes. In fact, if your equation ends up being some form of n * C + Y, where C is some constant and Y is some other value, the result is O(n), regardless of whether see is greater than 1, or smaller than 1.

You are right about the loop. Loop will determine the Big O. But the loop runs only for half the array.
So its. 2 + 6 *(n/2)
If we make n very large, other numbers are really small. So they won't matter.
So its O(n).
Lets say you are running 2 separate loops. 2 + 6* (n/2) + 6*(n/2) . In that case it will be O(n) again.
But if we run a nested loop. 2+ 6*(n*n). Then It will be O(n^2)
Always remove the constants and do the math. You got the idea.

As j-i decreases by 2 units on each iteration, N/2 of them are taken (assuming N=length(a)).
Hence the running time is indeed O(N/2). And O(N/2) is strictly equivalent to O(N).

The best, worst, and average-case runtime of a function to check for duplicates?

I'm having some trouble finding the big O for the if statement in the code below:
public static boolean areUnique (int[] ar)
{
for (int i = 0; i < ar.length-1; i++) // O(n)
{
for (int j = i+1; j < ar.length-1; j++) // O(n)
{
if (ar[i] == ar[j]) // O(???)
return false; // O(1)
}
}
return true; //O(1)
}
I'm trying to do a time complexity analysis for the best, worst, and average case
Thank you everyone for answering so quickly! I'm not sure if my best worst and average cases are correct... There should be a case difference should there not because of the if statement? But when I do my analysis I have them all ending up as O(n2)
Best: O(n) * O(n) * [O(1) + O(1)] = O(n2)
Worst: O(n) * O(n) * [O(1) + O(1) + O(1)] = n2
Average: O(n) * O(n) * [O(1) + O(1) + O(1)] = O(n2)
Am I doing this right? My textbook is not very helpful

For starters, this line
if (ar[i] == ar[j])
always takes time Θ(1) to execute. It does only a constant amount of work (a comparison plus a branch), so the work done here won't asymptotically contribute to the overall runtime.
Given this, we can analyze the worst-case behavior by considering what happens if this statement is always false. That means that the loop runs as long as possible. As you noticed, since each loop runs O(n) times, the total work done is Θ(n2) in the worst-case.
In the best case, however, the runtime is much lower. Imagine any array where the first two elements are the same. In that case, the function will terminate almost instantly when the conditional is encountered for the first time. In this case, the runtime is Θ(1), because a constant number of statements will be executed.
The average-case, however, is not well-defined here. Average-case is typically defined relative to some distribution - the average over what? - and it's not clear what that is here. If you assume that the array consists of truly random int values and that ints can take on any integer value (not a reasonable assumption, but it's fine for now), then the probability that a randomly-chosen array has a duplicate is 0 and we're back in the worst-case (runtime Θ(n2)). However, if the values are more constrained, the runtime changes. Let's suppose that there are n numbers in the array and the integers range from 0 to k - 1, inclusive. Given a random array, the runtime depends on
Whether there's any duplicates or not, and
If there is a duplicate, where the first duplicated value appears in the array.
I am fairly confident that this math is going to be very hard to work out and if I have the time later today I'll come back and try to get an exact value (or at least something asymptotically appropriate). I seriously doubt this is what was expected since this seems to be an introductory big-O assignment, but it's an interesting question and I'd like to look into it more.
Hope this helps!

the if itself is O(1);
this is because it does not take into account the process within the ALU or the CPU, so if(ar[i] == ar[j]) would be in reality O(6), that translates into O(1)

You can regard it as O(1).
No matter what you consider as 'one' step,
the instructions for carrying out a[i] == a[j] doesn't depend on the
value n in this case.

Time and complexity of an integer array

I am working on an assignment and don't get answer for some of questions.
I Have been asked :
Input: an array A of length N that can only contain integers from 1 to N
Output: TRUE - A contains duplicate, FALSE - otherwise.
I have created a class which is passing my test cases.
public class ArrayOfIntegers {
public boolean isArrayContainsDuplicates(int [] intArray){
int arraySize = intArray.length;
long expectedOutPut = arraySize*(arraySize+1)/2;
long actualOutput = 0;
for(int i =0 ; i< arraySize; i++){
actualOutput = actualOutput + intArray[i];
}
if(expectedOutPut == actualOutput)
return false;
return true;
}
}
Now further questions on this
Is it possible to provide the answer and NOT to destroy the input array A?
I have not destroy an array. So what I have done is correct?
Analyze time and space complexity of your algorithm?
So do I need to write something about the for loop that as soon as I find the duplicate elements I should break the loop. Frankly speaking I am not very clear about the concept of time and space complexity.
Is O(n) for both time and space possible?
is this should be No as n could be any number. Again , I am not very clear about O(n).
Thanks

Is it possible to provide the answer and NOT to destroy the input array A?
Yes. For example, if you don't care about the time it takes, you can loop over the array once for every possible number and check if you see it exactly once (if not, there must be a duplicate). That would be O(N^2).
Usually, you would use an additional array (or other data structure) as a scratch-list, though (which also does not destroy the input array, see the third question below).
Analyze time and space complexity of your algorithm?
Your algorithm runs in O(n), doing just a single pass over the input array, and requires no additional space. However, it does not work.
Is O(n) for both time and space possible?
Yes.
Have another array of the same size (size = N), count in there how often you see every number (single pass over input), then check the counts (single pass over output, or short-circuit when you have an error).
So do I need to write something about the for loop that as soon as I find the duplicate elements I should break the loop.
No. Complexity considerations are always about the worst case (or sometimes the average case). In the worst case, you cannot break out of the loop. In the average case, you can break out after half the loop. Either way, while being important for someone waiting on a real-life implementation to finish the calculation, this does not make a difference for scalability (complexity as N grows infinite). Constant offsets and multipliers (such as 50% for breaking out early) are not considered.

public boolean hasDuplicates(int[] arr) {
boolean found = false;
for (int i = 1 ; i <= arr.length ; i++) {
for (int a : arr)
if (a == i) found = true;
if (! found) return true;
}
return false;
}
I believe this method would work (as yours currently doesn't). It's O(n^2).
I'm quite sure that it is impossible to attain O(n) for both time and space since two nested for-loops would be required, thereby increasing the method's complexity.
Edit
I was wrong (sometimes it's good to admit it), this is O(n):
public boolean hasDuplicates(int[] arr) {
int[] newarr = new int[arr.length];
for (int a : arr) newarr[a - 1]++;
for (int a : newarr) if (a != 1) return true;
return false;
}
Yes, the input array is not destroyed.
The method directly above is O(n) (by that I mean its runtime and space requirements would grow linearly with the argument array length).
Yes, see above.

As hints:
Yes, it is possible to provide an answer and not destroy the array. Your code* provides an example.
Time complexity can be viewed as, "how many meaningful operations does this algorithm do?" Since your loop goes from 0 to N, at minimum, you are doing O(N) work.
Space complexity can be viewed as, "how much space am I using during this algorithm?" You don't make any extra arrays, so your space complexity is on the order of O(N).
You should really revisit how your algorithm is comparing the numbers for duplicates. But I leave that as an exercise to you.
*: Your code also does not find all of the duplicates in an array. You may want to revisit that.

It's possible by adding all of the elements to a hashset = O(n), then comparing the number of values in the hashset to the size of the array = O(1). If they aren't equal, then there are duplicates.
Creating the hashset will also take up less space on average than creating an integer array to count each element. It's also an improvement from 2n to n, although that has no effect on big-O.

1) This will not require much effort, and leaves the array intact:
public boolean isArrayContainsDuplicates(int [] intArray){
int expectedOutPut = (intArray.length*(intArray.length+1))/2;
int actualOutput = 0;
for(int i =0 ; i < intArray.length; i++){
if(intArray[i]>intArray.length)return true;
actualOutput += intArray[i];
}
return expectedOutPut == actualOutput ? false: true;
}
2) This will require touching a varying amount of elements in the Array. Best case, it hits the first element which would be O(1), average case is it hits in the middle O(log n), and worse case is it goes all the way through and returns false O(n).
O(1) refers to a number of operations which are not related to the total number of items. In this case, the first element would be the only one which has this case.
O(log n) - log is a limiting factor which can be a real number from 0 to 1. Thus, multiplying by log will result in a smaller number. Hence, O(log n) refers to a required amount of operations which are less than the number of items.
O(n) - This is when it required a number of operations equal to the number of items.
These are all big-o notation for the required time.
This algorithm uses memory which will increase as n increases. However, it will not grow linearly but instead be a fraction of the size n is and therefore its spacial Big-O is O(log n).
3) Yes, this is possible - however, it is only possible in best-case scenarios.

How to multiply two big big numbers

You are given a list of n numbers L=<a_1, a_2,...a_n>. Each of them is
either 0 or of the form +/- 2k, 0 <= k <= 30. Describe and implement an
algorithm that returns the largest product of a CONTINUOUS SUBLIST
p=a_i*a_i+1*...*a_j, 1 <= i <= j <= n.
For example, for the input <8 0 -4 -2 0 1> it should return 8 (either 8
or (-4)*(-2)).
You can use any standard programming language and can assume that
the list is given in any standard data structure, e.g. int[],
vector<int>, List<Integer>, etc.
What is the computational complexity of your algorithm?

In my first answer I addressed the OP's problem in "multiplying two big big numbers". As it turns out, this wish is only a small part of a much bigger problem which I'm going to address now:
"I still haven't arrived at the final skeleton of my algorithm I wonder if you could help me with this."
(See the question for the problem description)
All I'm going to do is explain the approach Amnon proposed in little more detail, so all the credit should go to him.
You have to find the largest product of a continuous sublist from a list of integers which are powers of 2. The idea is to:
Compute the product of every continuous sublist.
Return the biggest of all these products.
You can represent a sublist by its start and end index. For start=0 there are n-1 possible values for end, namely 0..n-1. This generates all sublists that start at index 0. In the next iteration, You increment start by 1 and repeat the process (this time, there are n-2 possible values for end). This way You generate all possible sublists.
Now, for each of these sublists, You have to compute the product of its elements - that is come up with a method computeProduct(List wholeList, int startIndex, int endIndex). You can either use the built in BigInteger class (which should be able to handle the input provided by Your assignment) to save You from further trouble or try to implement a more efficient way of multiplication as described by others. (I would start with the simpler approach since it's easier to see if Your algorithm works correctly and first then try to optimize it.)
Now that You're able to iterate over all sublists and compute the product of their elements, determining the sublist with the maximum product should be the easiest part.
If it's still to hard for You to make the connections between two steps, let us know - but please also provide us with a draft of Your code as You work on the problem so that we don't end up incrementally constructing the solution and You copy&pasting it.
edit: Algorithm skeleton
public BigInteger listingSublist(BigInteger[] biArray)
{
int start = 0;
int end = biArray.length-1;
BigInteger maximum;
for (int i = start; i <= end; i++)
{
for (int j = i; j <= end; j++)
{
//insert logic to determine the maximum product.
computeProduct(biArray, i, j);
}
}
return maximum;
}
public BigInteger computeProduct(BigInteger[] wholeList, int startIndex,
int endIndex)
{
//insert logic here to return
//wholeList[startIndex].multiply(wholeList[startIndex+1]).mul...(
// wholeList[endIndex]);
}

Since k <= 30, any integer i = 2k will fit into a Java int. However the product of such two integers might not necessarily fit into a Java int since 2k * 2k = 22*k <= 260 which fill into a Java long. This should answer Your question regarding the "(multiplication of) two numbers...".
In case that You might want to multiply more than two numbers, which is implied by Your assignment saying "...largest product of a CONTINUOUS SUBLIST..." (a sublist's length could be > 2), have a look at Java's BigInteger class.

Actually, the most efficient way of multiplication is doing addition instead. In this special case all you have is numbers that are powers of two, and you can get the product of a sublist by simply adding the expontents together (and counting the negative numbers in your product, and making it a negative number in case of odd negatives).
Of course, to store the result you may need the BigInteger, if you run out of bits. Or depending on how the output should look like, just say (+/-)2^N, where N is the sum of the exponents.
Parsing the input could be a matter of switch-case, since you only have 30 numbers to take care of. Plus the negatives.
That's the boring part. The interesting part is how you get the sublist that produces the largest number. You can take the dumb approach, by checking every single variation, but that would be an O(N^2) algorithm in the worst case (IIRC). Which is really not very good for longer inputs.
What can you do? I'd probably start from the largest non-negative number in the list as a sublist, and grow the sublist to get as many non-negative numbers in each direction as I can. Then, having all the positives in reach, proceed with pairs of negatives on both sides, eg. only grow if you can grow on both sides of the list. If you cannot grow in both directions, try one direction with two (four, six, etc. so even) consecutive negative numbers. If you cannot grow even in this way, stop.
Well, I don't know if this alogrithm even works, but if it (or something similar) does, its an O(N) algorithm, which means great performance. Lets try it out! :-)

Hmmm.. since they're all powers of 2, you can just add the exponent instead of multiplying the numbers (equivalent to taking the logarithm of the product). For example, 2^3 * 2^7 is 2^(7+3)=2^10.
I'll leave handling the sign as an exercise to the reader.
Regarding the sublist problem, there are less than n^2 pairs of (begin,end) indices. You can check them all, or try a dynamic programming solution.

EDIT: I adjusted the algorithm outline to match the actual pseudo code and put the complexity analysis directly into the answer:
Outline of algorithm
Go seqentially over the sequence and store value and first/last index of the product (positive) since the last 0. Do the same for another product (negative) which only consists of the numbers since the first sign change of the sequence. If you hit a negative sequence element swap the two products (positive and negative) along with the associagted starting indices. Whenever the positive product hits a new maximum store it and the associated start and end indices. After going over the whole sequence the result is stored in the maximum variables.
To avoid overflow calculate in binary logarithms and an additional sign.
Pseudo code
maxProduct = 0
maxProductStartIndex = -1
maxProductEndIndex = -1
sequence.push_front( 0 ) // reuses variable intitialization of the case n == 0
for every index of sequence
n = sequence[index]
if n == 0
posProduct = 0
negProduct = 0
posProductStartIndex = index+1
negProductStartIndex = -1
else
if n < 0
swap( posProduct, negProduct )
swap( posProductStartIndex, negProductStartIndex )
if -1 == posProductStartIndex // start second sequence on sign change
posProductStartIndex = index
end if
n = -n;
end if
logN = log2(n) // as indicated all arithmetic is done on the logarithms
posProduct += logN
if -1 < negProductStartIndex // start the second product as soon as the sign changes first
negProduct += logN
end if
if maxProduct < posProduct // update current best solution
maxProduct = posProduct
maxProductStartIndex = posProductStartIndex
maxProductEndIndex = index
end if
end if
end for
// output solution
print "The maximum product is " 2^maxProduct "."
print "It is reached by multiplying the numbers from sequence index "
print maxProductStartIndex " to sequence index " maxProductEndIndex
Complexity
The algorithm uses a single loop over the sequence so its O(n) times the complexity of the loop body. The most complicated operation of the body is log2. Ergo its O(n) times the complexity of log2. The log2 of a number of bounded size is O(1) so the resulting complexity is O(n) aka linear.

I'd like to combine Amnon's observation about multiplying powers of 2 with one of mine concerning sublists.
Lists are terminated hard by 0's. We can break the problem down into finding the biggest product in each sub-list, and then the maximum of that. (Others have mentioned this).
This is my 3rd revision of this writeup. But 3's the charm...
Approach
Given a list of non-0 numbers, (this is what took a lot of thinking) there are 3 sub-cases:
The list contains an even number of negative numbers (possibly 0). This is the trivial case, the optimum result is the product of all numbers, guaranteed to be positive.
The list contains an odd number of negative numbers, so the product of all numbers would be negative. To change the sign, it becomes necessary to sacrifice a subsequence containing a negative number. Two sub-cases:
a. sacrifice numbers from the left up to and including the leftmost negative; or
b. sacrifice numbers from the right up to and including the rightmost negative.
In either case, return the product of the remaining numbers. Having sacrificed exactly one negative number, the result is certain to be positive. Pick the winner of (a) and (b).
Implementation
The input needs to be split into subsequences delimited by 0. The list can be processed in place if a driver method is built to loop through it and pick out the beginnings and ends of non-0 sequences.
Doing the math in longs would only double the possible range. Converting to log2 makes arithmetic with large products easier. It prevents program failure on large sequences of large numbers. It would alternatively be possible to do all math in Bignums, but that would probably perform poorly.
Finally, the end result, still a log2 number, needs to be converted into printable form. Bignum comes in handy there. There's new BigInteger("2").pow(log); which will raise 2 to the power of log.
Complexity
This algorithm works sequentially through the sub-lists, only processing each one once. Within each sub-list, there's the annoying work of converting the input to log2 and the result back, but the effort is linear in the size of the list. In the worst case, the sum of much of the list is computed twice, but that's also linear complexity.

See this code. Here I implement exact factorial of a huge large number. I am just using integer array to make big numbers. Download the code from Planet Source Code.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.