I was given a programming problem that given an array, determine if it is a post order traversal of a binary tree. My solution is as follows:
public static boolean isPostOrder(int[] array) {
int root = array[array.length - 1];
int i = 0;
while(array[i] < root) {
i++;
}
while(array[i] > root) {
i++;
}
return i == array.length - 1;
}
I am trying to understand Big O. I have read this tutorial:
What is a plain English explanation of "Big O" notation?
However, I am still confused about addition and the while loops. I'm assuming in this case my while loops are O(1) since we are just comparing a value in an array to an integer, or am I wrong about this?
Now the addition is also O(1) because we are just adding 1 to some integer 1, right?
Therefore, is this an O(1) solution or am I missing something?
You algorithm runtime has a lineair connection to the input size because you loop through the elements of the input array. This makes it that your algorithm is O(n). If you wouldn't have had any loops, it would have been O(1) = constant access time.
The basic idea behind Big-O notation is to estimate the number of basic operations you algorithm performs, and how the size of the input affects it.
Here, your basic operation is to increment i (++i). You're iterating over an array of size N, and in the worst case, you go over all of it. So, this algorithm has a O(N) complexity.
Related
I have a function that I wrote below. This function is essentially a merge-sort.
public static long nlgn(double[] nums) {
if(nums.length > 1) {
int elementsInA1 = nums.length/2;
int elementsInA2 = nums.length - elementsInA1;
double[] arr1 = new double[elementsInA1];
double[] arr2 = new double[elementsInA2];
for(int i = 0; i < elementsInA1; i++)
arr1[i] = nums[i];
for(int i = elementsInA1; i < elementsInA1 + elementsInA2; i++)
arr2[i - elementsInA1] = nums[i];
nlgn(arr1);
nlgn(arr2);
int i = 0, j = 0, k = 0;
while(arr1.length != j && arr2.length != k) {
if(arr1[j] <= arr2[k]) {
nums[i] = arr1[j];
i++;
j++;
} else {
nums[i] = arr2[k];
i++;
k++;
}
}
while(arr1.length != j) {
nums[i] = arr1[j];
i++;
j++;
}
while(arr2.length != k) {
nums[i] = arr2[k];
i++;
k++;
}
}
return nuts;
}
Since this is a merge sort, I know from my research that the big-O complexity of this algorithm is O(n lgn). However, when I run my timing tests, the results I get do not suggest that this is running in O(n lgn) time. It seems like it is O(n lgn) time though, because up until we get to the end of the two for loops at the beginning. it runs in O(n) time. Once past that, it should be running in O(lgn) time as it sorts each element.
My question is, can somebody confirm that this piece of code is running in O(n lgn) time? If not, I would like to know where I am going wrong in my understanding.
O(nlogn) is an asymptotically tight bound. That means, only when n is large enough, its running time is close to the complexity. When n is small, because of function call overhead and many other factors, the bound is not tight.
You can make n larger, and compare the ratios between inputs to see if it is close to O(nlogn). Although I really doubt how large you have to make n to be...
Since this is a merge sort, I know from my research that the big-O complexity of this algorithm is O(n lgn). [...] My question is, can somebody confirm that this piece of code is running in O(n lgn) time?
No need to show it, because merge sort has already been proven to run in O(n lg(n)) time. But if you'd like to observe it, you'll need to experiment with increasingly large values for your inputs. You might want to update your post with your input values and timing results.
However, when I run my timing tests, the results I get do not suggest that this is running in O(n lgn) time. [...] If not, I would like to know where I am going wrong in my understanding.
I think you may be misunderstanding what Big-Oh notation actually tries to tell you. Big-O gives you an approximation of the asymptotic upper bound of the algorithm as the inputs become large enough. (How "large" is "large enough" will vary from algorithm to algorithm and would need to be found by experimentation. The point is that this value does exist and we represent it more abstractly.)
In other words, Big-O tells you what the worst case performance of the algorithm could be as N becomes very large. Since this is the worst case scenario, it also means that, it could perform better under some circumstances, but we don't generally care about those. (Look into Big-Omega and Big-Theta if you're interested.) For example, if you have a "small-enough" list, merge-sort can run faster than quick-sort and this is often used as an optimization.
It's also an approximation because the constants and other polynomial terms are not shown as part of the notation. For example, some hypothetical algorithm with a time complexity of 500x^2 + 15x + 9000 is going to be written as O(n^2).
Some reasons for dropping the lower terms include:
Relative Size: As n tends towards positive infinity, the larger n^2 term dominates; the lower terms contribute less and less to the overall cost in comparison to the largest/dominating term --like adding a few drops or buckets of water into a lake;
Convenience: Reading and understanding O(n^2) is easier than a long and more complicated polynomial for no real benefit
Trying to brush up on my Big-O understanding for a test (A very basic Big-O understanding required obviously) I have coming up and was doing some practice problems in my book.
They gave me the following snippet
public static void swap(int[] a)
{
int i = 0;
int j = a.length-1;
while (i < j)
{
int temp = a[i];
a[i] = a[j];
a[j] = temp;
i++;
j--;
}
}
Pretty easy to understand I think. It has two iterators each covering half the array with a fixed amount of work (which I think clocks them both at O(n/2))
Therefore O(n/2) + O(n/2) = O(2n/2) = O(n)
Now please forgive as this is my current understanding and that was my attempt at the solution to the problem. I have found many examples of big-o online but none that are quite like this where the iterators both increment and modify the array at basically the same time.
The fact that it has one loop is making me think it's O(n) anyway.
Would anyone mind clearing this up for me?
Thanks
The fact that it has one loop is making me think it's O(n) anyway.
This is correct. Not because it is making one loop, but because it is one loop that depends on the size of the array by a constant factor: the big-O notation ignores any constant factor. O(n) means that the only influence on the algorithm is based on the size of the array. That it actually takes half that time, does not matter for big-O.
In other words: if your algorithm takes time n+X, Xn, Xn + Y will all come down to big-O O(n).
It gets different if the size of the loop is changed other than a constant factor, but as a logarithmic or exponential function of n, for instance if size is 100 and loop is 2, size is 1000 and loop is 3, size is 10000 and loop is 4. In that case, it would be, for instance, O(log(n)).
It would also be different if the loop is independent of size. I.e., if you would always loop 100 times, regardless of loop size, your algorithm would be O(1) (i.e., operate in some constant time).
I was also wondering if the equation I came up with to get there was somewhere in the ballpark of being correct.
Yes. In fact, if your equation ends up being some form of n * C + Y, where C is some constant and Y is some other value, the result is O(n), regardless of whether see is greater than 1, or smaller than 1.
You are right about the loop. Loop will determine the Big O. But the loop runs only for half the array.
So its. 2 + 6 *(n/2)
If we make n very large, other numbers are really small. So they won't matter.
So its O(n).
Lets say you are running 2 separate loops. 2 + 6* (n/2) + 6*(n/2) . In that case it will be O(n) again.
But if we run a nested loop. 2+ 6*(n*n). Then It will be O(n^2)
Always remove the constants and do the math. You got the idea.
As j-i decreases by 2 units on each iteration, N/2 of them are taken (assuming N=length(a)).
Hence the running time is indeed O(N/2). And O(N/2) is strictly equivalent to O(N).
I'm having some trouble finding the big O for the if statement in the code below:
public static boolean areUnique (int[] ar)
{
for (int i = 0; i < ar.length-1; i++) // O(n)
{
for (int j = i+1; j < ar.length-1; j++) // O(n)
{
if (ar[i] == ar[j]) // O(???)
return false; // O(1)
}
}
return true; //O(1)
}
I'm trying to do a time complexity analysis for the best, worst, and average case
Thank you everyone for answering so quickly! I'm not sure if my best worst and average cases are correct... There should be a case difference should there not because of the if statement? But when I do my analysis I have them all ending up as O(n2)
Best: O(n) * O(n) * [O(1) + O(1)] = O(n2)
Worst: O(n) * O(n) * [O(1) + O(1) + O(1)] = n2
Average: O(n) * O(n) * [O(1) + O(1) + O(1)] = O(n2)
Am I doing this right? My textbook is not very helpful
For starters, this line
if (ar[i] == ar[j])
always takes time Θ(1) to execute. It does only a constant amount of work (a comparison plus a branch), so the work done here won't asymptotically contribute to the overall runtime.
Given this, we can analyze the worst-case behavior by considering what happens if this statement is always false. That means that the loop runs as long as possible. As you noticed, since each loop runs O(n) times, the total work done is Θ(n2) in the worst-case.
In the best case, however, the runtime is much lower. Imagine any array where the first two elements are the same. In that case, the function will terminate almost instantly when the conditional is encountered for the first time. In this case, the runtime is Θ(1), because a constant number of statements will be executed.
The average-case, however, is not well-defined here. Average-case is typically defined relative to some distribution - the average over what? - and it's not clear what that is here. If you assume that the array consists of truly random int values and that ints can take on any integer value (not a reasonable assumption, but it's fine for now), then the probability that a randomly-chosen array has a duplicate is 0 and we're back in the worst-case (runtime Θ(n2)). However, if the values are more constrained, the runtime changes. Let's suppose that there are n numbers in the array and the integers range from 0 to k - 1, inclusive. Given a random array, the runtime depends on
Whether there's any duplicates or not, and
If there is a duplicate, where the first duplicated value appears in the array.
I am fairly confident that this math is going to be very hard to work out and if I have the time later today I'll come back and try to get an exact value (or at least something asymptotically appropriate). I seriously doubt this is what was expected since this seems to be an introductory big-O assignment, but it's an interesting question and I'd like to look into it more.
Hope this helps!
the if itself is O(1);
this is because it does not take into account the process within the ALU or the CPU, so if(ar[i] == ar[j]) would be in reality O(6), that translates into O(1)
You can regard it as O(1).
No matter what you consider as 'one' step,
the instructions for carrying out a[i] == a[j] doesn't depend on the
value n in this case.
This question already has answers here:
What is a plain English explanation of "Big O" notation?
(43 answers)
Closed 9 years ago.
I really can't figure out what "Big-O" is and how to use it in practice, so i hope someone could give me a simple explaining and maybe a little programming example in java.
I have the following questions:
What does these terms mean (as simple as possible) and how is it used in java:
BigO(1), BigO(n), BigO(n2) and BigO(log(n)) ?
How do you calculate a Big-O from an existing java code?
How do you use Big-O sorting
How do you use Big-O recursion
Hope someone will be able to help.
Thank you in advantage
Big O is used to give an idea of how fast an algorithm will scale as input size increases
O(1) means that as input size increases, the running time will not change
O(n) means that as input size doubles, the running time will double, more or less
O(n^2) means that as input size double, the running time will quadruple, more or less
O(f(n)) means that as input size doubles, the running time will increase to around f(2n)
Regarding Big-O, sorting, and recursion.
Big-O sorting isn't really an algorithm. You can however use Big O to tell you how fast your sorting algorithm is.
For computing Big-O of a recursive function, I would recommend using the Master Theorem.
Guidelines for determining Big O:
Usually, you need to determine what your input size is (like array length, or number of nodes in your linked list, etc)
Then ask yourself, what happens if your input size doubles? Or triples?
If you have a for loop that goes through each element:
//say array a has n elements
for (int i = 0; i < n; i++) {
// do stuff
a[i] = 3;
}
Then doubling n would make the loop run twice as long, so it would be O(n). Tripling n would triple the time it takes, so the code scales linearly with the input size.
If you have a 2D array and nested for loops:
// we say that each dimension of the array is upper bounded by n,
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
// do stuff
a[i][j] = 7;
}
}
As n doubles, the code will take 2^2 = 4 times as long. If input size triples, code will take 3^2 = 9 times as long. So Big O is O(n^2).
The Big-O notation is a notation made for showing computer algorithm performance, when the input is very large.
Three quick programming examples, in Java:
O(1):
for (int i=0; i<3; i++) {
System.out.println(i);
}
O(n):
int n = 1000000; /* Some arbitrary large number */
for (int i=0; i<n; i++) {
System.out.println(i);
}
O(n2):
int n = 1000000; /* Some arbitrary large number */
for (int i=0; i<n; i++) {
for (int j=0; j<n; j++) {
System.out.println(i * j);
}
}
Read more: http://en.wikipedia.org/wiki/Big_O_notation
Big O (it is the letter O - which is big- as opposed to the letter o which is small) gives an idea of how an algorithm scales when the input size (n) changes. The numbers are
If for instance n doubles from say 100 to 200 items,
an O(1) algorithm will take approximately the same time.
an O(n) algortihm will take double the time.
an O(n^2) algorithm will take four times the time (2^2)
an O(n^3) algorithm will take eight times the time (2^3)
And so on.
Note that log(n) can be understood as "the number of digits in n". This means that if you from n having two digits (like 99) to n having the double number of digits (four digits like 9999) the running time only doubles. This typically happens when you split the data in two piles and solve each separately and merge the solutions back, for instance in sorting.
Typically each loop over the input data multiply by n. so a single loop is O(n) but if you need to compare every element to every other element you get O(n^2), and so on. Note that the times are relative so a slow O(n) algorithm may be outperformed by a fast O(n^2) algorithm for small values of n.
Also note that O is worst case. So quicksort which generally run fast still is O(^2) because there is pathetic input data which cause it to compare every element to every other.
This is interesting because most algorithms are fast for small data sets, but you need to know how they work with input data perhaps thousands or millions of times larger where it is important if you have O(n^2) or O(n^3) or worse. The numbers are relative so it does not say anything bps out if a given algorithm is slow or fast, just how the worst case looks like when you double the input size.
I am working on an assignment and don't get answer for some of questions.
I Have been asked :
Input: an array A of length N that can only contain integers from 1 to N
Output: TRUE - A contains duplicate, FALSE - otherwise.
I have created a class which is passing my test cases.
public class ArrayOfIntegers {
public boolean isArrayContainsDuplicates(int [] intArray){
int arraySize = intArray.length;
long expectedOutPut = arraySize*(arraySize+1)/2;
long actualOutput = 0;
for(int i =0 ; i< arraySize; i++){
actualOutput = actualOutput + intArray[i];
}
if(expectedOutPut == actualOutput)
return false;
return true;
}
}
Now further questions on this
Is it possible to provide the answer and NOT to destroy the input array A?
I have not destroy an array. So what I have done is correct?
Analyze time and space complexity of your algorithm?
So do I need to write something about the for loop that as soon as I find the duplicate elements I should break the loop. Frankly speaking I am not very clear about the concept of time and space complexity.
Is O(n) for both time and space possible?
is this should be No as n could be any number. Again , I am not very clear about O(n).
Thanks
Is it possible to provide the answer and NOT to destroy the input array A?
Yes. For example, if you don't care about the time it takes, you can loop over the array once for every possible number and check if you see it exactly once (if not, there must be a duplicate). That would be O(N^2).
Usually, you would use an additional array (or other data structure) as a scratch-list, though (which also does not destroy the input array, see the third question below).
Analyze time and space complexity of your algorithm?
Your algorithm runs in O(n), doing just a single pass over the input array, and requires no additional space. However, it does not work.
Is O(n) for both time and space possible?
Yes.
Have another array of the same size (size = N), count in there how often you see every number (single pass over input), then check the counts (single pass over output, or short-circuit when you have an error).
So do I need to write something about the for loop that as soon as I find the duplicate elements I should break the loop.
No. Complexity considerations are always about the worst case (or sometimes the average case). In the worst case, you cannot break out of the loop. In the average case, you can break out after half the loop. Either way, while being important for someone waiting on a real-life implementation to finish the calculation, this does not make a difference for scalability (complexity as N grows infinite). Constant offsets and multipliers (such as 50% for breaking out early) are not considered.
public boolean hasDuplicates(int[] arr) {
boolean found = false;
for (int i = 1 ; i <= arr.length ; i++) {
for (int a : arr)
if (a == i) found = true;
if (! found) return true;
}
return false;
}
I believe this method would work (as yours currently doesn't). It's O(n^2).
I'm quite sure that it is impossible to attain O(n) for both time and space since two nested for-loops would be required, thereby increasing the method's complexity.
Edit
I was wrong (sometimes it's good to admit it), this is O(n):
public boolean hasDuplicates(int[] arr) {
int[] newarr = new int[arr.length];
for (int a : arr) newarr[a - 1]++;
for (int a : newarr) if (a != 1) return true;
return false;
}
Yes, the input array is not destroyed.
The method directly above is O(n) (by that I mean its runtime and space requirements would grow linearly with the argument array length).
Yes, see above.
As hints:
Yes, it is possible to provide an answer and not destroy the array. Your code* provides an example.
Time complexity can be viewed as, "how many meaningful operations does this algorithm do?" Since your loop goes from 0 to N, at minimum, you are doing O(N) work.
Space complexity can be viewed as, "how much space am I using during this algorithm?" You don't make any extra arrays, so your space complexity is on the order of O(N).
You should really revisit how your algorithm is comparing the numbers for duplicates. But I leave that as an exercise to you.
*: Your code also does not find all of the duplicates in an array. You may want to revisit that.
It's possible by adding all of the elements to a hashset = O(n), then comparing the number of values in the hashset to the size of the array = O(1). If they aren't equal, then there are duplicates.
Creating the hashset will also take up less space on average than creating an integer array to count each element. It's also an improvement from 2n to n, although that has no effect on big-O.
1) This will not require much effort, and leaves the array intact:
public boolean isArrayContainsDuplicates(int [] intArray){
int expectedOutPut = (intArray.length*(intArray.length+1))/2;
int actualOutput = 0;
for(int i =0 ; i < intArray.length; i++){
if(intArray[i]>intArray.length)return true;
actualOutput += intArray[i];
}
return expectedOutPut == actualOutput ? false: true;
}
2) This will require touching a varying amount of elements in the Array. Best case, it hits the first element which would be O(1), average case is it hits in the middle O(log n), and worse case is it goes all the way through and returns false O(n).
O(1) refers to a number of operations which are not related to the total number of items. In this case, the first element would be the only one which has this case.
O(log n) - log is a limiting factor which can be a real number from 0 to 1. Thus, multiplying by log will result in a smaller number. Hence, O(log n) refers to a required amount of operations which are less than the number of items.
O(n) - This is when it required a number of operations equal to the number of items.
These are all big-o notation for the required time.
This algorithm uses memory which will increase as n increases. However, it will not grow linearly but instead be a fraction of the size n is and therefore its spacial Big-O is O(log n).
3) Yes, this is possible - however, it is only possible in best-case scenarios.