public String mostFrequent(){
int a = 0; Set s = hm.keySet(); String r = "";
Iterator<String> itrs = s.iterator();
while (itrs.hasNext()){
String b = itrs.next();
if (hm.get(b) > a){
a = hm.get (b);
r = b;
}
}
return r;
I know the worst case running time of get(v) is o(n). Then is the worst case running time of this method o(n^3), since it uses get(b) twice in a while loop? I am not sure what I think is correct or not.
Thanks for any kinds of hints and explanation!
Here is a helpful way to visualize O()
Imagine 1000 things
If you have an outer loop of 1000 times
doing 1000 things thats 1,000,000 things = O(N^2)
If you do 1000 loops that is O(N)
If you do 10 loops that is O(log(N))
If you don't loop that is O(1)
Multiplying or dividing O() by any constant does nothing
O() of a sequence of things is the O() of the biggest thing in the sequence
Every level of Loops multiplies O() by N
Guestimate how many things you do when N=1000 and that will give you the right order of magnitude
I would suggest you to read few chapters of Cormen or Tamassia. Let us visit your problem statement.
1) The operations on Hashmap must be ideally constant time(although they depend on the load factor), we can safely assume that it is constant time here i.e. O(1).
2) The keySet method generally is a O(n) time operation.
3) Inside the while loop, you are just doing a lookup, comparison and an assignment, all of these are primitive operations with O(1) running time.
4) Moreover, in the while loop you should not use the get method twice, doesn't make much sense, you could have just used it once and stored the value in another variable.
If all these arguments make sense to you, you can safely say that the running time would be T(n) = O(n) + c1 O(1) + c2 O(1) + C3 O(1)
You can safely drop the lower order terms and the constants to get T(n) = O(n)
If you really want to understand these things, you can always visit my blog
Related
This question already has answers here:
Big O, how do you calculate/approximate it?
(24 answers)
Closed 27 days ago.
In a Java app, I have the following algorithm that is used for "Longest Substring with K Distinct Characters" as shown below:
Input: String="araaci", K=2
Output: 4
Explanation: The longest substring with no more than '2' distinct characters is "araa".
Input: String="cbbebi", K=3
Output: 5
Explanation: The longest substrings with no more than '3' distinct characters are "cbbeb" & "bbebi".
Here is the code:
public static int longestSubstring(String str, int k) {
Map<Character, Integer> map = new HashMap<>();
int maxLength = 0;
int l = 0;
for (int r = 0; r < str.length(); r++) {
char cRight = str.charAt(r);
map.put(cRight, map.getOrDefault(cRight, 0) + 1);
while (map.size() > k) {
char cLeft = str.charAt(l);
map.put(cLeft, map.getOrDefault(cLeft, 0) - 1);
if (map.get(cLeft) == 0) {
map.remove(cLeft);
}
l++;
}
maxLength = Math.max(maxLength, r - l + 1);
}
return maxLength;
}
I could not understand the time complexity in the following definition:
Time Complexity
The time complexity of the above algorithm will be O(N) where ‘N’ is the number of characters in the input string. The outer for loop runs for all characters and the inner while loop processes each character only once, therefore the time complexity of the algorithm will be O(N+N) which is asymptotically equivalent to O(N).
So, I thought when there is a while loop inside another for loop, I thought time complexity is O(n^2). But here I could not understand "inner while loop processes each character only once". Can you explain this state if it is correct?
In order to analyse the complexity of an algorithm, most of the time you'll need to understand what the code does in details (you don't need to understand if what it does is correct tho). Using the structure of the code (ie. whether loops are nested) or just looking at the big picture is usually a bad idea. In other words, computing the complexity of an algorithm takes a lot of (your) time.
As you stated "the inner while loop processes each character only once", which is indeed important to notice, but that's not sufficient in my opinion.
Loops do not matter per se, what matters is the total number of instructions your program will run depending on the input size. You can read "instruction" as "a function that runs in constant time" (independently of the input size).
Making sure all function calls are in O(1)
Let's first look at the complexity of all function calls:
We have several a map reads, all in O(1) (read this as "constant time read"):
map.getOrDefault(cRight, 0)
map.getOrDefault(cLeft, 0)
Also several map insertions, also all in O(1):
map.put(cRight, ...)
map.put(cLeft, ...)
And map item deletion map.remove(cLeft), also in O(1)
The Math.max(..., ...) is also in O(1)
str.charAt(..) is also in O(1)
There is also increments/decrements of loop variables and check their values and also a few other +1, -1 that are all in O(1).
Ok, now we can safely say that all external functions are "instructions" (or more accurately, all these function use a "constant size number of instructions"). Notice that all hashmap complexities are not exactly accurate but this is a detail you can look at separately
Which means we now only need to call how many of these functions are called.
Analyzing the number of function calls
The argument made in the comments is accurate but using the fact that char cLeft = str.charAt(l) will crash the program if l > N is not very satisfactory in my opinion. But this is a valid point, its impossible for the inner loop to be executed more than l times in total (which leads directly to the expected O(N) time complexity).
If this was given as an exercise, I doubt that was the expected answer. Let's analyze the program like it was written using char cLeft = str.charAt(l % str.length()) instead to make it a little more interesting.
I feel like the main argument should be based on the "total number of character count" stored in map (a map of character to counter pairs). Here are some facts, mainly:
The outter loop always increases a single counter by exactly one.
The inner loop always decreases a single counter by exactly one.
Also:
The inner loop ensure that all counters are positive (removes counters when they are equal to 0).
The inner loop runs as long as the number of (positive) counters is > k.
Lemma For the inner loop to be executed C times in total, the outer loop needs to be executed at least C times in total.
Proof Suppose the outer loop is executed C times, and the inner loop at least C + 1 times, that means it exists an iteration r of the outer loop in which the inner loop is executed r + 1 times. During this iteration r of the outer loop, at some point (by 1. and 2.) the sum of all character counters in map will equal 0. By fact 3., this means that there are no counter left (map.size() equals 0). Since k > 0, during iteration r it is impossible to enter the inner loop for a r + 1 time because of 4.. A contradiction that proves the Lemma is true.
Less formally, the inner loop will never execute if the sum of counters is 0 because the total sum of k (> 0) positives counters is greater than 0. In other words, the consumer (inner loop) can't consume more than what is being produced (by the outer loop).
Because of this Lemma and because the outer loop executes exactly N times, the inner loop executes at most N times. In total we will execute at most A * N function calls in the outter loop, and B * N function calls in the inner loop, both A and B are constants and all functions are in O(1), therefore (A + B) * N ∈ O(N).
Also note that writing O(N + N) is a pleonasm (or doesn't make sense) because big-O is supposed to ignore all constant factors (both multiplicative and additive). Usually people will not write equations using big-O notations because it is hard to write something correct and formal (appart for obvious set inclusions like O(log N) ∈ O(N)). Usually you would say something like "all operations are in O(N), therefore the algorithm is in O(N)".
Trying to brush up on my Big-O understanding for a test (A very basic Big-O understanding required obviously) I have coming up and was doing some practice problems in my book.
They gave me the following snippet
public static void swap(int[] a)
{
int i = 0;
int j = a.length-1;
while (i < j)
{
int temp = a[i];
a[i] = a[j];
a[j] = temp;
i++;
j--;
}
}
Pretty easy to understand I think. It has two iterators each covering half the array with a fixed amount of work (which I think clocks them both at O(n/2))
Therefore O(n/2) + O(n/2) = O(2n/2) = O(n)
Now please forgive as this is my current understanding and that was my attempt at the solution to the problem. I have found many examples of big-o online but none that are quite like this where the iterators both increment and modify the array at basically the same time.
The fact that it has one loop is making me think it's O(n) anyway.
Would anyone mind clearing this up for me?
Thanks
The fact that it has one loop is making me think it's O(n) anyway.
This is correct. Not because it is making one loop, but because it is one loop that depends on the size of the array by a constant factor: the big-O notation ignores any constant factor. O(n) means that the only influence on the algorithm is based on the size of the array. That it actually takes half that time, does not matter for big-O.
In other words: if your algorithm takes time n+X, Xn, Xn + Y will all come down to big-O O(n).
It gets different if the size of the loop is changed other than a constant factor, but as a logarithmic or exponential function of n, for instance if size is 100 and loop is 2, size is 1000 and loop is 3, size is 10000 and loop is 4. In that case, it would be, for instance, O(log(n)).
It would also be different if the loop is independent of size. I.e., if you would always loop 100 times, regardless of loop size, your algorithm would be O(1) (i.e., operate in some constant time).
I was also wondering if the equation I came up with to get there was somewhere in the ballpark of being correct.
Yes. In fact, if your equation ends up being some form of n * C + Y, where C is some constant and Y is some other value, the result is O(n), regardless of whether see is greater than 1, or smaller than 1.
You are right about the loop. Loop will determine the Big O. But the loop runs only for half the array.
So its. 2 + 6 *(n/2)
If we make n very large, other numbers are really small. So they won't matter.
So its O(n).
Lets say you are running 2 separate loops. 2 + 6* (n/2) + 6*(n/2) . In that case it will be O(n) again.
But if we run a nested loop. 2+ 6*(n*n). Then It will be O(n^2)
Always remove the constants and do the math. You got the idea.
As j-i decreases by 2 units on each iteration, N/2 of them are taken (assuming N=length(a)).
Hence the running time is indeed O(N/2). And O(N/2) is strictly equivalent to O(N).
I just read from : Everything about java 8
that, java 8 adds Arrays.parallelSetAll()
int[] array = new int[8];
AtomicInteger i= new AtomicInteger();
Arrays.parallelSetAll(array, operand -> i.incrementAndGet());
[Edited] Is it O(1) or a constant time complexity on the same machine for same no.of elements in array ? What sort of performance improvement is indicated by the method name?
To start off, it can never be O(1), more clarification following:
I am using that n = array.length, which in your case is 8, however that does not matter as it could also be a very big number.
Now observe that normally you would do:
for (int i = 0; i < n; i++) {
array[i] = i.incrementAndGet();
}
This is with Java 8 much easier:
Arrays.setAll(array, v -> i.incrementAndGet());
Observe that they both take O(n) time.
Now take into account that you execute the code parallel, but there are no guarantees as to how it executes it, you do not know the number of parallellizations it does under the hood, if any at all for such a low number.
Therefore it still takes O(n) time, because you cannot prove that it will parallellize over n threads.
Edit, as an extra, I have observed that you seem to think that parallellizing an action means that any O(k) will converge to O(1), where k = n or k = n^2, etc.
This is not the case in practice as you can prove that you never have k processor cores available.
An intuitive argument is your own computer, if you are lucky it may have 8 cores, therefore the maximum time you could get under perfect parallellization conditions is O(n / 8).
I can already hear the people from the future laughing at that we only had 8 CPU cores...
It is O(N). Calling Arrays.parallelSetAll(...) involves assignments to set a total of array.length array elements. Even if those assignments are spread across P processors, the total number of assignments is linearly proportional to the length of the array. Take N as the length of the array, and math is obvious.
The thing to realize is that P ... the number of available processors ... is going to be a constant for any given execution of a program on a single computer. (Or if it is not a constant, there will be a constant upper bound.) And a computation whose sole purpose is to assign values to an array only makes sense when executed on a single computer.
I have N numbers in arraylist. To get the indexOf, arraylist will have to iterate maximum N times, so complexity is O(N), is that correct?
Source Java API
Yes,Complexity is O(N).
The size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation.
Yes it's O(n) as it needs to iterate through every item in the list in the worst case.
The only way to achieve better than this is to have some sort of structure to the list. The most typical example being looking through a sorted list using binary search in O(log n) time.
Yes, that is correct. The order is based off the worst case.
100%, it needs to iterate through the list to find the correct index.
It is true. Best Case is 1 so O(1), Average Case is N/2 so O(N) and Worst Case is N so O(N)
In the worst case you find the element at the very last position, which takes N steps, that is, O(N). In the best case the item you are searching for is the very first one, so the complexity is O(1). The average length is of the average number of steps. If we do not have further context, then this is how one can make the calculations:
avg = (1 + 2 + ... n) / n = (n * (n + 1) / 2) / n = (n + 1) / 2
If n -> infinity, then adding a positive constant and dividing by a positive constant has no effect, we still have infinity, so it is O(n).
However if you have a large finite data to work with, then you might want to calculate the exact average value as above.
Also, you might have a context there which could aid you to get further accuracy in your calculations.
Example:
Let's consider the example when your array is ordered by usage frequency descendingly. In case that your call of indexOf is a usage, then the most probable item is the first one, then the second and so on. If you have exact usage frequency for each item, then you will be able to calculate a probable wait time.
An ArrayList is an Array with more features. So the order of complexity for operations done to an ArrayList is the same as for an Array.
I'm having some trouble finding the big O for the if statement in the code below:
public static boolean areUnique (int[] ar)
{
for (int i = 0; i < ar.length-1; i++) // O(n)
{
for (int j = i+1; j < ar.length-1; j++) // O(n)
{
if (ar[i] == ar[j]) // O(???)
return false; // O(1)
}
}
return true; //O(1)
}
I'm trying to do a time complexity analysis for the best, worst, and average case
Thank you everyone for answering so quickly! I'm not sure if my best worst and average cases are correct... There should be a case difference should there not because of the if statement? But when I do my analysis I have them all ending up as O(n2)
Best: O(n) * O(n) * [O(1) + O(1)] = O(n2)
Worst: O(n) * O(n) * [O(1) + O(1) + O(1)] = n2
Average: O(n) * O(n) * [O(1) + O(1) + O(1)] = O(n2)
Am I doing this right? My textbook is not very helpful
For starters, this line
if (ar[i] == ar[j])
always takes time Θ(1) to execute. It does only a constant amount of work (a comparison plus a branch), so the work done here won't asymptotically contribute to the overall runtime.
Given this, we can analyze the worst-case behavior by considering what happens if this statement is always false. That means that the loop runs as long as possible. As you noticed, since each loop runs O(n) times, the total work done is Θ(n2) in the worst-case.
In the best case, however, the runtime is much lower. Imagine any array where the first two elements are the same. In that case, the function will terminate almost instantly when the conditional is encountered for the first time. In this case, the runtime is Θ(1), because a constant number of statements will be executed.
The average-case, however, is not well-defined here. Average-case is typically defined relative to some distribution - the average over what? - and it's not clear what that is here. If you assume that the array consists of truly random int values and that ints can take on any integer value (not a reasonable assumption, but it's fine for now), then the probability that a randomly-chosen array has a duplicate is 0 and we're back in the worst-case (runtime Θ(n2)). However, if the values are more constrained, the runtime changes. Let's suppose that there are n numbers in the array and the integers range from 0 to k - 1, inclusive. Given a random array, the runtime depends on
Whether there's any duplicates or not, and
If there is a duplicate, where the first duplicated value appears in the array.
I am fairly confident that this math is going to be very hard to work out and if I have the time later today I'll come back and try to get an exact value (or at least something asymptotically appropriate). I seriously doubt this is what was expected since this seems to be an introductory big-O assignment, but it's an interesting question and I'd like to look into it more.
Hope this helps!
the if itself is O(1);
this is because it does not take into account the process within the ALU or the CPU, so if(ar[i] == ar[j]) would be in reality O(6), that translates into O(1)
You can regard it as O(1).
No matter what you consider as 'one' step,
the instructions for carrying out a[i] == a[j] doesn't depend on the
value n in this case.