Our teacher didn't taught us how to analyze the running time of an algorithm before she want us to report about Shell sort.
I just want to know if there is a simple way to find the average/best/worst case performance of an algorithm like shell sort.
//for references
class ShellSort{
void shellSort(int array[], int n){
//n = array.length
for (int gap = n/2; gap > 0; gap /= 2){
for (int i = gap; i < n; i += 1) {
int temp = array[i];
int j;
for (j = i; j >= gap && array[j - gap] > temp; j -= gap){
array[j] = array[j - gap];
}
array[j] = temp;
}
}
}
welcome to stack overflow! Usually, time complexity for a sorting algorithm is measured via the number of key comparisons performed. One can begin by considering what is the input that requires the fewest number of key comparisons to completely sort (best case) then follow it up with the input that would require the most number (worst case). Oftentimes, the best case would be a sorted list and the worst case might be a list sorted in reverse order, though this might not be the case for some divide and conquer algorithms.
As for average case, once you derive the best and worst case, you know the average is bounded between the two. If both have the same time complexity class (oftentimes 'Big Oh' class), then you know the average is the same. Otherwise you can derive it mathematically through a probabilistic analysis.
For every loop over the array, that would add a time complexity factor of 'n', and nested loops would 'multiply' this complexity. i.e. 2 nested loops give a complexity of n^2 and 3 nested loops give a complexity of n^3.
Partitioning an array into half repeatedly would often give you a time complexity factor of 'log(n)'.
Related
I have task to compare 3d Matrix Bubble sort and Insertion sort execution times. They only should sort numbers in each row. I wrote a methods in Java like below:
static void BubbleSort(char tab2D[][], int tabSize) throws InterruptedException {
for (int i = 0; i < tabSize; i++) {
for (int j = 1; j < tabSize - i; j++) {
for (int k = 0; k < tabSize; k++) {
if (tab2D[k][j - 1] > tab2D[k][j]) {
char temp = tab2D[k][j];
tab2D[k][j] = tab2D[k][j - 1];
tab2D[k][j - 1] = temp;
}
}
}
}
}
static void InsertionSort(char tab2D[][], int size) throws InterruptedException {
char temp;
int i, j;
for (int ii = 0; ii < size; ii++) {
for (i = 1; i < size; i++) {
temp = tab2D[ii][i];
for (j = i - 1; j >= 0 && tab2D[ii][j] > temp; j--) {
tab2D[ii][j + 1] = tab2D[ii][j];
}
tab2D[ii][j + 1] = temp;
}
}
}
As far as I know both algorithms have time complexity O(n^2), yet there's my results for both algorithms:
100x100 Matrix: Bubble sort=21 ms, Insertion sort=3 ms
300x300 Matrix: Bubble sort=495 ms, Insertion sort=20 ms
700x700 Matrix: Bubble sort=6877 ms, Insertion sort=249 ms
Measuring time like:
start = System.currentTimeMillis();
// sort method
stop = System.currentTimeMillis();
time = stop - start;
I don't really understand where's problem, as Bubble sort is WAY too slow comparing to Insertion. As I understand their times should be similar. Checked even both versions of algorithms editing them to sort only arrays, Bubble sort is still way slower and I can't tell why.
Next thing I tried was to put Thread.Sleep(1) in both algorithms just bellow swapping moment. After doing that I realized that times finally look similar and are only just a bit different. Could someone explain why is that happening? Are times I measured before correct?
As mentioned in comments, one cannot say anything useful about the actual time it takes to run an algorithm, based on a given time complexity. Two algorithms that have the same time complexity do not necessarily have to need the same time to complete the same job.
That being said, here are some differences between the two functions that explain why InsertionSort needs less time to do the job on average:
If we ignore the tab2D[ii][j] > temp condition in the inner loop of InsertionSort, both functions perform the same total number of loop iterations. However, as InsertionSort has a condition in the inner loop that can make it exit early, InsertionSort will often make fewer iterations.
Both make a data comparison in their inner loops, but BubbleSort must read two values from the array to do that, while InsertionSort only needs one access to the array (as the other value is in temp)
The inner loop of InsertionSort can focus on one value that bubbles into its right place, meaning it does not actually has to swap, but only copy. BubbleSort has to swap, which involves three assignments instead of one. Granted, BubbleSort does not have to swap in each iteration of its inner loop, but that is offset against the fact that InsertionSort does not move all array values (in the loop's range) either -- as it has this early exit. This gives a feel of BubbleSort doing 3 assignments, where InsertionSort only needs 1 assignment (plus an overhead of 2 assignments outside of the inner loop). We can see that on average BubbleSort will perform more assignments (involving the array) than InsertionSort.
Here is the code:
for (int i = 0; i < 60; i++) {
for (int j = i-1; j < N; j++) {
for (int k = 0; k+2 < N; k++) {
System.out.println(i*j);
System.out.println(i);
i=i+1;
}
}
}
I believe it is O(N^2) since there's two N's in the for loops, but not too sure.
Any help is appreciated, thanks!
It's rather because the i-loop has a fixed limit. Saying it's O(N^2) is not wrong IMO but if we are strict, then the complexity is O(N^2 * log(N)).
We can prove it more formally:
First, let's get rid of the i-loop for the analysis. The value of i is incremented within the k-loop. When N is large, then i will get larger than 60 right in the first iteration, so the i-loop will execute just one iteration. For simplicity, let's say i is just a variable initialized with 0. Now the code is:
int i = 0;
for (int j = -1; j < N; j++) { // j from - 1 to N - 1
for (int k = 0; k+2 < N; k++) { // k from 0 to N - 1 - 2
System.out.println(i*j);
System.out.println(i);
i=i+1;
}
}
If we are very strict then we'd have to look at different cases where the i-loop executes more than one time, but only when N is small, which is not what we are interested in for big O. Let's just forget about the i-loop for simplicity.
We look at the loops first and say that the inner statements are constant for that. We look at them separately.
We can see that the complexity of the loops is O(N^2).
Now the statements: The interesting ones are the printing statements. Printing a number is obviously done digit by digit (simply speaking), so it's not constant. The amount of digits of a number grows logarithmic with the number. See this for details. The last statement is just constant.
Now we need to look at the numbers (roughly). The value of i grows up to N * N. The value of j grows up to N. So the first print prints a number that grows up to N * N * N. The second print prints a number that grows up to N * N. So the inner body has the complexity O(log(N^3) + log(N^2)), which is just O(3log(N) + 2log(N)), which is just O(5log(N)). Constant factors are dropped in big O so the final complexity is O(log(N)).
Combining the complexity of the loops and the complexity of the executed body yields the overall complexity: O(N^2 * log(N)).
Ask your teacher if the printing statements were supposed to be considered.
The answer is O(N^2 log N).
First of all, the outer loop can be ignored, since it has a constant number of iterations, and hence only contributes by a constant factor. Also, the i = i+1 has no effect on the time complexity since it only manipulates the outer loop.
The println(i * j) statement has a time complexity of O(bitlength(i * j)), which is bounded by O(bitlength(N^2)) = O(log N^2) = O(log N) (and similarly for the other println-statement. Now, how often are these println-statement executed?
The two inner loops are nested and and both run from a constant up to something that is linear in N, so they iterate O(N^2) times. Hence the total time complexity is O(N^2 log N).
Im just going over some basic sorting algorithms. I implemented the below insertion sort.
public static int[] insertionSort(int[] arr){
int I = 0;
for(int i = 0; i < arr.length; i++){
for(int j = 0; j < i; j++){
if(arr[i] < arr[j]){
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
I++;
}
}
System.out.println(I);
return arr;
}
I prints out 4950 for a sized 100 array with 100 randomly generated integers.
I know the algorithm is considered O(n^2), but what would be the more arithmetically correct runtime? If it was actually O(N^2) Iim assuming, would print out 10,000 and not 4950.
Big-Oh notation gives us how much work an algorithm must do as the input size grows bigger. A single input test doesn't give enough information to verify the theoretical Big-Oh. You should run the algorithm on arrays of different sizes from 100 to a million and graph the output with the size of the array as the x-variable and the number of steps that your code outputs as the y-variable. When you do this, you will see that the graph is a parabola.
You can use algebra to get an function in the form y = a*x^2 + b*x +c that fits as close as possible to this data. But with Big-Oh notation, we don't care about the smaller terms because they grow insignificant compared to the x^2 part. For example, when x = 10^3, then x^2 = 10^6 which is much larger than b*x + c. If x = 10^6 then x^2 = 10^12 which again is so much larger than b*x + c that we can ignore these smaller terms.
You can make the following observations: On the ith iteration of the outer loop, the inner loop runs i times, for i from 0 to n-1 where n is the length of the array.
In total over the entire algorithm the inner loop runs T(n) times where
T(n) = 0 + 1 + 2 + ... + (n-1)
This is an arithmetic series and it's easy to prove the sum is equal to a second degree polynomial on n:
T(n) = n*(n-1)/2 = .5*n^2 - .5*n
For n = 100, the formula predicts the inner loop will run T(100) = 100*99/2 = 4950 times which matches what you calculated.
The objective is to create a function that accepts two arguments: and array of integers and an integer that is our target, the function should return the two indexes of the array elements that add up to the target. We cannot sum and element by it self and we should assume that the given array always contains and answer
I solved this code kata exercise using a a for loop and a while loop. The time complexity for a for loop when N is the total elements of the array is linear O(N) but for each element there is a while process hat also increases linearly.
Does this means that the total time complexity of this code is O(N²) ?
public int[] twoSum(int[] nums, int target) {
int[] answer = new int[2];
for (int i = 0; i <= nums.length - 1; i++){
int finder = 0;
int index = 0;
while ( index <= nums.length - 1 ){
if (nums[i] != nums[index]){
finder = nums[i] + nums[index];
if (finder == target){
answer[0] = index;
answer[1] = i;
}
}
index++;
}
}
return answer;
}
How would you optimize this for time and space complexity?
Does this means that the total time complexity of this code is O(N²) ?
Yes, your reasoning is correct and your code is indeed O(N²) time complexity.
How would you optimize this for time and space complexity?
You can use an auxilary data structure, or sort the array, and perform lookups on the array.
One simple solution, which is O(n) average case is using a hash table, and inserting elements while you traverse the list. Then, you need to lookup for target - x in your hash table, assuming the element currently traversed is x.
I am leaving the implementation to you, I am sure you can do it and learn a lot in the process!
So I was working on this sorting algorithm in java, and was wondering if anybody has seen anything like this before. Specifically, this algorithm is meant to sort very large lists with a very small range of numbers. If you have seen that or have any suggestions to enhance the algorithm, can you say something about that below? I have the code here:
public static int[] sort(int[] nums)
{
int lowest = Integer.MAX_VALUE;
for (int n : nums)
{
if (n < lowest)
lowest = n;
}
int index = 0;
int down = 0;
while (index < nums.length)
{
for (int i = index; i < nums.length; i++)
{
if (nums[i] == lowest)
{
int temp = nums[i] + down;
nums[i] = nums[index];
nums[index] = temp;
index++;
}
else
nums[i]--;
}
down++;
}
return nums;
}
If I'm not mistaken, that is your standard-issue BubbleSort. It's simple to implement but has poor performance: O(n^2). Notice the two nested loops: as the size of the array increases, the runtime of the algorithm will increase exponentially.
It's named Bubble Sort because the smallest values will "bubble" to the front of the array, one at a time. You can read more about it on Wikipedia.
So the algorithm seems to work but it does a lot of unnecessary work in the process.
Basically you are throwing in the necessity to subtract from a number x times before you add x back and try and swap it where x is the difference between the number and the lowest number in the array. Take [99, 1] for example. With your algorithm you would update the array to [98, 1] in the first for loop iteration and then the next you would make the swap [1, 98] then you have to make 97 more iterations to bring your down variable up to 98 and your array to [1,1] state then you add 98 to it and swap it with itself. Its an interesting technique for sure but its not very efficient.
The best algorithm for any given job really depends on what you know about your data. Look into other sorting algorithms to get a feel for what they do and why they do it. Make sure that you walk through the algorithm you make and try to get rid of unnecessary steps.
To enhance the algorithm first I would get rid of finding the lowest in the set and remove the addition and subtraction steps. If you know that your numbers will all be integers in a given range look into bucket sorting otherwise you can try merge or quicksort algorithms.