How to implement a k-way merge sort?

How to implement a k-way merge sort? - java

I need to implement a function which does a k-way merge sort on an unsorted array or integers.
The function takes in two parameters, an integer K, which is the "way" of the sort and always a power of 2. The second parameter is the array of integers to be sorted, whose length is also a power of 2.
The function is to return an array containing the sorted elements. So far, I know how to implement a regular merge sort. How would I modify this code so that it implements a K-way merge sort? (Note: this function doesn't return the sorted array, I need help with that as well. It also doesn't take in K, since its a regular merge sort)
Below code:
public class MergeSort {
public static void main(String[] args) {
}
public static void mergeSort(int[] inputArray) {
int size = inputArray.length;
if (size < 2)
return;
int mid = size / 2;
int leftSize = mid;
int rightSize = size - mid;
int[] left = new int[leftSize];
int[] right = new int[rightSize];
for (int i = 0; i < mid; i++) {
left[i] = inputArray[i];
}
for (int i = mid; i < size; i++) {
right[i - mid] = inputArray[i];
}
mergeSort(left);
mergeSort(right);
merge(left, right, inputArray);
}
public static void merge(int[] left, int[] right, int[] arr) {
int leftSize = left.length;
int rightSize = right.length;
int i = 0, j = 0, k = 0;
while (i < leftSize && j < rightSize) {
if (left[i] <= right[j]) {
arr[k] = left[i];
i++;
k++;
} else {
arr[k] = right[j];
k++;
j++;
}
}
while (i < leftSize) {
arr[k] = left[i];
k++;
i++;
}
while (j < leftSize) {
arr[k] = right[j];
k++;
j++;
}
}
}

Regular merge sort is two-way sorting. You compare elements from the first and the second halves of array and copy smallest to output array.
For k-way sorting you divide input array into K parts. K indexes point to the first elements of every part. To effectively choose the smallest of them, use priority queue (based on binary heap) and pop the smallest element from the heap top at every step. When you pop element belonging to the m-th part, push the next element from the same part (if it still exists)
Let you have array length 16 and k = 4.
The first recursion level calls 4 mergesorts for arrays copied from indexes 0..3, 4..7, 8..11, 12..15.
The second recursion level gets length 4 array and calls 4 mergesorts for 1-element arrays.
The third recursion level gets length 1 array and immediately returns (such array is sorted).
Now at the second recursion level you merge 4 one-element arrays into one sorted array.
Now at the first recursion level you merge 4 four-element arrays into one sorted array length 16

Related

How to make Quicksort code stable without libraries? (Java)

I am currently trying to make a Quicksort in java. The only problem is that I just can't make it stable (so that the order of reccurring elements is still the same). My code so far:
Update: Thank you for all your answers but sadly I 'm not allowed to use any libraries like java.util for LinkedLists etc.
public void quickSortStable(Integer[] data) {
int IndexZero = 0;
int IndexLength = data.length-1;
sortQuicksortArray(data, IndexZero, IndexLength);
}
public int createQuicksortPartition(Integer[] data, int IndexZero, int IndexLength){
int pivot = data[IndexLength];
int i = (IndexZero-1);
for (int j=IndexZero; j<IndexLength; j++)
{
if (data[j] < pivot)
{
i++;
int temp = data[i];
data[i] = data[j];
data[j] = temp;
}
}
int temp = data[i+1];
data[i+1] = data[IndexLength];
data[IndexLength] = temp;
return i+1;
}
public void sortQuicksortArray(Integer[] data, int IndexZero, int IndexLength){
if (IndexZero < IndexLength)
{
int partition = createQuicksortPartition(data, IndexZero, IndexLength);
sortQuicksortArray(data, IndexZero, partition-1);
sortQuicksortArray(data, partition+1, IndexLength);
}
}

The quicksort algorithm is not stable by nature.
There are already some good answers on quora.
In short, each partition is not stable, because quick sort may swap the outer elements before the middle elements.
for example,
// original
4(a) 4(b) 3 2(a) 2(b)
^ ^
// after first partition
2(b) 4(b) 3 2(a) 4(a)
^ ^
// after second partition
2(b) 2(a) 3 4(b) 4(a)
Since the partition is not stable, the overall algorithm cannot be stable.

You can make the quicksort sorting stable when you save the elements which are smaller or bigger than the pivot in two temporary lists (or one array) and add them back to the original input array when you have separated the values. The pseudo algorithm looks like this:
public int createQuicksortPartition(Integer[] data, int startIndex, int endIndex){
List<Integer> lower = new ArrayList<Integer>();
List<Integer> higher = new ArrayList<Integer>();
Integer pivot = data[endIndex];
for (int i=startIndex; i<endIndex; i++) {
if (data[i] < pivot) {
lower.add(data[i]);
} else {
higher.add(data[i]);
}
// readd them to the input array
for (int i=0; i<lower.size(); i++) {
data[startIndex+i] = lower.get(i);
}
data[startIndex+lower.size()] = pivot;
for (int i=0; i<higher.size(); i++) {
data[startIndex+lower.size()+i] = higher.get(i);
}
return startIndex+lower.size();
}
(untested pseudo code)
This obviously need O(n) additional space to have a copy of the data to sort. You also have to take extra care for the pivot element and the "higher" elements, which are equal to that pivot. These have to added before the pivot element to ensure that the sorting is stable, because the pivot element was the last element from the input array. In this case the ordering should be:
Elements smaller than the pivot
Elements equal the pivot
The pivot itself
Elements greater than the pivot
You can solve this by using three lists for smaller, equal and greater values and add them back to the input array accordingly.

How can I implement the recursion in my Quicksort algorithm?

I'm trying to implement quicksort in Java to learn basic algorithms. I understand how the algo works (and can do it on paper) but am finding it hard to write it in code. I've managed to do step where we put all elements smaller than the pivot to the left, and larger ones to the right (see my code below). However, I can't figure out how to implement the recursion part of the algo, so sort the left and right sides recursively. Any help please?
public void int(A, p, q){
if(A.length == 0){ return; }
int pivot = A[q];
j = 0; k = 0;
for(int i = 0; i < A.length; i++){
if(A[i] <= pivot){
A[j] = A[i]; j++;
}
else{
A[k] = A[i]; k++;
}
}
A[j] = pivot;
}

Big Disclaimer: I did not write this piece of code, so upvotes is not needed. But I link to a tutorial which explains quicksort in detail. Gave me a much needed refreshment on the algorithm as well! The example given has very good comments that might just help you to wrap your head around it.
I suggest you adapt it to your code and write som tests for it to verify it works
Quicksort is a fast, recursive, non-stable sort algorithm which works by the divide and conquer principle. Quicksort will in the best case divide the array into almost two identical parts. It the array contains n elements then the first run will need O(n). Sorting the remaining two sub-arrays takes 2 O(n/2). This ends up in a performance of O(n log n).
In the worst case quicksort selects only one element in each iteration. So it is O(n) + O(n-1) + (On-2).. O(1) which is equal to O(n^2).*
public class Quicksort {
private int[] numbers;
private int number;
public void sort(int[] values) {
// check for empty or null array
if (values ==null || values.length==0){
return;
}
this.numbers = values;
number = values.length;
quicksort(0, number - 1);
}
private void quicksort(int low, int high) {
int i = low, j = high;
// Get the pivot element from the middle of the list
int pivot = numbers[low + (high-low)/2];
// Divide into two lists
while (i <= j) {
// If the current value from the left list is smaller than the pivot
// element then get the next element from the left list
while (numbers[i] < pivot) {
i++;
}
// If the current value from the right list is larger than the pivot
// element then get the next element from the right list
while (numbers[j] > pivot) {
j--;
}
// If we have found a value in the left list which is larger than
// the pivot element and if we have found a value in the right list
// which is smaller than the pivot element then we exchange the
// values.
// As we are done we can increase i and j
if (i <= j) {
exchange(i, j);
i++;
j--;
}
}
// This is the recursion part you had trouble with i guess?
// Recursion
if (low < j)
quicksort(low, j);
if (i < high)
quicksort(i, high);
}
private void exchange(int i, int j) {
int temp = numbers[i];
numbers[i] = numbers[j];
numbers[j] = temp;
}
}
Link to tutorial

Making 2 Modifications to a Bubblesort Program

I have to make the following 2 modifications to a simple bubblesort program:
After the first pass, the largest number is guaranteed to be in the highest-numbered element of the array; after the second pass, the two highest numbers are “in place”; and so on. Instead of making nine comparisons on every pass, modify the bubble sort to make eight comparisons on the second pass, seven on the third, and so on.
The data in the array may already be in the proper order or near proper order, so why make nine passes if fewer will suffice? Modify the sort to check at the end of each pass if any swaps have been made. If none have been made, the data must already be in the proper order, so the program should terminate. If swaps have been made, at least one more pass is needed."
Any help as to how I should approach these would be greatly appreciated!
//sort elements of array with bubble sort
public static void bubbleSort (int array2[])
{
//loop to control number of passes
for (int pass = 1; pass < array2.length; pass++)
{
//loop to control number of comparisons
for (int element = 0; element < array2.length - 1; element++)
{
//compare side-by-side elements and swap them if
//first element is greater than second element
if (array2[element] > array2[element + 1]){
swap (array2, element, element + 1);
}
}
}
}
//swap two elements of an array
public static void swap (int array3[], int first, int second)
{
//temporary holding area for swap
int hold;
hold = array3[first];
array3[first] = array3[second];
array3[second] = hold;
}

I think this will do for you. A boolean is added to check and the run (j) is subtracted from the input.length for each run.
public static int[] bubbleSort(int input[])
{
int i, j, tmp;
bool changed;
for (j = 0; j < input.length; j++)
{
changed = false;
for (i = 1; i < input.length - j; i++)
{
if (tmp[i-1] > input[i])
{
tmp= input[i];
input[i] = input[i-1];
input[i-1] = tmp;
changed = true;
}
}
if (!changed) return input;
}
return input;
}

Algorithm to partition the array using the pivot element

I was trying to solve following programming exercise from some java programming book
Write method that partitions the array using the first element, called a pivot. After the partition, the elements in the list are rearranged so that all the elements before the pivot are less than or equal to the pivot and the elements after the pivot are greater than the pivot. The method returns the index where the pivot is located in the new list. For example, suppose the list is {5, 2, 9, 3, 6, 8}. After the partition, the list becomes {3, 2, 5, 9, 6, 8}. Implement the method in a way that takes at most array.length comparisons.
I've implemented solution, but it takes much more than array.length comparisons.
The book itself has solution, but unfortunately it's just plain wrong (not working with some inputs). I've seen the answer to this similar question, and understood "conquer" part of Quicksort algorithm, but in this algorithm values are partitioned using mid-value, but in my case using of 1st array value as a pivot is required.

This is the pivot routine from the linked answer (adapted from source here).
int split(int a[], int lo, int hi) {
// pivot element x starting at lo; better strategies exist
int x=a[lo];
// partition
int i=lo, j=hi;
while (i<=j) {
while (a[i]<x) i++;
while (a[j]>x) j--;
if (i<=j) swap(a[i++], a[j--]);
}
// return new position of pivot
return i;
}
The number of inter-element comparisons in this algorithm is either n or n+1; because in each main loop iteration, i and j move closer together by at exactly c units, where c is the number of comparisons performed in each of the inner while loops. Look at those inner loops - when they return true, i and j move closer by 1 unit. And if they return false, then, at the end of the main loop, i and j will move closer by 2 units because of the swap.
This split() is readable and short, but it also has a very bad worst-case (namely, the pivot ending at either end; follow the first link to see it worked out). This will happen if the array is already sorted either forwards or backwards, which is actually very frequent. That is why other pivot positions are better: if you choose x=a[lo+hi/2], worst-case will be less common. Even better is to do like Java, and spend some time looking for a good pivot to steer clear from the worst case. If you follow the Java link, you will see a much more sophisticated pivot routine that avoids doing extra work when there are many duplicate elements.

It seem that the algorithm (as taken from "Introduction to algorihtm 3rd ed") can be implemented as follows (C++) should be similar in Java`s generics:
template <typename T> void swap_in_place(T* arr, int a, int b)
{
T tmp = arr[a];
arr[a] = arr[b];
arr[b] = tmp;
}
template <typename T> int partition(T* arr, int l, int r)
{
T pivot = arr[r];
int i = l-1;
int j;
for(j=l; j < r; j++) {
if (arr[j] < pivot /* or cmp callback */) {
// preincrement is needed to move the element
swap_in_place<T>(arr, ++i, j);
}
}
// reposition the pivot
swap_in_place(arr, ++i, j);
return i;
}
template <typename T> void qsort(T* arr, int l, int r)
{
if ( l < r ) {
T x = partition<T>(arr, l, r);
qsort(arr, l, x-1);
qsort(arr, x+1, r);
}
}
However, its a simple pseudocode implementation, I dont know if it`s the best pivot to pick from. Maybe (l+r)/2 would be more proper.

Pretty simple solution with deque:
int [] arr = {3, 2, 5, 9, 6, 8};
Deque<Integer> q = new LinkedBlockingDeque<Integer>();
for (int t = 0; t < arr.length; t++) {
if (t == 0) {
q.add(arr[t]);
continue;
}
if (arr[t] <= arr[0])
q.addFirst(arr[t]);
else
q.addLast(arr[t]);
}
for (int t:q) {
System.out.println(t);
}
Output is:
2
3
5 <-- pivot
9
6
8

There is video that I made on Pivot based partition I explained both the methods of patitioning.
https://www.youtube.com/watch?v=356Bffvh1dA
And based on your(the other) approach
https://www.youtube.com/watch?v=Hs29iYlY6Q4
And for the code. This is a code I wrote for pivot being the first element and it takes O(n) Comparisons.
void quicksort(int a[],int l,int n)
{
int j,temp;
if(l+1 < n)
{
int p=l;
j=l+1;
for(int i=l+1;i<n;++i)
{
if(a[i]<a[p])
{
temp=a[i];
a[i]=a[j];
a[j]=temp;
j++;
}
}
temp=a[j-1];
a[j-1]=a[p];
a[p]=temp;
quicksort(a,l,j);
quicksort(a,j,n);
}
}

The partition function below works as follow:
The last variable points to the last element in the array that has not been compared to the pivot element and can be swapped.
If the element directly next to the pivot element is less than the pivot
element. They are swapped.
Else if the pivot element is less than the next element, the nextelement is swapped with the element whose index is the last variable.
static int partition(int[] a){
int pivot = a[0];
int temp, index = 0;
int last = a.length -1;
for(int i = 1; i < a.length; i++){
//If pivot > current element, swap elements
if( a[i] <= pivot){
temp = a[i];
a[i] = pivot;
a[i-1] = temp;
index = i;
}
//If pivot < current elmt, swap current elmt and last > index of pivot
else if( a[i] > pivot && last > i){
temp = a[i];
a[i] = a[last];
a[last] = temp;
last -= 1;
i--;
}
else
break;
}
return index;
}

For N equally sized arrays with integers in ascending order, how can I select the numbers common to arrays?

I was asked an algorithmic question today in an interview and i would love to get SO members' input on the same. The question was as follows;
Given equally sized N arrays with integers in ascending order, how would you select the numbers common to all N arrays.
At first my thought was to iterate over elements starting from the first array trickling down to the rest of the arrays. But then that would result in N power N iterations if i am right. So then i came up with a solution to add the count to a map by keeping the element as the key and the value as the counter. This way i believe the time complexity is just N. Following is the implementation in Java of my approach
public static void main(String[] args) {
int[] arr1 = { 1, 4, 6, 8,11,15 };
int[] arr2 = { 3, 4, 6, 9, 10,16 };
int[] arr3 = { 1, 4, 6, 13,15,16 };
System.out.println(commonNumbers(arr1, arr2, arr3));
}
public static List<Integer> commonNumbers(int[] arr1, int[] arr2, int[] arr3) {
Map<Integer, Integer>countMap = new HashMap<Integer, Integer>();
for(int element:arr1)
{
countMap.put(element, 1);
}
for(int element:arr2)
{
if(countMap.containsKey(element))
{
countMap.put(element,countMap.get(element)+1);
}
}
for(int element:arr3)
{
if(countMap.containsKey(element))
{
countMap.put(element,countMap.get(element)+1);
}
}
List<Integer>toReturn = new LinkedList<Integer>();
for(int key:countMap.keySet())
{
int count = countMap.get(key);
if(count==3)toReturn.add(key);
}
return toReturn;
}
I just did this for three arrays to see how it will work. Question talks about N Arrays though i think this would still hold.
My question is, is there a better approach to solve this problem with time complexity in mind?

Treat as 3 queues. While values are different, "remove" (by incrementing the array index) the smallest. When they match, "remove" (and record) the matches.
int i1 = 0;
int i2 = 0;
int i3 = 0;
while (i1 < array1.size && i2 < array2.size && i3 < array3.size) {
int next1 = array1[i1];
int next2 = array2[i2];
int next3 = array3[i3];
if (next1 == next2 && next1 == next3) {
recordMatch(next1);
i1++;
i2++;
i3++;
}
else if (next1 < next2 && next1 < next3) {
i1++;
}
else if (next2 < next1 && next2 < next3) {
i2++;
}
else {
i3++;
}
}
Easily generalized to N arrays, though with N large you'd want to optimize the compares somehow (NPE's "heap").

I think this can be solved with a single parallel iteration over the N arrays, and an N-element min-heap. In the heap you would keep the current element from each of the N input arrays.
The idea is that at each step you'd advance along the array whose element is at the top of the heap (i.e. is the smallest).
You'll need to be able to detect when the heap consists entirely of identical values. This can be done in constant time as long as you keep track of the largest element you've added to the heap.
If each array contains M elements, the worst-case time complexity of the would be O(M*N*log(N)) and it would require O(N) memory.

try
public static Set<Integer> commonNumbers(int[] arr1, int[] arr2, int[] arr3) {
Set<Integer> s1 = createSet(arr1);
Set<Integer> s2 = createSet(arr2);
Set<Integer> s3 = createSet(arr3);
s1.retainAll(s2);
s1.retainAll(s3);
return s1;
}
private static Set<Integer> createSet(int[] arr) {
Set<Integer> s = new HashSet<Integer>();
for (int e : arr) {
s.add(e);
}
return s;
}

This is how I learned to do it in an algorithms class. Not sure if it's "better", but it uses less memory and less overhead because it iterates straight through the arrays instead of building a map first.
public static List<Integer> commonNumbers(int[] arr1, int[] arr2, int[] arr3, ... , int[] arrN) {
List<Integer>toReturn = new LinkedList<Integer>();
int len = arr1.length;
int j = 0, k = 0, ... , counterN = 0;
for (int i = 0; i < len; i++) {
while (arr2[j] < arr1[i] && j < len) j++;
while (arr3[k] < arr1[i] && k < len) k++;
...
while (arrN[counterN] < arr1[i] && counterN < len) counterN++;
if (arr1[i] == arr2[j] && arr2[j] == arr3[k] && ... && arr1[i] == arrN[counterN]) {
toReturn.add(arr1[i]);
}
}
return toReturn;
}

This may be solved in O(M * N) with M being the length of arrays.
Let's see what happens for N = 2, this would be a sorted-list intersection problem, which has a classic merge-like solution running in O(l1 + l2) time. (l1 = length of first array, l2 = length of second array). (Find out more about Merge Algorithms.)
Now, let's re-iterate the algorithm N times in an inductive matter. (e.g. i-th time we will have the i-th array, and the intersection result of previous step). This would result in an overall O(M * N) algorithm.
You may also observe that this worst case upper-bound is the best achievable, since all the numbers must be taken into account for any valid algorithm. So, no deterministic algorithm with a tighter upper-bound may be founded.

Okay - maybe a bit naive here, but I think the clue is that the arrays are in ascending order. My java is rusty, but here is some pseduocode. I haven't tested it, so it's probably not perfect, but it should be a fast way to do this:
I = 1
J = 1
K = 1
While I <= Array1Count and J <= Array2Count and K <= Array3Count
If Array1(I) = Array2(J)
If Array1(I) = Array3(K)
=== Found Match
I++
J++
K++
else
if Array1(I) < Array3(K)
I++
end if
end if
else
If Array1(I) < Array2(J)
I++
else
if Array2(J) < Array3(K)
J++
else
K++
end if
end if
end if
Wend
This is Option Base 1 - you'd have to recode to do option base 0 (like java and other languages have)

I think another approach is to do similar thing to what we do in Mergesort: walk through all the arrays at the same time, getting identical numbers. This would take advantage of the fact that the arrays are in sorted order, and would use no additional space other than the output array. If you just need to print the common numbers, no extra space is used.
public static List<Integer> commonNumbers(int[] arrA, int[] arrB, int[] arrC) {
int[] idx = {0, 0, 0};
while (idxA<arrA.length && idxB<arrB.length && idxC<arrC.length) {
if ( arrA[idx[0]]==arrB[idx[1]] && arrB[idx[1]]==arrC[idx[2]] ) {
// Same number
System.out.print("Common number %d\n", arrA[idx[0]]);
for (int i=0;i<3;i++)
idx[i]++;
} else {
// Increase the index of the lowest number
int idxLowest = 0; int nLowest = arrA[idx[0]];
if (arrB[idx[1]] < nLowest) {
idxLowest = 1;
nLowest = arrB[idx[1]];
}
if (arrC[idx[2]] < nLowest) {
idxLowest = 2;
}
idx[idxLowest]++;
}
}
}
To make this more general you may want to take an arrays of arrays of ints, this will let you make the code more pretty. The array indeces must be stored in an array, otherwise it is hard to code the "increment the index that points to the lowest number" code.

public static List<Integer> getCommon(List<List<Integer>> list){
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
int c=0;
for (List<Integer> is : list) {
c++;
for (int i : is) {
if(map.containsKey(i)){
map.put(i, map.get(i)+1);
}else{
map.put(i, 1);
}
}
}
List<Integer>toReturn = new LinkedList<Integer>();
for(int key:map.keySet())
{
int count = map.get(key);
if(count==c)toReturn.add(key);
}
return toReturn;
}

Your solution is acceptable, but it uses NxM space. You can do it with O(N) space (where N is the number of arrays), or in O(1) space.
Solution #1 (By Luigi Mendoza)
Assuming there are many small arrays (M << N), this can be useful, resulting in O(M*N*Log M) time, and constant space (excluding the output list).
Solution #2
Scan the arrays in ascending order, maintaining a min-heap of size N, containing the latest visited values (and indices) of the arrays. Whenever the heap contains N copies of the same value, add the value to the output collection. Otherwise, remove the min value and advance with the corresponding list.
The time complexity of this solution is O(M*N*Log N)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to implement a k-way merge sort? - java

Related

How to make Quicksort code stable without libraries? (Java)

How can I implement the recursion in my Quicksort algorithm?

Making 2 Modifications to a Bubblesort Program

Algorithm to partition the array using the pivot element

For N equally sized arrays with integers in ascending order, how can I select the numbers common to arrays?

Categories

Resources