Hoare partitioning algorithm for duplicate pivot value - java

Following is a Hoare partitioning algorithm per Wikipedia.
Pseudo-code from Wikipedia:
algorithm partition(A, lo, hi) is
// Pivot value
pivot := A[ floor((hi + lo) / 2) ] // The value in the middle of the array
// Left index
i := lo - 1
// Right index
j := hi + 1
loop forever
// Move the left index to the right at least once and while the element at
// the left index is less than the pivot
do i := i + 1 while A[i] < pivot
// Move the right index to the left at least once and while the element at
// the right index is greater than the pivot
do j := j - 1 while A[j] > pivot
// If the indices crossed, return
if i ≥ j then return j
// Swap the elements at the left and right indices
swap A[i] with A[j]
A java implementation:
public class Main
{
public static void main(String[] args) {
int[] arr = new int[] { 2, 1, 2, 4, 3 };
Hoare.partition(arr, 0, 4);
for (int x : arr) System.out.println(x);
}
}
class Hoare
{
private static void Swap(int[] array, int i, int j)
{
int temp = array[i];
array[i] = array[j];
array[j] = temp;
}
public static int partition(int []arr, int low, int high)
{
int pivot = arr[(low + high) / 2]; // pivot is 2 for this case.
// Expected out is:
// 1 2 2 3 4
// or
// 1 2 2 4 3
//
// Actual output is:
// 2 1 2 4 3
// Since 3 and 4 are greater than 2, then the partitioning isn't working.
int i = low - 1;
int j = high + 1;
while(true)
{
do {
i++;
}
while (arr[i] < pivot);
do {
j--;
}
while (arr[j] > pivot);
if (i >= j)
{
return j;
}
Swap(arr, i, j);
}
}
}
Why is the output wrong (indicated in code comments)? Is that a known limitation of Hoare algorithm? Is my implementation or Wikipedia's pseudocode is incorrect?

The Hoare algorithm guarantees that all elements before the pivot are less than or equal to the pivot value and all elements after the pivot are greater than or equal to the pivot value.
That also means that the pivot value is at the correct index in the final sorted array. That is important for the quicksort algorithm.
The Hoare partition does not guarantee that all elements equal to the pivot value are consecutive. That is not a requirement of the quicksort algorithm, so there is no point spending any additional computational power guaranteeing it.
In other words, the implementation is correct; the result is expected; and it will work in a quicksort without problems.

Related

quickSort algorithm using median of medians as a pivot for partitioning

there is a lot of info on StackOverflow but I couldn't exactly figure it out the way I need it.
I'm trying to implement a quickSelect algorithm with a median of medians but not exactly making it work.
the algorithm is supposed to find the i-th smallest element in the array with the input of n>1 elements using these steps :
if n==1 return the element in the array.
divide the n elements in the array into groups of "groupSize" and one more group of "groupSize" -1 elements at most. the original algorithm is groups of 5 but i want it to be modular.
find the median of each of the (ceiling value) of n/groupSize by using insertion sort and picking the median of the sort sub-array.
call Select recursively to find the median "x" of ceiling value n/groupSize medians found in step 3. given that number of medians is even - "x" would be the bottom median.
this is the part i found trickiest - divide the input array around the median of medians "x" using partition (possibly hoare-parititon). place in k the number of elements in the lower area of the partition so "x" would be the k-th smallest element and the upper are of the partition will hold n-k elements.
if i=k return "x" , if i<k call Select recursively to find the i-th smallest element in the lower sub-array, if i>k call select to find the (i-k) smallest in the upper sub-array.
i don't exactly know how to execute section 5 and i feel this is the part that figures everything here
this is my Select
if(right-left+1==1) return array[left];
// full steps are the count of full groups in size of groupSize(left of decimal point)
int fullSteps = array.length/groupSize;
int semiSteps = (int)((double)(array.length)%groupSize);
int i;
//array of medians is the size of number of groups needed
int medianArraySize = (int) Math.ceil (((double)(array.length))/groupSize);
int [] medianArray = new int [medianArraySize];
print(medianArray);
// sort the entire array by cutting it to chunks of defined size
for(i=0;i<medianArraySize;i++){
insertionSort(array, i*groupSize,i*groupSize+groupSize);
//place the median of sorted sub-array into median array in place i
medianArray[i]=findMidean(array, i*groupSize, i*groupSize+groupSize);
//implement insertion sort on full groups to find median - ceilling value
}
int medianOfMedians = select(medianArray,0,medianArraySize-1,groupSize,medianArray[(medianArraySize-1)/2]);
//find median of medians without using recursion
// int medianOfMedians = medianArray[(medianArraySize-1)/2];
// System.out.println("median of medians is :" +medianOfMedians);
int pivot = hoarePartition(array,left,right,medianOfMedians);
int k = left-pivot +1;
if (pivot==k-1) return array[pivot];
else
if (pivot<k-1) return select(array,pivot +1,right,groupSize,ithSmallest);
else return select(array,left,pivot-1,groupSize,ithSmallest-k);
}
helper function - find median
*
* returns the median value for the desginated range in given array.
*/
private static int findMidean(int [] array, double start,int end){
int stopper = Math.min(array.length,end);
int index =(int)(Math.ceil(start+stopper-1)/2.0);
return array[index];
}
partition algorithm
public static int hoarePartition(int[] arr, int low, int high,int pivot)
{
int i = low, j = high;
while (true) {
// Find leftmost element greater
// than or equal to pivot
while (i<j && arr[i] < pivot){
i++;
}
// Find rightmost element smaller
// than or equal to pivot
while (j>i && arr[j] > pivot){
j--;
}
// If two pointers met.
if (i >= j)
return j;
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
// swap(arr[i], arr[j]);
}
}
and insertion sort
public static void insertionSort(int [] arr,int placeHolder,int end)
{
int stopper = Math.min(arr.length,end);
for (int i = placeHolder; i < stopper; ++i) {
int key = arr[i];
int j = i - 1;
/* Move elements of arr[0..i-1], that are
greater than key, to one position ahead
of their current position */
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j = j - 1;
}
arr[j + 1] = key;
}
}
If Hoare partition scheme is used, then the pivot and all elements equal to the pivot can end up anywhere. This means the case of i == k can't be used, and instead quickselect will have to recursively call itself until the base case of a single element is reached.
If Lomuto partition scheme is used, then the pivot is put in place and the index of the pivot is returned, so the i == k case can be used.

Can someone explain this quicksort algorithm to me?

I'm a little confused on quicksort.
For example, with this algorithm taken from programcreek.com using the middle element as the pivot point:
public class QuickSort {
public static void main(String[] args) {
int[] x = { 9, 2, 4, 7, 3, 7, 10 };
System.out.println(Arrays.toString(x));
int low = 0;
int high = x.length - 1;
quickSort(x, low, high);
System.out.println(Arrays.toString(x));
}
public static void quickSort(int[] arr, int low, int high) {
if (arr == null || arr.length == 0)
return;
if (low >= high)
return;
// pick the pivot
int middle = low + (high - low) / 2;
int pivot = arr[middle];
// make left < pivot and right > pivot
int i = low, j = high;
while (i <= j) {
while (arr[i] < pivot) {
i++;
}
while (arr[j] > pivot) {
j--;
}
if (i <= j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
i++;
j--;
}
}
// recursively sort two sub parts
if (low < j)
quickSort(arr, low, j);
if (high > i)
quickSort(arr, i, high);
}
}
Can someone explain the 2 recursive calls at the bottom, as well as why there is a need to create an i and j variable to copy the left and right markers.
Also, can someone explain the difference between a quicksort algorithm using the middle element vs using the first or last element as the pivot point? The code looks different in a sense that using the last / first element as the pivot point is usually written with a partition method instead of the code above.
Thanks!
Quicksort is based on divide and conquer method, first we take a pivot element and put all elements that are less than this pivot element on the left and all the elements that are greater than this pivot element on the right and after that we recursively perform the same thing on both sides of pivot for left side Quicksort(array,low,pivot-1) for right side Quicksort(array,low,pivot+1)
This was the answer of your first question
and now what is the difference between choosing the middle or first element as pivot so
when we choose first element as pivot after sorting when i becomes greater than j we swap the pivot element(first element) with j so that the element that we chose as pivot comes at the place where all elements less than it comes at the left side and all elements greater than it comes it the right side.
and when we choose the middle element as pivot its already in the middle so there's no need to swap it.
This is a variation of Hoare partition scheme. The "classic" Hoare partition scheme increments i and decrements j before comparing to pivot. Both this example and the questions example include the partition logic in the main function.
void quickSort(int a[], size_t lo, size_t hi)
{
int pivot = a[lo+(hi-lo)/2];
int t;
if(lo >= hi)
return;
size_t i = lo-1;
size_t j = hi+1;
while(1)
{
while (a[++i] < pivot);
while (a[--j] > pivot);
if (i >= j)
break;
t = a[i];
a[i] = a[j];
a[j] = t;
}
QuickSort(a, lo, j);
QuickSort(a, j+1, hi);
}
The questions code increments i and decrements j after comparing to pivot.
The partition logic splits up a partition so that the left side <= pivot, right side >= pivot. The pivot and elements equal to pivot can end up anywhere on either side, and may not end up in their sorted position until a base case of a sub-array of size 1 is reached.
The reason for using the middle element for pivot is that choosing the first or last element for pivot will result in worst case time complexity of O(n^2) if the array is already sorted or reverse sorted. The example in this answer will fail if the last element is used for pivot (but the questions example will not).

Mergesort doesn't sort my array correctly

I'm trying to sort this array:
[5 1 3 2 2 9 1]
I've ran the debugger and have been getting strange results. I put a break point on the line
merge(left, right);
and after the first pass these are the sub-arrays:
left = [1 3 5 2 2]
right = [9]
so I'm pretty sure the 'divide' part is where I'm going wrong. Can somebody have a look at my code and see if they can spot the error?
public int[] arrayToSort = { 5, 1, 3, 2, 2, 9, 1 };
...
private void mergeSort(int[] array, int low, int high) {
// Base case
if (!(high - low <= 1)) {
int mid = low + (high - low) / 2;
// Split array into sub-arrays
int[] left = new int[mid];
int[] right = new int[high - mid];
// Fill sub-arrays
for (int i = low; i < mid; i++) {
left[i - low] = array[i];
}
for (int i = mid; i < high; i++) {
right[i - mid] = array[i];
}
mergeSort(left, low, mid);
mergeSort(right, mid + 1, high);
merge(left, right);
} else {
return;
}
}
private void merge(int[] left, int[] right) {
int lLength = left.length;
int rLength = right.length;
// Pointer for left array
int i = 0;
// Pointer for right array
int j = 0;
// Pointer for merged array
int k = 0;
while (i < lLength && j < rLength) {
if (left[i] < right[j]) {
arrayToSort[k] = left[i];
i++;
} else {
arrayToSort[k] = right[j];
j++;
}
k++;
}
while (i < lLength) {
arrayToSort[k] = left[i];
i++;
k++;
}
while (j < rLength) {
arrayToSort[k] = right[j];
j++;
k++;
}
}
Appreciate any help!
edit: Noticed a mistake in my code (a parameter wasn't getting used) and modified it. Here is the error I get now (from the above code):
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 4
at Launcher.mergeSort(Launcher.java:43)
at Launcher.mergeSort(Launcher.java:49)
at Launcher.main(Launcher.java:21)
So it complains at the line
left[i - low] = array[i];
Just looking at the code, the second recursive call should be:
mergeSort(right, mid, high);
More space than is needed is allocated for left:
int[] left = new int[mid-low];
Merge() function needs a low parameter to know where to start the merge at (k = low, not k = 0).
I wonder why bottom up merge sort is rarely taught these days, as it's simpler to understand.
Can you spot the problem in this snippet? ->
int mid = low + (high - low) / 2;
int[] left = new int[mid];
int[] right = new int[high - mid];
// ...
mergeSort(left, low, mid);
mergeSort(right, mid + 1, high);
What about the size of left and right, and the indexes passed to the recursive call to mergeSort?
The indexes you pass, are indexes in the received array, not in left and right.
The problem is most obvious when you look at right. The index high will be inevitably always out of bounds.
The above would make more sense like this:
mergeSort(left, 0, left.length);
mergeSort(right, 0, high.length);
Now the indexes are correct. They are also pointless, because now they are simply the full range of the sub-arrays, and as such not needed at all. This weirdness is just a symptom of a deeper problem.
Keep in mind that merge must eventually update the original array. For that it needs the reference to the original array, and the left and right ranges. You are passing to merge the left and right sub-arrays. The reference to the original array and the indexes of the ranges are lost in your implementation.
Consider this:
int mid = low + (high - low) / 2;
mergeSort(array, low, mid);
mergeSort(array, mid + 1, high);
merge(array, low, mid, high);
Notice that:
the original array reference is preserved, as it's passed down to recursive calls
the indexes are valid indexes in the original array
there are no more sub-arrays here
Change your implementation of merge accordingly, to merge the specified ranges in the array, modifying that array. An easy way to do it is to merge the ranges into a work array of size high - low, and then copying from the merged work array back to the range in the original.

Algorithm to partition the array using the pivot element

I was trying to solve following programming exercise from some java programming book
Write method that partitions the array using the first element, called a pivot. After the partition, the elements in the list are rearranged so that all the elements before the pivot are less than or equal to the pivot and the elements after the pivot are greater than the pivot. The method returns the index where the pivot is located in the new list. For example, suppose the list is {5, 2, 9, 3, 6, 8}. After the partition, the list becomes {3, 2, 5, 9, 6, 8}. Implement the method in a way that takes at most array.length comparisons.
I've implemented solution, but it takes much more than array.length comparisons.
The book itself has solution, but unfortunately it's just plain wrong (not working with some inputs). I've seen the answer to this similar question, and understood "conquer" part of Quicksort algorithm, but in this algorithm values are partitioned using mid-value, but in my case using of 1st array value as a pivot is required.
This is the pivot routine from the linked answer (adapted from source here).
int split(int a[], int lo, int hi) {
// pivot element x starting at lo; better strategies exist
int x=a[lo];
// partition
int i=lo, j=hi;
while (i<=j) {
while (a[i]<x) i++;
while (a[j]>x) j--;
if (i<=j) swap(a[i++], a[j--]);
}
// return new position of pivot
return i;
}
The number of inter-element comparisons in this algorithm is either n or n+1; because in each main loop iteration, i and j move closer together by at exactly c units, where c is the number of comparisons performed in each of the inner while loops. Look at those inner loops - when they return true, i and j move closer by 1 unit. And if they return false, then, at the end of the main loop, i and j will move closer by 2 units because of the swap.
This split() is readable and short, but it also has a very bad worst-case (namely, the pivot ending at either end; follow the first link to see it worked out). This will happen if the array is already sorted either forwards or backwards, which is actually very frequent. That is why other pivot positions are better: if you choose x=a[lo+hi/2], worst-case will be less common. Even better is to do like Java, and spend some time looking for a good pivot to steer clear from the worst case. If you follow the Java link, you will see a much more sophisticated pivot routine that avoids doing extra work when there are many duplicate elements.
It seem that the algorithm (as taken from "Introduction to algorihtm 3rd ed") can be implemented as follows (C++) should be similar in Java`s generics:
template <typename T> void swap_in_place(T* arr, int a, int b)
{
T tmp = arr[a];
arr[a] = arr[b];
arr[b] = tmp;
}
template <typename T> int partition(T* arr, int l, int r)
{
T pivot = arr[r];
int i = l-1;
int j;
for(j=l; j < r; j++) {
if (arr[j] < pivot /* or cmp callback */) {
// preincrement is needed to move the element
swap_in_place<T>(arr, ++i, j);
}
}
// reposition the pivot
swap_in_place(arr, ++i, j);
return i;
}
template <typename T> void qsort(T* arr, int l, int r)
{
if ( l < r ) {
T x = partition<T>(arr, l, r);
qsort(arr, l, x-1);
qsort(arr, x+1, r);
}
}
However, its a simple pseudocode implementation, I dont know if it`s the best pivot to pick from. Maybe (l+r)/2 would be more proper.
Pretty simple solution with deque:
int [] arr = {3, 2, 5, 9, 6, 8};
Deque<Integer> q = new LinkedBlockingDeque<Integer>();
for (int t = 0; t < arr.length; t++) {
if (t == 0) {
q.add(arr[t]);
continue;
}
if (arr[t] <= arr[0])
q.addFirst(arr[t]);
else
q.addLast(arr[t]);
}
for (int t:q) {
System.out.println(t);
}
Output is:
2
3
5 <-- pivot
9
6
8
There is video that I made on Pivot based partition I explained both the methods of patitioning.
https://www.youtube.com/watch?v=356Bffvh1dA
And based on your(the other) approach
https://www.youtube.com/watch?v=Hs29iYlY6Q4
And for the code. This is a code I wrote for pivot being the first element and it takes O(n) Comparisons.
void quicksort(int a[],int l,int n)
{
int j,temp;
if(l+1 < n)
{
int p=l;
j=l+1;
for(int i=l+1;i<n;++i)
{
if(a[i]<a[p])
{
temp=a[i];
a[i]=a[j];
a[j]=temp;
j++;
}
}
temp=a[j-1];
a[j-1]=a[p];
a[p]=temp;
quicksort(a,l,j);
quicksort(a,j,n);
}
}
The partition function below works as follow:
The last variable points to the last element in the array that has not been compared to the pivot element and can be swapped.
If the element directly next to the pivot element is less than the pivot
element. They are swapped.
Else if the pivot element is less than the next element, the nextelement is swapped with the element whose index is the last variable.
static int partition(int[] a){
int pivot = a[0];
int temp, index = 0;
int last = a.length -1;
for(int i = 1; i < a.length; i++){
//If pivot > current element, swap elements
if( a[i] <= pivot){
temp = a[i];
a[i] = pivot;
a[i-1] = temp;
index = i;
}
//If pivot < current elmt, swap current elmt and last > index of pivot
else if( a[i] > pivot && last > i){
temp = a[i];
a[i] = a[last];
a[last] = temp;
last -= 1;
i--;
}
else
break;
}
return index;
}

Considering the first element as the pivot,why won't index i exceed the bound of array?

Inside the two while loops of partition method, why it seems that whether index i exceed the bound of array is not being considered from first sight?[This is the right code from Big Java, I've tested already, just the index stuff confuses me]
public void sort(int from, int to)
{
if (from >= to) return;
int p = partition(from, to);
sort(from, p);
sort(p + 1, to);
}
private int partition(int from, int to)
{
int pivot = a[from];
int i = from - 1;
int j = to + 1;
while (i < j)
{
i++; while (a[i] < pivot) i++;//here
j--; while (a[j] > pivot) j--;//here
if (i < j) swap(i, j);
}
return j;
}
Since the pivot is chosen from the same array and due to how the logic of the algorithm is implemented you never need to check for the indices to go out of bounds. At some point of the execution the conditions must turn true.
The correctness of the algorithm can be proved using loop invariants.
1. private int partition(int from, int to)
2. {
3. int pivot = a[from];
4. int i = from - 1;
5. int j = to + 1;
6. while (i < j)
7. {
8. i++;
9. // at least one of a[i]...a[to] is greater than or equal to pivot
10. while (a[i] < pivot) i++;
11. j--;
12. // at least one of a[from]...a[j] is less than or equal to pivot
13. while (a[j] > pivot) j--;//here
14. if (i < j) swap(i, j);
15. // if i < j then at least one of a[i + 1]...a[to] is greater than or equal to pivot
16. // if i < j then at least one of a[from]...a[j - 1] is less than or equal to pivot
17. }
18. return j;
19. }
Lines 9 and 12 (and 15, 16) contain the invariants that hold true for every iteration of the loop 6 to 17. From these invariants it is clear that i and j indices can never go out of array bounds.
We can prove only the invariant on line 9, the invariant on line 12 can be proved analogously.
For the 1st iteration it is true because the pivot is chosen as a[from] and i = from.
At the end of every iteration (including the 1st iteration) we move the element at position i that is greater than or equal to pivot to position j. Because i < j then the invariant on line 15 holds true. On the next iteration after incrementing i on line 8 the invariant 9 becomes valid which follows directly from the invariant 15. By induction we can conclude that the invariant 9 is valid on every iteration of the loop 6 to 17.
If we chose pivot as last element of array i.e. a[to] the invariants would still hold true. However we would need to change the flow in the sort method.
sort(from, p == to ? p - 1 : p);
sort(p + 1, to);
instead of
sort(from, p);
sort(p + 1, to);
In the first iteration both indices cannot pass the pivot element, since i < pivotIndex < j. Therefore you cannot pass the bounds in the first iteration (provided indices are in valid range and from <= to; also indices are within range after the increment/decrement statements before the loops).
In all iterations after the first the indices cannot become smaller than from or larger than to, since i < j and the swap-call in last loop iteration placed an element that makes the respective loop conditions false at indices i and j respectively: For the element at position j a[j] > pivot was false, but that element was moved to position i < j and for the element at position i a[i] < pivot was false, but that element was moved to position j > i.
In your main partition loop you can see that i and j start at each end of the array and work towards the pivot while the element at that location is < or > the pivot. They both must stop at the pivot chosen so they will never escape the array.
int[] a;
private void sort() {
sort(0, a.length - 1);
}
public void sort(int from, int to) {
if (from >= to) {
return;
}
int p = partition(from, to);
sort(from, p);
sort(p + 1, to);
}
private int partition(int from, int to) {
int pivot = a[from];
int i = from - 1;
int j = to + 1;
while (i < j) {
i++;
while (a[i] < pivot) {
i++;
}
j--;
while (a[j] > pivot) {
j--;
}
if (i < j) {
swap(i, j);
}
}
return j;
}
private void swap(int i, int j) {
int t = a[i];
a[i] = a[j];
a[j] = t;
}
public void test() {
System.out.println("Hello");
a = new int[]{10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0};
sort();
System.out.println(Arrays.toString(a));
}
Note that using numbers[low] as the pivot merely degrades performance - the algorithm still sorts the array correctly.

Categories

Resources