I am currently working with mergeSort. I have come across a task that ask me specifically to NOT use temporary arrays to create a mergeSort. So recursion is the way to go. Here's my code:
UPDATE: Posted the rest of the code, by request.
public class RecursiveMergeSort {
public static void mergeSort(int[] list){
mergeSort(list, 0, list.length - 1);
}
private static void mergeSort(int[] list, int low, int high){
if(low < high){
//recursive call to mergeSort, one for each half
mergeSort(list, low, (high/2));
mergeSort(list, list.length/2, high);
int[] temp = merge(list, low, high);
System.arraycopy(temp, 0, list, low, high - low + 1);
}
}
private static int[] merge(int[] list, int low, int high){
int[] temp = new int[high - low + 1];
int mid = (high/2) + 1;
if(list[low] < list[mid] && mid < list.length){
temp[low] = list[low];
temp[mid] = list[mid];
}
if(list[low] > list[mid] && mid < list.length){
temp[low] = list[mid];
temp[mid] = list[low];
}
low++;
mid++;
return temp;
}
public static void main(String[] args) {
int[] list = {2, 3, 4, 5};
mergeSort(list);
for(int i = 0; i < list.length; i++){
System.out.println(list[i] + " ");
}
}
}
I'm supposed to recursively divide and conquer this. However, I get stuck at a infinite loop that causes stack overflow(naturally) for the second half. And I am at a complete loss for figuring out a gentle and smooth way to tell my method to keep splitting. Bear in mind that the if statement in my snippet is supposed to be there, by courtesy of our teacher.
The low and high values are the lowest and the highest Index for the array passed in the method.
Need a pointer please.
Related
I've tried benchmarking and for some reason when trying both of them on array of 1M elements the Mergesort sorted it in 0.3s and Quicksort took 1.3s.
I've heard that generally quicksort is faster, because of its memory management, but how would one explain these results?
I am running MacBook Pro if that makes any difference. The input is a set of randomly generated integers from 0 to 127.
The codes are in Java:
MergeSort:
static void mergesort(int arr[]) {
int n = arr.length;
if (n < 2)
return;
int mid = n / 2;
int left[] = new int[mid];
int right[] = new int[n - mid];
for (int i = 0; i < mid; i++)
left[i] = arr[i];
for (int i = mid; i < n; i++)
right[i - mid] = arr[i];
mergesort(left);
mergesort(right);
merge(arr, left, right);
}
public static void merge(int arr[], int left[], int right[]) {
int nL = left.length;
int nR = right.length;
int i = 0, j = 0, k = 0;
while (i < nL && j < nR) {
if (left[i] <= right[j]) {
arr[k] = left[i];
i++;
} else {
arr[k] = right[j];
j++;
}
k++;
}
while (i < nL) {
arr[k] = left[i];
i++;
k++;
}
while (j < nR) {
arr[k] = right[j];
j++;
k++;
}
}
Quicksort:
public static void quickSort(int[] arr, int start, int end) {
int partition = partition(arr, start, end);
if (partition - 1 > start) {
quickSort(arr, start, partition - 1);
}
if (partition + 1 < end) {
quickSort(arr, partition + 1, end);
}
}
public static int partition(int[] arr, int start, int end) {
int pivot = arr[end];
for (int i = start; i < end; i++) {
if (arr[i] < pivot) {
int temp = arr[start];
arr[start] = arr[i];
arr[i] = temp;
start++;
}
}
int temp = arr[start];
arr[start] = pivot;
arr[end] = temp;
return start;
}
Your implementations are a bit simplistic:
mergesort allocates 2 new arrays at each recursive call, which is expensive, yet some JVMs are surprisingly efficient at optimising such coding patterns.
quickSort uses a poor choice of pivot, the last element of the subarray, which gives quadratic time for sorted subarrays, including those with identical elements.
The data set, an array with pseudo-random numbers in a small range 0..127, causes the shortcoming of the quickSort implementation to perform much worse than the inefficiency of the mergesort version. Increasing the dataset size should make this even more obvious and might even cause a stack overflow because of too many recursive calls. Data sets with common patterns such as identical values, increasing or decreasing sets and combinations of such sequences would cause catastrophic performance of the quickSort implementation.
Here is a slightly modified version with less pathological choice of pivot (the element at 3/4 of the array) and a loop to detect duplicates of the pivot value to improve efficiency on datasets with repeated values. It performs much better (100x) on my standard sorting benchmark with arrays of just 40k elements, but still much slower (8x) than radixsort:
public static void quickSort(int[] arr, int start, int end) {
int p1 = partition(arr, start, end);
int p2 = p1;
/* skip elements identical to the pivot */
while (++p2 <= end && arr[p2] == arr[p1])
continue;
if (p1 - 1 > start) {
quickSort(arr, start, p1 - 1);
}
if (p2 < end) {
quickSort(arr, p2, end);
}
}
public static int partition(int[] arr, int start, int end) {
/* choose pivot at 3/4 or the array */
int i = end - ((end - start + 1) >> 2);
int pivot = arr[i];
arr[i] = arr[end];
arr[end] = pivot;
for (i = start; i < end; i++) {
if (arr[i] < pivot) {
int temp = arr[start];
arr[start] = arr[i];
arr[i] = temp;
start++;
}
}
int temp = arr[start];
arr[start] = pivot;
arr[end] = temp;
return start;
}
For the OP's dataset, assuming decent randomness of the distribution, scanning for duplicates is responsible for the performance improvement. Choosing a different pivot, be it first, last, middle, 3/4 or 2/3 or even median of 3 has almost no impact, as expected.
Further testing on other non random distributions shows catastrophic performance for this quickSort implementation due to the choice of pivot. On my benchmark, much improved performance is obtained by choosing for pivot the element at 3/4 or 2/3 of the array (300x improvement for 50k samples, 40% faster than standard merge sort and comparable time to radix_sort).
Mergesort has the distinct advantage of being stable and predictable for all distributions, but it requires extra memory between 50% and 100% of the size of the dataset.
Carefully implemented Quicksort is somewhat faster in many cases and performs in place, requiring only log(N) stack space for recursion. Yet it is not stable and tailor made distributions will exhibit catastrophic performance, possibly crashing.
Radixsort is only appropriate for specific kinds of data such as integers and fixed length strings. It also requires extra memory.
Countingsort would be the most efficient for the OP's dataset as it only needs an array of 128 integers to count the number of occurrences of the different values, known to be in the range 0..127. It will execute in linear time for any distribution.
I have some questions about my code. I've marked it with ---><--- down below.
public class Main {
public static void main(String[] args) {
int[] arr = {5, 4, 3, 2, 1, 4, 5, 6, 7, 8, 10};
int[] aux = new int[arr.length];
sort(arr, aux, 0, arr.length - 1);
for (int i = 0; i < arr.length; i++) {
System.out.print(arr[i] + " ");
}
}
public static void sort(int[] arr, int[] aux, int low, int high) {
// what does these lines do? --->
if (low >= high) {
return;
}
int mid = low + (high - low) / 2; // why cant it just be high - low / 2
//<--- These lines
sort(arr, aux, low, mid); //sorts left side
sort(arr, aux, mid + 1, high); //sorts right side
merge(arr, aux, low, mid, high); //merges the two sides
}
public static void merge(int[] arr, int[] aux, int low, int mid, int high) {
for (int k = low; k <= high; k++) {
aux[k] = arr[k];
}
//copies the array into an aux array
int i = low; //counter for the left side
int j = mid + 1; //counter for the right side
for (int k = low; k <= high; k++) {
if (i > mid) { //if i > mid meaning that if the left side of the array is empty then use the right side
arr[k] = aux[j++];
}
else if (j > high) { //if j > high then right side of array has been used so use left
arr[k] = aux[i++];
}
else if (aux[i] <= aux[j]) { //if value of left side is <= value of right then bring leftside value up to original array
arr[k] = aux[i++];
}
else { //value of right side is <= value of left so bring rightside value up to original array
arr[k] = aux[j++];
}
}
}
}
This is a part of the merge from GeeksForGeeks
void merge(int arr[], int l, int m, int r) {
/* Create temp arrays */
// These lines --->
int L[] = new int [n1];
int R[] = new int [n2];
//<---
}
With G4Geek's temp arrays:
Is it getting created every single time merge is called?
Does the memory just stay there?
Is the current code that I have better practice?
Thanks.
Recently started learning Java and during implementing the "binary search" found out that if I call the Arrays.sort() on the array I am about to search in, it makes the loop to be infinite. Removing/commenting out the line solves the problem, but I cannot get why. My intension is to pass the sorted array to the .binarySearch() method. Tried to figure out with debugger but could not. Don't want to leave this question without the answer, can anyone, please, help?
import java.util.Arrays;
public class Main {
static class BinarySearch {
int binarySearch(int[] array, int value) {
int low = 0;
int high = array.length - 1;
while (low <= high) {
int mid = low + high / 2;
int guess = array[mid];
if (guess == value) {
return mid;
} else if (guess > value) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return -1;
}
}
public static void main(String[] args) {
BinarySearch bs = new BinarySearch();
int[] a = {1, 3, 4, 45, 54, 666, 2, 4};
Arrays.sort(a);
int result = bs.binarySearch(a, 45);
if (result == -1) {
System.out.println("value not found");
} else {
System.out.println("value found at position: " + result);
}
}
}
Arrays.sort doesn't really have anything to do with you ending up with an infinite loop. It's the way you calculate mid.
Should've been (low + high) / 2. You seem to have forgot to add parentheses.
This question already has answers here:
What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it?
(26 answers)
Closed 4 years ago.
Im trying to make a Quicksort but it always showing up the Error ArrayIndexOutOfBoundsException.
public class Quicksort
{
void sort(int[] arr)
{
_quicksort(arr, 0, arr.length - 1);
}
private void _quicksort(int[] arr, int left, int right)
{
int pivot = (left + right)/2;
int l = left;
int r = right;
while (l <= r)
{
while (arr[l] < pivot)
{
l++;
}
while (arr[r] > pivot)
{
r--;
}
if(l <= r)
{
int temp = l;
l = r;
r = temp;
l++;
r++;
}
}
if(left < r)
{
_quicksort(arr, left, r);
}
if(l < right)
{
_quicksort(arr, l, right);
}
}
}
Does someone know why it doesnt run? It always gives a Error.
The Error message is
java.lang.ArrayIndexOutOfBoundsException: -1
at Quicksort._quicksort(Quicksort.java:18)
at Quicksort._quicksort(Quicksort.java:33)
at Quicksort.sort(Quicksort.java:5)
Error Message
It seems like there are a couple of issues with your code. I've listed them below:
The variable pivot stores the index of the pivot element and not the actual value. So, you can't use pivot for comaparison as you have done in the 2 nested while loops. You need arr[pivot] instead of pivot there.
Imagine arr looks like {1, 1, 1, 3, 2, 2, 2}. Here, pivot will be equal to 3 i.e. arr[pivot] will be equal to 3. Now, after both the nested while loops terminate, l will be equal to 3 and r will remain equal to 6. You then swap arr[l] and arr[r] and increment both l and r. Since l is still less than equal to r, the outer while loop runs for a second time and you'll get an ArrayIndexOutOfBoundsExecption when the control reaches the second nested while loop. This happens because you're trying to access arr[7] (Out of Bounds).
Here's my code:
class Quicksort
{
void sort(int[] arr)
{
myQuicksort(arr, 0, arr.length - 1);
}
private void myQuicksort(int[] arr, int l, int r) {
if (l >= r) {
return;
}
int pivotIndex = (l + r) / 2;
swap (arr, r, pivotIndex);
int pivotValue = arr[r];
int swapIndex = 0;
int currentIndex = 0;
while (currentIndex != r) {
if (arr[currentIndex] < pivotValue) {
swap(arr, currentIndex, swapIndex);
swapIndex++;
}
currentIndex++;
}
swap(arr, r, swapIndex);
myQuicksort(arr, l, swapIndex - 1);
myQuicksort(arr, swapIndex + 1, r);
}
private void swap(int[] arr, int i, int j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
}
public class Main{
public static void main(String[] args) {
Quicksort quicksort = new Quicksort();
int[] arr = {3, 7, 1, 0, 4};
quicksort.sort(arr);
for (int i : arr) {
System.out.println(i);
}
}
}
You should read up on Quicksort. But here are the main points:
Choose a random pivot element and swap it with the last element. This makes the implementation much simpler.
Loop over the input array and keep a track of a swapIndex such that everything before the swapIndex is less than the pivotValue and everything from the swapIndex till the currentIndex is greater than (or equal) the pivotValue.
After the loop runs out, swap the element at swapIndex with the pivot. This inserts the pivot in its correct position.
The pivot divides the array into 2 subarrays. Call myQuicksort on these 2 subarrays.
I am working on a program that uses quick sort to sort an array to display the running time. This program selects the median of three random numbers as the pivot. However, I am not sure if I am doing something wrong because every time I run this program I get the following errors:
"java.lang.StackOverflowError",
at java.util.Random.nextInt(Unknown Source)
at SortingItems.quickSort(SortingItems.java:31)
at SortingItems.quickSort(SortingItems.java:69) " I get this specific message multiple times"
This is the code:
import java.util.Random;
public class SortingItems {
static int x = 1000;
static int array [] = new int[x];
public static void main(String[] args) {
long time;
input(x, array);
time = System.nanoTime();
sort(array);
System.out.println("Running time in ascending order " + (System.nanoTime() - time));
}
public static void sort(int[] array) {
quickSort(array, 0, array.length - 1);
}
public static void quickSort(int[] array, int low, int high) {
if (array == null || array.length == 0)
return;
if (low >= high)
return;
// Generating random numbers from 0 to the last element of the array
Random f = new Random();
int first = f.nextInt(array.length - 1);
Random s = new Random();
int second = s.nextInt(array.length - 1);
Random t = new Random();
int third = t.nextInt(array.length - 1);
// Selecting the pivot
int pivot = Math.max(Math.min(array[first], array[second]),
Math.min(Math.max(array[first],array[second]), array[third]));
int i = low;
int j = high;
while (i <= j) {
while (array[i] < pivot) {
i++;
}
while (array[j] > pivot) {
j--;
}
if (i <= j) {
int temp = array[i];
array[i] = array[j];
array[j] = temp;
i++;
j--;
}
}
if (low < j)
quickSort(array, low, j);
if (high > i)
quickSort(array, i, high);
}
// Input in ascending order
public static int[] input(int x, int[] array) {
for (int k = 0; k < x; k++) {
array[k] = k;
}
return array;
}
}
The problem seems to be with how the median is being selected here. The pivots should be chosen from between the current high and low positions and not the entire array.
Note that your test-input is in ascending order. Now, as you select three random values from the entire array of size 1000 for pivoting, when it comes to the smaller recursive cases, the arrays tend to be lopsided with a very high probability. This means that another level of recursion would be called with the same input parameters. In the long run, this is likely to cause a stack overflow with a high probability.
The following change should fix your issue:
// Generating random numbers from low to high
Random f = new Random();
int first = f.nextInt(high-low) + low;
Random s = new Random();
int second = s.nextInt(high-low) + low;
Random t = new Random();
int third = t.nextInt(high-low) + low;