What is wrong with this merge sort? - java

int[] input = [6,5,4,3,2,1]
mergeSort(input,0,input.length)
void mergeSort(int[] A,int p,int r){
if(p < r){
int q = (int) Math.floor((p+r)/2)
mergeSort(A,p,q)
mergeSort(A,q+1,r)
merge(A,p,q,r)
}
}
void merge(int[] A,int p,int q,int r){
int i = 0
int j = 0
int n1 = q - p + 1
int n2 = r - q
int[] L = new int[n1]
int[] R = new int[n2]
for(;i<n1;i++){ L[i] = A[p+i-1] }
for(;j<n2;j++){ R[j] = A[q+j] }
i = j = 0
for(int k = p; k < r; k++){
if(L[i] <= R[j]){ A[k] = L[i++] }
else{ A[k] = R[j++] }
}
}
This is a direct implementation of merge sort from the Introduction to Algorithms book. Although it looks correct, it ends with an ArrayIndexOutOfBounds exception.
I have been trying to debug it but couldn't. I'd like to know what's going wrong and how to correct it.
Runnable Example: http://ideone.com/GhuuSd

Firstly, although this is Groovy and therefore the negative index highlighted by #Useless won't throw an exception, it is indicative of a problem.
I think you have two problems: Caveat: I'm not familiar with the exact presentation in "Introduction to Algorithms"
Your indices into A to populate L and R are off by one - they should be [p+i] (thus avoiding the possible negative index issue) and [q+j+1]. Note with this amendment you need to pass in length-1 as the starting argument for r
in your test at the bottom to determine which of L and R to use in re-populating elements of A, you do not check for i or j running beyond the length of those arrays (i.e. is i >= n1 or j >= n2). In these cases (i.e. one of L or R has "run out" of members), you should use the other array.
Code below works for your example, I haven't tested it extensively but have tried cases with repeated numbers, negative etc and believe it should hold up:
int[] input = [6,5,4,3,2,1]
mergeSort(input,0,input.length-1)
System.out.println(Arrays.toString(input))
void mergeSort(int[] A,int p,int r){
if(p < r){
int q = (int) Math.floor((p+r)/2)
mergeSort(A,p,q)
mergeSort(A,q+1,r)
merge(A,p,q,r)
}
}
void merge(int[] A,int p,int q,int r) {
int i = 0
int j = 0
int n1 = q - p + 1
int n2 = r - q
int[] L = new int[n1]
int[] R = new int[n2]
for(;i<n1;i++){ L[i] = A[p+i] }
for(;j<n2;j++){ R[j] = A[q+j+1] }
i = j = 0
for(int k = p; k <= r; k++){
if(j >= n2 || (i < n1 && L[i] < R[j] )) { A[k] = L[i++] }
else{ A[k] = R[j++] }
}
}

int i = 0
...
for(;i<n1;i++){ L[i] = A[p+i-1] }
What is the index into A on your first iteration, where p is zero?

Related

Merge Sort Java Implementation so merge works within one array with second splitted array reversed

Challenge: within a lecture on data structures and algorithms I encountered a version of merge sort which uses the merge routine in a way that the second half is being reversed from the splitting index and from there compares the first and the last element. I tried to implement in java and it always failed somehow.
Problem: The array is being sorted so that the output is [1, 2, 4, 8, 6] so the 6 is not sorted. It seems as if the recursive call is not looking at the element 6 in the last merge call.
What I tried: Shifting different indices and adding different print statements for checking.
I tried to make j = r before the last for loop within merge which lead to stack overflow every time. I tried to change the way how the size of the array is being calculated, since I was not sure if the pseudo code excepts the array to start from 1 or 0. I tried to shift if(p < r-1) to if(p <= r-1) but get a stack overflow.
I looked at different implementations of java merge routine and every I found so far seems to work with two arrays. Is there a serious reason why the approach above is not working correctly or any idea how to fix this issue?
Given the following pseudo code:
void merge_sort(array<T>& A, int p, int r) {
if (p < r - 1) {
int q = Floor((p + r) / 2);
merge_sort(A, p, q);
merge_sort(A, q + 1, r);
merge(A, p, q, r);
}
}
void merge(array<T>& A, int p, int q, int r) {
array<T> B(p, r - 1);
int i, j;
for (i = p; i < q; i++)
B[i] = A[i];
// Now i=q
for (j = r; i < r; i++)
B[--j] = A[i];
i = p;
j = r - 1;
for (int k = p; k < r; k++)
A[k] = (B[i] < B[j]) ? B[i++] : B[j--];
}
I tried to implement in java like so:
import java.util.Arrays;
public class Mergesort {
private static int[] A = new int[]{ 4, 2, 1, 8, 6 };
public static void main(String[] args) {
merge_sort(0, A.length - 1);
System.out.println(Arrays.toString(A));
}
public static void merge_sort(int p, int r) {
if (p < r - 1) {
int q = Math.floor((p + r) / 2);
merge_sort(p, q);
merge_sort(q + 1, r);
merge(p, q, r);
}
}
public static void merge(int p, int q, int r) {
int[] B = new int[r - p];
int i, j;
for (i = p; i < q; i++)
B[i] = A[i]
for (j = r; i < r; i++)
B[--j] = A[i];
i = p;
j = r - 1;
for (int k = p; k < r; k++)
A[k] = (B[i] < B[j])? B[i++] : B[j--];
}
}
There are multiple problems in your code:
the temporary array is too short: since r is the index of the last element, the size should be r - p + 1. It is much simpler to pass r as the index one past the last element of the slice to sort.
the first for loop is incorrect: you should use a different index into B and A.
the second for loop copies to B[r - 1] downwards, but it should use B[r - p] instead.
the merging loop is incorrect: you should test if i and j are still within the boundaries of the respective halves before accessing B[i] and/or B[j].
[minor] there is no need for int q = Math.floor((p + r) / 2); in java as p and r are have type int, so the division will use integer arithmetics.
Here is a modified version:
public class Mergesort {
private static int[] A = new int[]{ 4, 2, 1, 8, 6 };
public static void main(String[] args) {
merge_sort(0, A.length);
System.out.println(Arrays.toString(A));
}
public static void merge_sort(int p, int r) {
if (r - p >= 2) {
int q = p + (r - p) / 2;
merge_sort(p, q);
merge_sort(q, r);
merge(p, q, r);
}
}
public static void merge(int p, int q, int r) {
int m = q - p; // zero based index of the right half
int n = r - p; // length of the merged slice
int[] B = new int[n];
int i, j, k;
for (i = p, j = 0; j < m; j++)
B[j] = A[i++];
for (i = r, j = m; j < n; j++)
B[j] = A[--i];
for (i = 0, j = n, k = p; k < r; k++) {
// for stable sorting, i and j must be tested against their boundary
// A[k] = (i < m && (j <= m || B[i] <= B[j - 1])) ? B[i++] : B[--j];
// stability is not an issue for an array of int
A[k] = (B[i] <= B[j - 1]) ? B[i++] : B[--j];
}
}
}
Reversing the second half allows for a simpler merge loop without boundary tests. Note however that there is a simpler approach that uses less memory and might be more efficient:
public static void merge(int p, int q, int r) {
int m = q - p; // length of the left half
int[] B = new int[m];
int i, j, k;
// only save the left half
for (i = p, j = 0; j < m; j++)
B[j] = A[i++];
for (i = 0, j = q, k = p; i < m; k++) {
A[k] = (j >= r || B[i] <= A[j]) ? B[i++] : A[j++];
}
}

merge sort is duplicatiing array entries

I was trying to implement in Java the merge sort algorithm according to Cormen's Introduction to Algorithms. The problem with my code (below) is that the main array is duplicating some of its entries during the merge step.
Is someone able to catch what I'm doing wrong?
Thank you!
static void merge(int a[], int p, int q, int r)
{
int n1 = q - p;
int n2 = (r - q);
int [] left = new int[n1 + 1];
int [] right = new int[n2 + 1];
int pp = p;
int qq = q;
for(int i = 0; i < n1; i++)
{
left[i] = a[++pp];
}
for(int i = 0; i < n2; i++)
{
right[i] = a[++qq];
}
left[left.length-1] = Integer.MAX_VALUE;
right[right.length-1] = Integer.MAX_VALUE;
int i = 0;
int j = 0;
for(int k = p; k < r; k++)
{
if(left[i] <= right[j])
{
a[k] = left[i];
i++;
}
else
{
a[k] = right[j];
j++;
}
}
}
static int [] mergeSort(int a[], int p, int r)
{
if(p < r)
{
int q = (p + r)/2;
mergeSort(a, 1, q);
mergeSort(a, q + 1, r);
merge(a, p, q, r);
}
return a;
}
Part of the issue here is the example from the book apparently uses index range from 1 to length. It will be simpler if you change the index range from 0 to length-1, which I assume in the rest of my answer.
Use post increment while copying to left[] and right[] as answered by laune (since index range 0 to length-1).
left[i] = a[pp++];
...
right[i] = a[qq++];
The main issue is the merge function is not checking to see if it reached the end of the left or right run during a merge. This can be fixed by changing the inner if to:
if (i < n1 && (j >= n2 || left[i] <= right[j]))
The recursive calls to merge sort should be:
mergeSort(a, p, q);
mergeSort(a, q, r);
Not shown, but the initial call to mergeSort should be:
mergeSort(a, 0, a.length);
There's no need to allocate the extra element in left and right (since index range is 0 to length-1).
int [] left = new int[n1];
int [] right = new int[n2];
I think that this is in error (as well as its sibling in the next loop):
left[i] = a[++pp];
You want to copy starting with pp = p, so don't increment before you access the array element:
left[i] = a[pp++];

How can improve this algorithm to optimize the running time (find points in segments)

I'm given 2 integrals, the first is the number of segments (Xi,Xj) and the second is the number of points that can or cant be inside those segments.
As an example, the input could be:
2 3
0 5
8 10
1 6 11
Where, in first line, 2 means "2 segments" and 3 means "3 points".
The 2 segments are "0 to 5" and "8 to 10", and the points to look for are 1, 6, 11.
The output is
1 0 0
Where point 1 is in segment "0 to 5", and point 6 and 11 are not in any segment. If a point appears in more than one segment, like a 3, the output would be 2.
The original code, was just a double loop to search the points between segments. I used the Java Arrays quicksort (modified so when it sorts endpoints of segments, sorts also startpoints so start[i] and end[i] belong to the same segment i) to improve the speed of the double loop but it isnt enought.
The next code works fine but when there's too many segments it gets very slow:
public class PointsAndSegments {
private static int[] fastCountSegments(int[] starts, int[] ends, int[] points) {
sort(starts, ends);
int[] cnt2 = CountSegments(starts,ends,points);
return cnt2;
}
private static void dualPivotQuicksort(int[] a, int[] b, int left,int right, int div) {
int len = right - left;
if (len < 27) { // insertion sort for tiny array
for (int i = left + 1; i <= right; i++) {
for (int j = i; j > left && b[j] < b[j - 1]; j--) {
swap(a, b, j, j - 1);
}
}
return;
}
int third = len / div;
// "medians"
int m1 = left + third;
int m2 = right - third;
if (m1 <= left) {
m1 = left + 1;
}
if (m2 >= right) {
m2 = right - 1;
}
if (a[m1] < a[m2]) {
swap(a, b, m1, left);
swap(a, b, m2, right);
}
else {
swap(a, b, m1, right);
swap(a, b, m2, left);
}
// pivots
int pivot1 = b[left];
int pivot2 = b[right];
// pointers
int less = left + 1;
int great = right - 1;
// sorting
for (int k = less; k <= great; k++) {
if (b[k] < pivot1) {
swap(a, b, k, less++);
}
else if (b[k] > pivot2) {
while (k < great && b[great] > pivot2) {
great--;
}
swap(a, b, k, great--);
if (b[k] < pivot1) {
swap(a, b, k, less++);
}
}
}
// swaps
int dist = great - less;
if (dist < 13) {
div++;
}
swap(a, b, less - 1, left);
swap(a, b, great + 1, right);
// subarrays
dualPivotQuicksort(a, b, left, less - 2, div);
dualPivotQuicksort(a, b, great + 2, right, div);
// equal elements
if (dist > len - 13 && pivot1 != pivot2) {
for (int k = less; k <= great; k++) {
if (b[k] == pivot1) {
swap(a, b, k, less++);
}
else if (b[k] == pivot2) {
swap(a, b, k, great--);
if (b[k] == pivot1) {
swap(a, b, k, less++);
}
}
}
}
// subarray
if (pivot1 < pivot2) {
dualPivotQuicksort(a, b, less, great, div);
}
}
public static void sort(int[] a, int[] b) {
sort(a, b, 0, b.length);
}
public static void sort(int[] a, int[] b, int fromIndex, int toIndex) {
rangeCheck(a.length, fromIndex, toIndex);
dualPivotQuicksort(a, b, fromIndex, toIndex - 1, 3);
}
private static void rangeCheck(int length, int fromIndex, int toIndex) {
if (fromIndex > toIndex) {
throw new IllegalArgumentException("fromIndex > toIndex");
}
if (fromIndex < 0) {
throw new ArrayIndexOutOfBoundsException(fromIndex);
}
if (toIndex > length) {
throw new ArrayIndexOutOfBoundsException(toIndex);
}
}
private static void swap(int[] a, int[] b, int i, int j) {
int swap1 = a[i];
int swap2 = b[i];
a[i] = a[j];
b[i] = b[j];
a[j] = swap1;
b[j] = swap2;
}
private static int[] naiveCountSegments(int[] starts, int[] ends, int[] points) {
int[] cnt = new int[points.length];
for (int i = 0; i < points.length; i++) {
for (int j = 0; j < starts.length; j++) {
if (starts[j] <= points[i] && points[i] <= ends[j]) {
cnt[i]++;
}
}
}
return cnt;
}
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
int n, m;
n = scanner.nextInt();
m = scanner.nextInt();
int[] starts = new int[n];
int[] ends = new int[n];
int[] points = new int[m];
for (int i = 0; i < n; i++) {
starts[i] = scanner.nextInt();
ends[i] = scanner.nextInt();
}
for (int i = 0; i < m; i++) {
points[i] = scanner.nextInt();
}
//use fastCountSegments
int[] cnt = fastCountSegments(starts, ends, points);
for (int x : cnt) {
System.out.print(x + " ");
}
}
I believe the problem is in the CountSegments() method but I'm not sure of another way to solve it. Supposedly, I should use a divide and conquer algorithm, but after 4 days, I'm up to any solution.
I found a similar problem in CodeForces but the output is different and most solutions are in C++. Since I have just 3 months that I started to learn java, I think I have reached my knowledge limit.
Given the constrains by OP, let n be the # of segments, m be the number of points to be query, where n,m <= 5*10^4, I can come up with a O(nlg(n) + mlg(n)) solution (which should be enough to pass most online judge)
As each query is a verifying problem: Can the point be covered by some intervals, yes or no, we do not need to find which / how many intervals the point has been covered.
Outline of the algorithm:
Sort all intervals first by starting point, if tie then by length (rightmost ending point)
Try to merge the intervals to get some disjoint overlapping intervals. For e.g. (0,5), (2,9), (3,7), (3,5), (12,15) , you will get (0,9), (12,15). As the intervals are sorted, this can be done greedily in O(n)
Above are the precomputation, now for each point, we query using the disjoint intervals. Simply binary search if any interval contains such point, each query is O(lg(n)) and we got m points, so total O(m lg(n))
Combine whole algorithm, we will get an O(nlg(n) + mlg(n)) algorithm
This is an implementation similar to #Shole's idea:
public class SegmentsAlgorithm {
private PriorityQueue<int[]> remainSegments = new PriorityQueue<>((o0, o1) -> Integer.compare(o0[0], o1[0]));
private SegmentWeight[] arraySegments;
public void addSegment(int begin, int end) {
remainSegments.add(new int[]{begin, end});
}
public void prepareArrayCache() {
List<SegmentWeight> preCalculate = new ArrayList<>();
PriorityQueue<int[]> currentSegmentsByEnds = new PriorityQueue<>((o0, o1) -> Integer.compare(o0[1], o1[1]));
int begin = remainSegments.peek()[0];
while (!remainSegments.isEmpty() && remainSegments.peek()[0] == begin) {
currentSegmentsByEnds.add(remainSegments.poll());
}
preCalculate.add(new SegmentWeight(begin, currentSegmentsByEnds.size()));
int next;
while (!remainSegments.isEmpty()) {
if (currentSegmentsByEnds.isEmpty()) {
next = remainSegments.peek()[0];
} else {
next = Math.min(currentSegmentsByEnds.peek()[1], remainSegments.peek()[0]);
}
while (!currentSegmentsByEnds.isEmpty() && currentSegmentsByEnds.peek()[1] == next) {
currentSegmentsByEnds.poll();
}
while (!remainSegments.isEmpty() && remainSegments.peek()[0] == next) {
currentSegmentsByEnds.add(remainSegments.poll());
}
preCalculate.add(new SegmentWeight(next, currentSegmentsByEnds.size()));
}
while (!currentSegmentsByEnds.isEmpty()) {
next = currentSegmentsByEnds.peek()[1];
while (!currentSegmentsByEnds.isEmpty() && currentSegmentsByEnds.peek()[1] == next) {
currentSegmentsByEnds.poll();
}
preCalculate.add(new SegmentWeight(next, currentSegmentsByEnds.size()));
}
SegmentWeight[] arraySearch = new SegmentWeight[preCalculate.size()];
int i = 0;
for (SegmentWeight l : preCalculate) {
arraySearch[i++] = l;
}
this.arraySegments = arraySearch;
}
public int searchPoint(int p) {
int result = 0;
if (arraySegments != null && arraySegments.length > 0 && arraySegments[0].begin <= p) {
int index = Arrays.binarySearch(arraySegments, new SegmentWeight(p, 0), (o0, o1) -> Integer.compare(o0.begin, o1.begin));
if (index < 0){ // Bug fixed
index = - 2 - index;
}
if (index >= 0 && index < arraySegments.length) { // Protection added
result = arraySegments[index].weight;
}
}
return result;
}
public static void main(String[] args) {
SegmentsAlgorithm algorithm = new SegmentsAlgorithm();
int[][] segments = {{0, 5},{3, 10},{8, 9},{14, 20},{12, 28}};
for (int[] segment : segments) {
algorithm.addSegment(segment[0], segment[1]);
}
algorithm.prepareArrayCache();
int[] points = {-1, 2, 4, 6, 11, 28};
for (int point: points) {
System.out.println(point + ": " + algorithm.searchPoint(point));
}
}
public static class SegmentWeight {
int begin;
int weight;
public SegmentWeight(int begin, int weight) {
this.begin = begin;
this.weight = weight;
}
}
}
It prints:
-1: 0
2: 1
4: 2
6: 1
11: 2
28: 0
EDITED:
public static void main(String[] args) {
SegmentsAlgorithm algorithm = new SegmentsAlgorithm();
Scanner scanner = new Scanner(System.in);
int n = scanner.nextInt();
int m = scanner.nextInt();
for (int i = 0; i < n; i++) {
algorithm.addSegment(scanner.nextInt(), scanner.nextInt());
}
algorithm.prepareArrayCache();
for (int i = 0; i < m; i++) {
System.out.print(algorithm.searchPoint(scanner.nextInt())+ " ");
}
System.out.println();
}

Converting Merge Sort pseudocode to running Java code

I tried to convert this Merge Sort pseudocode into Java but don't get the right output. Here is the pseudocode:
Merge-Sort(A, p, r )
if p < r
then q←(p+r)/2
Merge-Sort(A, p, q)
Merge-Sort(A, q + 1, r )
Merge(A, p, q, r )
Merge(A, p, q, r )
for k←1 to r−p+1 do
if j>r or (i ≤ q and A[i] ≤ A[j])
then B[k]←A[i]; i←i+1 else B[k]←A[j];j←j+1
for k←1 to r−p+1 do A[k+p−1]←B[k]
And this is my Java code for it:
public class MergeSort {
public static void main(String[] args) {
int[] a = {2, 6, 3, 5, 1};
mergeSort(a, 0, a.length - 1);
for (int i = 0; i < a.length; i++) {
System.out.print(" " + a[i]);
}
}
public static void mergeSort(int[] a, int from, int to) {
final int begin = from, end = to;
if (begin < end) {
final int mid = (begin + end) / 2;
MergeSort.mergeSort(a, begin, mid);
MergeSort.mergeSort(a, mid+1, end);
MergeSort.merge(a, begin, mid, end);
}
}
private static void merge(int[] a, int from, int mid, int to) {
final int begin = from, mitte = mid, end = to;
int[] B = new int[a.length];
int i = begin, j = mitte;
for (int k = 0; k <= end-begin; k++) {
if (j > end || (i <= mitte && a[i] <= a[j])) {
B[k] = a[i];
i++;
} else {
B[k] = a[j];
j++;
}
}
for (int k = 0; k < end-begin; k++) {
a[k + begin] = B[k];
}
}
Sadly it is not working like that. I think i do something wrong with some indexes but I can't figure out where exactly the error is.
I need to stick as close as possible to this pseudocode.
It would be great if someone could show me what I am doing wrong.
The pseudocode given for the Merge algorithm is somewhat incorrect because it does not say anything about the situation when only one pointer moves while other remains stationary.
In the above mentioned case you would have to separately fill out temporary array for by moving that stationary pointer.
Also the required length of B is to - from + 1 and it should be j = mitte + 1 instead of j = mitte The correct code for the merge is :
private static void merge(int[] a, int from, int mid, int to) {
final int begin = from, mitte = mid, end = to;
int[] B = new int[end-begin+1];
int k=0;
int i = begin, j = mitte+1;
while(i<=mid&&j<=end)
if(a[i]<=a[j]){
B[k++] = a[i];
i++;
} else {
B[k++] = a[j];
j++;
}
//in case i remained stationary
while(i<=mid)
B[k++] = a[i++];
//in case j remained stationary
while(j<=end)
B[k++] = a[j++];
//Now copy the array
i=0;
for(k=begin;k<=end;++k)
a[k]=B[i++];
}

Merge Sort. Error-- Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2

Good day, everyone! I have here a program that sorts 50,000 words from a file using merge sort. I followed Thomas Cormen's pseudocode in his Introduction to Algorithms and it seems right when I'm "debuuging" it by hand manually. However, when I run the program it says Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2 . Yes, I think it is due to the large NO_OF_WORDS (ie, 50,000) but even though I decreased it to 10, still, it shows the same error.
import java.io.*;
import java.util.*;
public class SortingAnalysis {
public static void merge(String[] A, int p, int q, int r) {
int n1 = q-p+1;
int n2 = r-q;
String[] L = new String[n1+1];
String[] R = new String[n2+1];
for (int i=1; i<n1; i++) {
L[i] = A[p+i-1];
}
for (int j=1; j<n2; j++) {
R[j] = A[q+j];
}
L[n1+1] = "zzzzz"; //for infinity because if I use Math.floor, it will return a double
R[n2+1] = "zzzzz";
int i=1;
int j=1;
for (int k=p; k<=r; k++) {
int comparison = L[i].compareTo(R[j]);
if (comparison <= 0){
A[k] = L[i];
i++;
}
else {
A[k] = R[j];
j++;
}
}
}
public static void mergeSort (String[] A, int p, int r) {
if (p<r) {
int q = (p+r)/2;
mergeSort(A, p, q);
mergeSort(A, q+1, r);
merge(A, p, q, r);
}
}
public static void main(String[] args) {
final int NO_OF_WORDS = 50000;
try {
Scanner file = new Scanner(new File(args[0]));
String[] words = new String[NO_OF_WORDS];
int i = 0;
while(file.hasNext() && i < NO_OF_WORDS) {
words[i] = file.next();
i++;
}
long start = System.currentTimeMillis();
mergeSort(words, 0, words.length-1);
long end = System.currentTimeMillis();
System.out.println("Sorted Words: ");
for(int j = 0; j < words.length; j++) {
System.out.println(words[j]);
}
System.out.print("Running time: " + (end - start) + "ms");
}
catch(SecurityException securityException) {
System.err.println("Error");
System.exit(1);
}
catch(FileNotFoundException fileNotFoundException) {
System.err.println("Error");
System.exit(1);
}
}
}
I think it's because of the declaration of String[] L and R. Or not. Please help me what's the problem. Thank you very much!
EDIT
Cormen's Pseudocode
MERGE(A, p, q, r )
n1 ← q − p + 1
n2 ←r − q
create arrays L[1 . . n1 + 1] and R[1 . . n2 + 1]
for i ← 1 to n1
do L[i ] ← A[p + i − 1]
for j ← 1 to n2
do R[ j ] ← A[q + j ]
L[n1 + 1]←∞
R[n2 + 1]←∞
i ← 1
j ← 1
for k ← p to r
do if L[i ] ≤ R[ j ]
then A[k] ← L[i ]
i ←i + 1
else A[k] ← R[ j ]
j ← j + 1
I don't know what is your pseudocode but your implementation seems wrong. I've look at the wikipedia merge sort and it's quite different.
So I will not give you the full working algorithm here. I'll just give you the solution to resolve your problem of indexOutOfBounds but you still have to work more on your implementation.
In Java when you do that :
String[] L = new String[5];
You declare an array of string that can contains 5 strings within.
The access to those strings is made this way : L[anIndex].
The first element is at index 0.
So if you have an array of size 5 then the last element is at index 4 (because we start from 0).
In your code you do this :
String[] L = new String[n1+1];
String[] R = new String[n2+1];
then :
L[n1+1] = "zzzzz";
R[n2+1] = "zzzzz";
So here you always try to access a string at an index that doesn't exist.
The last element in each array is respectively n1 and n2 (because arrays size are n1+1 and n2+1 ).
I hope you'll understand better how array works in Java with this explanation. Now you have to improve your implementation because it's still not working. Maybe give us the pseudocode you use if you don't understand it well.
EDIT :
Ok I made some correction.
Here is the working algorithm. I've had to change several index to fit Java "based-0 arrays", take a look :
import java.io.*;
import java.util.*;
public class SortingAnalysis {
public static void merge(String[] A, int p, int q, int r) {
int n1 = q-p+1;
int n2 = r-q;
if(A[p]==null || A[q]==null)return;
String[] L = new String[n1+1];
String[] R = new String[n2+1];
for (int i=0; i<n1; i++) {
L[i] = A[p+i];
}
for (int j=0; j<n2; j++) {
R[j] = A[q+j +1];
}
L[n1] = "zzzzz"; //for infinity because if I use Math.floor, it will return a double
R[n2] = "zzzzz";
int i=0;
int j=0;
for (int k=p; k<=r; k++) {
int comparison = L[i].compareTo(R[j]);
if (comparison <= 0){
A[k] = L[i];
i++;
}
else {
A[k] = R[j];
j++;
}
}
}
public static void mergeSort (String[] A, int p, int r) {
if (p<r) {
int q = (p+r)/2;
mergeSort(A, p, q);
mergeSort(A, q+1, r);
merge(A, p, q, r);
}
}
public static void main(String[] args) {
final int NO_OF_WORDS = 50000;
try {
Scanner file = new Scanner("bla blya blay byla ybla");
ArrayList<String> words = new ArrayList<String>();
while(file.hasNext() && words.size() < NO_OF_WORDS) {
words.add(file.next());
}
String [] wordsArray = new String[words.size()];
words.toArray(wordsArray);
long start = System.currentTimeMillis();
mergeSort(wordsArray, 0, wordsArray.length-1);
long end = System.currentTimeMillis();
System.out.println("Sorted Words: ");
for(int j = 0; j < wordsArray.length; j++) {
System.out.println(wordsArray[j]);
}
System.out.print("Running time: " + (end - start) + "ms");
}
catch(SecurityException securityException) {
System.err.println("Error");
System.exit(1);
}
}
}
Note that I've change your Main, now I use an arrayList to avoid null value, if your text contains less words than the original array size. With your solution if you don't fill the 50000 words you get null in the array and then nullPointerException in the merge algo.
There is a big problem with your merge() method:
String[] L = new String[n1+1];
String[] R = new String[n2+1];
will not play well with
L[n1+1] = "zzzzz"; //for infinity because if I use Math.floor, it will return a double
R[n2+1] = "zzzzz";
You will get an ArrayIndexOutOfBoundsException here regardless of the values of n1 and n2 since arrays are 0-based in Java.

Categories

Resources