Binary search with word prefix string out of bounds - java

I have a program that creates a class dictionary, in which it populates and arrayList of strings with words given from a command line argument(in alphabetical order, all different lengths). Anyway, I need to implement binary search to look for a prefix in the dictionary as part of a backtracking method. I run into problems when the prefix is longer than the word in the dictionary---I tried to adjust binary search for this situation but it is producing incorrect results. I really don't understand binary search enough to fix this issue. If I don't account for the issue of a prefix being longer than a word, it .subString produces string indexoutofbounds. Any help would be greatly appreciated.
public int searchPrefix(String prefixKey){
int minIndex=0;
int maxIndex= newDictionary.size()-1;
return searchPrefix( prefixKey, minIndex,maxIndex);
}
public int searchPrefix(String prefixKey, int minIndex, int maxIndex){
if(minIndex>maxIndex){
return-1;
}
int midIndex=(maxIndex-minIndex)/2+minIndex;
if (prefixKey.length()>newDictionary.get(midIndex).length()){
return searchPrefix( prefixKey, midIndex+1,maxIndex);
}
else if(newDictionary.get(midIndex).length(<prefixKey.length()&&newDictionary.get(midIndex).compareTo(prefixKey.substring(0,newDictionary.get(midIndex).length()))>0){
return searchPrefix(prefixKey,minIndex,maxIndex);
}
else if(newDictionary.get(midIndex).substring(0,prefixKey.length()).compareTo(prefixKey)>0){
return searchPrefix(prefixKey,minIndex,maxIndex-1);
}
else if(newDictionary.get(midIndex).length()<prefixKey.length()&&newDictionary.get(midIndex).compareTo(prefixKey.substring(0,newDictionary.get(midIndex).length()))<0){
return searchPrefix(prefixKey,minIndex,maxIndex);
}
else if(newDictionary.get(midIndex).substring(0,prefixKey.length()).compareTo(prefixKey)<0){
return searchPrefix( prefixKey, midIndex+1,maxIndex);
}
else
return midIndex;
}

Have taken the binarySearch method from Collections class and modified as per your need.
private static int binarySearch(List<String> list, String key) {
int low = 0;
int high = list.size() - 1;
while (low <= high) {
int mid = (low + high) >>> 1;
String midVal = list.get(mid);
int cmp = -1;
if (midVal.length() > key.length())
cmp = midVal.substring(0, key.length()).compareTo(key);
else
cmp = key.substring(0, midVal.length()).compareTo(midVal) * -1;
if (cmp < 0)
low = mid + 1;
else if (cmp > 0)
high = mid - 1;
else
return mid; // key found
}
return -1; // key not found
}
Hope this will help you.

The String method compareTo takes care of String values of different length, so that (e.g.) "ABC" precedes "ABCD" so all of these case distinctions in your method aren't really necessary.
The cascaded of statement begins:
if (prefixKey.length() > newDictionary.get(midIndex).length()){
return searchPrefix( prefixKey, midIndex+1,maxIndex);
}
else if(newDictionary.get(midIndex).length() < prefixKey.length() && ...
But theses conditions are identical, which means that the second branch is never reached.
You have two statements:
return searchPrefix(prefixKey,minIndex,maxIndex);
Under no circumstances should a recursive call be made with exactly the same parameters as were passed to the current call: infinite recursion results.
Why can't you use Arrays.binarySearch?
I can't really suggest an improvement because you haven't described the problem. Please provide an example, giving a small dictionary and a set of keys with expected results.

Related

How can I make this binary search code more efficient?

public class MySearch {
public static int search(MyArray array, int value) {
int index = -1;
int start = 0, end = array.length - 1;
while(start <= end) {
int mid = (start + end) / 2;
if(array.compToValue(mid, value) == 1) end = mid - 1;
else if(array.compToValue(mid, value) == -1) start = mid + 1;
else return mid;
}
return index;
}
}
In some cases the number of comparisons is exceeded as you can see in the screenshot. I'm not allowed to use read operations (get). The number of comparisons I'm allowed to make is O(logn).
Your code says nothing about whether the array is ordered. If the array is not ordered, you can do only linear search, and a search operation in O(log n) is impossible.
If the array is ordered, as Turing85's comment says:
extract array.compToValue(mid, value) into a variable, use this variable in the if-clauses instead of calculating the value twice.
Also useful advice is to always use curly brackets. Java allows to drop it if the code block is only a single line, but doing that is a bad practice.

How to use binary search of a sorted list and then count the comparisons

I am trying to use my sorted list and implement it with binary search. Then i want to count the number of comparisons it takes to find the key. my code is:
public class BinarySearch {
private static int comparisions = 0;
public static void main(String[] args) {
int [] list = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20};
int i = BinarySearch.BinSearch(list, 20);
System.out.println(comparisions);
}
public static int BinSearch(int[] list, int key) {
int low = 0;
int high = list.length - 1;
int mid = (high + low) / 2;
if (key < list[mid]) {
high = mid - 1;
comparisions++;
} else if (key == list[mid]) {
return mid;
comparisions++;
} else {
low = mid + 1;
comparisions++;
}
return -1;
}
}
So far it only gives me 1 for the comparison no matter what number is the key.
Your code is missing the looping part of the search, that looping can either be done using recursion or using a while loop. In both cases you have to ask yourself wether or not you just want to know the count or actually return the count of comparisons. Since your method right now returns the index, it cannot easily return the count of comparisons at the same time. For that to work you either need to return an array of two ints or a custom class IndexAndComparisonCount { ... }.
If you use a recursive approach you need to increment whenever you do a comparison and when you do a recursive call you need to get the return value of that recursive call and increment the comparisonCount the call returned by 1:
if (... < ...) {
IndexAndComparisonCount ret = BinSearch(...);
ret.comparisonCount += 1;
return ret;
} else if (... > ...) {
IndexAndComparisonCount ret = BinSearch(...);
ret.comparisonCount += 2; // since you did compare twice already
return ret;
} else {
return new IndexAndComparisonCount(mid, 2); // you compared twice as well
}

Incrementing statements to track comparisons in recursive method

I am trying to construct a recursive binary search method that searches a sorted array of comparable objects for an object of interest. This is part of a larger algorithm that searches a collection of sorted arrays and looks for elements common to all arrays in the collection. The goal is to make the search/comparison portion of the algorithm as efficient as possible. A linear solution should be possible. This is the method:
private static boolean BinarySearch(Comparable[] ToSearch, Comparable ToFind, int first, int last){
boolean found = false;
int mid = first + (last - first) / 2;
comparisons++;
if(first > last){
found = false;
}
else if(ToFind.compareTo(ToSearch[mid]) == 0){
found = true;
comparisons++;
}
else if(ToFind.compareTo(ToSearch[mid]) < 0) {
found = BinarySearch(ToSearch, ToFind, first, mid - 1);
comparisons++;
}
else{
found = BinarySearch(ToSearch, ToFind,mid + 1, last);
comparisons++;
}
return found;
}
The problem I am having is tracking the number of comparisons through the recursion. Because I have to count the comparisons that evaluate to false as well as true, I tried to placing the comparison incrementing statement inside each selection statement but this does not work because the statement is not incremented if the statement evaluates to false. I also cannot place them between the selection statements because that would give me else without if erros. I am wondering if it was a bad idea to use recursion at all for the search but I want to believe it is possible. Any help would be appreciated.
Perhaps you could set a variable in each if block with the number of comparisons it took to get there, and add it at the end?
private static boolean BinarySearch(Comparable[] ToSearch, Comparable ToFind, int first, int last){
boolean found = false;
int newComparisons = 0;
int mid = first + (last - first) / 2;
if(first > last){
found = false;
newComparisons = 1;
}
else if(ToFind.compareTo(ToSearch[mid]) == 0){
found = true;
newComparisons = 2;
}
else if(ToFind.compareTo(ToSearch[mid]) < 0) {
found = BinarySearch(ToSearch, ToFind, first, mid - 1);
newComparisons = 3;
}
else{
found = BinarySearch(ToSearch, ToFind,mid + 1, last);
newComparisons = 3;
}
comparisons += newComparisons;
return found;
}

What counts as a binary search comparison?

I'm writing a program that determines how many comparisons it takes to run a binary search algorithm for a given number and sorted array. What I don't understand is what counts as a comparison.
// returns the number of comparisons it takes to find key in sorted list, array
public static int binarySearch(int key, int[] array) {
int left = 0;
int mid;
int right = array.length - 1;
int i = 0;
while (true) {
if (left > right) {
mid = -1;
break;
}
else {
mid = (left + right)/2;
if (key < array[mid]) {
i++;
right = mid - 1;
}
else if (key > array[mid]) {
i++;
left = mid + 1;
}
else {
break; // success
}
}
}
return i;
}
The function returns i, which is supposed to be the total number of comparisons made in finding the key in array. But what defines a comparison? Is it any time there is a conditional?
Thanks for any help, just trying to understand this concept.
Usually, a comparison occurs each time the key is compared to an array element. The code seems to not be counting that, though. It is counting how many times one of the search boundaries (left or right) is changed. It's not exactly the same thing being counted, but it's pretty close to the same thing, since the number of times a boundary is shifted is directly related to the number of times through the loop and hence to the number of times a comparison is made. At most, the two ways of counting will be off by 1 or 2 (I didn't bother to figure that out exactly).
Note also that if one were to use the usual definition, the code could be rewritten to use Integer.compare(int,int) do a single comparison of key with array[mid] to determine whether key was less than, equal to, or greater than array[mid].
public static int binarySearch(int key, int[] array) {
int left = 0;
int mid;
int right = array.length - 1;
int i = 0;
while (left <= right) {
mid = (left + right)/2;
int comp = Integer.compare(key, array[mid]);
i++;
if (comp < 0) {
right = mid - 1;
}
else if (comp > 0) {
left = mid + 1;
}
else {
break; // success
}
}
return i;
}

Finding index of array where min value occurs

This one is making my head spin. Just when I think I got it, I realize something's not right. I have to use recursion for this assignment. Any hints?
/**
* Uses recursion to find index of the shortest string.
* Null strings are treated as infinitely long.
* Implementation notes:
* The base case if lo == hi.
* Use safeStringLength(paths[xxx]) to determine the string length.
* Invoke recursion to test the remaining paths (lo +1)
*/
static int findShortestString(String[] paths, int lo, int hi) {
int min=lo;
if (lo==hi)
return min;
if (safeStringLength(paths[lo]) < safeStringLength(paths[lo+1])){
min=lo;
return Math.min(min, findShortestString(paths, lo+1, hi));
}
else{
min=lo+1;
return Math.min(min, findShortestString(paths, lo+1, hi));
}
}
I think got something here:
static int findShortestString(String[] paths, int lo, int hi)
{
if (lo==hi)
return lo;
int ShortestIndexSoFar = findShortestString(paths, lo+1, hi);
if(safeStringLength(paths[ShortestIndexSoFar]) < safeStringLength(paths[lo]))
return ShortestIndexSoFar;
else
return lo;
}
static int safeStringLength(String str)
{
if(str == null)
return Integer.MAX_VALUE;
return str.length();
}
Explaining why this works:
Here's a sample:
[0] ab
[1] abcd
[2] a
[3] abc
[4] ab
Obviously, index 2 is the shortest one.
Think bottoms up. Read the following starting from the bottom, upwards.
I make it sound like each function is talking to the function above it in the chain.
And each line is lead by the function parameters that were passed.
"[0] ab (paths, 0, 4): return 2, coz he's shorter than me or anyone before us"
"[1] abcd (paths, 1, 4): return 2, coz he's shorter than me or anyone before us"
"[2] a (paths, 2, 4): return 2, I'm shorter than anyone before me"
"[3] abc (paths, 3, 4): return 4, coz he's shorter than me or anyone before us"
"[4] ab (paths, 4, 4): return 4, I'm the shortest; I don't know any better"
So in the code, you see that exactly happening.
When we define ShortestIndexSoFar, this is where each function will know the shortest of all the paths beyond it.
And right after it is where the function itself checks if its index has a shorter path than the shortest of all the ones below.
Keep trickling the shortest one upward, and the final guy will return the shortest index.
That makes sense?
Since this is homework, here's a hint to help you learn:
The signature of the findShortestString method suggests that you should be using a binary search. Why would I say that? Why would it be a good idea to do that? All of the other solutions suffer from a practical problem in Java ... what would that be?
To people other than the OP ... PLEASE DON'T GIVE THE ANSWERS AWAY ... e.g. by correcting your answers!
Why not just get the length of each element and sort the returned length to get the ordering? Like this.
int[] intArray = {10, 17, 8, 99, 1}; // length of each element of string array
Arrays.sort(intArray);
one solution will be like this
public static final int findShortestString(final String[] arr, final int index, final int minIndex) {
if(index >= arr.length - 1 ) {
return minIndex;
}
if(-1 == safeStringLength(arr[index])) {
return index;
}
int currentMinIncex = minIndex;
if(safeStringLength(arr[minIndex]) > safeStringLength(arr[index+1])){
currentMinIncex = index + 1;
}
return findShortestString(arr, index + 1, currentMinIncex);
}
public static final int safeStringLength(final String string) {
if( null == string) return -1;
return string.length();
}
A simple solution:
/**
* Uses recursion to find index of the shortest string.
* Null strings are treated as infinitely long.
* Implementation notes:
* The base case if lo == hi.
* Use safeStringLength(paths[xxx]) to determine the string length.
* Invoke recursion to test the remaining paths (lo +1)
*/
static int findShortestString(String[] paths, int lo, int hi) {
if (lo==hi)
return lo;
if (paths[lo] == null)
return findShortestString(paths, lo+1, hi);
int bestIndex = findShortestString(paths, lo+1, hi);
if (safeStringLength[lo] < safeStringLength[bestIndex])
return lo;
return bestIndex;
}
Calculating min on the result of running findShortestString isn't meaningful. The best way to start this kind of problem is to consider just a single recursive step, you can do this by considering what happens with an array of only two strings to compare.
What you want to do is check the length of the first string against the length of the second. The real trick, though, is that you want to test the length of the second by calling the function recursively. This is straight forward enough, but requires determining the end-case of your recursion. You did this successfully, it's when lo == hi. That is, when lo == hi the shortest known string is lo (it's the only known string!).
Ok, so back to comparing just two strings. Given that you know that you want to compare the length of two strings stored in paths, you might do something like this (without recursion):
if(safeStringLength(paths[0]) < safeStringLength(paths[1])){
return 0; // paths[0] is shorter
}else{
return 1; // paths[1] is shorter
}
But you want to recurse -- and in the recurse step you need to somehow generate that 1 of paths[1]. We already figured out how to do that, when lo == hi, we return lo. Thus the recursion step is "compare the current lowest known string length to the string length of the best known index" -- wait, we have a function for that! it's findShortestString. Thus we can modify what's written above to be slightly more concise, and add in the base case to get:
static int findShortestString(String[] paths, int lo, int hi) {
// base case, no comparisons, the only known string is the shortest one
if(lo == hi){
return lo;
}
int bestIndex = findShortestString(paths, lo+1, hi);
return safeStringLength(paths[lo]) < safeStringLength(paths[bestIndex]) ?
lo : bestIndex;
}
static int findShortestString(String[] paths, int lo, int hi)
{
if (lo==hi)
return lo;
int ShortestIndexDown = findShortestString(paths, lo, (hi + lo)/2);
int ShortestIndexUp = findShortestString(paths, (lo+hi)/2+1, hi);
return SafeStringLength(paths[ShortestIndexDown]) < SafeStringLength(paths[ShortestIndexUp])?ShortestIndexDown:ShortestIndexUp;
}
static int safeStringLength(String str)
{
if(str == null)
return Integer.MAX_VALUE;
return str.length();
}

Categories

Resources