Create a lazy stream of all anagrams of a given word - java

I'm trying to write code to create a lazy stream of all anagrams of a given word. I was using this code originally:
public static Stream<WordSequence> anagram(Stream<WordSequence> data, Object[] parameters) {
return data.unordered().flatMap(WordSequence.forEachWord(Functions::allAnagrams)).distinct();
}
private static Stream<Word> allAnagrams(Word data) {
if (data.length() <= 1)
return Stream.of(data);
Stream<Word> ret = Stream.empty();
for (int i = 0; i < data.length(); i++) {
char ch = data.charAt(i);
String rest = new StringBuilder(data).deleteCharAt(i).toString();
ret = Stream.concat(ret, allAnagrams(new Word(rest)).map(word -> new Word(ch + word.toString()))).unordered();
}
return ret;
}
(I'm using my own WordSequence and Word classes.)
I realized that this was not very efficient because it's just concatenating a bunch of empty and one-element streams, and it also computes all the anagrams before returning the stream of them. I found this wonderful algorithm in Core Java somewhere:
StringBuilder b = new StringBuilder(word);
for (int i = b.length() - 1; i > 0; i--)
if (b.charAt(i - 1) < b.charAt(i)) {
int j = b.length() - 1;
while (b.charAt(i - 1) > b.charAt(j))
j--;
swap(b, i - 1, j);
reverse(b, i);
return new Word(b.toString());
}
return new Word(b.reverse().toString());
If you call it with a word, it will return the next word in a sequence of all the anagrams of the word.
I implemented it as follows:
public static Stream<WordSequence> anagram(Stream<WordSequence> data, Object[] parameters) {
class AnagramIterator implements Iterator<Word> {
private final Word start;
private Word current;
private boolean done;
AnagramIterator(Word start) {
current = this.start = start;
}
#Override
public boolean hasNext() {
return !done;
}
#Override
public Word next() {
if (done)
throw new NoSuchElementException();
StringBuilder b = new StringBuilder(current);
for (int i = b.length() - 1; i > 0; i--)
if (b.charAt(i - 1) < b.charAt(i)) {
int j = b.length() - 1;
while (b.charAt(i - 1) > b.charAt(j))
j--;
swap(b, i - 1, j);
reverse(b, i);
current = new Word(b.toString());
done = current.equals(start);
return current;
}
current = new Word(b.reverse().toString());
done = current.equals(start);
return current;
}
private void swap(StringBuilder b, int i, int j) {
char tmp = b.charAt(i);
b.setCharAt(i, b.charAt(j));
b.setCharAt(j, tmp);
}
private void reverse(StringBuilder b, int i) {
int j = b.length() - 1;
while (i < j) {
swap(b, i, j);
i++;
j--;
}
}
}
return data.flatMap(WordSequence.forEachWord(w -> StreamSupport.stream(
Spliterators.spliteratorUnknownSize(
new AnagramIterator(w),
Spliterator.DISTINCT + Spliterator.IMMUTABLE + Spliterator.NONNULL),
false)));
}
However, that algorithm has a problem. If you give it a word that ends with a double letter and then another letter, where the double letter value is numerically less than the single letter, such as "ees", you get this sequence of anagrams:
ees
ese
ees
and that repeats infinitely
That sequence doesn't include "see".
How can I do this?
My code is on GitHub.

I thought about what the algorithm was doing and had a flash of insight. Given the string "ese", this is what the algorithm does:
Find i, which in this case points to the s.
Find j, which points to the e.
Swap i - 1 and j, which swaps the two e's.
Reverse the string from i onward, which swaps the s and the e.
What we want it to do is have j point to the s too, which would make it swap the first e and the s. So how can we modify the algorithm to make that happen?
Well, here's what it does to find j:
Start by pointing j at the last e.
i - 1, which is an e, is not greater than j, which is the other e, so j points to the last e.
Here's my flash of insight: change the comparison from "greater than" to "greater than or equal to". I changed that, and it seems to have worked!

Related

How to generate combination of numbers in more optimized way in Java?

I had a problem statement which requires passing 3 different numbers to a method and checking which 3 numbers satisfies a certain constraint.
Here is my code, but I wanted to know instead of creating nested loops, is there any more optimized way of checking which set of triplet satisfies a certain constraint. ?
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
public class Solution
{
static List l = new ArrayList();
static int geometricTrick(String s)
{
int count = 0;
for (int i = 0; i < s.length(); i++)
{
for (int j = 0; j < s.length(); j++)
{
for (int k = 0; k < s.length(); k++)
{
if (is1stConstraintTrue(s, i, j, k) && is2ndConstraintTrue(i, j, k))
{
l.add(new Triplet(i, j, k));
}
}
}
}
count = l.size();
return count;
}
static boolean is2ndConstraintTrue(int i, int j, int k)
{
boolean retVal = false;
double LHS = Math.pow((j + 1), 2);
double RHS = (i + 1) * (k + 1);
if (LHS == RHS)
retVal = true;
else
retVal = false;
return retVal;
}
static boolean is1stConstraintTrue(String s, int i, int j, int k)
{
boolean retVal = false;
char[] localChar = s.toCharArray();
if (localChar[i] == 'a' && localChar[j] == 'b' && localChar[k] == 'c')
{
retVal = true;
}
return retVal;
}
static class Triplet
{
public int i, j, k;
public Triplet(int i, int j, int k)
{
this.i= i;
this.j= j;
this.k= k;
}
}
public static void main(String[] args)
{
Scanner in = new Scanner(System.in);
int n = in.nextInt();
String s = in.next();
int result = geometricTrick(s);
System.out.println(result);
}
}
Here are some hints:
Computing the square by multiplication will be faster than using pow
String::toCharray() is expensive. It copies all characters into a new array. Each time you call it. Don't call it multiple times.
If the result is the number of triples, you don't need to build a list. You don't even need to create Triple instances. Just count them.
Nested loops are not inefficient, if you need to iterate all combinations.
To provide you a simple and "stupid" solution to not use those inner loops.
Let's use a Factory like an Iterator. This would "hide" the loop by using condition statement.
public class TripletFactory{
final int maxI, maxJ, maxK;
int i, j ,k;
public TripletFactory(int i, int j, int k){
this.maxI = i;
this.maxJ = j;
this.maxK = k;
}
public Triplet next(){
if(++k > maxK){
k = 0;
if(++j > maxJ){
j = 0;
if(++i > maxI){
return null;
}
}
}
return new Triplet(i,j,k);
}
}
That way, you just have to get a new Triple until a null instance in ONE loop
TripletFactory fact = new TripletFactory(2, 3 ,5);
Triplet t = null;
while((t = fact.next()) != null){
System.out.println(t);
}
//no more Triplet
And use it like you want.
Then, you will have to update your constraint method to take a Triplet and use the getters to check the instance.
That could be a method of Triplet to let it validate itself by the way.
Note :
I used the same notation as the Iterator because we could implement Iterator and Iterable to use a notation like :
for(Triplet t : factory){
...
}
The naive approach you presented has cubic time complexity (three nested for-loops). If you have many occurrences of 'a', 'b' and 'c' within your input, it may be worth wile to first determine the indices of all 'a''s, 'b''s and 'c''s and then check your second condition only over this set.
import java.util.ArrayList;
import java.util.List;
public class Main {
public static List<Integer> getAllOccurrences(String input, char of) {
List<Integer> occurrences = new ArrayList<Integer>();
char[] chars = input.toCharArray();
for (int idx = 0; idx < chars.length; ++idx) {
if (of == chars[idx]) {
occurrences.add(idx);
}
}
return (occurrences);
}
static List<Triplet> geometricTrick(String input){
List<Integer> allAs = getAllOccurrences(input, 'a');
List<Integer> allBs = getAllOccurrences(input, 'b');
List<Integer> allCs = getAllOccurrences(input, 'c');
List<Triplet> solutions = new ArrayList<Triplet>();
// reorder the loops, so that the c-loop is the innermost loop (see
// below why this is useful and how it is exploited).
for (int a : allAs) {
for (int c : allCs) {
// calculate lhs as soon as possible, no need to recalculate the
// same value multiple times
final int lhs = ((a + 1) * (c + 1));
for (int b : allBs) {
final int rhs = ((b + 1) * (b + 1));
if (lhs > rhs) {
continue;
}
/* else */ if (lhs == rhs) {
solutions.add(new Triplet(a, b, c));
}
// by construction, the b-values are in ascending or der.
// Thus if the rhs-value is larger or equal to the
// lhs-value, we can skip all other rhs-values. for this
// lhs-value.
// if (lhs <= rhs) {
break;
// }
}
}
}
return (solutions);
}
static class Triplet {
public final int i;
public final int j;
public final int k;
public Triplet(int i, int j, int k) {
this.i = i;
this.j = j;
this.k = k;
}
}
}
Searching all occurrences of one char within a given String takes O(n) (method getAllOccurrences(...)), calling it three times does not change the complexity (3 * n \in O(n)). Iterating through all possible combinations of a, b and c takes #a * #b * #c time, where #a, #b and #c stay for the count of a's, b's and c's in your input. This gives a total time complexity of O(n + #a * #b * #c).
Note that the worst case time complexity, i.e. if 1/3 of your string consists of a's, 1/3 consists of b's and 1/3 consists of c's, is still cubic.

Algo: Find anagram of given string at a given index in lexicographically sorted order

Need to write an Algo to find Anagram of given string at a given index in lexicographically sorted order. For example:
Consider a String: ABC then all anagrams are in sorted order: ABC ACB
BAC BCA CAB CBA. So, for index 5 value is: CAB. Also, consider the case of duplicates like for AADFS anagram would be DFASA at index 32
To do this I have written Algo but I think there should be something less complex than this.
import java.util.*;
public class Anagram {
static class Word {
Character c;
int count;
Word(Character c, int count) {
this.c = c;
this.count = count;
}
}
public static void main(String[] args) {
System.out.println(findAnagram("aadfs", 32));
}
private static String findAnagram(String word, int index) {
// starting with 0 that's y.
index--;
char[] array = word.toCharArray();
List<Character> chars = new ArrayList<>();
for (int i = 0; i < array.length; i++) {
chars.add(array[i]);
}
// Sort List
Collections.sort(chars);
// To maintain duplicates
List<Word> words = new ArrayList<>();
Character temp = chars.get(0);
int count = 1;
int total = chars.size();
for (int i = 1; i < chars.size(); i++) {
if (temp == chars.get(i)) {
count++;
} else {
words.add(new Word(temp, count));
count = 1;
temp = chars.get(i);
}
}
words.add(new Word(temp, count));
String anagram = "";
while (index > 0) {
Word selectedWord = null;
// find best index
int value = 0;
for (int i = 0; i < words.size(); i++) {
int com = combination(words, i, total);
if (index < value + com) {
index -= value;
if (words.get(i).count == 1) {
selectedWord = words.remove(i);
} else {
words.get(i).count--;
selectedWord = words.get(i);
}
break;
}
value += com;
}
anagram += selectedWord.c;
total--;
}
// put remaining in series
for (int i = 0; i < words.size(); i++) {
for (int j = 0; j < words.get(i).count; j++) {
anagram += words.get(i).c;
}
}
return anagram;
}
private static int combination(List<Word> words, int index, int total) {
int value = permutation(total - 1);
for (int i = 0; i < words.size(); i++) {
if (i == index) {
int v = words.get(i).count - 1;
if (v > 0) {
value /= permutation(v);
}
} else {
value /= permutation(words.get(i).count);
}
}
return value;
}
private static int permutation(int i) {
if (i == 1) {
return 1;
}
return i * permutation(i - 1);
}
}
Can someone help me with less complex logic.
I write the following code to solve your problem.
I assume that the given String is sorted.
The permutations(String prefix, char[] word, ArrayList permutations_list) function generates all possible permutations of the given string without duplicates and store them in a list named permutations_list. Thus, the word: permutations_list.get(index -1) is the desired output.
For example, assume that someone gives us the word "aab".
We have to solve this problem recursively:
Problem 1: permutations("","aab").
That means that we have to solve the problem:
Problem 2: permutations("a","ab").
String "ab" has only two letters, therefore the possible permutations are "ab" and "ba". Hence, we store in permutations_list the words "aab" and "aba".
Problem 2 has been solved. Now we go back to problem 1.
We swap the first "a" and the second "a" and we realize that these letters are the same. So we skip this case(we avoid duplicates).
Next, we swap the first "a" and "b". Now, the problem 1 has changed and we want to solve the new one:
Problem 3: permutations("","baa").
The next step is to solve the following problem:
Problem 4: permutations("b","aa").
String "aa" has only two same letters, therefore there is one possible permutation "aa". Hence, we store in permutations_list the word "baa"
Problem 4 has been solved. Finally, we go back to problem 3 and problem 3 has been solved. The final permutations_list contains "aab", "aba" and "baa".
Hence, findAnagram("aab", 2) returns the word "aba".
import java.util.ArrayList;
import java.util.Arrays;
public class AnagramProblem {
public static void main(String args[]) {
System.out.println(findAnagram("aadfs",32));
}
public static String findAnagram(String word, int index) {
ArrayList<String> permutations_list = new ArrayList<String>();
permutations("",word.toCharArray(), permutations_list);
return permutations_list.get(index - 1);
}
public static void permutations(String prefix, char[] word, ArrayList<String> permutations_list) {
boolean duplicate = false;
if (word.length==2 && word[0]!=word[1]) {
String permutation1 = prefix + String.valueOf(word[0]) + String.valueOf(word[1]);
permutations_list.add(permutation1);
String permutation2 = prefix + String.valueOf(word[1]) + String.valueOf(word[0]);
permutations_list.add(permutation2);
return;
}
else if (word.length==2 && word[0]==word[1]) {
String permutation = prefix + String.valueOf(word[0]) + String.valueOf(word[1]);
permutations_list.add(permutation);
return;
}
for (int i=0; i < word.length; i++) {
if (!duplicate) {
permutations(prefix + word[0], new String(word).substring(1,word.length).toCharArray(), permutations_list);
}
if (i < word.length - 1) {
char temp = word[0];
word[0] = word[i+1];
word[i+1] = temp;
}
if (i < word.length - 1 && word[0]==word[i+1]) duplicate = true;
else duplicate = false;
}
}
}
I think your problem will become a lot simpler if you considerate generating the anagrams in alphabetical order, so you don't have to sort them afterwards.
The following code (from Generating all permutations of a given string) generates all permutations of a String. The order of these permutations are given by the initial order of the input String. If you sort the String beforehand, the anagrams will thus be added in sorted order.
to prevent duplicates, you can simply maintain a Set of Strings you have already added. If this Set does not contain the anagram you're about to add, then you can safely add it to the list of anagrams.
Here is the code for the solution i described. I hope you find it to be simpler than your solution.
public class Anagrams {
private List<String> sortedAnagrams;
private Set<String> handledStrings;
public static void main(String args[]) {
Anagrams anagrams = new Anagrams();
List<String> list = anagrams.permutations(sort("AASDF"));
System.out.println(list.get(31));
}
public List<String> permutations(String str) {
handledStrings = new HashSet<String>();
sortedAnagrams = new ArrayList<String>();
permutation("", str);
return sortedAnagrams;
}
private void permutation(String prefix, String str) {
int n = str.length();
if (n == 0){
if(! handledStrings.contains(prefix)){
//System.out.println(prefix);
sortedAnagrams.add(prefix);
handledStrings.add(prefix);
}
}
else {
for (int i = 0; i < n; i++)
permutation(prefix + str.charAt(i), str.substring(0, i) + str.substring(i + 1, n));
}
}
public static String sort(String str) {
char[] arr = str.toCharArray();
Arrays.sort(arr);
return new String(arr);
}
}
If you create a "next permutation" method which alters an array to its next lexicographical permutation, then your base logic could be to just invoke that method n-1 times in a loop.
There's a nice description with code that can be found here. Here's both the basic pseudocode and an example in Java adapted from that page.
/*
1. Find largest index i such that array[i − 1] < array[i].
(If no such i exists, then this is already the last permutation.)
2. Find largest index j such that j ≥ i and array[j] > array[i − 1].
3. Swap array[j] and array[i − 1].
4. Reverse the suffix starting at array[i].
*/
boolean nextPermutation(char[] array) {
int i = array.length - 1;
while (i > 0 && array[i - 1] >= array[i]) i--;
if (i <= 0) return false;
int j = array.length - 1;
while (array[j] <= array[i - 1]) j--;
char temp = array[i - 1];
array[i - 1] = array[j];
array[j] = temp;
j = array.length - 1;
while (i < j) {
temp = array[i];
array[i] = array[j];
array[j] = temp;
i++;
j--;
}
return true;
}

Finding shortest possible substring that contains a String

This was a question asked in a recent programming interview.
Given a random string S and another string T with unique elements, find the minimum consecutive sub-string of S such that it contains all the elements in T.
Say,
S='adobecodebanc'
T='abc'
Answer='banc'
I've come up with a solution,
public static String completeSubstring(String T, String S){
String minSub = T;
StringBuilder sb = new StringBuilder();
for (int i = 0; i <T.length()-1; i++) {
for (int j = i + 1; j <= T.length() ; j++) {
String sub = T.substring(i,j);
if(stringContains(sub, S)){
if(sub.length() < minSub.length()) minSub = sub;
}
}
}
return minSub;
}
private static boolean stringContains(String t, String s){
//if(t.length() <= s.length()) return false;
int[] arr = new int[256];
for (int i = 0; i <t.length() ; i++) {
char c = t.charAt(i);
arr[c -'a'] = 1;
}
boolean found = true;
for (int i = 0; i <s.length() ; i++) {
char c = s.charAt(i);
if(arr[c - 'a'] != 1){
found = false;
break;
}else continue;
}
return found;
}
This algorithm has a O(n3) complexity, which but naturally isn't great. Can someone suggest a better algorithm.
Here's the O(N) solution.
The important thing to note re: complexity is that each unit of work involves incrementing either start or end, they don't decrease, and the algorithm stops before they both get to the end.
public static String findSubString(String s, String t)
{
//algorithm moves a sliding "current substring" through s
//in this map, we keep track of the number of occurrences of
//each target character there are in the current substring
Map<Character,int[]> counts = new HashMap<>();
for (char c : t.toCharArray())
{
counts.put(c,new int[1]);
}
//how many target characters are missing from the current substring
//current substring is initially empty, so all of them
int missing = counts.size();
//don't waste my time
if (missing<1)
{
return "";
}
//best substring found
int bestStart = -1, bestEnd = -1;
//current substring
int start=0, end=0;
while (end<s.length())
{
//expand the current substring at the end
int[] cnt = counts.get(s.charAt(end++));
if (cnt!=null)
{
if (cnt[0]==0)
{
--missing;
}
cnt[0]+=1;
}
//while the current substring is valid, remove characters
//at the start to see if a shorter substring that ends at the
//same place is also valid
while(start<end && missing<=0)
{
//current substring is valid
if (end-start < bestEnd-bestStart || bestEnd<0)
{
bestStart = start;
bestEnd = end;
}
cnt = counts.get(s.charAt(start++));
if (cnt != null)
{
cnt[0]-=1;
if (cnt[0]==0)
{
++missing;
}
}
}
//current substring is no longer valid. we'll add characters
//at the end until we get another valid one
//note that we don't need to add back any start character that
//we just removed, since we already tried the shortest valid string
//that starts at start-1
}
return(bestStart<=bestEnd ? s.substring(bestStart,bestEnd) : null);
}
I know that there already is an adequate O(N) complexity answer, but I tried to figure it out on my own without looking it up, just because it's a fun problem to solve and thought I would share. Here's the O(N) solution that I came up with:
public static String completeSubstring(String S, String T){
int min = S.length()+1, index1 = -1, index2 = -1;
ArrayList<ArrayList<Integer>> index = new ArrayList<ArrayList<Integer>>();
HashSet<Character> targetChars = new HashSet<Character>();
for(char c : T.toCharArray()) targetChars.add(c);
//reduce initial sequence to only target chars and keep track of index
//Note that the resultant string does not allow the same char to be consecutive
StringBuilder filterS = new StringBuilder();
for(int i = 0, s = 0 ; i < S.length() ; i++) {
char c = S.charAt(i);
if(targetChars.contains(c)) {
if(s > 0 && filterS.charAt(s-1) == c) {
index.get(s-1).add(i);
} else {
filterS.append(c);
index.add(new ArrayList<Integer>());
index.get(s).add(i);
s++;
}
}
}
//Not necessary to use regex, loops are fine, but for readability sake
String regex = "([abc])((?!\\1)[abc])((?!\\1)(?!\\2)[abc])";
Matcher m = Pattern.compile(regex).matcher(filterS.toString());
for(int i = 0, start = -1, p1, p2, tempMin, charSize = targetChars.size() ; m.find(i) ; i = start+1) {
start = m.start();
ArrayList<Integer> first = index.get(start);
p1 = first.get(first.size()-1);
p2 = index.get(start+charSize-1).get(0);
tempMin = p2-p1;
if(tempMin < min) {
min = tempMin;
index1 = p1;
index2 = p2;
}
}
return S.substring(index1, index2+1);
}
I'm pretty sure the complexity is O(N), please correct if I'm wrong
Alternative implementation of O(N) algorithm proposed by #MattTimmermans, which uses Map<Integer, Integer> to count occurrences and Set<Integer> to store chars from T that are present in current substring:
public static String completeSubstring(String s, String t) {
Map<Integer, Integer> occ
= t.chars().boxed().collect(Collectors.toMap(c -> c, c -> 0));
Set<Integer> found = new HashSet<>(); // characters from T found in current match
int start = 0; // current match
int bestStart = Integer.MIN_VALUE, bestEnd = -1;
for (int i = 0; i < s.length(); i++) {
int ci = s.charAt(i); // current char
if (!occ.containsKey(ci)) // not from T
continue;
occ.put(ci, occ.get(ci) + 1); // add occurrence
found.add(ci);
for (int j = start; j < i; j++) { // try to reduce current match
int cj = s.charAt(j);
Integer c = occ.get(cj);
if (c != null) {
if (c == 1) { // cannot reduce anymore
start = j;
break;
} else
occ.put(cj, c - 1); // remove occurrence
}
}
if (found.size() == occ.size() // all chars found
&& (i - start < bestEnd - bestStart)) {
bestStart = start;
bestEnd = i;
}
}
return bestStart < 0 ? null : s.substring(bestStart, bestEnd + 1);
}

Permutations of a string with no duplicates

I read solutions to the problem of generating all the permutations of a string (solution).
Can anyone explain how perm2 is different from perm1? (I feel the only difference is that perm1 tries to put each element in the first position while perm2 in the last one)
// print N! permutation of the characters of the string s (in order)
public static void perm1(String s) { perm1("", s); }
private static void perm1(String prefix, String s) {
int N = s.length();
if (N == 0) System.out.println(prefix);
else {
for (int i = 0; i < N; i++)
perm1(prefix + s.charAt(i), s.substring(0, i) + s.substring(i+1, N));
}
}
// print N! permutation of the elements of array a (not in order)
public static void perm2(String s) {
int N = s.length();
char[] a = new char[N];
for (int i = 0; i < N; i++)
a[i] = s.charAt(i);
perm2(a, N);
}
private static void perm2(char[] a, int n) {
if (n == 1) {
System.out.println(a);
return;
}
for (int i = 0; i < n; i++) {
swap(a, i, n-1);
perm2(a, n-1);
swap(a, i, n-1);
}
}
Also, if some letters are the same in the string, then some permutations will be the same? The only way I can think of to prevent this is to save the result in a hashset so as to keep only one instance of a permutation. Is there a better solution?
I expect that the justification for the second solution is efficiency. It uses character arrays rather than String objects and swaps characters at each step rather than creating a new String via concatenation.
In terms of functionality the only difference between the two solutions is the order in which the results will be output.
You are correct that this does not guarantee unique solutions if there are some duplicate characters in the input. Storing the results and checking uniqueness (by using a Set or directly via contains) would be the easiest way to avoid this if required.
An alternative, in the second solution, would be to check if a character has already been handled. This would avoid the overhead of storing a result set (which could be significant for long strings).
In second perm2 function:
if (n == 1) {
System.out.println(a);
return;
}
for (int i = n; i < a.length; i++) {
if (a[i] == a[n-1])
return;
}
for (int i = 0; i < n; i++) {
boolean duplicate = false;
for (int j = 0; !duplicate && j < i; j++)
duplicate = a[i] == a[j];
if (!duplicate) {
swap(a, i, n-1);
perm2(a, n-1);
swap(a, i, n-1);
}
}

Implementing a binary insertion sort using binary search in Java

I'm having trouble combining these two algorithms together. I've been asked to modify Binary Search to return the index that an element should be inserted into an array. I've been then asked to implement a Binary Insertion Sort that uses my Binary Search to sort an array of randomly generated ints.
My Binary Search works the way it's supposed to, returning the correct index whenever I test it alone. I wrote out Binary Insertion Sort to get a feel for how it works, and got that to work as well. As soon as I combine the two together, it breaks. I know I'm implementing them incorrectly together, but I'm not sure where my problem lays.
Here's what I've got:
public class Assignment3
{
public static void main(String[] args)
{
int[] binary = { 1, 7, 4, 9, 10, 2, 6, 12, 3, 8, 5 };
ModifiedBinaryInsertionSort(binary);
}
static int ModifiedBinarySearch(int[] theArray, int theElement)
{
int leftIndex = 0;
int rightIndex = theArray.length - 1;
int middleIndex = 0;
while(leftIndex <= rightIndex)
{
middleIndex = (leftIndex + rightIndex) / 2;
if (theElement == theArray[middleIndex])
return middleIndex;
else if (theElement < theArray[middleIndex])
rightIndex = middleIndex - 1;
else
leftIndex = middleIndex + 1;
}
return middleIndex - 1;
}
static void ModifiedBinaryInsertionSort(int[] theArray)
{
int i = 0;
int[] returnArray = new int[theArray.length + 1];
for(i = 0; i < theArray.length; i++)
{
returnArray[ModifiedBinarySearch(theArray, theArray[i])] = theArray[i];
}
for(i = 0; i < theArray.length; i++)
{
System.out.print(returnArray[i] + " ");
}
}
}
The return value I get for this when I run it is 1 0 0 0 0 2 0 0 3 5 12. Any suggestions?
UPDATE: updated ModifiedBinaryInsertionSort
static void ModifiedBinaryInsertionSort(int[] theArray)
{
int index = 0;
int element = 0;
int[] returnArray = new int[theArray.length];
for (int i = 1; i < theArray.lenght - 1; i++)
{
element = theArray[i];
index = ModifiedBinarySearch(theArray, 0, i, element);
returnArray[i] = element;
while (index >= 0 && theArray[index] > element)
{
theArray[index + 1] = theArray[index];
index = index - 1;
}
returnArray[index + 1] = element;
}
}
Here is my method to sort an array of integers using binary search.
It modifies the array that is passed as argument.
public static void binaryInsertionSort(int[] a) {
if (a.length < 2)
return;
for (int i = 1; i < a.length; i++) {
int lowIndex = 0;
int highIndex = i;
int b = a[i];
//while loop for binary search
while(lowIndex < highIndex) {
int middle = lowIndex + (highIndex - lowIndex)/2; //avoid int overflow
if (b >= a[middle]) {
lowIndex = middle+1;
}
else {
highIndex = middle;
}
}
//replace elements of array
System.arraycopy(a, lowIndex, a, lowIndex+1, i-lowIndex);
a[lowIndex] = b;
}
}
How an insertion sort works is, it creates a new empty array B and, for each element in the unsorted array A, it binary searches into the section of B that has been built so far (From left to right), shifts all elements to the right of the location in B it choose one right and inserts the element in. So you are building up an at-all-times sorted array in B until it is the full size of B and contains everything in A.
Two things:
One, the binary search should be able to take an int startOfArray and an int endOfArray, and it will only binary search between those two points. This allows you to make it consider only the part of array B that is actually the sorted array.
Two, before inserting, you must move all elements one to the right before inserting into the gap you've made.
I realize this is old, but the answer to the question is that, perhaps a little unintuitively, "Middleindex - 1" will not be your insertion index in all cases.
If you run through a few cases on paper the problem should become apparent.
I have an extension method that solves this problem. To apply it to your situation, you would iterate through the existing list, inserting into an empty starting list.
public static void BinaryInsert<TItem, TKey>(this IList<TItem> list, TItem item, Func<TItem, TKey> sortfFunc)
where TKey : IComparable
{
if (list == null)
throw new ArgumentNullException("list");
int min = 0;
int max = list.Count - 1;
int index = 0;
TKey insertKey = sortfFunc(item);
while (min <= max)
{
index = (max + min) >> 1;
TItem value = list[index];
TKey compKey = sortfFunc(value);
int result = compKey.CompareTo(insertKey);
if (result == 0)
break;
if (result > 0)
max = index - 1;
else
min = index + 1;
}
if (index <= 0)
index = 0;
else if (index >= list.Count)
index = list.Count;
else
if (sortfFunc(list[index]).CompareTo(insertKey) < 0)
++index;
list.Insert(index, item);
}
Dude, I think you have some serious problem with your code. Unfortunately, you are missing the fruit (logic) of this algorithm. Your divine goal here is to get the index first, insertion is a cake walk, but index needs some sweat. Please don't see this algorithm unless you gave your best and desperate for it. Never give up, you already know the logic, your goal is to find it in you. Please let me know for any mistakes, discrepancies etc. Happy coding!!
public class Insertion {
private int[] a;
int n;
int c;
public Insertion()
{
a = new int[10];
n=0;
}
int find(int key)
{
int lowerbound = 0;
int upperbound = n-1;
while(true)
{
c = (lowerbound + upperbound)/2;
if(n==0)
return 0;
if(lowerbound>=upperbound)
{
if(a[c]<key)
return c++;
else
return c;
}
if(a[c]>key && a[c-1]<key)
return c;
else if (a[c]<key && a[c+1]>key)
return c++;
else
{
if(a[c]>key)
upperbound = c-1;
else
lowerbound = c+1;
}
}
}
void insert(int key)
{
find(key);
for(int k=n;k>c;k--)
{
a[k]=a[k-1];
}
a[c]=key;
n++;
}
void display()
{
for(int i=0;i<10;i++)
{
System.out.println(a[i]);
}
}
public static void main(String[] args)
{
Insertion i=new Insertion();
i.insert(56);
i.insert(1);
i.insert(78);
i.insert(3);
i.insert(4);
i.insert(200);
i.insert(6);
i.insert(7);
i.insert(1000);
i.insert(9);
i.display();
}
}

Categories

Resources