Suffix Array Implementation Error - java

I keep getting compiler errors with an implementation of a suffix array by Arrays.sort.
I get the following errors:
a cannot be resolved to a variable
Syntax error on token ",", . expected
Syntax error on token "-", -- expected
a cannot be resolved to a variable
b cannot be resolved to a variable
In the following code:
import java.util.*;
public class SuffixArray {
// sort suffixes of S in O(n*log(n))
public static int[] suffixArray(CharSequence S) {
int n = S.length();
Integer[] order = new Integer[n];
for (int i = 0; i < n; i++)
order[i] = n - 1 - i;
// stable sort of characters
Arrays.sort(order, (a, b) -> Character.compare(S.charAt(a), S.charAt(b)));
int[] sa = new int[n];
int[] classes = new int[n];
for (int i = 0; i < n; i++) {
sa[i] = order[i];
classes[i] = S.charAt(i);
}
// sa[i] - suffix on i'th position after sorting by first len characters
// classes[i] - equivalence class of the i'th suffix after sorting by first len characters
for (int len = 1; len < n; len *= 2) {
int[] c = classes.clone();
for (int i = 0; i < n; i++) {
// condition sa[i - 1] + len < n simulates 0-symbol at the end of the string
// a separate class is created for each suffix followed by simulated 0-symbol
classes[sa[i]] = i > 0 && c[sa[i - 1]] == c[sa[i]] && sa[i - 1] + len < n && c[sa[i - 1] + len / 2] == c[sa[i] + len / 2] ? classes[sa[i - 1]] : i;
}
// Suffixes are already sorted by first len characters
// Now sort suffixes by first len * 2 characters
int[] cnt = new int[n];
for (int i = 0; i < n; i++)
cnt[i] = i;
int[] s = sa.clone();
for (int i = 0; i < n; i++) {
// s[i] - order of suffixes sorted by first len characters
// (s[i] - len) - order of suffixes sorted only by second len characters
int s1 = s[i] - len;
// sort only suffixes of length > len, others are already sorted
if (s1 >= 0)
sa[cnt[classes[s1]]++] = s1;
}
}
return sa;
}
// sort rotations of S in O(n*log(n))
public static int[] rotationArray(CharSequence S) {
int n = S.length();
Integer[] order = new Integer[n];
for (int i = 0; i < n; i++)
order[i] = i;
Arrays.sort(order, (a, b) -> Character.compare(S.charAt(a), S.charAt(b)));
int[] sa = new int[n];
int[] classes = new int[n];
for (int i = 0; i < n; i++) {
sa[i] = order[i];
classes[i] = S.charAt(i);
}
for (int len = 1; len < n; len *= 2) {
int[] c = classes.clone();
for (int i = 0; i < n; i++)
classes[sa[i]] = i > 0 && c[sa[i - 1]] == c[sa[i]] && c[(sa[i - 1] + len / 2) % n] == c[(sa[i] + len / 2) % n] ? classes[sa[i - 1]] : i;
int[] cnt = new int[n];
for (int i = 0; i < n; i++)
cnt[i] = i;
int[] s = sa.clone();
for (int i = 0; i < n; i++) {
int s1 = (s[i] - len + n) % n;
sa[cnt[classes[s1]]++] = s1;
}
}
return sa;
}
// longest common prefixes array in O(n)
public static int[] lcp(int[] sa, CharSequence s) {
int n = sa.length;
int[] rank = new int[n];
for (int i = 0; i < n; i++)
rank[sa[i]] = i;
int[] lcp = new int[n - 1];
for (int i = 0, h = 0; i < n; i++) {
if (rank[i] < n - 1) {
for (int j = sa[rank[i] + 1]; Math.max(i, j) + h < s.length() && s.charAt(i + h) == s.charAt(j + h); ++h)
;
lcp[rank[i]] = h;
if (h > 0)
--h;
}
}
return lcp;
}
// Usage example
public static void main(String[] args) {
String s1 = "abcab";
int[] sa1 = suffixArray(s1);
// print suffixes in lexicographic order
for (int p : sa1)
System.out.println(s1.substring(p));
System.out.println("lcp = " + Arrays.toString(lcp(sa1, s1)));
// random test
Random rnd = new Random(1);
for (int step = 0; step < 100000; step++) {
int n = rnd.nextInt(100) + 1;
StringBuilder s = new StringBuilder();
for (int i = 0; i < n; i++)
s.append((char) ('\1' + rnd.nextInt(10)));
int[] sa = suffixArray(s);
int[] ra = rotationArray(s.toString() + '\0');
int[] lcp = lcp(sa, s);
for (int i = 0; i + 1 < n; i++) {
String a = s.substring(sa[i]);
String b = s.substring(sa[i + 1]);
if (a.compareTo(b) >= 0
|| !a.substring(0, lcp[i]).equals(b.substring(0, lcp[i]))
|| (a + " ").charAt(lcp[i]) == (b + " ").charAt(lcp[i])
|| sa[i] != ra[i + 1])
throw new RuntimeException();
}
}
System.out.println("Test passed");
}
}

a cannot be resolved to a variable
Syntax error on token ",", . expected
Syntax error on token "-", -- expected
a cannot be resolved to a variable
b cannot be resolved to a variable
You are getting these errors on this line (which appears twice in the code) :
Arrays.sort(order, (a, b) -> Character.compare(S.charAt(a), S.charAt(b)));
^^ ^ ^ ^
The reason must be that you are not compiling the code in Java 8. Lambda expressions require Java 8.

Related

Levenshtein distance in java outputting the wrong number

For my university assignment in java I have been asked to provide "extra analytics functions" I decided to use Levenshtein distance but I have an issue where the number outputted to the console is one less than the actual answer. So the distance between "cat" and "hat" should be 1 but it's displaying as 0
public class Levenshtein {
public Levenshtein(String first, String second) {
char [] s = first.toCharArray();
char [] t = second .toCharArray();
int Subcost = 0;
int[][] array = new int[first.length()][second.length()];
for (int i = 0; i < array[0].length; i++)
{
array[0][i] = i;
}
for (int j = 0; j < array.length; j++)
{
array [j][0]= j;
}
for (int i = 1; i < second.length(); i++)
{
for (int j = 1; j < first.length(); j++)
{
if (s[j] == t [i])
{
Subcost = 0;
}
else
{
Subcost = 1;
}
array [j][i] = Math.min(array [j-1][i] +1,
Math.min(array [j][i-1] +1,
array [j-1][i-1] + Subcost) );
}
}
UI.output("The Levenshtein distance is -> " + array[first.length()-1][second.length()-1]);
}
}
Apparently you're using the following algorithm:
https://en.wikipedia.org/wiki/Levenshtein_distance#Iterative_with_full_matrix
I think you were not too accurate with indices. I'm not sure where exactly the problem is, but here is a working version:
public int calculateLevenshteinDistance(String first, String second) {
char[] s = first.toCharArray();
char[] t = second.toCharArray();
int substitutionCost = 0;
int m = first.length();
int n = second.length();
int[][] array = new int[m + 1][n + 1];
for (int i = 1; i <= m; i++) {
array[i][0] = i;
}
for (int j = 1; j <= n; j++) {
array[0][j] = j;
}
for (int j = 1; j <= n; j++) {
for (int i = 1; i <= m; i++) {
if (s[i - 1] == t[j - 1]) {
substitutionCost = 0;
} else {
substitutionCost = 1;
}
int deletion = array[i - 1][j] + 1;
int insertion = array[i][j - 1] + 1;
int substitution = array[i - 1][j - 1] + substitutionCost;
int cost = Math.min(
deletion,
Math.min(
insertion,
substitution));
array[i][j] = cost;
}
}
return array[m][n];
}

How to use LSD String Sort without having to enter a fixed length?

sorry for my bad english. I'm styding LSD String Sorts algorithm and I have a question related to it. Here my code. I want input W not fixed, for example:
String[] a = {"38A", "3TW723", "2IYEA938", "3CI34780720"};
public static void sort(String[] a, int w) { // Sort a[] on leading W characters.
int R = 256;
int N = a.length;
//For each of the character from right to left
for (int d = w - 1; d >= 0; d--) {
//1. count the frequencies
int[] count = new int[R + 1];
for (int i = 0; i < N; i++) {
count[a[i].charAt(d) + 1]++;
}
//2. Transform counts to indices
for (int r = 0; r < R; r++) {
count[r + 1] += count[r];
}
//3. Distribute
String aux[] = new String[N];
for (int i = 0; i < N; i++) {
aux[count[a[i].charAt(d)]] = a[i];
count[a[i].charAt(d)]++;
}
//4. Copyback
System.arraycopy(aux, 0, a, 0, N);
}
}
To develop an implementation of LSD string sort that works for variable-length strings, we need to do many works on the base of your code. We need to find the longest length of string in string[] a, so when d >= a[i].length(), we return 0, which means we add extra 0 to make every string in the same length. This is my code.
// Develop an implementation of LSD string sort
// that works for variable-length strings.
public class LSDForVariableLengthStrings {
// do not instantiate
private LSDForVariableLengthStrings() { }
// find longest length string in string[] a.
public static int findLongestLength(String[] a) {
int longest = 0;
for (int i = 0; i < a.length; ++i) {
if (a[i].length() > longest) {
longest = a[i].length();
}
}
return longest;
}
// if d >= 0 && d < a[i].length(), return a[i].charAt(d);
// else , return 0, which means least value to sort.
public static int findCharAtInString(int i, int d, String[] a) {
if (d < 0 || d >= a[i].length()) {
return 0;
}
return a[i].charAt(d);
}
// Rearranges the array of variable-length strings.
public static void sort(String[] a) {
int n = a.length;
int R = 256; // extended ASCII alphabet size.
String[] aux = new String[n];
int w = findLongestLength(a); // w is the length of longest string in a.
for (int d = w - 1; d >= 0; d--) {
// sort by key-indexed counting on dth character
// compute frequency counts
int[] count = new int[R + 1];
for (int i = 0; i < n; ++i) {
int c = findCharAtInString(i, d, a);
count[c + 1]++;
}
// compute cumulates
for (int r = 0; r < R; ++r) {
count[r + 1] += count[r];
}
// move data
for (int i = 0; i < n; ++i) {
int c = findCharAtInString(i, d, a);
aux[count[c]++] = a[i];
}
// copy back
for (int i = 0; i < n; ++i) {
a[i] = aux[i];
}
}
}
public static void main(String[] args) {
String[] a = {"38A", "3TW723", "2IYEA938", "3CI34780720"};
int n = a.length;
// sort the strings
sort(a);
// prints results
for (int i = 0; i < n; ++i) {
System.out.println(a[i]);
}
}
}
LSD sorts only fixed-length strings. Use MSD instead

Why does my Biginteger.multiply() shows NullPointerException?

I tried to initialize factorials from 1 to 1000 to an biginteger array and calculating the sum of the digits. Why this code showing java.lang.NullPointerException? I think everything was initialized correctly.
class Main {
public static void main(String[] args) {
BigInteger[] b = new BigInteger[1010];
int[] ara = new int[1010];
BigInteger c;
b[0] = BigInteger.ONE;
b[1] = BigInteger.ONE;
ara[0] = ara[1] = 1;
String s;
int l, sum;
for (int i = 2; i <= 1001; i++) {
c = b[i - 1];
b[i] = b[i].multiply(c);
s = b[i].toString();
l = s.length();
sum = 0;
for (int j = 0; j < l; j++) {
sum += Character.getNumericValue(s.charAt(j));
}
ara[i] = sum;
}
Problem :
b[i] = b[i].multiply(c);
And look at your b array which you initialised
b[0] = BigInteger.ONE;
b[1] = BigInteger.ONE;
And now look at the for loop
for (int i = 2; i <= 1001; i++) {
c = b[i - 1];
b[i] = b[i].multiply(c);
You have only 0,1 indexes. It will throw NPE for index 2.
You are trying loop on 1001 elements and there are only 2 elements inside your array. Fill the b array with zeros first.
Solution :
Change your for loop as below and keep everything same. It works.
for (int i = 2; i <= 1001; i++) {
b[i] = BigInteger.ONE;
c = b[i - 1];
b[i] = b[i].multiply(c);
s = b[i].toString();
The algorithm for factorial is to take some value n and multiply that n by n - 1 until the value 1 is arrived at. Your algorithm doesn't appear to do that (it generates ones). I think you wanted something like
int len = 1010;
BigInteger[] b = new BigInteger[len];
int[] ara = new int[len];
for (int i = 0; i < len; i++) {
// calculate factorial.
b[i] = BigInteger.valueOf(i + 1);
for (int j = i; j > 1; j--) {
b[i] = b[i].multiply(BigInteger.valueOf(j));
}
// now sum digits.
for (char ch : b[i].toString().toCharArray()) {
ara[i] += Character.getNumericValue(ch);
}
}

Suffix array O(NlogN) implementation

I'm looking into the specific O(NlogN) implementation of suffix array found at this link : https://sites.google.com/site/indy256/algo/suffix_array
I'm able to understand the core concepts but understanding the implementation in its entirety is a problem.
public static int[] suffixArray(CharSequence S) {
int n = S.length();
Integer[] order = new Integer[n];
for (int i = 0; i < n; i++)
order[i] = n - 1 - i;
// stable sort of characters
Arrays.sort(order, (a, b) -> Character.compare(S.charAt(a), S.charAt(b)));
int[] sa = new int[n];
int[] classes = new int[n];
for (int i = 0; i < n; i++) {
sa[i] = order[i];
classes[i] = S.charAt(i);
}
// sa[i] - suffix on i'th position after sorting by first len characters
// classes[i] - equivalence class of the i'th suffix after sorting by first len characters
for (int len = 1; len < n; len *= 2) {
int[] c = classes.clone();
for (int i = 0; i < n; i++) {
// condition sa[i - 1] + len < n simulates 0-symbol at the end of the string
// a separate class is created for each suffix followed by simulated 0-symbol
classes[sa[i]] = i > 0 && c[sa[i - 1]] == c[sa[i]] && sa[i - 1] + len < n && c[sa[i - 1] + len / 2] == c[sa[i] + len / 2] ? classes[sa[i - 1]] : i;
}
// Suffixes are already sorted by first len characters
// Now sort suffixes by first len * 2 characters
int[] cnt = new int[n];
for (int i = 0; i < n; i++)
cnt[i] = i;
int[] s = sa.clone();
for (int i = 0; i < n; i++) {
// s[i] - order of suffixes sorted by first len characters
// (s[i] - len) - order of suffixes sorted only by second len characters
int s1 = s[i] - len;
// sort only suffixes of length > len, others are already sorted
if (s1 >= 0)
sa[cnt[classes[s1]]++] = s1;
}
}
return sa;
}
I'm wondering about the use of cnt[] array and places it is useful.
Any pointers would be helpful.
Thanks.

How to generate combinations obtained by permuting 2 positions in Java

I have this problem, I need to generate from a given permutation not all combinations, but just those obtained after permuting 2 positions and without repetition. It's called the region of the a given permutation, for example given 1234 I want to generate :
2134
3214
4231
1324
1432
1243
the size of the region of any given permutation is , n(n-1)/2 , in this case it's 6 combinations .
Now, I have this programme , he does a little too much then what I want, he generates all 24 possible combinations :
public class PossibleCombinations {
public static void main(String[] args) {
Scanner s=new Scanner(System.in);
System.out.println("Entrer a mumber");
int n=s.nextInt();
int[] currentab = new int[n];
// fill in the table 1 TO N
for (int i = 1; i <= n; i++) {
currentab[i - 1] = i;
}
int total = 0;
for (;;) {
total++;
boolean[] used = new boolean[n + 1];
Arrays.fill(used, true);
for (int i = 0; i < n; i++) {
System.out.print(currentab[i] + " ");
}
System.out.println();
used[currentab[n - 1]] = false;
int pos = -1;
for (int i = n - 2; i >= 0; i--) {
used[currentab[i]] = false;
if (currentab[i] < currentab[i + 1]) {
pos = i;
break;
}
}
if (pos == -1) {
break;
}
for (int i = currentab[pos] + 1; i <= n; i++) {
if (!used[i]) {
currentab[pos] = i;
used[i] = true;
break;
}
}
for (int i = 1; i <= n; i++) {
if (!used[i]) {
currentab[++pos] = i;
}
}
}
System.out.println(total);
}
}
the Question is how can I fix this programme to turn it into a programme that generates only the combinations wanted .
How about something simple like
public static void printSwapTwo(int n) {
int count = 0;
StringBuilder sb = new StringBuilder();
for(int i = 0; i < n - 1;i++)
for(int j = i + 1; j < n; j++) {
// gives all the pairs of i and j without repeats
sb.setLength(0);
for(int k = 1; k <= n; k++) sb.append(k);
char tmp = sb.charAt(i);
sb.setCharAt(i, sb.charAt(j));
sb.setCharAt(j, tmp);
System.out.println(sb);
count++;
}
System.out.println("total=" + count+" and should be " + n * (n - 1) / 2);
}

Categories

Resources