Get the list of index in subsequence matching

Get the list of index in subsequence matching - java

i have 2 sequences, for instance s=aaba and ss=aa, and i want all the way ss is in s.
In this example:
[0,1], [0,3] and [1,3]
My code is below. It works fine, except for very long s with multiple ss. In that case i've got
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
(I already use java with -Xmx at the maximum I can…)
public static ArrayList<ArrayList<Integer>> getListIndex(String[] s, String[] ss, int is, int iss) {
ArrayList<ArrayList<Integer>> listOfListIndex = new ArrayList<ArrayList<Integer>>();
ArrayList<ArrayList<Integer>> listRec = new ArrayList<ArrayList<Integer>>();
ArrayList<Integer> listI = new ArrayList<Integer>();
if (iss<0||is<iss){
return listOfListIndex;
}
if (ss[iss].compareTo(s[is])==0){
//ss[iss] matches, search ss[0..iss-1] in s[0..is-1]
listRec = getListIndex(s,ss,is-1,iss-1);
//empty lists (iss=0 for instance)
if(listRec.size()==0){
listI = new ArrayList<Integer>();
listI.add(is);
listOfListIndex.add(listI);
}
else{
//adding to what we have already found
for (int i=0; i<listRec.size();i++){
listI = listRec.get(i);
listI.add(is);
listOfListIndex.add(listI);
}
}
}
//In all cases
//searching ss[0..iss] in s[0..is-1]
listRec = getListIndex(s,ss,is-1,iss);
for (int i=0; i<listRec.size();i++){
listI = listRec.get(i);
listOfListIndex.add(listI);
}
return listOfListIndex;
}
Is there anyway to do this more efficiently ?

I doubt the recursion is the problem (think of what the maximum recursion depth is). The algorithm can be efficiently implemented by collecting the indecies of each character of s in ss in TreeSets and then simply taking the .tailSet when needing to "advance" in the string.
import java.util.*;
public class Test {
public static Set<List<Integer>> solutions(List<TreeSet<Integer>> is, int n) {
TreeSet<Integer> ts = is.get(0);
Set<List<Integer>> sol = new HashSet<List<Integer>>();
for (int i : ts.tailSet(n+1)) {
if (is.size() == 1) {
List<Integer> l = new ArrayList<Integer>();
l.add(i);
sol.add(l);
} else
for (List<Integer> tail : solutions(is.subList(1, is.size()), i)) {
List<Integer> l = new ArrayList<Integer>();
l.add(i);
l.addAll(tail);
sol.add(l);
}
}
return sol;
}
public static void main(String[] args) {
String ss = "aaba";
String s = "aa";
List<TreeSet<Integer>> is = new ArrayList<TreeSet<Integer>>();
// Compute all indecies of each character.
for (int i = 0; i < s.length(); i++) {
TreeSet<Integer> indecies = new TreeSet<Integer>();
char c = s.charAt(i);
for (int j = 0; j < ss.length(); j++) {
if (ss.charAt(j) == c)
indecies.add(j);
}
is.add(indecies);
}
System.out.println(solutions(is, -1));
}
}
Output:
[[0, 1], [1, 3], [0, 3]]

ArrayList<Integer> is quite memory-inefficient due to the overhead of the wrapper class. Using TIntArrayList from GNU Trove will probably cut down your memory usage by a factor of 3 (or even more if you're running on a 64bit JVM).

Well, the basic problem is that your algorithm is recursive. Java doesn't do tail-call optimization, so every recursive call just adds to the stack until you overflow.
What you want to do is re-structure your algorithm to be iterable so you aren't adding to the stack. Think about putting a loop (with a termination test) as the outer-most element of your method instead.
Another way to look at this problem is to break it into two steps:
Capture the positions of all of the given character ('a' in your example) into a single set.
All you want here is the complete set of combinations between them. Remember that the equation for number of combinations of r things chosen from n different things is:
C(n,r) = n!/[r!(n-r)!]

Related

Create a list of combinations recursively

I am trying to implement a combination method using recursion. I have already done it using a for loop but want to approach it via recursion.
The method gets two inputs and creates all possible combinations. It is supposed to store the combination in an instance variable that I have called "combination". I tried different codes but they don't work properly. I think recursive back-tracking is the best way to approach this.
For example, object.pe1.combination(4,3) would create something like this:
image of combination list
// Instance variable needed for this problem
ArrayList<Integer[]> combination;
private int size;
// To calculate all the possible combinations
private int factorial(int x){
if (x == 0) {
return 1;
}
else {
return x * factorial(x - 1);
}
}
void combination(int n, int r) {
// formula for calculating the combination of r items selected among n: n! / (r! * (n - r)!)
int noc = factorial (n) / (factorial (r) * factorial (n - r)); // number of combinations
this.combination = new ArrayList<Integer[]>(noc); // 2D array. Each slot stores a combination
if (noc == 0) {
}
else {
this.combination = new ArrayList<Integer[]>(noc);
int[] arr = new int[n];
int[] temparr = new int[r];
arr = createCombination(temparr, 0, r);
}
}
private int[] createCombination(int[] temparr, int index, int r) {
// this is where I am stuck
temparr[0] = index;
if (temparr[r] == 0) {
temparr = new int[r - 1];
temparr = createCombination(temparr, index + 1, r - 1);
}
else {
return temparr;
}
}

A recursive implementation of any algorithm is comprised of two parts:
base case - a condition that terminates a branch of recursive calls and represents an edge case for which result is known in advance;
recursive case - a part where the logic resides and the recursive calls are made.
For this task a base case will be a situation when size of the combination equals to a target size (denoted as r in your code, in the code below I gave it a name targetSize).
Explanation of the recursive logic:
Every method call tracks its own combination;
Every combination unless it reaches the targetSize is used as a blue-print for other combinations;
Each item from the source of data can be used only once, hence when it's being added to a combination it must be removed from the source.
The type ArrayList<Integer[]> which you are using to store the combination isn't a good choice. Arrays and generics do not play well together. List<List<Integer>> will be more appropriate for this purpose.
Also in my code List is used as a source of data instead of an array, which isn't a complicated conversion and can be achieved easily.
Pay attention to the comments in the code.
private List<List<Integer>> createCombination(List<Integer> source, List<Integer> comb, int targetSize) {
if (comb.size() == targetSize) { // base condition of the recursion
List<List<Integer>> result = new ArrayList<>();
result.add(comb);
return result;
}
List<List<Integer>> result = new ArrayList<>();
Iterator<Integer> iterator = source.iterator();
while (iterator.hasNext()) {
// taking an element from a source
Integer item = iterator.next();
iterator.remove(); // in order not to get repeated the element has to be removed
// creating a new combination using existing as a base
List<Integer> newComb = new ArrayList<>(comb);
newComb.add(item); // adding the element that was removed from the source
result.addAll(createCombination(new ArrayList<>(source), newComb, targetSize)); // adding all the combinations generated
}
return result;
}
For the input
createCombination(new ArrayList<>(List.of(1, 2, 3)), new ArrayList<>(), 2));
It'll produce the output
[[1, 2], [1, 3], [2, 3]]

Sort multiple arrays simultaneously "in place"

I have the following 3 arrays:
int[] indexes = new int[]{0,2,8,5};
String[] sources = new String[]{"how", "are", "today", "you"};
String[] targets = new String[]{"I", "am", "thanks", "fine"};
I want to sort the three arrays based on the indexes:
indexes -> {0,2,5,8}
sources -> {"how", "are", "you", "today"}
targets -> {"I", "am", "fine", "thanks"}
I can create a new class myClass with all three elements:
class myClass {
int x;
String source;
String target;
}
Reassign everything to myClass, then sort myClass using x. However, this would required additional spaces. I am wondering if it is possible to do in place sorting? Thanks!

Three ways of doing this
1. Using Comparator (Need Java 8 plus)
import java.io.*;
import java.util.*;
class Test {
public static String[] sortWithIndex (String[] strArr, int[] intIndex )
{
if (! isSorted(intIndex)){
final List<String> stringList = Arrays.asList(strArr);
Collections.sort(stringList, Comparator.comparing(s -> intIndex[stringList.indexOf(s)]));
return stringList.toArray(new String[stringList.size()]);
}
else
return strArr;
}
public static boolean isSorted(int[] arr) {
for (int i = 0; i < arr.length - 1; i++) {
if (arr[i + 1] < arr[i]) {
return false;
};
}
return true;
}
// Driver program to test function.
public static void main(String args[])
{
int[] indexes = new int[]{0,2,8,5};
String[] sources = new String[]{"how", "are", "today", "you"};
String[] targets = new String[]{"I", "am", "thanks", "fine"};
String[] sortedSources = sortWithIndex(sources,indexes);
String[] sortedTargets = sortWithIndex(targets,indexes);
Arrays.sort(indexes);
System.out.println("Sorted Sources " + Arrays.toString(sortedSources) + " Sorted Targets " + Arrays.toString(sortedTargets) + " Sorted Indexes " + Arrays.toString(indexes));
}
}
Output
Sorted Sources [how, are, you, today] Sorted Targets [I, am, fine, thanks] Sorted Indexes [0, 2, 5, 8]
2. Using Lambda (Need Java 8 plus)
import java.io.*;
import java.util.*;
public class Test {
public static String[] sortWithIndex (String[] strArr, int[] intIndex )
{
if (! isSorted(intIndex)) {
final List<String> stringList = Arrays.asList(strArr);
Collections.sort(stringList, (left, right) -> intIndex[stringList.indexOf(left)] - intIndex[stringList.indexOf(right)]);
return stringList.toArray(new String[stringList.size()]);
}
else
return strArr;
}
public static boolean isSorted(int[] arr) {
for (int i = 0; i < arr.length - 1; i++) {
if (arr[i + 1] < arr[i]) {
return false;
};
}
return true;
}
// Driver program to test function.
public static void main(String args[])
{
int[] indexes = new int[]{0,2,5,8};
String[] sources = new String[]{"how", "are", "today", "you"};
String[] targets = new String[]{"I", "am", "thanks", "fine"};
String[] sortedSources = sortWithIndex(sources,indexes);
String[] sortedTargets = sortWithIndex(targets,indexes);
Arrays.sort(indexes);
System.out.println("Sorted Sources " + Arrays.toString(sortedSources) + " Sorted Targets " + Arrays.toString(sortedTargets) + " Sorted Indexes " + Arrays.toString(indexes));
}
}
3. Using Lists and Maps and avoiding multiple calls (as in second solution above) to the method to sort individual arrays
import java.util.*;
import java.lang.*;
import java.io.*;
public class Test{
public static <T extends Comparable<T>> void sortWithIndex( final List<T> key, List<?>... lists){
// input validation
if(key == null || lists == null)
throw new NullPointerException("Key cannot be null.");
for(List<?> list : lists)
if(list.size() != key.size())
throw new IllegalArgumentException("All lists should be of the same size");
// Lists are size 0 or 1, nothing to sort
if(key.size() < 2)
return;
// Create a List of indices
List<Integer> indices = new ArrayList<Integer>();
for(int i = 0; i < key.size(); i++)
indices.add(i);
// Sort the indices list based on the key
Collections.sort(indices, new Comparator<Integer>(){
#Override public int compare(Integer i, Integer j) {
return key.get(i).compareTo(key.get(j));
}
});
Map<Integer, Integer> swapMap = new HashMap<Integer, Integer>(indices.size());
List<Integer> swapFrom = new ArrayList<Integer>(indices.size()),
swapTo = new ArrayList<Integer>(indices.size());
// create a mapping that allows sorting of the List by N swaps.
for(int i = 0; i < key.size(); i++){
int k = indices.get(i);
while(i != k && swapMap.containsKey(k))
k = swapMap.get(k);
swapFrom.add(i);
swapTo.add(k);
swapMap.put(i, k);
}
// use the swap order to sort each list by swapping elements
for(List<?> list : lists)
for(int i = 0; i < list.size(); i++)
Collections.swap(list, swapFrom.get(i), swapTo.get(i));
}
public static void main (String[] args) throws java.lang.Exception{
List<Integer> index = Arrays.asList(0,2,8,5);
List<String> sources = Arrays.asList("how", "are", "today", "you");
// List Types do not need to be the same
List<String> targets = Arrays.asList("I", "am", "thanks", "fine");
sortWithIndex(index, index, sources, targets);
System.out.println("Sorted Sources " + sources + " Sorted Targets " + targets + " Sorted Indexes " + index);
}
}
Output
Sorted Sources [how, are, you, today] Sorted Targets [I, am, fine, thanks] Sorted Indexes [0, 2, 5, 8]

It is possible although it is not that easy than it looks like. There are two options:
write your own sort algorithm where the swap function for two elements also swaps the elements in the other arrays.
AFAIK there is no way to extend the standard Array.sort in a way that it swaps additional arrays.
Use a helper array with the sort order.
First of all you need to initialize the helper array with the range {0, 1 ... indexes.Length-1}.
Now you sort the helper array using a Comparator that compares indexes[a] with indexes[b] rather than a to b.
The result is an helper array where each element has the index of the element of the source array where its content should come from, i.e. the sort sequence.
The last step is the most tricky one. You need to swap the elements in your source arrays according to the sort sequence above.
To operate strictly in place set your current index cur to 0.
Then take the cur-th element from your helper array. Let's call it from. This is the element index that should be placed at index cur after completion.
Now you need to make space at index cur to place the elements from index from there. Copy them to a temporary location tmp.
Now move the elements from index from to index cur. Index from is now free to be overridden.
Set the element in the helper array at index cur to some invalid value, e.g. -1.
Set your current index cur to from proceed from above until you reach an element in the helper array which already has an invalid index value, i.e. your starting point. In this case store the content of tmp at the last index. You now have found a closed loop of rotated indices.
Unfortunately there may exist an arbitrary number of such loops each of arbitrary size. So you need to seek in the helper array for the next non-invalid index value and again continue from above until all elements of the helper array are processed.
Since you will end at the starting point after each loop it is sufficient to increment cur unless you find an non-invalid entry. So the algorithm is still O(n) while processing the helper array.
All entries before cur are necessarily invalid after a loop completed.
If curincrements beyond the size of the helper array you are done.
There is an easier variation of option 2 when you are allowed to create new target arrays.
In this case you simply allocate the new target arrays and fill their content according to the indices in your helper array.
The drawback is that the allocations might be quite expensive if the arrays are really large. And of course, it is no longer in place.
Some further notes.
Normally the custom sort algorithm performs better as it avoids the allocation of the temporary array. But in some cases the situation changes. The processing of the cyclic element rotation loops uses a minimum move operations. This is O(n) rather than O(n log n) of common sort algorithms.
So when the number of arrays to sort and or the size of the arrays grows the method #2 has an advantage because it uses less swap operations.
A data model requiring a sort algorithm like this is mostly broken by design. Of course, like always there are a few cases where you can't avoid this.

May I suggest you to use a TreeMap or something similar, using your integer as key.
static Map<Integer, myClass> map = new TreeMap<>();
So when you want to retrieve ordered you only have to do a for loop or whatever you prefer.
for (int i : map.keyset()){
System.out.println("x: "+map.get(i).x+"\nsource: "+map.get(i).source+"\ntarget: "+map.get(i).target);
}

This example requires creating an Integer array of indexes, but the arrays to be sorted are reordered in place according to array1, and the arrays can be of any type (primitives or objects) that allows indexing.
public static void main(String[] args) {
int array1[]={5,1,9,3,8};
int array2[]={2,0,3,6,1};
int array3[]={3,1,4,5,9};
// generate array of indices
Integer[] I = new Integer [array1.length];
for(int i = 0; i < I.length; i++)
I[i] = i;
// sort array of indices according to array1
Arrays.sort(I, (i, j) -> array1[i]-array1[j]);
// reorder array1 ... array3 in place using sorted indices
// also reorder indices back to 0 to length-1
// time complexity is O(n)
for(int i = 0; i < I.length; i++){
if(i != I[i]){
int t1 = array1[i];
int t2 = array2[i];
int t3 = array3[i];
int j;
int k = i;
while(i != (j = I[k])){
array1[k] = array1[j];
array2[k] = array2[j];
array3[k] = array3[j];
I[k] = k;
k = j;
}
array1[k] = t1;
array2[k] = t2;
array3[k] = t3;
I[k] = k;
}
}
// display result
for (int i = 0; i < array1.length; i++) {
System.out.println("array1 " + array1[i] +
" array2 " + array2[i] +
" array3 " + array3[i]);
}
}

Another solution using Collection (increase the memory usage) :
Let's create a sorted map to will simply be a mapping between the correct index and the original position :
public static TreeMap<Integer, Integer> sortIndex(int[] array){
TreeMap<Integer, Integer> tree = new TreeMap<>();
for(int i=0; i < array.length; ++i) {
tree.put(array[i], i);
}
return tree;
}
Test :
int[] indexes = new int[] { 0, 1, 3, 2, 4, 5 };
TreeMap<Integer, Integer> map = sortIndex(indexes);
map.keySet().stream().forEach(System.out::print); //012345
map.values().stream().forEach(System.out::print); //013245
We have the indexes sorted (on the key) and the original index order as the values.
No we can simple use this to order the array, I will be drastic and use a Stream to map and collect into a List.
public static List<String> sortInPlace(String[] array, TreeMap<Integer, Integer> map) {
return map.values().stream().map(i -> array[i]).collect(Collectors.toList());
}
Test :
String[] sources = "to be not or to be".split(" ");
int[] indexes = new int[] { 0, 1, 3, 2, 4, 5 };
TreeMap<Integer, Integer> map = sortIndex(indexes);
List<String> result = sortInPlace(sources, map);
System.out.println(result);
[to, be, or, not, to, be]
Why did I use a List. Mostly to simplify the re-ordering, if we try to order the original arrays, it will be complicated because we need to remove the opposed key/pair
2 -> 3
3 -> 2
Without some cleaning, we will just swap the cells twice ... so there will be no changes.
If we want to reduce a bit the memory usage, we can create another array instead of using the stream and copy values per values iterating the map. This would be possible to do with multiple array in parallel too.

It all depends on the size of your arrays. This solution will use the first array to perform the sorting but will perform the permutation on multiple arrays.
So this could have some performances issues if the sorting algorithm used will need a lot of permutation.
Here, I took a basic sorting algorithm on which I have added some actions I can do during the swap of two cells. This allows use to define some lambda to swap multiple array at the same time based on one array.
public static void sortArray( int[] array, BiConsumer<Integer, Integer>... actions ) {
int tmp;
for ( int i = 0, length = array.length; i < length; ++i ) {
tmp = array[i];
for ( int j = i + 1; j < length; ++j ) {
if ( tmp > array[j] ) {
array[i] = array[j];
array[j] = tmp;
tmp = array[i];
// Swap the other arrays
for ( BiConsumer<Integer, Integer> cons : actions ){
cons.accept( i, j);
}
}
}
}
}
Let's create a generic method to swap the cells that we can pass as a BiConsumer lambda (only works for non-primitive arrays):
public static <T> void swapCell( T[] array, int from, int to ) {
T tmp = array[from];
array[from] = array[to];
array[to] = tmp;
}
That allows use to sort the arrays like :
public static void main( String[] args ) throws ParseException {
int[] indexes = new int[] { 0, 2, 8, 5 };
String[] sources = new String[] { "how", "are", "today", "you" };
String[] targets = new String[] { "I", "am", "thanks", "fine" };
sortArray( indexes,
( i, j ) -> swapCell( sources, i, j ),
( i, j ) -> swapCell( targets, i, j ) );
System.out.println( Arrays.toString( indexes ) );
System.out.println( Arrays.toString( sources ) );
System.out.println( Arrays.toString( targets ) );
}
[0, 2, 5, 8]
[how, are, you, today]
[I, am, fine, thanks]
This solution does not required (much) more memory than the one already used since no additional array or Collection are required.
The use of BiConsumer<>... provide a generic solution, this could also accept an Object[]... but this would not work for primitives array anymore. This have a slight performance lost of course, so based on the need, this can be removed.
Creation of a complete solution, first let's define an interface that will be used as a factory as well :
interface Sorter {
void sort(int[] array, BiConsumer<Integer, Integer>... actions);
static void sortArrays(int[] array, BiConsumer<Integer, Integer>... actions){
// call the implemented Sorter
}
}
Then, implement a simple Selection sorterr with the same logic as before, for each permutation in the original array, we execute the BiConsumer:
class SelectionSorter implements Sorter {
public void sort(int[] array, BiConsumer<Integer, Integer>... actions) {
int index;
int value;
int tmp;
for (int i = 0, length = array.length; i < length; ++i) {
index = i;
value = array[i];
for (int j = i + 1; j < length; ++j) {
if (value > array[j]) {
index = j;
value = array[j];
}
}
if (index != i) {
tmp = array[i];
array[i] = array[index];
array[index] = tmp;
// Swap the other arrays
for (BiConsumer<Integer, Integer> cons : actions) {
cons.accept(i, index);
}
}
}
}
}
Let also create a Bubble sorter :
class BubbleSorter implements Sorter {
public void sort(int[] array, BiConsumer<Integer, Integer>... actions) {
int tmp;
boolean swapped;
do {
swapped = false;
for (int i = 1, length = array.length; i < length; ++i) {
if (array[i - 1] > array[i]) {
tmp = array[i];
array[i] = array[i - 1];
array[i - 1] = tmp;
// Swap the other arrays
for (BiConsumer<Integer, Integer> cons : actions) {
cons.accept(i, i - 1);
}
swapped = true;
}
}
} while (swapped);
}
}
Now, we can simple call one or the other based on a simple condition, the length :
static void sortArrays(int[] array, BiConsumer<Integer, Integer>... actions){
if(array.length < 1000){
new BubbleSorter().sort(array, actions);
} else {
new SelectionSorter().sort(array, actions);
}
}
That way, we can call our sorter simply with
Sorter.sortArrays(indexes,
(i, j) -> swapCell(sources, i, j),
(i, j) -> swapCell(targets, i, j)
);
Complete test case on ideone (limit on size because of the time out)

I wonder if my approach is valid.
public class rakesh{
public static void sort_myClass(myClass myClasses[]){
for(int i=0; i<myClasses.length; i++){
for(int j=0; j<myClasses.length-i-1; j++){
if(myClasses[j].x >myClasses[j+1].x){
myClass temp_myClass = new myClass(myClasses[j+1]);
myClasses[j+1] = new myClass(myClasses[j]);
myClasses[j] = new myClass(temp_myClass);
}
}
}
}
public static class myClass{
int x;
String source;
String target;
myClass(int x,String source,String target){
this.x = x;
this.source = source;
this.target = target;
}
myClass(myClass super_myClass){
this.x = super_myClass.x;
this.source = super_myClass.source;
this.target = super_myClass.target;
}
}
public static void main(String args[]) {
myClass myClass1 = new myClass(0,"how","I");
myClass myClass2 = new myClass(2,"are","am");
myClass myClass3 = new myClass(8,"today","thanks");
myClass myClass4 = new myClass(5,"you","fine");
myClass[] myClasses = {myClass1, myClass2, myClass3, myClass4};
sort_myClass(myClasses);
for(myClass myClass_dummy : myClasses){
System.out.print(myClass_dummy.x + " ");
}
System.out.print("\n");
for(myClass myClass_dummy : myClasses){
System.out.print(myClass_dummy.source + " ");
}
System.out.print("\n");
for(myClass myClass_dummy : myClasses){
System.out.print(myClass_dummy.target + " ");
}
}
}
If you find any error or have suggestions then please leave a comment so I could make any necessary edits.
Output
0 2 5 8
how are you today
I am fine thanks
Process finished with exit code 0

without assign values in class, you can achieve it with following code:
Integer[] indexes = new Integer[]{0,2,8,5};
String[] sources = new String[]{"how", "are", "today", "you"};
String[] targets = new String[]{"I", "am", "thanks", "fine"};
Integer[] sortedArrya = Arrays.copyOf(indexes, indexes.length);
Arrays.sort(sortedArrya);
String[] sortedSourses = new String[sources.length];
String[] sortedTargets = new String[targets.length];
for (int i = 0; i < sortedArrya.length; i++) {
int intValus = sortedArrya[i];
int inx = Arrays.asList(indexes).indexOf(intValus);
sortedSourses[i] = sources[+inx];
sortedTargets[i] = targets[+inx];
}
System.out.println(sortedArrya);
System.out.println(sortedSourses);
System.out.println(sortedTargets);

I have an other solution for your question:
private void reOrder(int[] indexes, String[] sources, String[] targets){
int[] reIndexs = new int[indexes.length]; // contain index of item from MIN to MAX
String[] reSources = new String[indexes.length]; // array sources after re-order follow reIndexs
String[] reTargets = new String[indexes.length]; // array targets after re-order follow reIndexs
for (int i=0; i < (indexes.length - 1); i++){
if (i == (indexes.length - 2)){
if (indexes[i] > indexes[i+1]){
reIndexs[i] = i+1;
reIndexs[i+1] = i;
}else
{
reIndexs[i] = i;
reIndexs[i+1] = i+1;
}
}else
{
for (int j=(i+1); j < indexes.length; j++){
if (indexes[i] > indexes[j]){
reIndexs[i] = j;
}else {
reIndexs[i] = i;
}
}
}
}
// Re-order sources array and targets array
for (int index = 0; index < reIndexs.length; index++){
reSources[index] = sources[reIndexs[index]];
reTargets[index] = targets[reIndexs[index]];
}
// Print to view result
System.out.println( Arrays.toString(reIndexs));
System.out.println( Arrays.toString(reSources));
System.out.println( Arrays.toString(reTargets));
}

You can also achieve in your way too.
Here I created an ArrayList myArr and sorted Based on index value and then converted back to the array if you satisfied with ArrayList just you can remove the conversion or you want Array this one be helpful.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
public class StackOverflow {
public static void main(String[] args) {
int[] indexes = new int[]{0,2,8,5};
String[] sources = new String[]{"how", "are", "today", "you"};
String[] targets = new String[]{"I", "am", "thanks", "fine"};
ArrayList<myClass> myArr=new ArrayList<>();
for(int i=0;i<indexes.length;i++) {
myArr.add(new myClass(indexes[i], sources[i], targets[i]));
}
//Collections.sort(myArr,new compareIndex());
// Just for readability of code
Collections.sort(myArr, (mC1, mC2) -> mC1.getX() - mC2.getX());
//Conversion Part
for (int i=0;i<myArr.size();i++){
indexes[i]=myArr.get(i).getX();
sources[i]=myArr.get(i).getSource();
targets[i]=myArr.get(i).getTarget();
}
System.out.println(Arrays.toString(indexes));
System.out.println(Arrays.toString(sources));
System.out.println(Arrays.toString(targets));
}
}
class myClass {
private Integer x;
private String source;
private String target;
public myClass(Integer x,String source,String target){
this.x=x;
this.source=source;
this.target=target;
}
public Integer getX() {
return x;
}
public String getSource() {
return source;
}
public String getTarget() {
return target;
}
}

Select N random elements from a List efficiently (without toArray and change the list)

As in the title, I want to use Knuth-Fisher-Yates shuffle algorithm to select N random elements from a List but without using List.toArray and change the list. Here is my current code:
public List<E> getNElements(List<E> list, Integer n) {
List<E> rtn = null;
if (list != null && n != null && n > 0) {
int lSize = list.size();
if (lSize > n) {
rtn = new ArrayList<E>(n);
E[] es = (E[]) list.toArray();
//Knuth-Fisher-Yates shuffle algorithm
for (int i = es.length - 1; i > es.length - n - 1; i--) {
int iRand = rand.nextInt(i + 1);
E eRand = es[iRand];
es[iRand] = es[i];
//This is not necessary here as we do not really need the final shuffle result.
//es[i] = eRand;
rtn.add(eRand);
}
} else if (lSize == n) {
rtn = new ArrayList<E>(n);
rtn.addAll(list);
} else {
log("list.size < nSub! ", lSize, n);
}
}
return rtn;
}
It uses list.toArray() to make a new array to avoid modifying the original list. However, my problem now is that my list could be very big, can have 1 million elements. Then list.toArray() is too slow. And my n could range from 1 to 1 million. When n is small (say 2), the function is very in-efficient as it still need to do list.toArray() for a list of 1 million elements.
Can someone help improve the above code to make it more efficient when dealing with large lists. Thanks.
Here I assume Knuth-Fisher-Yates shuffle is the best algorithm to do the job of selecting n random elements from a list. Am I right? I would be very glad to if there is other algorithms better than Knuth-Fisher-Yates shuffle to do the job in terms of the speed and the quality of the results (guarantee real randomness).
Update:
Here is some of my test results:
When selection n from 1000000 elements.
When n<1000000/4 the fastest way to through using Daniel Lemire's Bitmap function to select n random id first then get the elements with these ids:
public List<E> getNElementsBitSet(List<E> list, int n) {
List<E> rtn = new ArrayList<E>(n);
int[] ids = genNBitSet(n, 0, list.size());
for (int i = 0; i < ids.length; i++) {
rtn.add(list.get(ids[i]));
}
return rtn;
}
The genNBitSet is using the code generateUniformBitmap from https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2013/08/14/java/UniformDistinct.java
When n>1000000/4 the Reservoir Sampling method is faster.
So I have built a function to combine these two methods.

You are probably looking for something like Resorvoir Sampling.
Start with an initial array with first k elements, and modify it with new elements with decreasing probabilities:
java like pseudo code:
E[] r = new E[k]; //not really, cannot create an array of generic type, but just pseudo code
int i = 0;
for (E e : list) {
//assign first k elements:
if (i < k) { r[i++] = e; continue; }
//add current element with decreasing probability:
j = random(i++) + 1; //a number from 1 to i inclusive
if (j <= k) r[j] = e;
}
return r;
This requires a single pass on the data, with very cheap ops every iteration, and the space consumption is linear with the required output size.

If n is very small compared to the length of the list, take an empty set of ints and keep adding a random index until the set has the right size.
If n is comparable to the length of the list, do the same, but then return items in the list that don't have indexes in the set.
In the middle ground, you can iterate through the list, and randomly select items based on how many items you've seen, and how many items you've already returned. In pseudo-code, if you want k items from N:
for i = 0 to N-1
if random(N-i) < k
add item[i] to the result
k -= 1
end
end
Here random(x) returns a random number between 0 (inclusive) and x (exclusive).
This produces a uniformly random sample of k elements. You could also consider making an iterator to avoid building the results list to save memory, assuming the list is unchanged as you're iterating over it.
By profiling, you can determine the transition point where it makes sense to switch from the naive set-building method to the iteration method.

Let's assume that you can generate n random indices out of m that are pairwise disjoint and then look them up efficiently in the collection. If you don't need the order of the elements to be random, then you can use an algorithm due to Robert Floyd.
Random r = new Random();
Set<Integer> s = new HashSet<Integer>();
for (int j = m - n; j < m; j++) {
int t = r.nextInt(j);
s.add(s.contains(t) ? j : t);
}
If you do need the order to be random, then you can run Fisher--Yates where, instead of using an array, you use a HashMap that stores only those mappings where the key and the value are distinct. Assuming that hashing is constant time, both of these algorithms are asymptotically optimal (though clearly, if you want to randomly sample most of the array, then there are data structures with better constants).

Just for convenience: A MCVE with an implementation of the Resorvoir Sampling proposed by amit (possible upvotes should go to him (I'm just hacking some code))
It seems like this is indeed a algorithm that nicely covers the cases of where the number of elements to select is low compared to the list size, and the cases where the number of elements is high compared to the list size (assumung that the properties about the randomness of the result that are stated on the wikipedia page are correct).
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Random;
import java.util.TreeMap;
public class ReservoirSampling
{
public static void main(String[] args)
{
example();
//test();
}
private static void test()
{
List<String> list = new ArrayList<String>();
list.add("A");
list.add("B");
list.add("C");
list.add("D");
list.add("E");
int size = 2;
int runs = 100000;
Map<String, Integer> counts = new TreeMap<String, Integer>();
for (int i=0; i<runs; i++)
{
List<String> sample = sample(list, size);
String s = createString(sample);
Integer count = counts.get(s);
if (count == null)
{
count = 0;
}
counts.put(s, count+1);
}
for (Entry<String, Integer> entry : counts.entrySet())
{
System.out.println(entry.getKey()+" : "+entry.getValue());
}
}
private static String createString(List<String> list)
{
Collections.sort(list);
StringBuilder sb = new StringBuilder();
for (String s : list)
{
sb.append(s);
}
return sb.toString();
}
private static void example()
{
List<String> list = new ArrayList<String>();
for (int i=0; i<26; i++)
{
list.add(String.valueOf((char)('A'+i)));
}
for (int i=1; i<=26; i++)
{
printExample(list, i);
}
}
private static <T> void printExample(List<T> list, int size)
{
System.out.printf("%3d elements: "+sample(list, size)+"\n", size);
}
private static final Random random = new Random(0);
private static <T> List<T> sample(List<T> list, int size)
{
List<T> result = new ArrayList<T>(Collections.nCopies(size, (T) null));
int i = 0;
for (T element : list)
{
if (i < size)
{
result.set(i, element);
i++;
continue;
}
i++;
int j = random.nextInt(i);
if (j < size)
{
result.set(j, element);
}
}
return result;
}
}

If n is way smaller then size, you could use this algorith, witch is unfortunatly quadratic with n, but doest depend on size of array at all.
Example with size = 100 and n = 4.
choose random number from 0 to 99, lets say 42, and add it to result.
choose random number from 0 to 98, lets say 39, and add it to result.
choose random number from 0 to 97, lets say 41, but since 41 is bigger or equal than 39, increment it by 1, so you have 42, but that is bigger then equal than 42, so you have 43.
...
Shortly, you choose from remaining numbers and then compuce what number have you acctualy chosen. I would use link list for this, but maybe there are better data structures.

Summarizing Changwang's update. If you want more than 250,000 items, use amit's answer. Otherwise use Knuth-Fisher-Yates Shuffle as shown in entirety here
NOTE: The result is always in the original order as well
public static <T> List<T> getNRandomElements(int n, List<T> list) {
List<T> subList = new ArrayList<>(n);
int[] ids = generateUniformBitmap(n, list.size());
for (int id : ids) {
subList.add(list.get(id));
}
return subList;
}
// https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2013/08/14/java/UniformDistinct.java
private static int[] generateUniformBitmap(int num, int max) {
if (num > max) {
DebugUtil.e("Can't generate n ints");
}
int[] ans = new int[num];
if (num == max) {
for (int k = 0; k < num; ++k) {
ans[k] = k;
}
return ans;
}
BitSet bs = new BitSet(max);
int cardinality = 0;
Random random = new Random();
while (cardinality < num) {
int v = random.nextInt(max);
if (!bs.get(v)) {
bs.set(v);
cardinality += 1;
}
}
int pos = 0;
for (int i = bs.nextSetBit(0); i >= 0; i = bs.nextSetBit(i + 1)) {
ans[pos] = i;
pos += 1;
}
return ans;
}
If you want them randomized, I use:
public static <T> List<T> getNRandomShuffledElements(int n, List<T> list) {
List<T> randomElements = getNRandomElements(n, list);
Collections.shuffle(randomElements);
return randomElements;
}

I needed something for this in C#, here's my solution which works on a generic List.
It selects N random elements of the list and places them at the front of the list.
So upon returning, the first N elements of the list are randomly selected. It is fast and efficient even when you're dealing with a very large number of elements.
static void SelectRandom<T>(List<T> list, int n)
{
if (n >= list.Count)
{
// n should be less than list.Count
return;
}
int max = list.Count;
var random = new Random();
for (int i = 0; i < n; i++)
{
int r = random.Next(max);
max = max - 1;
int irand = i + r;
if (i != irand)
{
T rand = list[irand];
list[irand] = list[i];
list[i] = rand;
}
}
}

Inserting objects into arraylist

I have an arraylist of 50 RANDOM integers. I ask a user to remove a number and all occurences of that number are removed from the list. I did that using
while (randInts.contains(removeInt) )
{
if (randInts.get(i) == removeInt)
randInts.remove(randInts.get(i));
i++;
}
System.out.println("\n" + randInts.toString());
System.out.println("\n" + randInts.size());`
The other part of the problem is to prompt the user to enter another number. The removed number from above is inserted after each occurrence of the second prompted number. I am having issues with the second part as I keep getting IndexOutOfBoundsException.

Use a LinkedList instead; it's a much better choice when you need in-order traversal but not really random access, and when you need to insert and remove elements in the middle of the list.
You can accomplish what you're wanting (removing all instances of removeInt and inserting removeInt after every instance of insertAfterInt) with a simple traversal of the list's iterator:
ListIterator<Integer> li = randInts.listIterator();
while(li.hasNext()) {
int i = li.next();
if(removeInt == i) // assumes removeInt is an int; use equals() for Integer
li.remove();
if(insertAfterInt == i)
li.add(removeInt); // the iterator will skip this element, so it won't get removed
}

I see two big issues: You're not bounding i to anything, and you wrote an n^2 loop (you can do this in linear time).
You're shrinking the size of the `List` as you go...take this simple example:
Say you want to remove all instances of 5
Given a list that looks like {1,2,3,5,5}
When i = 3 you will remove the first 5, making the list look like: {1,2,3,5}
then you will attempt to remove the element at i = 4, but that element you want to remove is really now at i = 3, and you'll get the IndexOutOfBoundsException
You don't want to use a `contains`, as this expands the worst case performance of your loop to n^2, this would be faster:
int size = randInts.size() - 1;
for (int i = size; i >= 0; i--){
if (randInts.get(i).equals(removeInt))
randInts.remove(i);
}

while (randInts.contains(removeInt) )
{
if(i<randInts.size());
{
if (randInts.get(i) == removeInt)
randInts.remove(randInts.get(i));
}//if
i++;
}while

I am guessing you are starting with a collection of 50 items (randInts) and removing the items that users enter (i)?
If that is the case, once your remove an item, your collection then only 49 indexes left and get gets by the index. Try something like...
if (randInts.contains(i)){
randInts.remove(randInts.indexOf(i));
}

This is n^2, but it should work
int i = 0;
while(i < loFnumbers.size()){
if(loFnumbers.get(i) == removeInt){
loFnumbers.remove(i);
continue;
}
i++;
}

Here's an approach that avoids any state mutation (i.e. randInts is never modified):
package so;
import java.util.ArrayList;
public class SO_18836900 {
public static void main(String[] args) {
// build a collection of random ints
ArrayList<Integer> randInts = new ArrayList();
for (int i = 0; i < 50; i ++) {
randInts.add((int)(Math.random() * 5));
}
// create a collection with all 3s filtered out
ArrayList<Integer> filtered = filterOut(randInts, 3);
System.out.println(filtered);
System.out.println(filtered.size());
// create a collection with a 99 inserted after each 4
ArrayList<Integer> insertedAfter = insertAfter(randInts, 4, 99);
System.out.println(insertedAfter);
System.out.println(insertedAfter.size());
}
static ArrayList<Integer> filterOut(Iterable<Integer> xs, int toRemove) {
ArrayList<Integer> filteredInts = new ArrayList();
for (int x : xs) {
if (x != toRemove) filteredInts.add(x);
}
return filteredInts;
}
static ArrayList<Integer> insertAfter(Iterable<Integer> xs, int trigger, int toInsert) {
ArrayList<Integer> insertedAfter = new ArrayList();
for (int x : xs) {
insertedAfter.add(x);
if (x == trigger) insertedAfter.add(toInsert);
}
return insertedAfter;
}
}

if (randInts.get(i) == removeInt)
randInts.remove(randInts.get(i));
i++;
You're never checking stopping conditions. Fix is:
while (randInts.contains(removeInt) )
{
i=0;
while(i<randInts.size()){
if (randInts.get(i) == removeInt)
randInts.remove(randInts.get(i));
i++;
}
}

Dont use == on "Integers" you are comparing references.
Either unbox into int or use equals(

Find common elements in two unsorted array

I try to find a solution to this problem:
I have two arrays A and B of integers (A and B can have different dimensions). I have to find the common elements in these two arrays. I have another condition: the maximum distance between the common elements is k.
So, this is my solution. I think is correct:
for (int i = 0; i<A.length; i++){
for (int j=jlimit; (j<B.length) && (j <= ks); j++){
if(A[i]==B[j]){
System.out.println(B[j]);
jlimit = j;
ks = j+k;
}//end if
}
}
Is there a way to make a better solution? Any suggestions? Thanks in advance!

Given your explanation, I think the most direct approach is reading array A, putting all elements in a Set (setA), do the same with B (setB), and use the retainAll method to find the intersection of both sets (items that belong to both of the sets).
You will see that the k distance is not used at all, but I see no way to use that condition that leads to code either faster or more maintenable. The solution I advocate works without enforcing that condition, so it works also when the condition is true (that is called "weakening the preconditions")

IMPLEMENT BINARY SEARCH AND QUICK SORT!
this will lead to tons of code.... but the fastest result.
You can sort the elements of the larger array with like quick sort which would lead to O(nlogn).
then iterate through the smaller array for each value and do a binary search of that particular element in the other array. Add some logic for the distance in the binary search.
I think you can get the complexity down to O(nlogn). Worst case O(n^2)
pseudo code.
larger array equals a
other array equals b
sort a
iterate through b
binary search b at iterated index
// I would throw (last index - index) logic in binary search
// to exit out of that even faster by returning "NOT FOUND" as soon as that is hit.
if found && (last index - index) is less than or equal
store last index
print value
this is the fastest way possible to do your problem i believe.

Although this would be a cheat, since it uses HashSets, it is pretty nice for a Java implementation of this algorithm. If you need the pseudocode for the algorithm, don't read any further.
Source and author in the JavaDoc. Cheers.
/**
* #author Crunchify.com
*/
public class CrunchifyIntersection {
public static void main(String[] args) {
Integer[ ] arrayOne = { 1, 4, 5, 2, 7, 3, 9 };
Integer[ ] arrayTwo = { 5, 2, 4, 9, 5 };
Integer[ ] common = iCrunchIntersection.findCommon( arrayOne, arrayTwo );
System.out.print( "Common Elements Between Two Arrays: " );
for( Integer entry : common ) {
System.out.print( entry + " " );
}
}
public static Integer[ ] findCommon( Integer[ ] arrayOne, Integer[ ] arrayTwo ) {
Integer[ ] arrayToHash;
Integer[ ] arrayToSearch;
if( arrayOne.length < arrayTwo.length ) {
arrayToHash = arrayOne;
arrayToSearch = arrayTwo;
} else {
arrayToHash = arrayTwo;
arrayToSearch = arrayOne;
}
HashSet<Integer> intersection = new HashSet<Integer>( );
HashSet<Integer> hashedArray = new HashSet<Integer>( );
for( Integer entry : arrayToHash ) {
hashedArray.add( entry );
}
for( Integer entry : arrayToSearch ) {
if( hashedArray.contains( entry ) ) {
intersection.add( entry );
}
}
return intersection.toArray( new Integer[ 0 ] );
}
}

Your implementation is roughly O(A.length*2k).
That seems to be about the best you're going to do if you want to maintain your "no more than k away" logic, as that rules out sorting and the use of sets. I would alter a little to make your code more understandable.
First, I would ensure that you iterate over the smaller of the two arrays. This would make the complexity O(min(A.length, B.length)*2k).
To understand the purpose of this, consider the case where A has 1 element and B has 100. In this case, we are only going to perform one iteration in the outer loop, and k iterations in the inner loop.
Now consider when A has 100 elements, and B has 1. In this case, we will perform 100 iterations on the outer loop, and 1 iteration each on the inner loop.
If k is less than the length of your long array, iterating over the shorter array in the outer loop will be more efficient.
Then, I would change how you're calculating the k distance stuff just for readability's sake. The code I've written demonstrates this.
Here's what I would do:
//not sure what type of array we're dealing with here, so I'll assume int.
int[] toIterate;
int[] toSearch;
if (A.length > B.length)
{
toIterate = B;
toSearch = A;
}
else
{
toIterate = A;
toSearch = B;
}
for (int i = 0; i < toIterate.length; i++)
{
// set j to k away in the negative direction
int j = i - k;
if (j < 0)
j = 0;
// only iterate until j is k past i
for (; (j < toSearch.length) && (j <= i + k); j++)
{
if(toIterate[i] == toSearch[j])
{
System.out.println(toSearch[j]);
}
}
}
Your use of jlimit and ks may work, but handling your k distance like this is more understandable for your average programmer (and it's marginally more efficient).

O(N) solution (BloomFilters):
Here is a solution using bloom filters (implementation is from the Guava library)
public static <T> T findCommon_BloomFilterImpl(T[] A, T[] B, Funnel<T> funnel) {
BloomFilter<T> filter = BloomFilter.create(funnel, A.length + B.length);
for (T t : A) {
filter.put(t);
}
for (T t : B) {
if (filter.mightContain(t)) {
return t;
}
}
return null;
}
use it like this:
Integer j = Masking.findCommon_BloomFilterImpl(new Integer[]{12, 2, 3, 4, 5222, 622, 71, 81, 91, 10}, new Integer[]{11, 100, 15, 18, 79, 10}, Funnels.integerFunnel());
Assert.assertNotNull(j);
Assert.assertEquals(10, j.intValue());
Runs in O(N) since calculating hash for Integer is pretty straight forward. So still O(N) if you can reduce the calculation of hash of your elementents to O(1) or a small O(K) where K is the size of each element.
O(N.LogN) solution (sorting and iterating):
Sorting and the iterating through the array will lead you to a O(N*log(N)) solution:
public static <T extends Comparable<T>> T findCommon(T[] A, T[] B, Class<T> clazz) {
T[] array = concatArrays(A, B, clazz);
Arrays.sort(array);
for (int i = 1; i < array.length; i++) {
if (array[i - 1].equals(array[i])) { //put your own equality check here
return array[i];
}
}
return null;
}
concatArrays(~) is in O(N) of course. Arrays.sort(~) is a bi-pivot implementation of QuickSort with complexity in O(N.logN), and iterating through the array again is O(N).
So we have O((N+2).logN) ~> O(N.logN).
As a general case solution (withouth the "within k" condition of your problem) is better than yours. It should be considered for k "close to" N in your precise case.

Simple solution if arrays are already sorted
public static void get_common_courses(Integer[] courses1, Integer[] courses2) {
// Sort both arrays if input is not sorted
//Arrays.sort(courses1);
//Arrays.sort(courses2);
int i=0, j=0;
while(i<courses1.length && j<courses2.length) {
if(courses1[i] > courses2[j]) {
j++;
} else if(courses1[i] < courses2[j]){
i++;
} else {
System.out.println(courses1[i]);
i++;j++;
}
}
}
Apache commons collections API has done this in efficient way without sorting
public static Collection intersection(final Collection a, final Collection b) {
ArrayList list = new ArrayList();
Map mapa = getCardinalityMap(a);
Map mapb = getCardinalityMap(b);
Set elts = new HashSet(a);
elts.addAll(b);
Iterator it = elts.iterator();
while(it.hasNext()) {
Object obj = it.next();
for(int i=0,m=Math.min(getFreq(obj,mapa),getFreq(obj,mapb));i<m;i++) {
list.add(obj);
}
}
return list;
}

Solution using Java 8
static <T> Collection<T> intersection(Collection<T> c1, Collection<T> c2) {
if (c1.size() < c2.size())
return intersection(c2, c1);
Set<T> c2set = new HashSet<>(c2);
return c1.stream().filter(c2set::contains).distinct().collect(Collectors.toSet());
}
Use Arrays::asList and boxed values of primitives:
Integer[] a =...
Collection<Integer> res = intersection(Arrays.asList(a),Arrays.asList(b));

Generic solution
public static void main(String[] args) {
String[] a = { "a", "b" };
String[] b = { "c", "b" };
String[] intersection = intersection(a, b, a[0].getClass());
System.out.println(Arrays.toString(intersection));
Integer[] aa = { 1, 3, 4, 2 };
Integer[] bb = { 1, 19, 4, 5 };
Integer[] intersectionaabb = intersection(aa, bb, aa[0].getClass());
System.out.println(Arrays.toString(intersectionaabb));
}
#SuppressWarnings("unchecked")
private static <T> T[] intersection(T[] a, T[] b, Class<? extends T> c) {
HashSet<T> s = new HashSet<>(Arrays.asList(a));
s.retainAll(Arrays.asList(b));
return s.toArray((T[]) Array.newInstance(c, s.size()));
}
Output
[b]
[1, 4]

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.