Find common elements in two unsorted array

Find common elements in two unsorted array - java

I try to find a solution to this problem:
I have two arrays A and B of integers (A and B can have different dimensions). I have to find the common elements in these two arrays. I have another condition: the maximum distance between the common elements is k.
So, this is my solution. I think is correct:
for (int i = 0; i<A.length; i++){
for (int j=jlimit; (j<B.length) && (j <= ks); j++){
if(A[i]==B[j]){
System.out.println(B[j]);
jlimit = j;
ks = j+k;
}//end if
}
}
Is there a way to make a better solution? Any suggestions? Thanks in advance!

Given your explanation, I think the most direct approach is reading array A, putting all elements in a Set (setA), do the same with B (setB), and use the retainAll method to find the intersection of both sets (items that belong to both of the sets).
You will see that the k distance is not used at all, but I see no way to use that condition that leads to code either faster or more maintenable. The solution I advocate works without enforcing that condition, so it works also when the condition is true (that is called "weakening the preconditions")

IMPLEMENT BINARY SEARCH AND QUICK SORT!
this will lead to tons of code.... but the fastest result.
You can sort the elements of the larger array with like quick sort which would lead to O(nlogn).
then iterate through the smaller array for each value and do a binary search of that particular element in the other array. Add some logic for the distance in the binary search.
I think you can get the complexity down to O(nlogn). Worst case O(n^2)
pseudo code.
larger array equals a
other array equals b
sort a
iterate through b
binary search b at iterated index
// I would throw (last index - index) logic in binary search
// to exit out of that even faster by returning "NOT FOUND" as soon as that is hit.
if found && (last index - index) is less than or equal
store last index
print value
this is the fastest way possible to do your problem i believe.

Although this would be a cheat, since it uses HashSets, it is pretty nice for a Java implementation of this algorithm. If you need the pseudocode for the algorithm, don't read any further.
Source and author in the JavaDoc. Cheers.
/**
* #author Crunchify.com
*/
public class CrunchifyIntersection {
public static void main(String[] args) {
Integer[ ] arrayOne = { 1, 4, 5, 2, 7, 3, 9 };
Integer[ ] arrayTwo = { 5, 2, 4, 9, 5 };
Integer[ ] common = iCrunchIntersection.findCommon( arrayOne, arrayTwo );
System.out.print( "Common Elements Between Two Arrays: " );
for( Integer entry : common ) {
System.out.print( entry + " " );
}
}
public static Integer[ ] findCommon( Integer[ ] arrayOne, Integer[ ] arrayTwo ) {
Integer[ ] arrayToHash;
Integer[ ] arrayToSearch;
if( arrayOne.length < arrayTwo.length ) {
arrayToHash = arrayOne;
arrayToSearch = arrayTwo;
} else {
arrayToHash = arrayTwo;
arrayToSearch = arrayOne;
}
HashSet<Integer> intersection = new HashSet<Integer>( );
HashSet<Integer> hashedArray = new HashSet<Integer>( );
for( Integer entry : arrayToHash ) {
hashedArray.add( entry );
}
for( Integer entry : arrayToSearch ) {
if( hashedArray.contains( entry ) ) {
intersection.add( entry );
}
}
return intersection.toArray( new Integer[ 0 ] );
}
}

Your implementation is roughly O(A.length*2k).
That seems to be about the best you're going to do if you want to maintain your "no more than k away" logic, as that rules out sorting and the use of sets. I would alter a little to make your code more understandable.
First, I would ensure that you iterate over the smaller of the two arrays. This would make the complexity O(min(A.length, B.length)*2k).
To understand the purpose of this, consider the case where A has 1 element and B has 100. In this case, we are only going to perform one iteration in the outer loop, and k iterations in the inner loop.
Now consider when A has 100 elements, and B has 1. In this case, we will perform 100 iterations on the outer loop, and 1 iteration each on the inner loop.
If k is less than the length of your long array, iterating over the shorter array in the outer loop will be more efficient.
Then, I would change how you're calculating the k distance stuff just for readability's sake. The code I've written demonstrates this.
Here's what I would do:
//not sure what type of array we're dealing with here, so I'll assume int.
int[] toIterate;
int[] toSearch;
if (A.length > B.length)
{
toIterate = B;
toSearch = A;
}
else
{
toIterate = A;
toSearch = B;
}
for (int i = 0; i < toIterate.length; i++)
{
// set j to k away in the negative direction
int j = i - k;
if (j < 0)
j = 0;
// only iterate until j is k past i
for (; (j < toSearch.length) && (j <= i + k); j++)
{
if(toIterate[i] == toSearch[j])
{
System.out.println(toSearch[j]);
}
}
}
Your use of jlimit and ks may work, but handling your k distance like this is more understandable for your average programmer (and it's marginally more efficient).

O(N) solution (BloomFilters):
Here is a solution using bloom filters (implementation is from the Guava library)
public static <T> T findCommon_BloomFilterImpl(T[] A, T[] B, Funnel<T> funnel) {
BloomFilter<T> filter = BloomFilter.create(funnel, A.length + B.length);
for (T t : A) {
filter.put(t);
}
for (T t : B) {
if (filter.mightContain(t)) {
return t;
}
}
return null;
}
use it like this:
Integer j = Masking.findCommon_BloomFilterImpl(new Integer[]{12, 2, 3, 4, 5222, 622, 71, 81, 91, 10}, new Integer[]{11, 100, 15, 18, 79, 10}, Funnels.integerFunnel());
Assert.assertNotNull(j);
Assert.assertEquals(10, j.intValue());
Runs in O(N) since calculating hash for Integer is pretty straight forward. So still O(N) if you can reduce the calculation of hash of your elementents to O(1) or a small O(K) where K is the size of each element.
O(N.LogN) solution (sorting and iterating):
Sorting and the iterating through the array will lead you to a O(N*log(N)) solution:
public static <T extends Comparable<T>> T findCommon(T[] A, T[] B, Class<T> clazz) {
T[] array = concatArrays(A, B, clazz);
Arrays.sort(array);
for (int i = 1; i < array.length; i++) {
if (array[i - 1].equals(array[i])) { //put your own equality check here
return array[i];
}
}
return null;
}
concatArrays(~) is in O(N) of course. Arrays.sort(~) is a bi-pivot implementation of QuickSort with complexity in O(N.logN), and iterating through the array again is O(N).
So we have O((N+2).logN) ~> O(N.logN).
As a general case solution (withouth the "within k" condition of your problem) is better than yours. It should be considered for k "close to" N in your precise case.

Simple solution if arrays are already sorted
public static void get_common_courses(Integer[] courses1, Integer[] courses2) {
// Sort both arrays if input is not sorted
//Arrays.sort(courses1);
//Arrays.sort(courses2);
int i=0, j=0;
while(i<courses1.length && j<courses2.length) {
if(courses1[i] > courses2[j]) {
j++;
} else if(courses1[i] < courses2[j]){
i++;
} else {
System.out.println(courses1[i]);
i++;j++;
}
}
}
Apache commons collections API has done this in efficient way without sorting
public static Collection intersection(final Collection a, final Collection b) {
ArrayList list = new ArrayList();
Map mapa = getCardinalityMap(a);
Map mapb = getCardinalityMap(b);
Set elts = new HashSet(a);
elts.addAll(b);
Iterator it = elts.iterator();
while(it.hasNext()) {
Object obj = it.next();
for(int i=0,m=Math.min(getFreq(obj,mapa),getFreq(obj,mapb));i<m;i++) {
list.add(obj);
}
}
return list;
}

Solution using Java 8
static <T> Collection<T> intersection(Collection<T> c1, Collection<T> c2) {
if (c1.size() < c2.size())
return intersection(c2, c1);
Set<T> c2set = new HashSet<>(c2);
return c1.stream().filter(c2set::contains).distinct().collect(Collectors.toSet());
}
Use Arrays::asList and boxed values of primitives:
Integer[] a =...
Collection<Integer> res = intersection(Arrays.asList(a),Arrays.asList(b));

Generic solution
public static void main(String[] args) {
String[] a = { "a", "b" };
String[] b = { "c", "b" };
String[] intersection = intersection(a, b, a[0].getClass());
System.out.println(Arrays.toString(intersection));
Integer[] aa = { 1, 3, 4, 2 };
Integer[] bb = { 1, 19, 4, 5 };
Integer[] intersectionaabb = intersection(aa, bb, aa[0].getClass());
System.out.println(Arrays.toString(intersectionaabb));
}
#SuppressWarnings("unchecked")
private static <T> T[] intersection(T[] a, T[] b, Class<? extends T> c) {
HashSet<T> s = new HashSet<>(Arrays.asList(a));
s.retainAll(Arrays.asList(b));
return s.toArray((T[]) Array.newInstance(c, s.size()));
}
Output
[b]
[1, 4]

Related

Randomizing set of duplicate arrays in Java without repeating elements

In my problem I have few arrays with numbers 1 - 3,
[1,2,3], [1,2,3]
I combined the arrays into one full array,
[1,2,3, 1,2,3]
I need to randomize the array each run, so that no element repeats.
For example, this would work
[1, 2, 1, 3, 2, 3]
but this would not.
[1,2,2,3,1,3]
I chose 1,2,3 to simplify it, but my arrays would consist of the numbers 1 - 6. The idea remains the same though. Is there an algorithm or easy method to accomplish this?

This is a heuristic solution for random shuffling not allowing consecutive duplicates. It applies to lists, but it's easy to transfer it to arrays as it does only swapping and no shift operations are required. It seems to work in the majority of cases for lists consisting of millions of elements and various density factors, but always keep in mind that heuristic algorithms may never find a solution. It uses logic from genetic algorithms, with the exception that this version utilizes one individual and selective mutation only (it's easy to convert it to a real genetic algorithm though), but it's simple and works as follows:
If a duplicate is found, try swapping it with a random element after it; if not possible, try swapping it with an element prior to it (or vice versa). The key point here is the random position for exchanging elements, so as to keep a better uniform distribution on random output.
This question has been asked in alternative forms, but I couldn't find an acceptable solution yet. Unfortunately, as most of the proposed answers (except for the "greedy" extensive re-shuffling till we get a match or computing every combination), this solution does not provide a perfect uniform distribution, but seems to minimize some patterns, :( still not possible to remove every pattern, as you see below. Try it and post any comments for potential improvements.
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Random;
//Heuristic Non-Consecutive Duplicate (NCD) Shuffler
public class NCDShuffler {
private static Random random = new Random();
//private static int swaps = 0;
public static <T> void shuffle (List<T> list) {
if (list == null || list.size() <= 1) return;
int MAX_RETRIES = 10; //it's heuristic
boolean found;
int retries = 1;
do {
Collections.shuffle(list);
found = true;
for (int i = 0; i < list.size() - 1; i++) {
T cur = list.get(i);
T next = list.get(i + 1);
if (cur.equals(next)) {
//choose between front and back with some probability based on the size of sublists
int r = random.nextInt(list.size());
if ( i < r) {
if (!swapFront(i + 1, next, list, true)) {
found = false;
break;
}
} else {
if (!swapBack(i + 1, next, list, true)) {
found = false;
break;
}
}
}
}
retries++;
} while (retries <= MAX_RETRIES && !found);
}
//try to swap it with an element in a random position after it
private static <T> boolean swapFront(int index, T t, List<T> list, boolean first) {
if (index == list.size() - 1) return first ? swapBack(index, t, list, false) : false;
int n = list.size() - index - 1;
int r = random.nextInt(n) + index + 1;
int counter = 0;
while (counter < n) {
T t2 = list.get(r);
if (!t.equals(t2)) {
Collections.swap(list, index, r);
//swaps++;
return true;
}
r++;
if (r == list.size()) r = index + 1;
counter++;
}
//can't move it front, try back
return first ? swapBack(index, t, list, false) : false;
}
//try to swap it with an element in a random "previous" position
private static <T> boolean swapBack(int index, T t, List<T> list, boolean first) {
if (index <= 1) return first ? swapFront(index, t, list, false) : false;
int n = index - 1;
int r = random.nextInt(n);
int counter = 0;
while (counter < n) {
T t2 = list.get(r);
if (!t.equals(t2) && !hasEqualNeighbours(r, t, list)) {
Collections.swap(list, index, r);
//swaps++;
return true;
}
r++;
if (r == index) r = 0;
counter++;
}
return first ? swapFront(index, t, list, false) : false;
}
//check if an element t can fit in position i
public static <T> boolean hasEqualNeighbours(int i, T t, List<T> list) {
if (list.size() == 1)
return false;
else if (i == 0) {
if (t.equals(list.get(i + 1)))
return true;
return false;
} else {
if (t.equals(list.get(i - 1)) || (t.equals(list.get(i + 1))))
return true;
return false;
}
}
//check if shuffled with no consecutive duplicates
public static <T> boolean isShuffledOK(List<T> list) {
for (int i = 1; i < list.size(); i++) {
if (list.get(i).equals(list.get(i - 1)))
return false;
}
return true;
}
//count consecutive duplicates, the smaller the better; We need ZERO
public static <T> int getFitness(List<T> list) {
int sum = 0;
for (int i = 1; i < list.size(); i++) {
if (list.get(i).equals(list.get(i - 1)))
sum++;
}
return sum;
}
//let's test it
public static void main (String args[]) {
HashMap<Integer, Integer> freq = new HashMap<Integer, Integer>();
//initialise a list
List<Integer> list = new ArrayList<Integer>();
list.add(1);
list.add(1);
list.add(2);
list.add(3);
/*for (int i = 0; i<100000; i++) {
list.add(random.nextInt(10));
}*/
//Try to put each output in the frequency Map
//then check if it's a uniform distribution
Integer hash;
for (int i = 0; i < 10000; i++) {
//shuffle it
shuffle(list);
hash = hash(list);
if (freq.containsKey(hash)) {
freq.put(hash, freq.get(hash) + 1);
} else {
freq.put(hash, 1);
}
}
System.out.println("Unique Outputs: " + freq.size());
System.out.println("EntrySet: " + freq.entrySet());
//System.out.println("Swaps: " + swaps);
//for the last shuffle
System.out.println("Shuffled OK: " + isShuffledOK(list));
System.out.println("Consecutive Duplicates: " + getFitness(list));
}
//test hash
public static int hash (List<Integer> list) {
int h = 0;
for (int i = 0; (i < list.size() && i < 9); i++) {
h += list.get(i) * (int)Math.pow(10, i); //it's reversed, but OK
}
return h;
}
}
This is a sample output; it's easy to understand the issue with the non-uniform distribution.
Unique Outputs: 6
EntrySet: [1312=1867, 3121=1753, 2131=1877, 1321=1365, 1213=1793, 1231=1345]
Shuffled OK: true
Consecutive Duplicates: 0

You could use Collections.shuffle to randomize the list. Do it in a while loop, until the list passes your constraint.

If the arrays are relatively small, it would not be too hard for you just to combine the two arrays, randomize it then check the numbers, and if there are too same numbers just shift one over or just randomize it again.

There's no pre-written algorithm that I know of (which doesn't mean one doesn't exist), but the problem is easy to understand and the implementation is straightforward.
I will offer two suggestions dependent on if you want to build a valid array or if you want to build an array and then check its validity.
1 - Create some collection (Array, ArrayList, etc) that contains all of the possible values that will be included in your final array. Grab one of those values and add it to the array. Store a copy of that value in a variable. Grab another value from the possible values, check that it's not equal to your previous value, and add it to the array if it's valid.
2 - Create an array that contains the number of values you want. Check that item n != item n+1 for all items except the last one. If you fail one of those checks, either generate a new random value for that location or add or subtract some constant from the value at that location. Once you have checked all of the values in this array, you know you have a valid array. Assuming the first and last values can be the same.

The most optimal solution, I can think of, is to count the number of occurrences of each value, logically creating a "pool" for each distinct value.
You then randomly choose a value from any of the pools that are not the value of the previous selection. The random selection is weighted by pool sizes.
If a pool is more than half the size of all remaining values, then you must choose from that pool, in order to prevent repetition at the end.
This way you can produce result fast without any form of retry or backtracking.
Example (using letters as values to clarify difference from counts):
Input: A, B, C, A, B, C
Action Selected Pools(Count)
A(2) B(2) C(2)
Random from all 3 pools A A(1) B(2) C(2)
Random from B+C pools C A(1) B(2) C(1)
Random from A+B pools (1:2 ratio) A A(0) B(2) C(1)
Must choose B (>half) B A(0) B(1) C(1)
Random from A+C, so C C A(0) B(1) C(0)
Must choose B (>half) B A(0) B(0) C(0)
Result: A, C, A, B, C, B

Merge Sort Recursion

This is a code from Introduction to Java Programming about Merge Sort. This method uses a recursion implementation.
public class MergeSort {
2 /** The method for sorting the numbers */
3 public static void mergeSort(int[] list) {
4 if (list.length > 1) {
5 // Merge sort the first half
6 int[] firstHalf = new int[list.length / 2];
7 System.arraycopy(list, 0, firstHalf, 0, list.length / 2);
8 mergeSort(firstHalf);
9
10 // Merge sort the second half
11 int secondHalfLength = list.length - list.length / 2;
12 int[] secondHalf = new int[secondHalfLength];
13 System.arraycopy(list, list.length / 2,
14 secondHalf, 0, secondHalfLength);
15 mergeSort(secondHalf);
16
17 // Merge firstHalf with secondHalf into list
18 merge(firstHalf, secondHalf, list);
19 }
20 }
My question: is in Line 8 calls the recursion method back to "mergeSort"? If running from the beginning of the method, the "firstHalf" array will be created again and the length will be half short. I think the "firstHalf" can not created again and the length should not be changed if an array is defined already.
Here is the whole code link: Merge Sort Java.

This is beginner's way of thinking. Yes, exactly I thought the same when I encountered this before. I couldn't believe that the same array size can change dynamically. Understand this, in the below code, array l and array r are created with different sizes for every recursive call. Don't confuse on this.
Yes, this is never possible that the same array size changes dynamically for a beginner like you and me. But, there is an exception, well, there are exceptions. We will see them very often as we move forward.
Its recursion, in recursion things change dynamically and all this
changes are stored in a call stack.
Its confusing but its really interesting if you ponder over it. Its profound. Merge sort can be implemented in quite different ways, but the underlying concept of recursion is same. Don't get confused here, Its better you follow another way to do it, video:
Merge sort first takes a list or an array. Lets imagine the
a.length; #lenght of an array is 8
Now the end goal is to split the array recursively, till it reaches to a point where there are no-elements (only-one). And a single element is always sorted.
See the base case in the below code:
if(a.length<2) /*Remember this is the base case*/
{
return;
}
Once it reaches to single element, sort and merge them back. This way you get a complete sorted array which is easy to merge. The only reason we are doing all this non-sense is to get a better run-time algorithm which is O(nlogn).
Because, all the other sorting algos (insertion, bubble, and selection) will take O(n2), which is alot, too much indeed. So, humanity must figure out the better solution. Its a need for humanity, very important. I know its annoying, I had gone through this non-sense.
Please do some research on recursion before you attempt on this. Understand recursion clearly. Keep all this away. Take a simple recursion example and start working on it. Take a factorial example. Its a bad example but its easy to understand.
Top-down MergeSort
See my code, its nice and easy. Again, both are not easy to understand on your first attempt. You must get in touch with recursion before you attempt to understand these things. All the very best.
public class MergeSort
{
private int low;
private int high;
private int mid;
public static int[] a;
public MergeSort(int x)
{
a = new int[x];
a[0]=19;
a[1]=10;
a[2]=0;
a[3]=220;
a[4]=80;
a[5]=2000;
a[6]=56001;
a[7]=2;
}
public void division(int[] a)
{
low=0;
int p;
high = a.length;
mid = (high+low)/2;
if(a.length<2) /*Remember this is the base case*/
{
return;
}
else
{
int[] l = new int[mid];
int[] r = new int[high-mid];
/*copying elements from a into l and r*/
for(p=0;p<mid;p++)
l[p]=a[p];
for(int q=0;q<high-mid;q++, p++)
r[q]=a[p];
/*first recursive call starts from here*/
division(l);
division(r);
sortMerge(a, l, r);
}
}
public void sortMerge(int[] a, int[] l, int[] r)
{
int i=0, j=0, k=0;
/*sorting and then merging recursively*/
while(i<l.length && j<r.length)
{
if(l[i]<r[j])
{
a[k] = l[i]; /*copying sorted elements into a*/
i++;
k++;
}
else
{
a[k] = r[j];
j++;
k++;
}
}
/*copying remaining elements into a*/
while(i<l.length)
{
a[k] = l[i];
i++;
k++;
}
while(j<r.length)
{
a[k] = r[j];
j++;
k++;
}
}
/*method display elements in an array*/
public void display()
{
for(int newIndex=0;newIndex<a.length;newIndex++)
{
System.out.println(a[newIndex]);
}
}
public static void main(String[] args)
{
MergeSort obj = new MergeSort(8);
obj.division(a);
obj.display();
}
}

As it was pointed out by Emz: This is due to scope reasons. A local variable is a new object.
[
Local variables are declared by local variable declaration statements
(§14.4).
Whenever the flow of control enters a block (§14.2) or for statement
(§14.14), a new variable is created for each local variable declared
in a local variable declaration statement immediately contained within
that block or for statement.
A local variable declaration statement may contain an expression which
initializes the variable. The local variable with an initializing
expression is not initialized, however, until the local variable
declaration statement that declares it is executed. (The rules of
definite assignment (§16) prevent the value of a local variable from
being used before it has been initialized or otherwise assigned a
value.) The local variable effectively ceases to exist when the
execution of the block or for statement is complete.]1

Here is an alternative implementation of merge sort, this is bottom-up MergeSort
public class MergeSort {
public static void merge(int[]a,int[] aux, int f, int m, int l) {
for (int k = f; k <= l; k++) {
aux[k] = a[k];
}
int i = f, j = m+1;
for (int k = f; k <= l; k++) {
if(i>m) a[k]=aux[j++];
else if (j>l) a[k]=aux[i++];
else if(aux[j] > aux[i]) a[k]=aux[j++];
else a[k]=aux[i++];
}
}
public static void sort(int[]a,int[] aux, int f, int l) {
if (l<=f) return;
int m = f + (l-f)/2;
sort(a, aux, f, m);
sort(a, aux, m+1, l);
merge(a, aux, f, m, l);
}
public static int[] sort(int[]a) {
int[] aux = new int[a.length];
sort(a, aux, 0, a.length-1);
return a;
}
}

To understand how Merge Sort works you must understand two core data structures, Arrays and Stacks. Stacks are LIFO (Last in First Out). Method calls are executed using Stacks, so the last method call is executed first. Due to these factors, the Merge Sort has this unique behavior.
For example let's take an array as an input:
int[] input = new array[] {12, 11, 13, 5, 6, 7};
Now let's implement a Merge Sort on this array:
'''
class MergeSort
{
private static void merge_sort(int[] arr)
{
if (arr.length > 1)
{
int midpoint = arr.length / 2;
int[] l_arr = new int[midpoint];
int[] r_arr = new int[arr.length - midpoint];
int L_index = 0;
int R_index = 0;
// SORTING [ BEGIN ]
// [ BEGIN ]
// WHILE LOOP THAT IS FILLING THE LEFT ARRAY
//
while(L_index < l_arr.length )
{
l_arr[L_index] = arr[L_index];
if (L_index + 1 < l_arr.length)
{
l_arr[L_index + 1] = arr[L_index + 1];
L_index++;
}
L_index++;
}
// [ END ]
L_index = midpoint;
// [ BEGIN ]
// A WHILE LOOP THAT IS FILLING THE RIGHT ARRAY
//
while(R_index < r_arr.length)
{
r_arr[R_index] = arr[L_index];
if (R_index + 1 < r_arr.length)
{
r_arr[R_index + 1] = arr[L_index + 1];
L_index++;
R_index++;
}
L_index++;
R_index++;
}
// [ END ]
merge_sort(l_arr);
merge_sort(r_arr);
// SORTING [ END ]
// MEGING [ BEGIN ]
int l_index = 0;
int r_index = 0;
int index = 0;
while (l_index < l_arr.length && r_index < r_arr.length )
{
if (l_arr[l_index] <= r_arr[r_index])
{
arr[index] = l_arr[l_index];
l_index++;
}
else
{
arr[index] = r_arr[r_index];
r_index++;
}
index++;
}
while (l_index < l_arr.length)
{
arr[index] = l_arr[l_index];
l_index++;
index++;
}
while (r_index < r_arr.length)
{
arr[index] = r_arr[r_index];
r_index++;
index++;
}
// MEGING [ END ]
}
}
public static void main(String[] args)
{
int[] arr = new int[] {12, 11, 13, 5, 6, 7};
// BEGIN THE MERGE SORT
merge_sort(arr);
}
}
'''
When the merge sort is called the array is split into two arrays, the left array and right array. When the split happens, the left and right arrays are filled, and then recursion occurs.
The split happens always on the left until no split cannot be done, then the split transitions to the right half.
When the array reaches the size of one, the recursion stops, giving control to the previous method call. When no recursion cannot be performed, the code execution will go bellow the recursive method calls and the merge section of the algorithm will arrange the two halves in increasing / decreasing order and pass the control back to its own caller method instance.
Now the magic happens. When the array is given as a parameter to a method and it is sorted, the modifications done on the array parameter will affect the array that is within the caller method instance because, arrays are passed by reference and not by value. So this means that each time recursion occurs and it is passing the left or right half of the array, it is passing a reference to the left or right array and the modifications done by the called method instance will affect the array passed as a parameter in the caller method.

Find numbers present in at least two of the three arrays

How might I approach solving the following problem:
Create an array of integers that are contained in at least two of the given arrays.
For example:
int[] a1 = new int[] { 1, 2, 3, 4, 5 };
int[] a2 = new int[] { 5, 10, 11, 8 };
int[] a3 = new int[] { 1, 7, 6, 4, 5, 3, 11 };
must give a result array
int[] result = new int[] {1, 3, 4, 5, 11}
P.S. i'm interested in suggestions on how I might approach this ("algorithm"), not what Java utils might give me the answer

put a1 numbers in a Map<Integer,Integer> count, using the value as the key, and setting the count to 1
Put a2 numbers into the same map. If an item does not exist, assign the count of 1, otherwise assign it the existing count + 1
Put a3 numbers into the same map. If an item does not exist, assign the count of 1, otherwise assign it the existing count + 1
Go through the entries in a map, and output all keys where the value is greater than one.
This algorithm is amortized linear time in the combined number of elements in the three arrays.
If the numbers in the three arrays are limited to, say, 1000 or another relatively small number, you could avoid using collections at all, but use a potentially more expensive algorithm based on the upper limit of your numbers: replace the map with an array counts[MAX_NUM+1], and then run the same algorithm, like this:
int[] counts = new int[MAX_NUM+1];
for (int a : a1) counts[a]++;
for (int a : a2) counts[a]++;
for (int a : a3) counts[a]++;
for (int i = 0 ; i != MAX_NUM+1 ; i++) {
if (counts[i] > 1) {
System.out.println(i);
}
}

You can look at the 3 arrays as sets and find each element that is in the intersection of some pair of sets.
basically, you are looking for (set1 [intersection] set2) [union] (set2 [intersection] set3) [union] (set1 [intersection] set2)
I agree that it might not be the easiest way to achieve what you are after, but being able to reduce one problem to another is a technique every programmer should master, and this solution should be very educating.

The only way to do this without collections would be to take an element from an array, iterate over the remaining two arrays to see if a duplicate is found (and then break and move to the next element). You need to do this for two out of the three arrays as by the time you move to the third one, you would already have your answer.

Mathematically this can be solved as follows:
You can construct three sets using each of the three arrays, so duplicated entries in each array will only occur once in each set. And then the entries that appear at least in two of the three sets are solutions. So they are given by
(S_1 intersect S_2) union (S_2 intersect S_3) union (S_3 intersect S_1)

Think about the question and the different strategies you might use:
Go through each entry in each array, if that entry is NOT already in the "duplicates" result, then see if that entry is in each of the remaining arrays. Add to duplicates if it is and return to next integer
Create an array of non-duplicates by adding an entry from each array (and if it is already there, putting it in the duplicates array).
Use another creative strategy of your own

I like drawing Venn diagramms. You know that diagram with three intersecting circles, e.g. see here.
You then see that the complement is easier to describe:
Those elements which only exist in one array, are not interesting.
So you could build a frequency list (i.e. key = element, value = count of in how many arrays you found it [for the first time]) in a hash map, and then in a final pass pick all elements which occured more than once.
For simplicity I used sets. If your arrays contain multiple entries of the same value, you have to ignore those extra occurences when you build the frequency list.

An approach could be like this:
1.Sort all the arrays.
2.For each combination of arrays do this
Let us consider the first two arrays A,B. Let a be A's size.
Also take a third array or vector to store our result
for i=0-->a-1 {
Search for A[i] in B using binarySearch.
if A[i] exists in B then insert A[i] into our result vector
}
Repeat the same process for (B,C) and (C,A).
Now sort & Traverse the result vector from the end, remove the elements which have the property
result[i] = result[i-1]
The final vector is the required result.
Time Complexity Analysis:
T(n) = O(nlog(n)) for Sorting where n is the highest array size among the given three
For searching each element of an array in other sorted array T(n) = n * O(log n)
T(n) = O(n (log n)) for sorting the result and O(n) for traversing
So overall time complexity is O(n log(n)); and space complexity is O(n)
Please correct me of I am wrong

In Java:
Will write one without using java.utils shortly.
Meantime a solution using java.utils:
public static void twice(int[] a, int[] b, int[] c) {
//Used Set to remove duplicates
Set<Integer> setA = new HashSet<Integer>();
for (int i = 0; i < a.length; i++) {
setA.add(a[i]);
}
Set<Integer> setB = new HashSet<Integer>();
for (int i = 0; i < b.length; i++) {
setB.add(b[i]);
}
Set<Integer> setC = new HashSet<Integer>();
for (int i = 0; i < c.length; i++) {
setC.add(c[i]);
}
//Logic to fill data into a Map
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
for (Integer val : setA) {
map.put(val, 1);
}
for (Integer val : setB) {
if (map.get(val) != null) {
int count = map.get(val);
count++;
map.put(val, count);
} else {
map.put(val, 1);
}
}
for (Integer val : setC) {
if (map.get(val) != null) {
int count = map.get(val);
count++;
map.put(val, count);
} else {
map.put(val, 1);
}
}
for (Map.Entry<Integer, Integer> entry2 : map.entrySet()) {
//if (entry2.getValue() == 2) { //Return the elements that are present in two out of three arrays.
if(entry2.getValue() >= 2) { //Return elements that are present **at least** twice in the three arrays.
System.out.print(" " + entry2.getKey());
}
}
}
Change condition in last for loop in case one need to return the elements that are present in two out of three arrays. Say:
int[] a = { 2, 3, 8, 4, 1, 9, 8 };
int[] b = { 6, 5, 3, 7, 9, 2, 1 };
int[] c = { 5, 1, 8, 2, 4, 0, 5 };
Output: { 3, 8, 4, 5, 9 }

Here goes without any java.util library:
public static void twice(int[] a, int[] b, int[] c) {
int[] a1 = removeDuplicates(a);
int[] b1 = removeDuplicates(b);
int[] c1 = removeDuplicates(c);
int totalLen = a1.length + b1.length +c1.length;
int[][] keyValue = new int[totalLen][2];
int index = 0;
for(int i=0; i<a1.length; i++, index++)
{
keyValue[index][0] = a1[i]; //Key
keyValue[index][1] = 1; //Value
}
for(int i=0; i<b1.length; i++)
{
boolean found = false;
int tempIndex = -1;
for(int j=0; j<index; j++)
{
if (keyValue[j][0] == b1[i]) {
found = true;
tempIndex = j;
break;
}
}
if(found){
keyValue[tempIndex][1]++;
} else {
keyValue[index][0] = b1[i]; //Key
keyValue[index][1] = 1; //Value
index++;
}
}
for(int i=0; i<c1.length; i++)
{
boolean found = false;
int tempIndex = -1;
for(int j=0; j<index; j++)
{
if (keyValue[j][0] == c1[i]) {
found = true;
tempIndex = j;
break;
}
}
if(found){
keyValue[tempIndex][1]++;
} else {
keyValue[index][0] = c1[i]; //Key
keyValue[index][1] = 1; //Value
index++;
}
}
for(int i=0; i<index; i++)
{
//if(keyValue[i][1] == 2)
if(keyValue[i][1] >= 2)
{
System.out.print(keyValue[i][0]+" ");
}
}
}
public static int[] removeDuplicates(int[] input) {
boolean[] dupInfo = new boolean[500];//Array should not have any value greater than 499.
int totalItems = 0;
for( int i = 0; i < input.length; ++i ) {
if( dupInfo[input[i]] == false ) {
dupInfo[input[i]] = true;
totalItems++;
}
}
int[] output = new int[totalItems];
int j = 0;
for( int i = 0; i < dupInfo.length; ++i ) {
if( dupInfo[i] == true ) {
output[j++] = i;
}
}
return output;
}

It's very simple and could be done for n different arrays the same way:
public static void compute(int[] a1, int[] a2, int[] a3) {
HashMap<Integer, Integer> map = new HashMap<>();
fillMap(map, a1);
fillMap(map, a2);
fillMap(map, a3);
for (Integer key : map.keySet()) {
System.out.print(map.get(key) > 1 ? key + ", " : "");
}
}
public static void fillMap(HashMap<Integer, Integer> map, int[] a) {
for (int i : a) {
if (map.get(i) == null) {
map.put(i, 1);
continue;
}
int count = map.get(i);
map.put(i, ++count);
}
}

fun atLeastTwo(a: ArrayList<Int>, b: ArrayList<Int>, c: ArrayList<Int>): List<Int>{
val map = a.associateWith { 1 }.toMutableMap()
b.toSet().forEach { map[it] = map.getOrDefault(it, 0) + 1 }
c.toSet().forEach{ map[it] = map.getOrDefault(it, 0) + 1 }
return map.filter { it.value == 2 }.map { it.key }
}

In Javascript you can do it like this:
let sa = new Set(),
sb = new Set(),
sc = new Set();
A.forEach(a => sa.add(a));
B.forEach(b => sb.add(b));
C.forEach(c => sc.add(c));
let res = new Set();
sa.forEach((a) => {
if (sb.has(a) || sc.has(a)) res.add(a);
})
sb.forEach((b) => {
if (sa.has(b) || sc.has(b)) res.add(b);
})
sc.forEach((c) => {
if (sa.has(c) || sb.has(c)) res.add(c);
})
let arr = Array.from(res.values());
arr.sort((i, j) => i - j)
return arr

Get the list of index in subsequence matching

i have 2 sequences, for instance s=aaba and ss=aa, and i want all the way ss is in s.
In this example:
[0,1], [0,3] and [1,3]
My code is below. It works fine, except for very long s with multiple ss. In that case i've got
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
(I already use java with -Xmx at the maximum I can…)
public static ArrayList<ArrayList<Integer>> getListIndex(String[] s, String[] ss, int is, int iss) {
ArrayList<ArrayList<Integer>> listOfListIndex = new ArrayList<ArrayList<Integer>>();
ArrayList<ArrayList<Integer>> listRec = new ArrayList<ArrayList<Integer>>();
ArrayList<Integer> listI = new ArrayList<Integer>();
if (iss<0||is<iss){
return listOfListIndex;
}
if (ss[iss].compareTo(s[is])==0){
//ss[iss] matches, search ss[0..iss-1] in s[0..is-1]
listRec = getListIndex(s,ss,is-1,iss-1);
//empty lists (iss=0 for instance)
if(listRec.size()==0){
listI = new ArrayList<Integer>();
listI.add(is);
listOfListIndex.add(listI);
}
else{
//adding to what we have already found
for (int i=0; i<listRec.size();i++){
listI = listRec.get(i);
listI.add(is);
listOfListIndex.add(listI);
}
}
}
//In all cases
//searching ss[0..iss] in s[0..is-1]
listRec = getListIndex(s,ss,is-1,iss);
for (int i=0; i<listRec.size();i++){
listI = listRec.get(i);
listOfListIndex.add(listI);
}
return listOfListIndex;
}
Is there anyway to do this more efficiently ?

I doubt the recursion is the problem (think of what the maximum recursion depth is). The algorithm can be efficiently implemented by collecting the indecies of each character of s in ss in TreeSets and then simply taking the .tailSet when needing to "advance" in the string.
import java.util.*;
public class Test {
public static Set<List<Integer>> solutions(List<TreeSet<Integer>> is, int n) {
TreeSet<Integer> ts = is.get(0);
Set<List<Integer>> sol = new HashSet<List<Integer>>();
for (int i : ts.tailSet(n+1)) {
if (is.size() == 1) {
List<Integer> l = new ArrayList<Integer>();
l.add(i);
sol.add(l);
} else
for (List<Integer> tail : solutions(is.subList(1, is.size()), i)) {
List<Integer> l = new ArrayList<Integer>();
l.add(i);
l.addAll(tail);
sol.add(l);
}
}
return sol;
}
public static void main(String[] args) {
String ss = "aaba";
String s = "aa";
List<TreeSet<Integer>> is = new ArrayList<TreeSet<Integer>>();
// Compute all indecies of each character.
for (int i = 0; i < s.length(); i++) {
TreeSet<Integer> indecies = new TreeSet<Integer>();
char c = s.charAt(i);
for (int j = 0; j < ss.length(); j++) {
if (ss.charAt(j) == c)
indecies.add(j);
}
is.add(indecies);
}
System.out.println(solutions(is, -1));
}
}
Output:
[[0, 1], [1, 3], [0, 3]]

ArrayList<Integer> is quite memory-inefficient due to the overhead of the wrapper class. Using TIntArrayList from GNU Trove will probably cut down your memory usage by a factor of 3 (or even more if you're running on a 64bit JVM).

Well, the basic problem is that your algorithm is recursive. Java doesn't do tail-call optimization, so every recursive call just adds to the stack until you overflow.
What you want to do is re-structure your algorithm to be iterable so you aren't adding to the stack. Think about putting a loop (with a termination test) as the outer-most element of your method instead.
Another way to look at this problem is to break it into two steps:
Capture the positions of all of the given character ('a' in your example) into a single set.
All you want here is the complete set of combinations between them. Remember that the equation for number of combinations of r things chosen from n different things is:
C(n,r) = n!/[r!(n-r)!]

missing elements from two arrays in java

How can we find out missing elements from two arrays ?
Ex:
int []array1 ={1,2,3,4,5};
int []array2 ={3,1,2};
From the above two arrays i want to find what are the missing elements in second array?

Convert them to Sets and use removeAll.
The first problem is how to convert a primitive int[] to a collection.
With Guava you can use:
List<Integer> list1 = Ints.asList(array1);
List<Integer> list2 = Ints.asList(array2);
Apache commons (which I'm not familiar with) apparently has something similar.
Now convert to a set:
Set<Integer> set1 = new HashSet<Integer>(list1);
And compute the difference:
set1.removeAll(list2);
And convert the result back to an array:
return Ints.toArray(set1);

If you are allowed duplicates in the arrays, an efficient (O(n)) solution it to create a frequency table (Map) by iterating over the first array, and then use the map to match off any elements in the second array.
Map<Integer, Integer> freqMap = new HashMap<Integer, Integer>();
// Iterate over array1 and populate frequency map whereby
// the key is the integer and the value is the number of
// occurences.
for (int val1 : array1) {
Integer freq = freqMap.get(val1);
if (freq == null) {
freqMap.put(val1, 1);
} else {
freqMap.put(val1, freq + 1);
}
}
// Now read the second array, reducing the frequency for any value
// encountered that is also in array1.
for (int val2 : array2) {
Integer freq = freqMap.get(val2);
if (freq == null) {
freqMap.remove(val2);
} else {
if (freq == 0) {
freqMap.remove(val2);
} else {
freqMap.put(freq - 1);
}
}
}
// Finally, iterate over map and build results.
List<Integer> result = new LinkedList<Integer>();
for (Map.Entry<Integer, Integer> entry : freqMap.entrySet()) {
int remaining = entry.getValue();
for (int i=0; i<remaining; ++i) {
result.add(entry.getKey());
}
}
// TODO: Convert to int[] using the util. method of your choosing.

Simple logic for getting the unmatched numbers.
public static int getelements(int[] array1, int[] array2)
{
int count = 0;
ArrayList unMatched = new ArrayList();
int flag = 0;
for(int i=0; i<array1.length ; i++)
{ flag=0;
for(int j=0; j<array2.length ; j++)
{
if(array1[i] == array2[j]) {
flag =1;
break;
}
}
if(flag==0)
{
unMatched.add(array1[i]);
}
}
System.out.println(unMatched);
return unMatched.size();
}
public static void main(String[] args) {
// write your code here5
int array1[] = {7,3,7,2,8,3,2,5};
int array2[] = {7,4,9,5,5,10,4};
int count;
count = getelements(array1,array2);
System.out.println(count);
}

You can use Set and its methods. This operation would be a set difference.

The naive way would be to simply search one array for each of the elements of the other array (with a for loop). If you first were to SORT both arrays, it becomes much more efficient.

Consider using intersection method:
A healthy discussion is available at:
http://www.coderanch.com/t/35439/Programming-Diversions/Intersection-two-arrays

You could create two other int arrays to store the multiplicity of each value. Increment the index of the array that the value corresponds with every time it is found and then compare the arrays.
It's not the most "efficient" way perhaps, but it's a very simple concept that works.

Guava library can be helpful; you need to change Array in Set then can use API.

#finnw I believe you were thinking of commons-collections.
Need to import org.apache.commons.collections.CollectionUtils;
To get the disjunction function.
Using the disjunction method will find all objects that aren't found in an intersection.
Integer[] array1 ={1,2,3,4,5};
Integer[] array2 ={3,1,2};
List list1 = Arrays.asList(array1);
List list2 = Arrays.asList(array2);
Collection result = CollectionUtils.disjunction(list1, list2);
System.out.println(result); // displays [4, 5]

This is not the most efficient way but it's probably the simplest way that works in Java :
public static void main(final String[] args) {
final int[] a = { 1, 2, 3, 4, 5 };
final int[] b = { 3, 1, 2 };
// we have to do this just in case if there might some values that are missing in a and b
// example: a = { 1, 2, 3, 4, 5 }; b={ 2, 3, 1, 0, 5 }; missing value=4 and 0
findMissingValue(b, a);
findMissingValue(a, b);
}
private static void findMissingValue(final int[] x, final int[] y) {
// loop through the bigger array
for (final int n : x) {
// for each value in the a array call another loop method to see if it's in there
if (!findValueSmallerArray(n, y)) {
System.out.println("missing value: " + n);
// break;
}
}
}
private static boolean findValueSmallerArray(final int n, final int[] y) {
for (final int i : y) {
if (n == i) {
return true;
}
}
return false;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Find common elements in two unsorted array - java

Related

Randomizing set of duplicate arrays in Java without repeating elements

Merge Sort Recursion

Find numbers present in at least two of the three arrays

Get the list of index in subsequence matching

missing elements from two arrays in java

Categories

Resources