find if there exists a common element in 2 arrays - java

Given two arrays of integers, how can you efficiently find out if the two arrays have an element in common?
Can somebody come up with a better space complexity than this (I would appreciate pointing errors in the program too, thanks!!).
Is it possible to solve this using a XOR?
public boolean findcommon(int[] arr1, int[] arr2) {
Set<int> s = new Hashset<int>();
for(int i=0;i<arr1.length;i++) {
if(!s.contains(arr1[i]))
s.add(arr1[i]);
}
for(int i=0;i<arr2.length;i++) {
if(s.contains(arr2[i]))
return true;
}
return false;
}

Since you are asking for a more space efficient solution:
When you accept a runtime of O(n log n) and are allowed to change the arrays, you could sort them and then do a linear pass to find the common elements.

If you only need to do it ONCE, then you can't do better than time complexity O(n+m) where n and m are the respective lengths of arrays. You have to go through the input in both arrays, there is no choice (how else will you look at all the input?), so just the input processing will have that complexity, there is no point in doing something more efficient. If you need to keep searching as the arrays grow, that's a different discussion.
So the question with your suggested way of implementing comes down to how long does "contains" take? Since you're using a Hashset, contains is constant time O(1), so you get to O(n) to create the hashset and O(m) to verify elements in the second array.
Put together, O(n+m). Good enough ;)
If you're looking for improved space complexity, you first of all need to be able to make changes to the original arrays. But I don't think there's any way to use less additional space than O(n) and still perform in O(n+m) time.
Note: I'm not sure what XOR you are thinking of. If you're thinking of bit-wise or logical XOR, there's no use for it here. If you're thinking of Set XOR, it doesn't matter if it's logically useful or not, as it's not in the implementation of Java Sets, so you would still have to roll your own.

Given that your solution only attempts to find if there is an element that exists in both arrays, the following is code that will do it for you:
public boolean findCommon(int[] arr1, int[] arr2) {
HashTable hash = new HashTable();
for (item : arr1){
if !(hash.containsKey(item)){
hash.put(item, "foo");
}
}
for (item : arr2){
if (hash.containsKey(item)){
return(true);
}
}
return(false);
}
This does still have a worst case complexity of O(n) for two arrays which do not share single element. If as is suggested by your initial question what you're worried about is Space Complexity (eg, you'd be happy to accept a performance hit if you didn't have to store the HashTable), you could go for something along these lines:
public boolean findCommon(int[] arr1, int[] arr2){
for (item : arr1){
for(item2 : arr2){
if(item ==item2){
return(true);
}
}
}
return(false);
}
That would solve your Space Complexity issue, but would have the (objectively terrible) Time Complexity of O(n^2).
This could be simplified if you were to put more parameters around it (Say you happen to know that at least one of the arrays is sorted, or better yet, both are sorted).
But in the wildcard example you asked I believe it really will come down to O(n) with a HashTable for Space Complexity, or O(n^2) with less space complexity.

You can improve space occupation (this is the question right?) with an algoritm of order O(n*m). Just take every pair of elements and compare them... This is awful in time but does not use any additional memory. Otherwise you could sort the two array in place (if you are allowed to modify them) and find the common elements in O(max(n,m)).

Related

Is my Java solution O(n) or am I missing something?

My solution for a certain problem is apparently slower than 95% of solutions and I wanted to make sure that I was correct on the time complexity.
It looks to me like this code is O(n). I use a couple of loops that are at most O(n) and they aren't nested so I don't believe the solution is n^2.
I use a HashMap as storage and use HashMap methods which are O(1) inside my while and for-loop for insertion and look ups respectively.
Am I correct in that this solution is O(n) or am I missing something?
public int pairSum(ListNode head) {
HashMap<Integer, Integer> nodeVals = new HashMap<Integer, Integer>();
int count = 0;
ListNode current = head;
nodeVals.put(count, current.val);
count++;
while (current.next != null) {
current = current.next;
nodeVals.put(count, current.val);
count++;
}
int maxTwinSum = 0;
for (int i = 0; i < nodeVals.size() / 2; i++) {
int currTwinSum;
currTwinSum = nodeVals.get(i) + nodeVals.get(nodeVals.size() - 1 - i);
if (currTwinSum > maxTwinSum) maxTwinSum = currTwinSum;
}
return maxTwinSum;
}
Is my Java solution O(N) or am I missing something?
Yes to both!
Your solution is O(N), AND you are missing something.
The something that you are missing is that complexity and performance are NOT the same thing. Complexity is about how some measure (e.g. time taken, space used, etc) changes depending on certain problem size variables; e.g. the size of the list N.
Put it another way ... not all O(N) solutions to a problem will have the same performance. Some are faster, some are slower.
In your case, HashMap is a relatively expensive data structure. While it it (amortized) O(1) for operations like get and put, the constants of proportionality are large compared with (say) using an ArrayList or an array to hold the same information.
So ... I expect that the solutions that are faster than yours won't be using HashMap.
The flipside is that an O(N^2) solution can be faster than an O(N) solution if you only consider values of N less than some threshold. This follows from the mathematical definition of Big O.
For instance, if you are sorting arrays of integers and the array size is small enough, a naive bubblesort will be faster than quicksort.
In short: complexity is not performance.
First off: HashMap methods are amortized O(1), which basically means that you can treat them as if they were O(1) if you use them often enough because that's what they'll be on average. But building a hashmap is still a "relatively expensive" operation (a notion that can't be expressed in bit-O notation, because that one only cares about the asymptotic worst case).
Second: you construct a complete, inefficient copy of the list in a HashMap which is probably slower than most other approaches.
The first optimization is to replace your HashMap with a simple ArrayList: you only use numeric and strictly monotonically increasing keys anyway, so a list is a perfect match.

Why is my algorithm O(1) additional space complexity?

I solved this problem from codefights:
Note: Write a solution with O(n) time complexity and O(1) additional space complexity, since this is what you would be asked to do during a real interview.
Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index. In other words, if there are more than 1 duplicated numbers, return the number for which the second occurrence has a smaller index than the second occurrence of the other number does. If there are no such elements, return -1.
int firstDuplicate(int[] a) {
HashSet z = new HashSet();
for (int i: a) {
if (z.contains(i)){
return i;
}
z.add(i);
}
return -1;
}
My solution passed all of the tests. However I don't understand how my solution met the O(1) additional space complexity requirement. The size of the hashtable is directly proportional to the input so I would think it is O(n) space complexity. Did codefights incorrectly test my algorithm or am I misunderstanding something?
Your code doesn’t have O(1) auxiliary space complexity, since that hash set can grow up to size n if given an array of all different elements.
My guess is that the online testing infrastructure didn’t check memory usage or otherwise checked memory usage incorrectly. If you want to meet the space constraints, you’ll need to go back and try solving the problem a different way.
As a hint, think about reordering the array elements.
In case you are able to modify incomming array, you could fix your problem with O(n) time complexity, and do not use external memory.
public static int getFirstDuplicate(int... arr) {
for (int i = 0; i < arr.length; i++) {
int val = Math.abs(arr[i]);
if (arr[val - 1] < 0)
return val;
arr[val - 1] = -arr[val - 1];
}
return -1;
}
This is technically incorrect, for two reasons.
Firstly, depending on the values in the array, there may be overhead when the ints become Integers and added to the HashSet.
Secondly, while the additional memory is largely the overhead associated with a HashSet, that overhead is linearly proportional to the size of the set. (Note that I am not counting the elements in this, as they are already present in the array.)
Usually, these memory constraints are tested by setting a limit to the amount of memory it can use. A solution like this I would expect to fall below the said threshold.

Java accessing elements in an arraylist slower over time

I have some code, and I noticed that the progress of iterating through an ArrayList became drastically slower over time. The code that seems to be causing the problem is as below:
public boolean isWordOfficial(String word){
return this.wordList.get(this.stringWordList.indexOf(word)).isWordOfficial();
}
Is there something about this code I don't know in terms of accessing the two arraylists?
I don't exactly why, or by how much, your ArrayList performance is becoming too slow, but from a quick glance at your use case, you are doing the following operations:
given a String word, look it up in stringWordList, and return the numerical index
lookup the word in wordList contained at this index and return it
This pattern of usage would better be served by a Map, where the key would be the input word, possibly corresponding to an entry in stringWordList, and the output another word, from wordList.
A map lookup would be an O(1) operation, as compared to O(N) for the lookups in a list.
this.stringWordList.indexOf is O(N) and that is the cause of your issue. As N increases (you add words to the list) these operations take longer and longer.
To avoid this keep your list sorted and use binarySearch.
This takes your complexity from O(n) to O(log(N)).

How reduce the complexity of the searching in two lists algorithm?

I have to find some common items in two lists. I cannot sort it, order is important. Have to find how many elements from secondList occur in firstList. Now it looks like below:
int[] firstList;
int[] secondList;
int iterator=0;
for(int i:firstList){
while(i <= secondList[iterator]/* two conditions more */){
iterator++;
//some actions
}
}
Complexity of this algorithm is n x n. I try to reduce the complexity of this operation, but I don't know how compare elements in different way? Any advice?
EDIT:
Example: A=5,4,3,2,3 B=1,2,3
We look for pairs B[i],A[j]
Condition:
when
B[i] < A[j]
j++
when
B[i] >= A[j]
return B[i],A[j-1]
next iteration through the list of A to an element j-1 (mean for(int z=0;z<j-1;z++))
I'm not sure, Did I make myself clear?
Duplicated are allowed.
My approach would be - put all the elements from the first array in a HashSet and then do an iteration over the second array. This reduces the complexity to the sum of the lengths of the two arrays. It has the downside of taking additional memory, but unless you use more memory I don't think you can improve your brute force solution.
EDIT: to avoid further dispute on the matter. If you are allowed to have duplicates in the first array and you actually care how many times does an element in the second array match an array in the first one, use HashMultiSet.
Put all the items of the first list in a set
For each item of the second list, test if its in the set.
Solved in less than n x n !
Edit to please fge :)
Instead of a set, you can use a map with the item as key and the number of occurrence as value.
Then for each item of the second list, if it exists in the map, execute your action once per occurence in the first list (dictionary entries' value).
import java.util.*;
int[] firstList;
int[] secondList;
int iterator=0;
HashSet hs = new HashSet(Arrays.asList(firstList));
HashSet result = new HashSet();
while(i <= secondList.length){
if (hs.contains( secondList[iterator]))
{
result.add(secondList[iterator]);
}
iterator++;
}
result will contain required common element.
Algorithm complexity n
Just because the order is important doesn't mean that you cannot sort either list (or both). It only means you will have to copy first before you can sort anything. Of course, copying requires additional memory and sorting requires additional processing time... yet I guess all solutions that are better than O(n^2) will require additional memory and processing time (also true for the suggested HashSet solutions - adding all values to a HashSet costs additional memory and processing time).
Sorting both lists is possible in O(n * log n) time, finding common elements once the lists are sorted is possible in O(n) time. Whether it will be faster than your native O(n^2) approach depends on the size of the lists. In the end only testing different approaches can tell you which approach is fastest (and those tests should use realistic list sizes as to be expected in your final code).
The Big-O notation is no notation that tells you anything about absolute speed, it only tells you something about relative speed. E.g. if you have two algorithms to calculate a value from an input set of elements, one is O(1) and the other one is O(n), this doesn't mean that the O(1) solution is always faster. This is a big misconception of the Big-O notation! It only means that if the number of input elements doubles, the O(1) solution will still take approx. the same amount of time while the O(n) solution will take approx. twice as much time as before. So there is no doubt that by constantly increasing the number of input elements, there must be a point where the O(1) solution will become faster than the O(n) solution, yet for a very small set of elements, the O(1) solution may in fact be slower than the O(n) solution.
OK, so this solution will work if there are no duplicates in either the first or second array. As the question does not tell, we cannot be sure.
First, build a LinkedHashSet<Integer> out of the first array, and a HashSet<Integer> out of the second array.
Second, retain in the first set only elements that are in the second set.
Third, iterate over the first set and proceed:
// A LinkedHashSet retains insertion order
Set<Integer> first = LinkedHashSet<Integer>(Arrays.asList(firstArray));
// A HashSet does not but we don't care
Set<Integer> second = new HashSet<Integer>(Arrays.asList(secondArray));
// Retain in first only what is in second
first.retainAll(second);
// Iterate
for (int i: first)
doSomething();

Java Search an array for a matching string

how can I optimize the following:
final String[] longStringArray = {"1","2","3".....,"9999999"};
String searchingFor = "9999998"
for(String s : longStringArray)
{
if(searchingFor.equals(s))
{
//After 9999998 iterations finally found it
// Do the rest of stuff here (not relevant to the string/array)
}
}
NOTE: The longStringArray is only searched once per runtime & is not sorted & is different every other time I run the program.
Im sure there is a way to improve the worst case performance here, but I cant seem to find it...
P.S. Also would appreciate a solution, where string searchingFor does not exist in the array longStringArray.
Thank you.
Well, if you have to use an array, and you don't know if it's sorted, and you're only going to do one lookup, it's always going to be an O(N) operation. There's nothing you can do about that, because any optimization step would be at least O(N) to start with - e.g. populating a set or sorting the array.
Other options though:
If the array is sorted, you could perform a binary search. This will turn each lookup into an O(log N) operation.
If you're going to do more than one search, consider using a HashSet<String>. This will turn each lookup into an O(1) operation (assuming few collisions).
import org.apache.commons.lang.ArrayUtils;
ArrayUtils.indexOf(array, string);
ArrayUtils documentation
You can create a second array with the hash codes of the string and binary search on that.
You will have to sort the hash array and move the elements of the original array accordingly. This way you will end up with extremely fast searching capabilities but it's going to be kept ordered, so inserting new elements takes resources.
The most optimal would be implementing a binary tree or a B-tree, if you have really so much data and you have to handle inserts it's worth it.
Arrays.asList(longStringArray).contains(searchingFor)

Categories

Resources