Related
In a way, it is the reverse of the problem of generating subsets of size k from an array containing k+1 elements.
For example, if somebody gives me the pairs {a,b} , {a,c} , {b,c} , {a,e} , {b,e}, {a,f}, I need an algorithm that will tell me the triplets {a,b,c} and (a,b,e} are completely covered for their pairwise combinations in the pairs given to me. I need to generalize from pair/triplet in my example to the case k/k+1
My hunch was that there would be a well documented and efficient algorithm that solves my problem. Sadly, searching the internet did not help obtaining it. Questions already posted in stackoverflow do not cover this problem. I am thereby compelled to post this question to find my solution.
I'm not familiar with an established algorithm for this and you didn't ask for a specific language so I've written up a C# algorithm that accomplishes what you've asked and matches the test values provided. It doesn't have much real-world error checking. I've got a .Net fiddle you can run to see the results within a web browser. https://dotnetfiddle.net/ErwTeg
It works by converting your array of arrays (or similar container) to a dictionary with every unique value as a key and the value for each key being every value that is found within any list with the key. From your sample, a gets {b,c,e,f} (We'll call them friends, and this is what the GetFriends function does)
The AreFriendsWithEachother function indicates whether or not all passed values are friends with all other values.
The results of the friends list are then fed to the MakeTeam function which makes teams of a given size by enumerating every friend that a key has and trying every size length permutation of these. For instance, in the original example a has friend permutations of {{a,b,c},{a,b,e},{a,b,f},{a,c,b},{a,c,e},{a,c,f},{a,e,b},{a,e,c},{a,e,f},{a,f,b},{a,f,c},{a,f,e}}. Of these we make sure that all three values are friends by checking the friends list we created earlier. If all values within a permutation are friends then we add it to our results cache. The results would then be culled for all duplicate sets. This is handled in C# by using HashSet which only adds items that aren't already on the list.
The MakeTeam function is terrible looking because it contains a runtime variable number of loops (normally visualized by foreach). I am rolling up and down through enumerators and emulating the foreach loops myself.
I've included versions for MakeTeamOf3 and MakeTeamOf4 which show static loop structures, which are very easily adapted when you know your k value ahead of time.
The same code is provided here
using System;
using System.Collections.Generic;
using System.Linq;
namespace kfromkm1 // k elements from k minus 1
{
public class Program
{
static readonly string[][] pairs =
{
new string[] { "a", "b" },
new string[] { "a", "c" },
new string[] { "b", "c" },
new string[] { "a", "e" },
new string[] { "b", "e" },
new string[] { "a", "f" }
};
static readonly string[][] pairsExpectedResult =
{
new string[] { "a", "b", "c" },
new string[] { "a", "b", "e" }
};
static readonly string[][] triplets =
{
new string[] { "a", "b", "c" },
new string[] { "a", "b", "d" },
new string[] { "a", "c", "d" },
new string[] { "b", "c", "d" },
new string[] { "b", "c", "e" }
};
static readonly string[][] tripletsExpectedResults =
{
new string[] { "a", "b", "c", "d" }
};
public static void Main(string[] args)
{
Dictionary<string, HashSet<string>> friendsList = GetFriends(pairs);
Dump(nameof(pairs), pairs);
Console.WriteLine();
Dump(nameof(pairsExpectedResult), pairsExpectedResult);
Console.WriteLine();
HashSet<HashSet<string>> teams = MakeTeams(friendsList, 3);
Dump(nameof(teams), teams);
Console.WriteLine();
friendsList = GetFriends(triplets);
Dump(nameof(triplets), triplets);
Console.WriteLine();
Dump(nameof(tripletsExpectedResults), tripletsExpectedResults);
Console.WriteLine();
teams = MakeTeams(friendsList, 4);
Dump(nameof(teams), teams);
Console.ReadLine();
}
// helper function to display results
static void Dump<T>(string name, IEnumerable<IEnumerable<T>> values)
{
Console.WriteLine($"{name} =");
int line = 0;
bool notfirst;
foreach (IEnumerable<T> layer in values)
{
Console.Write($"{line}: {{");
notfirst = false;
foreach (T value in layer)
{
if (notfirst)
Console.Write($", {value}");
else
{
Console.Write(value);
notfirst = true;
}
}
Console.WriteLine("}");
line++;
}
}
// items are friends if they show up in a set (pair in the example) together
// list can be a list of lists, array of arrays, list of arrays, etc
// {a, b} means a and b are friends
// {a, b, c} means a is friends with b and c, b is friends with a and c, c is friends with a and b
static Dictionary<T, HashSet<T>> GetFriends<T>(IEnumerable<IEnumerable<T>> list) where T : IEquatable<T>
{
Dictionary<T, HashSet<T>> result = new Dictionary<T, HashSet<T>>();
foreach (IEnumerable<T> set in list) // one set at a time
{
foreach (T current in set) // enumerate the set from front to back
{
foreach (T other in set) // enumerate the set with a second pointer to compare every item
{
if (!current.Equals(other)) // ignore self
{
if (!result.ContainsKey(current)) // initialize this item's result hashset
result[current] = new HashSet<T>();
result[current].Add(other); // add friend (hashset will ignore duplicates)
}
}
}
}
return result;
}
// indicates whether or not all items are friends
static bool AreFriendsWithEachother<T>(Dictionary<T, HashSet<T>> friendsList, IEnumerable<T> values)
{
if (friendsList == null) // no list = no results
throw new ArgumentNullException(nameof(friendsList));
foreach (T first in values)
{
if (!friendsList.ContainsKey(first)) // not on list, has no friends
return false;
foreach (T other in values)
{
if (!friendsList[first].Contains(other) && !first.Equals(other)) // false if even one doesn't match, don't count self as non-friend for computational ease
return false;
}
}
return true; // all matched so true
}
// size represents how many items should be in each team
static HashSet<HashSet<T>> MakeTeams<T>(Dictionary<T, HashSet<T>> friendsList, int size) where T : IEquatable<T>
{
if (friendsList == null) // no list = no results
throw new ArgumentNullException(nameof(friendsList));
if (size < 2)
throw new ArgumentOutOfRangeException(nameof(size), size, "Size should be greater than 2");
HashSet<HashSet<T>> result = new HashSet<HashSet<T>>(HashSet<T>.CreateSetComparer());
T[] values = new T[size];
IEnumerator<T>[] enumerators = new IEnumerator<T>[size - 1]; // gotta cache our own enumerators with a variable number of "foreach" layers
int layer;
bool moveNext;
foreach (T key in friendsList.Keys) // this is a mess because it's a runtime variable number of copies of enumerators running over the same list
{
values[0] = key;
for (int index = 0; index < size - 1; index++)
enumerators[index] = friendsList[key].GetEnumerator();
moveNext = true;
layer = 0;
while (moveNext)
{
while (layer < size - 1 && moveNext)
{
if (enumerators[layer].MoveNext())
layer++;
else
{
if (layer == 0)
moveNext = false;
else
{
enumerators[layer].Reset();
layer--;
}
}
}
for (int index = 1; index < size; index++)
values[index] = enumerators[index - 1].Current;
if (values.Distinct().Count() == size && AreFriendsWithEachother(friendsList, values))
result.Add(new HashSet<T>(values));
layer--;
}
}
return result;
}
// provided as an example
static HashSet<HashSet<T>> MakeTeamsOf3<T>(Dictionary<T, HashSet<T>> friendsList) where T : IEquatable<T>
{
if (friendsList == null) // no list = no results
throw new ArgumentNullException(nameof(friendsList));
HashSet<HashSet<T>> result = new HashSet<HashSet<T>>(HashSet<T>.CreateSetComparer());
T[] values;
foreach (T key in friendsList.Keys) // start with every key
{
foreach (T first in friendsList[key])
{
foreach (T second in friendsList[key])
{
values = new T[] { key, first, second };
if (values.Distinct().Count() == 3 && AreFriendsWithEachother(friendsList, values)) // there's no duplicates and they are friends
result.Add(new HashSet<T>(values));
}
}
}
return result;
}
// provided as an example
static HashSet<HashSet<T>> MakeTeamsOf4<T>(Dictionary<T, HashSet<T>> friendsList) where T : IEquatable<T>
{
if (friendsList == null) // no list = no results
throw new ArgumentNullException(nameof(friendsList));
HashSet<HashSet<T>> result = new HashSet<HashSet<T>>(HashSet<T>.CreateSetComparer());
T[] values;
foreach (T key in friendsList.Keys) // start with every key
{
foreach (T first in friendsList[key])
{
foreach (T second in friendsList[key])
{
foreach (T third in friendsList[key])
{
values = new T[] { key, first, second, third };
if (values.Distinct().Count() == 4 && AreFriendsWithEachother(friendsList, values)) // there's no duplicates and they are friends
result.Add(new HashSet<T>(values));
}
}
}
}
return result;
}
}
}
Function to generate SetOfkNbrdElementCombinations
//to generate outputs with k values greater than two (pairwise)
Take SetOfkNbrdElementCombinations as an input
//Example - {{a,b},{b,c},...} : here k is 2 (though variable name will retain the letter k); elements are a,b,c,..; sets {a,b}, {b,c} are 2-numbered combinations of elements
Take nextSize as an input
//nextSize should be bigger than the k in the input SetOfkNbrdElementCombinations by 1.
//For example above where k is 2, nextSize would be 3
//Logic:
Comb(SetOfkNbrdElementCombinations={S1,S2,...Sn},nextSize) = {S1,Comb({SetOfkNbrdElementCombinations-a1},nextSize-l)}
//The recursive algorithm specified in the line above generates sets containing unique nextSize numbered combinations of the combinations in SetOfkNbrdElementCombinations
//Code that implements the algorithm is available at Rosetta code
//In our example it would generate {{{a,b},{b,c},{b,e}},{{a,b},{b,c},{a,c}},...} (triplets of pairs)
//My logic to generate nextSize numbered combinations of elements is below
// Example of my output, based on the example input above, would be {{a,b,c},{a,c,e},...}
Intitialize oputSetOfkNbrdElementCombinations to empty
For each nextSize sized combination of combinations generated above
Join the contained combinations in a union set
If the nbr of elements in the union is nextSize, add the union set to oputSetOfkNbrdElementCombinations
Output oputSetOfkNbrdElementCombinations
Here is the Java implementation of the algorithm. You can copy, paste and run on https://ide.geeksforgeeks.org/
/* This program takes k sized element combinations and generates the k+1 sized element combinations
that are possible.
For example if the program is given {a,b,c}, {a,b,d}, {a,c,d}, {b,c,d}, {b,c,e}
which are 3 sized combinations, it will identify {a,b,c,d} the
4 sized combination that has all the 3 sized combinations of its elements covered
in what were provided to the program
The program can scale to higher values of k.
The program uses only the hashset data structure
*/
//AUTHOR: Suri Chitti
import java.util.*;
public class uppOrdCombsFromCombs {
//sample CSV strings...let us pretend they came from a file
//This is a sample of input to the program
static String[] csvStrings = new String[] {
"a,b,c",
"a,b,d",
"a,c,d",
"b,c,d",
"b,c,e"
};
/* //Another sample CSV strings...let us pretend they came from a file
//This is another sample of input to the program
static String[] csvStrings = new String[] {
"a,b",
"b,c",
"a,c",
"c,e",
"a,e"
};
*/ /////USE ONLY ONE SAMPLE
//Before we can generate a k+1 sized combination of elements from a bunch
//of k sized combinations we need to obtain groups containing k+1 number of
// k sized combinations
//The method below, called SetOfNxtSizeNbrdkElementCombinationsets will do it for us
//It takes a bunch of k sized combinations called the parameter hsSetOfkNbrdCombinationsetsPrm
//which is a hashset.
//It also takes k+1 as input called the parameter nextSize which is an integer
//Outputs k+1 sized groups of k sized element combinations as a variable called hsSetOfNxtSizeNbrdCombinationsets
//which is hashset
static HashSet SetOfNxtSizeNbrdCombinationsets(HashSet hsSetOfkNbrdCombinationsetsPrm, Integer nextSize){
HashSet hsSetOfNxtSizeNbrdCombinationsets = new HashSet<>();//is it better to have nested <HashSet> tokens in this declaration?
HashSet hsRecursor = new HashSet<>(hsSetOfkNbrdCombinationsetsPrm);
Iterator <HashSet> loopIterator1 = hsSetOfkNbrdCombinationsetsPrm.iterator();
while (loopIterator1.hasNext()) {
HashSet hsName = loopIterator1.next();
if(nextSize == 1){
hsSetOfNxtSizeNbrdCombinationsets.add(hsName);
}
else {
HashSet hsConc1 = new HashSet<>();
hsRecursor.remove(hsName);
hsConc1 = SetOfNxtSizeNbrdCombinationsets(hsRecursor,nextSize-1);
Iterator <HashSet> loopIterator2 = hsConc1.iterator();
while (loopIterator2.hasNext()) {
HashSet hsConc2 = new HashSet<>();
HashSet hsConc3 = new HashSet<>();
hsConc2 = loopIterator2.next();
Iterator <HashSet> loopIterator3 = hsConc2.iterator();
Object obj = loopIterator3.next();
if (String.class.isInstance(obj)) {
hsConc3.add(hsName);
hsConc3.add(hsConc2);
}
else {
loopIterator3 = hsConc2.iterator();
hsConc3.add(hsName);
while (loopIterator3.hasNext()) {
hsConc3.add(loopIterator3.next());
}
}
hsSetOfNxtSizeNbrdCombinationsets.add(hsConc3);
}
}
}
return hsSetOfNxtSizeNbrdCombinationsets;
}
//The method below takes the k+1 sized groupings of k sized element combinations
//generated by the method above and generates all possible K+1 sized combinations of
//elements contained in them
//Name of the method is SetOfkNbrdCombinationsets
//It takes the k+1 sized groupings in a parameter called hsSetOfNxtSizeNbrdCombinationsetsPrm which is a HashSet
//It takes the value k+1 as a parameter called nextSize which is an Integer
//It returns k+1 sized combinations as a variable called hsSetOfkNbrdCombinationsets which is a HashSet
//This is the intended output of the whole program
static HashSet SetOfkNbrdCombinationsets(HashSet hsSetOfNxtSizeNbrdCombinationsetsPrm, Integer nextSize){
HashSet hsSetOfkNbrdCombinationsets = new HashSet<>();
HashSet hsMember = new HashSet<>();
Iterator <HashSet> loopIteratorOverParam = hsSetOfNxtSizeNbrdCombinationsetsPrm.iterator();
while (loopIteratorOverParam.hasNext()) {
hsMember = loopIteratorOverParam.next();
HashSet hsInnerUnion = new HashSet<>();
Iterator <HashSet> loopIteratorOverMember = hsMember.iterator();
while (loopIteratorOverMember.hasNext()) {
HashSet hsInnerMemb = new HashSet<>(loopIteratorOverMember.next());
hsInnerUnion.addAll(hsInnerMemb);
}
if (hsInnerUnion.size()==nextSize) {
HashSet hsTemp = new HashSet<>(hsInnerUnion);
hsSetOfkNbrdCombinationsets.add(hsTemp);
}
hsInnerUnion.clear();
}
return hsSetOfkNbrdCombinationsets;
}
public static void main(String args[]) {
HashSet hsSetOfkNbrdCombinationsets = new HashSet<>();//should this have nested <HashSet> tokens?
HashSet hsSetOfNxtSizeNbrdCombinationsets = new HashSet<>();//should this have nested <HashSet> tokens?
Integer innerSize=0,nextSize = 0;
System.out.println("Ahoy");
//pretend we are looping through lines in a file here
for(String line : csvStrings)
{
String[] linePieces = line.split(",");
List<String> csvPieces = new ArrayList<String>(linePieces.length);
for(String piece : linePieces)
{
//System.out.println(piece); will print each piece in separate lines
csvPieces.add(piece);
}
innerSize = csvPieces.size();
Set<String> hsInner = new HashSet<String>(csvPieces);
hsSetOfkNbrdCombinationsets.add(hsInner);
}
nextSize = innerSize+1; //programmatically obtain nextSize
hsSetOfNxtSizeNbrdCombinationsets = SetOfNxtSizeNbrdCombinationsets(hsSetOfkNbrdCombinationsets,nextSize);
hsSetOfkNbrdCombinationsets = SetOfkNbrdCombinationsets(hsSetOfNxtSizeNbrdCombinationsets, nextSize);
System.out.println("The " + nextSize + " sized combinations from elements present in the input combinations are: " + hsSetOfkNbrdCombinationsets);
} //end of main
} //end of class
I'm developing a Java Application that reads a lot of strings data likes this:
1 cat (first read)
2 dog
3 fish
4 dog
5 fish
6 dog
7 dog
8 cat
9 horse
...(last read)
I need a way to keep all couple [string, occurrences] in order from last read to first read.
string occurrences
horse 1 (first print)
cat 2
dog 4
fish 2 (last print)
Actually i use two list:
1) List<string> input; where i add all data
In my example:
input.add("cat");
input.add("dog");
input.add("fish");
...
2)List<string> possibilities; where I insert the strings once in this way:
if(possibilities.contains("cat")){
possibilities.remove("cat");
}
possibilities.add("cat");
In this way I've got a sorted list where all possibilities.
I use it like that:
int occurrence;
for(String possible:possibilities){
occurrence = Collections.frequency(input, possible);
System.out.println(possible + " " + occurrence);
}
That trick works good but it's too slow(i've got millions of input)... any help?
(English isn’t my first language, so please excuse any mistakes.)
Use a Map<String, Integer>, as #radoslaw pointed, to keep the insertion sorting use LinkedHashMap and not a TreeMap as described here:
LinkedHashMap keeps the keys in the order they were inserted, while a TreeMap is kept sorted via a Comparator or the natural Comparable ordering of the elements.
Imagine you have all the strings in some array, call it listOfAllStrings, iterate over this array and use the string as key in your map, if it does not exists, put in the map, if it exists, sum 1 to actual result...
Map<String, Integer> results = new LinkedHashMap<String, Integer>();
for (String s : listOfAllStrings) {
if (results.get(s) != null) {
results.put(s, results.get(s) + 1);
} else {
results.put(s, 1);
}
}
Make use of a TreeMap, which will keep ordering on the keys as specified by the compare of your MyStringComparator class handling MyString class which wraps String adding insertion indexes, like this:
// this better be immutable
class MyString {
private MyString() {}
public static MyString valueOf(String s, Long l) { ... }
private String string;
private Long index;
public hashcode(){ return string.hashcode(); }
public boolean equals() { // return rely on string.equals() }
}
class MyStringComparator implements Comparator<MyString> {
public int compare(MyString s1, MyString s2) {
return -s1.getIndex().compareTo(s2.gtIndex());
}
}
Pass the comparator while constructing the map:
Map<MyString,Integer> map = new TreeMap<>(new MyStringComparator());
Then, while parsing your input, do
Long counter = 0;
while (...) {
MyString item = MyString.valueOf(readString, counter++);
if (map.contains(item)) {
map.put(map.get(item)+1);
} else {
map.put(item,1);
}
}
There will be a lot of instantiation because of the immutable class, and the comparator will not be consistent with equals, but it should work.
Disclaimer: this is untested code just to show what I'd do, I'll come back and recheck it when I get my hands on a compiler.
Here is the complete solution for your problem,
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class DataDto implements Comparable<DataDto>{
public int count = 0;
public String string;
public long lastSeenTime;
public DataDto(String string) {
this.string = string;
this.lastSeenTime = System.currentTimeMillis();
}
public boolean equals(Object object) {
if(object != null && object instanceof DataDto) {
DataDto temp = (DataDto) object;
if(temp.string != null && temp.string.equals(this.string)) {
return true;
}
}
return false;
}
public int hashcode() {
return string.hashCode();
}
public int compareTo(DataDto o) {
if(o != null) {
return o.lastSeenTime < this.lastSeenTime ? -1 : 1;
}
return 0;
}
public String toString() {
return this.string + " : " + this.count;
}
public static final void main(String[] args) {
String[] listOfAllStrings = {"horse", "cat", "dog", "fish", "cat", "fish", "dog", "cat", "horse", "fish"};
Map<String, DataDto> results = new HashMap<String, DataDto>();
for (String s : listOfAllStrings) {
DataDto dataDto = results.get(s);
if(dataDto != null) {
dataDto.count = dataDto.count + 1;
dataDto.lastSeenTime = System.nanoTime();
} else {
dataDto = new DataDto(s);
results.put(s, dataDto);
}
}
List<DataDto> finalResults = new ArrayList<DataDto>(results.values());
System.out.println(finalResults);
Collections.sort(finalResults);
System.out.println(finalResults);
}
}
Ans
[horse : 1, cat : 2, fish : 2, dog : 1]
[fish : 2, horse : 1, cat : 2, dog : 1]
I think this solution will be suitable for your requirement.
If you know that your data is not going to exceed your memory capacity when you read it all into memory, then the solution is simple - using a LinkedList or a and a LinkedHashMap.
For example, if you use a Linked list:
LinkedList<String> input = new LinkedList();
You then proceed to use input.add() as you did originally. But when the input list is full, you basically use Jordi Castilla's solution - but put the entries in the linked list in reverse order. To do that, you do:
Iterator<String> iter = list.descendingIterator();
LinkedHashMap<String,Integer> map = new LinkedHashMap<>();
while (iter.hasNext()) {
String s = iter.next();
if ( map.containsKey(s)) {
map.put( s, map.get(s) + 1);
} else {
map.put(s, 1);
}
}
Now, the only real difference between his solution and mine is that I'm using list.descendingIterator() which is a method in LinkedList that gives you the entries in backwards order, from "horse" to "cat".
The LinkedHashMap will keep the proper order - whatever was entered first will be printed first, and because we entered things in reverse order, then whatever was read last will be printed first. So if you print your map the result will be:
{horse=1, cat=2, dog=4, fish=2}
If you have a very long file, and you can't load the entire list of strings into memory, you had better keep just the map of frequencies. In this case, in order to keep the order of entry, we'll use an object such as this:
private static class Entry implements Comparable<Entry> {
private static long nextOrder = Long.MIN_VALUE;
private String str;
private int frequency = 1;
private long order = nextOrder++;
public Entry(String str) {
this.str = str;
}
public String getString() {
return str;
}
public int getFrequency() {
return frequency;
}
public void updateEntry() {
frequency++;
order = nextOrder++;
}
#Override
public int compareTo(Entry e) {
if ( order > e.order )
return -1;
if ( order < e.order )
return 1;
return 0;
}
#Override
public String toString() {
return String.format( "%s: %d", str, frequency );
}
}
The trick here is that every time you update the entry (add one to the frequency), it also updates the order. But the compareTo() method orders Entry objects from high order (updated/inserted later) to low order (updated/inserted earlier).
Now you can use a simple HashMap<String,Entry> to store the information as you read it (I'm assuming you are reading from some sort of scanner):
Map<String,Entry> m = new HashMap<>();
while ( scanner.hasNextLine() ) {
String str = scanner.nextLine();
Entry entry = m.get(str);
if ( entry == null ) {
entry = new Entry(str);
m.put(str, entry);
} else {
entry.updateEntry();
}
}
Scanner.close();
Now you can sort the values of the entries:
List<Entry> orderedList = new ArrayList<Entry>(m.values());
m = null;
Collections.sort(orderedList);
Running System.out.println(orderedList) will give you:
[horse: 1, cat: 2, dog: 4, fish: 2]
In principle, you could use a TreeMap whose keys contained the "order" stuff, rather than a plain HashMap like this followed by sorting, but I prefer not having either mutable keys in a map, nor changing the keys constantly. Here we are only changing the values as we fill the map, and each key is inserted into the map only once.
What you could do:
Reverse the order of the list using
Collections.reverse(input). This runs in linear time - O(n);
Create a Set from the input list. A Set garantees uniqueness.
To preserve insertion order, you'll need a LinkedHashSet;
Iterate over this set, just as you did above.
Code:
/* I don't know what logic you use to create the input list,
* so I'm using your input example. */
List<String> input = Arrays.asList("cat", "dog", "fish", "dog",
"fish", "dog", "dog", "cat", "horse");
/* by the way, this changes the input list!
* Copy it in case you need to preserve the original input. */
Collections.reverse(input);
Set<String> possibilities = new LinkedHashSet<String>(strings);
for (String s : possibilities) {
System.out.println(s + " " + Collections.frequency(strings, s));
}
Output:
horse 1
cat 2
dog 4
fish 2
I have two ArrayLists of type Answer (self-made class).
I'd like to compare the two lists to see if they contain the same contents, but without order mattering.
Example:
//These should be equal.
ArrayList<String> listA = {"a", "b", "c"}
ArrayList<String> listB = {"b", "c", "a"}
List.equals states that two lists are equal if they contain the same size, contents, and order of elements. I want the same thing, but without order mattering.
Is there a simple way to do this? Or will I need to do a nested for loop, and manually check each index of both lists?
Note: I can't change them from ArrayList to another type of list, they need to remain that.
Probably the easiest way for any list would be:
listA.containsAll(listB) && listB.containsAll(listA)
You could sort both lists using Collections.sort() and then use the equals method. A slighly better solution is to first check if they are the same length before ordering, if they are not, then they are not equal, then sort, then use equals. For example if you had two lists of Strings it would be something like:
public boolean equalLists(List<String> one, List<String> two){
if (one == null && two == null){
return true;
}
if((one == null && two != null)
|| one != null && two == null
|| one.size() != two.size()){
return false;
}
//to avoid messing the order of the lists we will use a copy
//as noted in comments by A. R. S.
one = new ArrayList<String>(one);
two = new ArrayList<String>(two);
Collections.sort(one);
Collections.sort(two);
return one.equals(two);
}
Apache Commons Collections to the rescue once again:
List<String> listA = Arrays.asList("a", "b", "b", "c");
List<String> listB = Arrays.asList("b", "c", "a", "b");
System.out.println(CollectionUtils.isEqualCollection(listA, listB)); // true
List<String> listC = Arrays.asList("a", "b", "c");
List<String> listD = Arrays.asList("a", "b", "c", "c");
System.out.println(CollectionUtils.isEqualCollection(listC, listD)); // false
Docs:
org.apache.commons.collections4.CollectionUtils
public static boolean isEqualCollection(java.util.Collection a,
java.util.Collection b)
Returns `true` iff the given `Collection`s contain exactly the same
elements with exactly the same cardinalities.
That is, iff the
cardinality of e in a is equal to the cardinality of e in b, for each
element e in a or b.
Parameters:
a - the first collection, must not be null
b - the second
collection, must not be null
Returns: true iff the collections contain
the same elements with the same cardinalities.
// helper class, so we don't have to do a whole lot of autoboxing
private static class Count {
public int count = 0;
}
public boolean haveSameElements(final List<String> list1, final List<String> list2) {
// (list1, list1) is always true
if (list1 == list2) return true;
// If either list is null, or the lengths are not equal, they can't possibly match
if (list1 == null || list2 == null || list1.size() != list2.size())
return false;
// (switch the two checks above if (null, null) should return false)
Map<String, Count> counts = new HashMap<>();
// Count the items in list1
for (String item : list1) {
if (!counts.containsKey(item)) counts.put(item, new Count());
counts.get(item).count += 1;
}
// Subtract the count of items in list2
for (String item : list2) {
// If the map doesn't contain the item here, then this item wasn't in list1
if (!counts.containsKey(item)) return false;
counts.get(item).count -= 1;
}
// If any count is nonzero at this point, then the two lists don't match
for (Map.Entry<String, Count> entry : counts.entrySet()) {
if (entry.getValue().count != 0) return false;
}
return true;
}
I'd say these answers miss a trick.
Bloch, in his essential, wonderful, concise Effective Java, says, in item 47, title "Know and use the libraries", "To summarize, don't reinvent the wheel". And he gives several very clear reasons why not.
There are a few answers here which suggest methods from CollectionUtils in the Apache Commons Collections library but none has spotted the most beautiful, elegant way of answering this question:
Collection<Object> culprits = CollectionUtils.disjunction( list1, list2 );
if( ! culprits.isEmpty() ){
// ... do something with the culprits, i.e. elements which are not common
}
Culprits: i.e. the elements which are not common to both Lists. Determining which culprits belong to list1 and which to list2 is relatively straightforward using CollectionUtils.intersection( list1, culprits ) and CollectionUtils.intersection( list2, culprits ).
However it tends to fall apart in cases like { "a", "a", "b" } disjunction with { "a", "b", "b" } ... except this is not a failing of the software, but inherent to the nature of the subtleties/ambiguities of the desired task.
You can always examine the source code (l. 287) for a task like this, as produced by the Apache engineers. One benefit of using their code is that it will have been thoroughly tried and tested, with many edge cases and gotchas anticipated and dealt with. You can copy and tweak this code to your heart's content if need be.
NB I was at first disappointed that none of the CollectionUtils methods provides an overloaded version enabling you to impose your own Comparator (so you can redefine equals to suit your purposes).
But from collections4 4.0 there is a new class, Equator which "determines equality between objects of type T". On examination of the source code of collections4 CollectionUtils.java they seem to be using this with some methods, but as far as I can make out this is not applicable to the methods at the top of the file, using the CardinalityHelper class... which include disjunction and intersection.
I surmise that the Apache people haven't got around to this yet because it is non-trivial: you would have to create something like an "AbstractEquatingCollection" class, which instead of using its elements' inherent equals and hashCode methods would instead have to use those of Equator for all the basic methods, such as add, contains, etc. NB in fact when you look at the source code, AbstractCollection does not implement add, nor do its abstract subclasses such as AbstractSet... you have to wait till the concrete classes such as HashSet and ArrayList before add is implemented. Quite a headache.
In the mean time watch this space, I suppose. The obvious interim solution would be to wrap all your elements in a bespoke wrapper class which uses equals and hashCode to implement the kind of equality you want... then manipulate Collections of these wrapper objects.
If the cardinality of items doesn't matter (meaning: repeated elements are considered as one), then there is a way to do this without having to sort:
boolean result = new HashSet<>(listA).equals(new HashSet<>(listB));
This will create a Set out of each List, and then use HashSet's equals method which (of course) disregards ordering.
If cardinality matters, then you must confine yourself to facilities provided by List; #jschoen's answer would be more fitting in that case.
Converting the lists to Guava's Multiset works very well. They are compared regardless of their order and duplicate elements are taken into account as well.
static <T> boolean equalsIgnoreOrder(List<T> a, List<T> b) {
return ImmutableMultiset.copyOf(a).equals(ImmutableMultiset.copyOf(b));
}
assert equalsIgnoreOrder(ImmutableList.of(3, 1, 2), ImmutableList.of(2, 1, 3));
assert !equalsIgnoreOrder(ImmutableList.of(1), ImmutableList.of(1, 1));
This is based on #cHao solution. I included several fixes and performance improvements. This runs roughly twice as fast the equals-ordered-copy solution. Works for any collection type. Empty collections and null are regarded as equal. Use to your advantage ;)
/**
* Returns if both {#link Collection Collections} contains the same elements, in the same quantities, regardless of order and collection type.
* <p>
* Empty collections and {#code null} are regarded as equal.
*/
public static <T> boolean haveSameElements(Collection<T> col1, Collection<T> col2) {
if (col1 == col2)
return true;
// If either list is null, return whether the other is empty
if (col1 == null)
return col2.isEmpty();
if (col2 == null)
return col1.isEmpty();
// If lengths are not equal, they can't possibly match
if (col1.size() != col2.size())
return false;
// Helper class, so we don't have to do a whole lot of autoboxing
class Count
{
// Initialize as 1, as we would increment it anyway
public int count = 1;
}
final Map<T, Count> counts = new HashMap<>();
// Count the items in col1
for (final T item : col1) {
final Count count = counts.get(item);
if (count != null)
count.count++;
else
// If the map doesn't contain the item, put a new count
counts.put(item, new Count());
}
// Subtract the count of items in col2
for (final T item : col2) {
final Count count = counts.get(item);
// If the map doesn't contain the item, or the count is already reduced to 0, the lists are unequal
if (count == null || count.count == 0)
return false;
count.count--;
}
// At this point, both collections are equal.
// Both have the same length, and for any counter to be unequal to zero, there would have to be an element in col2 which is not in col1, but this is checked in the second loop, as #holger pointed out.
return true;
}
Think how you would do this yourself, absent a computer or programming language. I give you two lists of elements, and you have to tell me if they contain the same elements. How would you do it?
One approach, as mentioned above, is to sort the lists and then go element-by-element to see if they're equal (which is what List.equals does). This means either you're allowed to modify the lists or you're allowed to copy them -- and without knowing the assignment, I can't know if either/both of those are allowed.
Another approach would be to go through each list, counting how many times each element appears. If both lists have the same counts at the end, they have the same elements. The code for that would be to translate each list to a map of elem -> (# of times the elem appears in the list) and then call equals on the two maps. If the maps are HashMap, each of those translations is an O(N) operation, as is the comparison. That's going to give you a pretty efficient algorithm in terms of time, at the cost of some extra memory.
I had this same problem and came up with a different solution. This one also works when duplicates are involved:
public static boolean equalsWithoutOrder(List<?> fst, List<?> snd){
if(fst != null && snd != null){
if(fst.size() == snd.size()){
// create copied lists so the original list is not modified
List<?> cfst = new ArrayList<Object>(fst);
List<?> csnd = new ArrayList<Object>(snd);
Iterator<?> ifst = cfst.iterator();
boolean foundEqualObject;
while( ifst.hasNext() ){
Iterator<?> isnd = csnd.iterator();
foundEqualObject = false;
while( isnd.hasNext() ){
if( ifst.next().equals(isnd.next()) ){
ifst.remove();
isnd.remove();
foundEqualObject = true;
break;
}
}
if( !foundEqualObject ){
// fail early
break;
}
}
if(cfst.isEmpty()){ //both temporary lists have the same size
return true;
}
}
}else if( fst == null && snd == null ){
return true;
}
return false;
}
Advantages compared to some other solutions:
less than O(N²) complexity (although I have not tested it's real performance comparing to solutions in other answers here);
exits early;
checks for null;
works even when duplicates are involved: if you have an array [1,2,3,3] and another array [1,2,2,3] most solutions here tell you they are the same when not considering the order. This solution avoids this by removing equal elements from the temporary lists;
uses semantic equality (equals) and not reference equality (==);
does not sort itens, so they don't need to be sortable (by implement Comparable) for this solution to work.
One-line method :)
Collection's items NOT implement the interface Comparable<? super T>
static boolean isEqualCollection(Collection<?> a, Collection<?> b) {
return a == b || (a != null && b != null && a.size() == b.size()
&& a.stream().collect(Collectors.toMap(Function.identity(), s -> 1L, Long::sum)).equals(b.stream().collect(Collectors.toMap(Function.identity(), s -> 1L, Long::sum))));
}
Collection's items implement the interface Comparable<? super T>
static <T extends Comparable<? super T>> boolean isEqualCollection2(Collection<T> a, Collection<T> b) {
return a == b || (a != null && b != null && a.size() == b.size() && a.stream().sorted().collect(Collectors.toList()).equals(b.stream().sorted().collect(Collectors.toList())));
}
support Android5 & Android6 via https://github.com/retrostreams/android-retrostreams
static boolean isEqualCollection(Collection<?> a, Collection<?> b) {
return a == b || (a != null && b != null && a.size() == b.size()
&& StreamSupport.stream(a).collect(Collectors.toMap(Function.identity(), s->1L, Longs::sum)).equals(StreamSupport.stream(b).collect(Collectors.toMap(Function.identity(), s->1L, Longs::sum))));
}
////Test case
boolean isEquals1 = isEqualCollection(null, null); //true
boolean isEquals2 = isEqualCollection(null, Arrays.asList("1", "2")); //false
boolean isEquals3 = isEqualCollection(Arrays.asList("1", "2"), null); //false
boolean isEquals4 = isEqualCollection(Arrays.asList("1", "2", "2"), Arrays.asList("1", "1", "2")); //false
boolean isEquals5 = isEqualCollection(Arrays.asList("1", "2"), Arrays.asList("2", "1")); //true
boolean isEquals6 = isEqualCollection(Arrays.asList("1", 2.0), Arrays.asList(2.0, "1")); //true
boolean isEquals7 = isEqualCollection(Arrays.asList("1", 2.0, 100L), Arrays.asList(2.0, 100L, "1")); //true
boolean isEquals8 = isEqualCollection(Arrays.asList("1", null, 2.0, 100L), Arrays.asList(2.0, null, 100L, "1")); //true
If you don't hope to sort the collections and you need the result that ["A" "B" "C"] is not equals to ["B" "B" "A" "C"],
l1.containsAll(l2)&&l2.containsAll(l1)
is not enough, you propably need to check the size too :
List<String> l1 =Arrays.asList("A","A","B","C");
List<String> l2 =Arrays.asList("A","B","C");
List<String> l3 =Arrays.asList("A","B","C");
System.out.println(l1.containsAll(l2)&&l2.containsAll(l1));//cautions, this will be true
System.out.println(isListEqualsWithoutOrder(l1,l2));//false as expected
System.out.println(l3.containsAll(l2)&&l2.containsAll(l3));//true as expected
System.out.println(isListEqualsWithoutOrder(l2,l3));//true as expected
public static boolean isListEqualsWithoutOrder(List<String> l1, List<String> l2) {
return l1.size()==l2.size() && l1.containsAll(l2)&&l2.containsAll(l1);
}
Solution which leverages CollectionUtils subtract method:
import static org.apache.commons.collections15.CollectionUtils.subtract;
public class CollectionUtils {
static public <T> boolean equals(Collection<? extends T> a, Collection<? extends T> b) {
if (a == null && b == null)
return true;
if (a == null || b == null || a.size() != b.size())
return false;
return subtract(a, b).size() == 0 && subtract(a, b).size() == 0;
}
}
If you care about order, then just use the equals method:
list1.equals(list2)
If you don't care order then use this
Collections.sort(list1);
Collections.sort(list2);
list1.equals(list2)
I think this is a simple solution:
public boolean equalLists(List<String> one, List<String> two) {
if (one == null && two == null) {
return true;
}
if (one == null || two == null) {
return false;
}
return new HashSet<>(one).equals(new HashSet<>(two));
}
Best of both worlds [#DiddiZ, #Chalkos]: this one mainly builds upon #Chalkos method, but fixes a bug (ifst.next()), and improves initial checks (taken from #DiddiZ) as well as removes the need to copy the first collection (just removes items from a copy of the second collection).
Not requiring a hashing function or sorting, and enabling an early exist on un-equality, this is the most efficient implementation yet. That is unless you have a collection length in the thousands or more, and a very simple hashing function.
public static <T> boolean isCollectionMatch(Collection<T> one, Collection<T> two) {
if (one == two)
return true;
// If either list is null, return whether the other is empty
if (one == null)
return two.isEmpty();
if (two == null)
return one.isEmpty();
// If lengths are not equal, they can't possibly match
if (one.size() != two.size())
return false;
// copy the second list, so it can be modified
final List<T> ctwo = new ArrayList<>(two);
for (T itm : one) {
Iterator<T> it = ctwo.iterator();
boolean gotEq = false;
while (it.hasNext()) {
if (itm.equals(it.next())) {
it.remove();
gotEq = true;
break;
}
}
if (!gotEq) return false;
}
// All elements in one were found in two, and they're the same size.
return true;
}
It is an alternative way to check equality of array lists which can contain null values:
List listA = Arrays.asList(null, "b", "c");
List listB = Arrays.asList("b", "c", null);
System.out.println(checkEquality(listA, listB)); // will return TRUE
private List<String> getSortedArrayList(List<String> arrayList)
{
String[] array = arrayList.toArray(new String[arrayList.size()]);
Arrays.sort(array, new Comparator<String>()
{
#Override
public int compare(String o1, String o2)
{
if (o1 == null && o2 == null)
{
return 0;
}
if (o1 == null)
{
return 1;
}
if (o2 == null)
{
return -1;
}
return o1.compareTo(o2);
}
});
return new ArrayList(Arrays.asList(array));
}
private Boolean checkEquality(List<String> listA, List<String> listB)
{
listA = getSortedArrayList(listA);
listB = getSortedArrayList(listB);
String[] arrayA = listA.toArray(new String[listA.size()]);
String[] arrayB = listB.toArray(new String[listB.size()]);
return Arrays.deepEquals(arrayA, arrayB);
}
My solution for this. It is not so cool, but works well.
public static boolean isEqualCollection(List<?> a, List<?> b) {
if (a == null || b == null) {
throw new NullPointerException("The list a and b must be not null.");
}
if (a.size() != b.size()) {
return false;
}
List<?> bCopy = new ArrayList<Object>(b);
for (int i = 0; i < a.size(); i++) {
for (int j = 0; j < bCopy.size(); j++) {
if (a.get(i).equals(bCopy.get(j))) {
bCopy.remove(j);
break;
}
}
}
return bCopy.isEmpty();
}
Assuming if we can modify one list, with the support of java 8 removeIf
assertTrue(listA.size(), listB.size());
listB.removeIf(listA::contains);
assertTrue(listB.isEmpty());
In that case lists {"a", "b"} and {"b","a"} are equal. And {"a", "b"} and {"b","a","c"} are not equal. If you use list of complex objects, remember to override equals method, as containsAll uses it inside.
if (oneList.size() == secondList.size() && oneList.containsAll(secondList)){
areEqual = true;
}
I have an object as Riziv with three variables as id, cnk and product. Then I search in a databank for this object and add it to a ArrayList as ArrayList<Riziv> list.
Now I should checkout if all object in his array are the same cnk then return true otherwise I should return all objects which are not the same cnk with error message.
public class Riziv{ String id, cnk, product; }
ArrayList<Riziv> list = getArrayListFromDatabank(id);
public void getDuplicatedWhichHasTheSameCnk(){
}
}
Using standard JVM structures (MultiMap is provided by guava), you can do that:
public List<Riviz> getDuplicates(final List<Riviz> l)
{
final HashMap<String, List<Riviz>> m = new HashMap<String, List<Riviz>>();
final List<Riviz> ret = new ArrayList<Riviz>();
String cnk;
for (final Riviz r: l) {
cnk = r.getCnk();
if (!m.contains(cnk))
m.add(cnk, new ArrayList<Riviz>());
m.get(cnk).add(r);
}
List<Riviz> tmp;
for (final Map.Entry<String, List<Riviz>> entry: m.entrySet()) {
tmp = entry.getValue();
if (tmp.size() == 1) // no dups
continue;
ret.addAll(tmp);
}
return ret;
}
ret will contain the duplicates. You can change that function to return a Map<String, Riviz> instead, and filter out entries where the list size is only one. You'll then get a map with the conflicting cnks as keys and a list of dups as values.
I am not clear exactly what you want however I suspect you want something like this.
MultiMap<Key, Riziv> multiMap =
List<Riziv> list =
for(Riziv r: list)
multiMap.put(r.getCnk(), r);
for(Key cnk: multiMap.keySet()) {
Collection<Riziv> sameCnk = multiMap.get(cnk);
// check size and compare entries
}
The multi-map will have the list of Riziv objects for each Cnk.
One way to do it is write a comparator to sort the list by cnk String and then compare each consecutive cnk String to the next, if you find a duplicate, they will be right next to eachother.
1.) Sort the list using a comparator by sorting on the cnk variable.
2.) Compare each element in the list to the next for duplicates.
There's probably many other ways to solve this, this is just the first that came to mind.
I did not test this so you have been forewarned lol:
ArrayList<Riziv> rizArray = new ArrayList<Riziv>();
//Sort the array by the CNK variable.
Collections.sort(rizArray, new Comparator<Riziv>(){
#Override
public int compare(Riziv arg0, Riziv arg1) {
//Return the comparison of the Strings.
//Use .compareToIgnoreCase if you want to ignore upper/lower case.
return arg0.getCnk().compareTo(arg1.getCnk());
}
});
//List should be in alphabetical order at this point.
List<Riziv> duplicates = new ArrayList<Riziv>();
Riziv rizPrevious = null;
for(Riziv riz: rizArray){
if(rizPrevious == null){
rizPrevious = riz;
continue;
}
if(riz.getCnk().compareTo(rizPrevious.getCnk()) == 0){
duplicates.add(riz);
}
rizPrevious = riz;
}
Lest's say I have string:
String test= "AA BB CC BB BB CC BB";
What I would like to do is create String array like this:
String[]{"BB", "CC", "AA"}
Since B occurred 4 times C did 2 times and A only 1 time.
What would solution for this problem look like?
String test = "AA BB CC BB BB CC BB";
System.out.println(Arrays.deepToString(sort(test)));
Output: [BB, CC, AA]
Code:
public static String[] sort(String test) {
String[] strings = test.split(" ");
HashMap<String,Integer> map = new HashMap<String,Integer>();
for (String s : strings) {
Integer i = map.get(s);
if (i != null) {
map.put(s, i+1);
} else {
map.put(s, 1);
}
}
TreeMap<Integer,String> sort = new TreeMap<Integer,String>(Collections.reverseOrder());
for (Entry<String,Integer> e : map.entrySet()) {
sort.put(e.getValue(), e.getKey());
}
return sort.values().toArray(new String[0]);
}
What you could do is something like this (rough code):
String[] myOccurences = test.split(" ");
Then:
HashMap<String,Integer> occurencesMap = new HashMap<String,Integer>()
for( String s : myOccurences ){
if( occurencesMap.get( s ) == null ){
occurencesMap.put(s, 1);
} else {
occurencesMap.put(s, occurencesMap.get(s)++ );
}
}
Edit: The actual sorting (again rough code and unchecked):
List<String> mapKeys = new ArrayList<String>(occurencesMap.keySet()); // Keys
List<Integer> mapValues = new ArrayList<Integer>(occurencesMap.values()); // Values
TreeSet<Integer> sortedSet = new TreeSet( mapValues ); // Sorted according to natural order
Integer[] sortedValuesArray = sortedSet.toArray();
HashMap<String,Integer> lhMap = new LinkedHashMap<String,Integer>(); // LinkedHashMaps conserve order
for (int i=0; i<size; i++){
lhMap.put(mapKeys.get(mapValues.indexOf(sortedArray[i])), sortedValuesArray[i]);
}
mapKeys = new ArrayList<String>(occurencesMap.keySet()); // Keys again, this time sorted
Collections.sort(mapKeys, Collections.reverseOrder()); // Reverse since original ascending
String[] occurencesSortedByDescendingArray = mapKeys.toArray();
Feel free to comment.
If you want to use Guava:
Lists.transform(
Ordering
.natural()
.onResultOf(new Function<Multiset.Entry<String>, Integer>() {
public Integer apply(Multiset.Entry<String> entry) {
return entry.getCount();
}
})
.reverse()
.sortedCopy(
ImmutableMultiset.copyOf( Splitter.onPattern("\\s+").split(test) ).entrySet()
),
new Function<Multiset.Entry<String>, String>() {
public String apply(Multiset.Entry<String> entry) {
return entry.getElement();
}
}
);
I am not sure if a method exists for this exact purpose.
However, you could use the String.split() method to split the single string into an array of strings. From there, you could locate unique strings (either by manually checking or adding them all to a set, which would check for duplicates). Track (and increment a counter unique to each unique String) each time you add an element and it is not part of the collection. Then create an array that is sorted based on this count.
A map would be ideal for holding the String/count, as it would maintain the set of unique Strings as keys, and the count for each String as the value.