I am looking for best solution to find matching set with max string match.
Here is example,
inSet = ["a","b","c","x"]
other list of set
set1 = ["a","d","q","s"]
set2 = ["a","m","t","b","z"]
set3 = ["a","x","b","s","r","t"]
in above example, set3 is max. match count (3).
what is best algorithm to find with minimal execution.
any pointer or suggestion appreciated.
Let us have Set<String> set and Guava.Sets:
Set<Set<String>> set = new Set<>();
//add Set<String>s
Set<String> maxMatchSet = set.stream()
.max(Comparator.comparingInt((value -> Sets.intersection(value, inSet).size()))
.get();
OK, now some theory. ["a", "b"] isn't a set but an array (or a list). We have different data structures in Java. Sets are represented in {}.
Anyway, what matters is the code.
Set<String> set = new HashSet<>();
would initialize the Set and
List<String> list = new ArrayList<>();
would initialize the List. Still there is another option:
String[] array = new String[3];
would initialize new array of size 3. Arrays are fixed length.
Related
I am trying to declare an array of Set<String> so I do not have to manage each sets separately. But things go wrong:
ArrayList<Set<String>> categories=new LinkedHashSet<>();
Here, Java says that type Set<String> is erroneous and then reports an error.
If this is wrong, then how can I make an array of :
static Set<String> category1 = new LinkedHashSet<>();
You are initialising an ArrayList with LinkedHashSet object and hence the error:
ArrayList<Set<String>> categories=new LinkedHashSet<>();
change it to
ArrayList<Set<String>> categories=new ArrayList<>();
you need to use HashSet when you create a Set to be added into the list. Something like this:
Set<String> firstSet = new HashSet<String>();
//build your set
//add set to list
categories.add(firstSet);
Btw, you mentioned Array in your question description, so here is the declaraiton for plain array of Sets:
Set<String>[] categories=new HashSet[10];
You can do something like:
List<HashSet> list =new ArrayList<HashSet>();
HashSet<String> hs =new HashSet<String>();
hs.add(value1);
hs.add(value2);
list.add(hs);
You can use for or while loop to add values to set(hs) and then add the set to list.
Let's say I have an arrayList1 of Points. The data structure is like this :
(1,2)->(2,2)->(3,2)->(4,2)->(5,2)
I have another arrayList2 of Points :
(2,2)->(1,2)->(8,5)->(9,3)
How do I compare the two lists and add non-existing values from arrayList2 to arrayList1?
current solution
The only method I can think of now is using a for loop to compare each of the Points in arrayList1 such as, if(!arrayList1.contains(arrayList2.get(i))){ arrayList1.add(arrayList2.get(i)); } i++;.
Is there a more efficient way or already prepared method from a class? Because I have arrayList1 until arrayList6 to compare and replace....
For one-liner lovers (running demo):
List<Point> list3 = new ArrayList<Point>(new HashSet<Point>(list1){{ addAll(list2); }});
Safe version * (running demo):
Set<String> tmpSet = new HashSet<String>(arrayList1);
tmpSet.addAll(arrayList2);
List<String> mergedList = new ArrayList<String>(tmpSet);
* As correctly pointed out by Bruce Wayne, Double Brace initialization (the one-liner example, also used in both examples to populate the first two lists) should be used with care, due to the potential drawbacks described in the following article:
Don’t be “Clever”: The Double Curly Braces Anti Pattern
Explanation: Sets can't contain duplicates, so use one as transition vector.
Example 1 code:
List<String> arrayList1 = new ArrayList<String>(){{ add("One"); add("Two"); }};
List<String> arrayList2 = new ArrayList<String>(){{ add("Two"); add("Three"); }};
List<String> mergedList = new ArrayList<String>(new HashSet<String>(arrayList1){{ addAll(arrayList2); }});
System.out.println(mergedList);
Output: [One, Two, Three]
Example 2 code:
List<String> arrayList1 = new ArrayList<String>(){{ add("One"); add("Two"); }};
List<String> arrayList2 = new ArrayList<String>(){{ add("Two"); add("Three"); }};
Set<String> tmpSet = new HashSet<String>(arrayList1);
tmpSet.addAll(arrayList2);
List<String> mergedList = new ArrayList<String>(tmpSet);
System.out.println(mergedList);
Output: [One, Two, Three]
If time complexity is your main priority, add all the points in List1 to a HashSet<Point>.
Then, for each list thereafter, loop through it and see if the set contains each point and if not, add it to List1.
Set<Point> pointsInList1 = new HashSet<>(list1);
for(Point p : list2)
{
if(!pointsInList1.contains(p)) {
list1.add(p);
pointsInList1.add(p);
}
}
//Repeat for other lists
This solution is linear with respect to the size of the largest list.
It can have multiple solutions. As you are using java.awt.Point class which already has equals method overridden(based on the coordinates).
So, you can easily use contains method of List class.
for(Point point : list2){
if(!list1.contains(point)){
list1.add(point);
}
}
Make sure to use for each loop for a better performance (Do not use index based loop (It makes a difference if you are using LinkedList)).
ii) Another alternative is to use java.util.Set and use its method addAll(Set). As Set does not all duplicates and hence will merge the elements efficiently.
You should use a Set. It is a collection with no duplicates. So you can add the same value twice, it will be present only one time.
It means you can add many List in your Set, you will not have duplicates in it.
Set setA = new HashSet();
ArrayList<Point> points1 = new ArrayList<Point>();
ArrayList<Point> points2 = new ArrayList<Point>();
Point element1 = new Point(0,0);
Point element2 = new Point(0,1);
Point element3 = new Point(0,0);
Point element4 = new Point(0,2);
points1.add(element1);
points1.add(element2);
points1.add(element3);
points2.add(element1);
points2.add(element4);
setA.addAll(points1);
setA.addAll(points2);
Iterator<Point> it = setA.iterator();
while(it.hasNext())
System.out.println(it.next());
Output :
java.awt.Point[x=0,y=0]
java.awt.Point[x=0,y=1]
java.awt.Point[x=0,y=2]
You can do something like this
list2.removeAll(list1);
list1.addAll(list2);
You have to override your equal function in your Point Class
and then you could iterate over these two list, and compare their values.
How do I compare the two lists
That one is easy, just use equals.
add non-existing values from arrayList2 to arrayList1
remove all elements of arrayList1 from arrayList2 and add it to the arrayList2. That way only the new elements will be added to arrayList2
get the difference (arrayList1 - arrayList2) and add these to arrayList2 (for instance with CollectionUtils)
Your current solution is probably wrong (it will either skip one element or run forever, depending on your loop):
if(arrayList1.contains(arrayList2.get(i))) {
i++; // this shouldn't be there if done in the loop
} else {
arrayList1.add(arrayList2.get(i)); // here a ++ is needed if not in the loop
}
Is there a more efficient way
A little advice:
First, make it work (and have a good UnitTest coverage). Then (and only then!) optimize if needed!
I am new to Java and I literally have no idea how to do this.
I have this Java array:
String luni[];
luni = new String[] {"A","B","C"};
and I want each value A,B,C from the array to become a HashSet variable, like this:
Set<String> luni[0] = new HashSet<>(500);
Set<String> luni[1] = new HashSet<>(500);
Set<String> luni[2] = new HashSet<>(500);
Eventually having A,B,C as HashSet to which I can later use luni[0].add("string");
I hope you get the idea. How can I do this, it seems it won't work as I wrote it?
You can use a HashMap in your case it will have a String keys and HashSet values.
HashMap<String, HashSet<Whatever>> map
= new HashMap<String, HashSet<Whatever>>();
Original answer was:
If you just need to access each HashSet in the array by index,
luni[0].add("string"), then you simply have to define luni as an
array of Sets:
But in fact, you'll need to use an ArrayList of Sets (or use an array of raw Set, but that's not as good), and you'll still be able to use it with an index:
Note that this is only good if you don't have any actual use for the "A", "B", "C" and you just wanted to access the hashsets by index.
List<Set<String>> luni = new ArrayList<Set<String>>();
luni.add( new HashSet<String>(500) );
luni.add( new HashSet<String>(500) );
luni.add( new HashSet<String>(500) );
luni.get(0).add("String");
Either use:
Map<String, Set<String>> luni = new HashMap<>();
luni.put("A", new HashSet<String>(500));
luni.put("B", new HashSet<String>(500));
luni.put("C", new HashSet<String>(500));
// To add a value to B:
luni.get("B").add("some string");
or:
List<Set<String>> luni = new ArrayList<>(3);
luni.add(new HashSet<String>(500));
luni.add(new HashSet<String>(500));
luni.add(new HashSet<String>(500));
// To add a value to 'B' (index 1):
luni.get(1).add("some string");
I suggest using the first one. The second one uses index instead of A, B and C like you want.
What is the best way to create a union of N lists in java ?
For eg
List<Integer> LIST_1 = Lists.newArrayList(1);
List<Integer> LIST_2 = Lists.newArrayList(2);
List<Integer> LIST_3 = Lists.newArrayList(3);
List<Integer> LIST_4 = Lists.newArrayList(4);
List<Integer> LIST_1_2_3_4 = Lists.newArrayList(1,2,3,4);
assert LIST_1_2_3_4.equals(union(LIST_1,LIST_2,LIST_3,LIST_4));
The union method will take a var args parameter
<Item> List<Item> union(List<Item> ... itemLists)
Is there a library which provides this method.Simplest way is to loop through the array and accumulate each list into one
There may be a library, but including it only for these 3 lines of code would probably not worth another dependency...
private static <Item> List<Item> union(List<Item> ... itemLists)
{
List<Item> result = new ArrayList<Item>();
for (List<Item> list : itemLists) result.addAll(list);
return result;
}
You could use Google Guava:
List<Integer> joined = new ArrayList<>( Iterables.concat(LIST_1, LIST_2, LIST_3, LIST_4) );
or for comparison only:
Iterables.elementsEqual( LIST_1_2_3_4, Iterables.concat(LIST_1, LIST_2, LIST_3, LIST_4) );
I am not sure what you mean by best solution but a simple solution would involve using the addAll method.
For extra performance you may also hint the size by summing all sizes.
new ArrayList<...>(totalSizeHere)
Also see this answer: How to do union, intersect, difference and reverse data in java
So, I'm trying to s a list of documents that contain a term and then enter the corresponding document_id and the term frequency into an array (of size 2). I then add this entry array into a List, so that the final List contains an all the entries. However, because the entry is passed by reference into the List, I have no idea how to accomplish this, since it rewrites itself every time. And due to the size of the data, my program runs out of memory if I try to declare a new int[] entry within the while loop. Any ideas on how to get pass this? I'm a but rusty on my Java. Thanks.
List<int[]> occurenceIndex = new ArrayList<>();
int[] entry = new int[2];
while (matchedDocs.next())
{
entry[0] = (matchedDocs.doc()); // Adds document id
entry[1] = (matchedDocs.freq()); // Adds term weight
occurenceIndex.add(entry);
}
Try to create a new object of the int array inside the loop.
List<int[]> occurenceIndex = new ArrayList<>();
while (matchedDocs.next())
{
int[] entry = new int[2];
entry[0] = (matchedDocs.doc()); // Adds document id
entry[1] = (matchedDocs.freq()); // Adds term weight
occurenceIndex.add(entry);
}
You have to put int[] entry = new int[2]; into the while loop
does it need to be an int, what about byte or short? if this isn't possible then the program needs to be re-factored as there is no way to store the arrays like this using the same array instance. – Neil Locketz 1 min ago edit
Consider using HashMap to store records.
Map<Integer, Integer> occurenceIdx = new HashMap<Integer, Integer>();
while(matchedDocs.next())
occurenceIdx.put(matchedDocs.doc(), matchedDocs.freq());
That's all the code you need to create the map. To retrieve value based on doc ID
docFreq = occurenceIdx.get(docId);
Please note that this will work ONLY if you have unique doc IDs. If not, you will have to improvise on this solution. I would probably make my map a HashMap<Integer, List<Integer>> to support multiple instances of docID