Java remove duplicates from List store into other list? - java

I have a list of words which contains multiple duplicate words. I want to extract the words that are duplicated and store them in another list (maintaining the integrity of the original list).
I tried iterating through the list like you see below, but this fails logically because every 'dupe' will at some point be equal to primary. I really want to iterate through the list and for every String in the list check all the OTHER strings in the list for duplicates.
Is there a method in the List interface that allows this type of comparison?
For reference list 1 is a list of Strings.
for(String primary: list1){
for(String dupe: list1){
if(primary.equals(dupe)){
System.out.print(primary + " " + dupe);
ds3.add(primary);
}
}
}
EDIT:
I should note, that I'm aware that a Set doesn't allow for duplicates, but what I'm trying to do is OBTAIN the duplicates. I want to find them, and take them out and use them later. I'm not trying to eradicate them.

The easiest way to remove the duplicates is to add all elements into a Set:
Set<String> nodups = new LinkedHashSet<String>(list1);
List<String> ds3 = new ArrayList<String>(nodups);
In the above code, ds3 will be duplicate-free. Now, if you're interested in finding which elements are duplicate in O(n):
Map<String, Integer> counter = new LinkedHashMap<String, Integer>();
for (String s : list1) {
if (counter.containsKey(s))
counter.put(s, counter.get(s) + 1);
else
counter.put(s, 1);
}
With the above, it's easy to find the duplicated elements:
List<String> ds3 = new ArrayList<String>();
for (Map.Entry<String, Integer> entry : counter.entrySet())
if (entry.getValue() > 1)
ds3.add(entry.getKey());
Yet another way, also O(n): use a Set to keep track of the duplicated elements:
Set<String> seen = new HashSet<String>();
List<String> ds3 = new ArrayList<String>();
for (String s : list1) {
if (seen.contains(s))
ds3.add(s);
else
seen.add(s);
}

Consider using a Set. "A collection that contains no duplicate elements."

The intent is to extract the duplicates not lose them entirely
List<String> list =
Set<String> set = new LinkedHashSet<>(); // to keep he order
List<String> dups = new ArrayList<String>(); // could be duplicate duplicates
for(String s: list)
if (!set.add(s)) dups.add(s);

To obtain only the duplicates (as opposed to eliminating duplicates from the list), you can use a set as a temporary lookup table of what previous string has been visited:
Set<String> tmp = new HashSet<String>();
for(String primary: list1){
if(tmp.contains(primary)) {
// primary is a duplicate
}
tmp.add(primary);
}

Related

How could i transfer duplicate items in other ArrayList?

I have an array list
ArrayList<String> list=new ArrayList<String>();
list.add("Apple");
list.add("Ball");
list.add("Ball");
list.add("Cat");
list.add("Cat");
list.add("dog");
and I want to transfer duplicate strings to other ArrayList.
I mean 2nd array list should only contain Ball and Cat not Apple and dog.
Any kind of help is appreciated.
You can do this:
List<String> duplicates = new ArrayList<String>();
for(String str: list) {
if(Collections.frequency(list, str) > 1) {
duplicates.add(str);
}
}
duplicates will contain your duplicates
Try this:
// Custom list to ensure that one duplicate gets added to a list at most as
// opposed to n-1 instances (only two instances of a value in this list would
// be deceiving).
List<String> list = new ArrayList<>();
list.add("Apple");
list.add("Ball");
list.add("Ball");
list.add("Ball");
list.add("Ball");
list.add("Cat");
list.add("Cat");
list.add("Cat");
list.add("dog");
list.add("dog");
Set<String> set = new HashSet<>();
Set<String> setOfDuplicates = new HashSet<>();
for (String s : list) {
if (!set.add(s)) { // Remember that sets do not accept duplicates
setOfDuplicates.add(s);
}
}
List<String> listOfDuplicates = new ArrayList<>(setOfDuplicates);
You can use a Set as a way to help determine the duplicated elements then simply return an ArrayList of those elements.
public static ArrayList<String> retainDuplicates(ArrayList<String> inputList){
Set<String> tempSet = new HashSet<>();
ArrayList<String> duplicateList = new ArrayList<>();
for (String elem : inputList) {
if(!tempSet.add(elem)) duplicateList.add(elem);
}
return duplicateList.stream().distinct().collect(Collectors.toCollection(ArrayList::new));
}
call the method like so:
ArrayList<String> resultList = retainDuplicates(list);
note that I've used distinct() to remove any elements that occur more than once within the duplicateList. However, if you want to keep the duplicates regardless of theirs occurrences within the duplicateList then just perform return duplicateList; rather than return duplicateList.stream().distinct().collect(Collectors.toCollection(ArrayList::new));.
since you said your duplicates will all be next to each other, you can itterate through the list in pairs, and if the pair's elements match, there is a duplicate
here would be the general pseudo code for it
int first = 0
int second = 1
for (arraySize)
if (array[first] == array[second])
//there is a match here
newArray.add(array[first])
first += 1
second += 1
Note that this does not check the bounds of the array, which should be easy to implement yourself
now as for the second list not having duplicate items, you can simply store a variable with the last transfered item, and if the new found duplicate is the same, dont transfer it again
ArrayList<String> list=new ArrayList<String>();
list.add("Apple");
list.add("Ball");
list.add("Ball");
list.add("Cat");
list.add("Cat");
list.add("dog");
List<String> duplicateList= new ArrayList<String>();
for(String str: list) {
if(Collections.frequency(list, str) > 1) {
duplicateList.add(str);
}
}
System.out.println(duplicateList.toString());
//Here you will get duplicate String from the original list.

How do I store this data in Java?

I want a dictionary of values. The keys are all strings. Each key corresponds with some sort of list of strings. How do I make a list of strings for each key and update that accordingly? I'll explain:
I have a loop that is reading lines of a word list. The words are then converted into a string code and set as keys in the dictionary. Here is an example of the string code/word relationship.
123, [the]
456, [dog]
328, [bug]
...
However, my program keeps looping through the word list and eventually will run into a word with the same code as "the", but maybe a different word, lets say "cat". So I want the list to look like:
123, [the, cat]
456, [dog]
...
How do I get it to make an arraylist for every key that I can then add to on the fly when needed? My end goal is to be able to print out the list of words in that list for a called code (.get())
You can make a HashMap. In your case
HashMap<Integer, ArrayList<String>> works fine.
Like it has already been said, a MultiMap seems to be what you need. Guava that was already suggested and it's a good option. There is also and implementation from commons-collections you can use.
From commons-collections documentation:
MultiValuedMap<K, String> map = new MultiValuedHashMap<K, String>();
map.put(key, "A");
map.put(key, "B");
map.put(key, "C");
Collection<String> coll = map.get(key); // returns ["A", "B", "C"]
You can always implement your own MultiMap if you don't want to use an external library. Use a HashMap<String,List<String>> to store your values and wrap it with your own put, get and whatever other methods you see fit.
It sounds like you want a Multimap from the Guava library.
You can also go the route of using a Map<Integer, List<String>>, but then you will need to manually handle the case where the list is null (probably just allocate a new list in that case).
You can use a HashMap that links each id to a list of strings:
Map<String, List<String>> dictionary = new HashMap<String,List<String>>();
Now let's say you read two Strings: id and word . To add them to your dictionary, you can first verify if your id has already been read (using the containsKey() method)- in which case you just append the word to the list corresponding to that id - or, if this is not the case, you create a new list with this word:
//If the list already exists...
if(dictionary.containsKey(id)) {
List<String> appended = dictionary.get(id);
appended.add(word); //We add a new word to our current list
dictionary.remove(id); //We update the map by first removing the old list
dictionary.put(id, appended); //and then appending the new one
} else {
//Otherwise we create a new list for that id
List<String> newList = new ArrayList<String>();
newList.add(word);
dictionary.put(id, newList);
}
Then whenever you want to retrieve your list of strings for a certain id you can simply use dictionary.get(id);
You can find more information on HashMaps on the Java documentation
I assumed you didn't want repeats in your list so I used Set instead.
Map<String,Set<String>> mapToSet = new HashMap<>();
List<String []>keyvals = Arrays.asList(new String[][]{{"123","the"},{"123","cat"}});
for(String kv[] : keyvals) {
Set<String> s = mapToSet.get(kv[0]);
if(null == s) {
s = new HashSet<String>();
}
s.add(kv[1]);
mapToSet.put(kv[0], s);
}

Global variable during recursion

I have a global variable masterList, which is a HashMap.
private static HashMap<ArrayList<String>, Integer> masterList =
new HashMap<ArrayList<String>, Integer>();
I have a recursive function, generateAnagram that puts ArrayLists of anagrams in this HashMap with the number of words in the list as the value. However, the HashMap starts to mess up after the first call,and previous ArrayLists are overriden with the new one I'm trying to add, but the previous value remains. This results in two keys with the same value.
Here's a screenshot of the results - Click [here] http://tinypic.com/r/ka1gli/8
private static void generateAnagram(Set<String> subsets, ArrayList<String> currList, letterMap wordMap) {
if (wordMap.count() == 0) {
System.out.println("Adding: " + currList);
masterList.put(currList, currList.size());
System.out.println("Current Master: " + masterList.toString());
} else {
for (String word : subsets) {
if (word.length() <= wordMap.count() && wordMap.isConstructionPossible(word)) {
//System.out.println("Word: " + word + " " + wordMap.isConstructionPossible(word));
wordMap.remove(word);
currList.add(word);
generateAnagram(subsets, currList, wordMap);
currList.remove(word);
wordMap.addBack(word);
}
}
}
}
It's not a good idea to use an ArrayList as the key of a HashMap. Each time you change the content of the ArrayList (by adding or removing elements), its hashCode would change, so even if it's already in the HashMap, get() and containsKey() won't find it, and put() will add it again.
You only have one instance of the ArrayList, which you keep putting in the masterList map, so you would have only one entry in your map if you didn't change the contents of that list all the time.
You need to look at this from the point of view of the parameters. The ArrayList reference is passed as an argument to your recursion call each time, but it still points to the same ArrayList. When you then put it into the hashmap, you are storing multiple references to the same, single, original ArrayList.
Therefore use ArrayList.clone() before adding it to the master list. Better still, store an immutable collection to ensure your hash doesn't get messed up in the HashMap:
HashMap<List<String>, Integer> masterList =
new HashMap<List<String>, Integer>();
...
ArrayList<String> tmp = (ArrayList<String>)currList.clone();
List<String> imm = Collections.unmodifiableList(tmp);
masterList.put(imm, imm.size());
"previous ArrayLists are overriden with the new one I'm trying to add, but the previous value remains."
If you do not want the previous values, you might need to do something like this
BEFORE SCENARIO:
final ArrayList<Integer> arrayList = new ArrayList<Integer>();
final HashMap<ArrayList<Integer>, Integer> hashmap = new HashMap<ArrayList<Integer>, Integer>();
arrayList.add(1);
hashmap.put(arrayList, 1);
arrayList.add(2);
hashmap.put(arrayList, 1);
System.out.println(hashmap);
OUTPUT : {[1, 2]=1, [1, 2]=1}
AFTER SCENARIO :
ArrayList<Integer> arrayList = new ArrayList<Integer>();
final HashMap<ArrayList<Integer>, Integer> hashmap = new HashMap<ArrayList<Integer>, Integer>();
arrayList.add(1);
hashmap.put(arrayList, 1);
arrayList = new ArrayList<Integer>();
arrayList.add(2);
hashmap.put(arrayList, 1);
System.out.println(hashmap);
OUTPUT : {[1]=1, [2]=1}

Java. How to delete duplicate objects from both Lists

2nd question, which is continue of first.
I have got two Lists of strings. There is an List of strings (asu) - M1, M2, M3 ... As well as an List of string (rzs) - M1, M2, M3 and all possible combinations thereof. The need for each element (asu) (for example M1) to find an element in (rzs) (M1, M1M2, ..), which contains (e.g. M1). Example: took M1 from (asu) and will start search for duplicate(contain) in (rzs). We found M1M2 in (rzs), it contains M1. After that we should delete both elements from lists. Great thanks to No Idea For Name helped for modification this code. But the program always fails because AbstractList.remove error. Please help to implementation logic and tuning code!
Imports..........
public class work{
List<string> asu = Arrays.asList("M1","M1","M1","M3","M4","M5","M1","M1","M1","M4","M5","M5");
List<string> rzs = Arrays.asList("M1","M2","M3","M4","M5",
"M1M2","M1M3","M1M4","M1M5","M2M3","M2M4","M2M5","M3M4","M3M5","M4M5"
,"M1M2M3","M1M2M4","M1M2M5","M1M3M4","M1M3M4","M1M4M5","M2M4","M2M5");
public static void main(String[] args) {
work bebebe = new work();
bebebe.mywork();
}
List<string> tmp1 = new ArrayList<string>();
List<string> tmp2 = new ArrayList<string>();
System.out.println(Arrays.deepToString(rzs));
System.out.println(Arrays.deepToString(asu));
for (string curr : asu){
for (string currRzs : rzs){
System.out.println("New iteration ");
if (currRzs.contains(curr)) {
System.out.println("Element ("+curr+") in ASU =
element ("+currRzs+") in RZS");
if(tmp1.contains(curr) == false)
tmp1.add(curr);
if(tmp2.contains(currRzs) == false)
tmp2.add(currRzs);
}
}
}
for (string curr : tmp1){
asu.remove(curr);
}
for (string currRzs : tmp2){
rzs.remove(currRzs);
}
You should try to make use of removeAll() or retainAll() methods of Collection.
For example:
List<String> aList = new ArrayList<String>();
aList.add("a");
aList.add("b");
aList.add("c");
aList.add("d");
aList.add("e");
List<String> bList = new ArrayList<String>();
bList.add("b");
bList.add("e");
bList.add("d");
aList.removeAll(bList);
will give you the "a" and "c" elements left in aList
While if you try to make use of retainAll() method:
aList.retainAll(bList);
will give you "b", "d" and "e" elements left in aList;
retainAll() is used to remove all the elements of the invoking collection which are not part of the given collection.
removeAll() is used to remove all the elements of a collection from another collection.
So, it all depends on your use-case.
EDIT
If in any case you want to remove some elements from these collections while iterating conditionally then you should first obtain the Iterator<Type> then call the remove() method over it.
Like:
while(iterator.hasNext()){
String str = iterator.next();
if(str.equals('test')){
iterator.remove();
}
}
Don't remove items from list using foreach loop. Use classic for and iterate over elements, and when removing item, decrease iterator.
To safely remove elements while iterating use Iterator.remove method:
The behavior of an iterator is unspecified if the underlying
collection is modified while the iteration is in progress in any way
other than by calling this method.
Iterator<String> i = tmp1.iterator();
while (i.hasNext()) {
i.next(); // must be called before remove
i.remove();
}
Also it is easier to remove all collection from another by simply calling:
asu.removeAll(tmp1);
instead of List you can use Set, which will remove automatically the duplicate elements...
You can use removeAll() method to remove collection of elements from the list instead of removing one by one.
use
asu.removeAll(tmp1);
instead of
for (string curr : tmp1)
{
asu.remove(curr);
}
and use
rzs.removeAll(tmp2);
instead of
for (string currRzs : tmp2)
{
rzs.remove(currRzs);
}
update
I trace out your problem.The problem lies in Arrays.asList() method.
According to Arrays#asList
asList() returns "a fixed-size list backed by the specified array". If you want to resize the array, you have to create a new one and copy the old data. Then the list won't be backed by the same array instance.
So create a duplicate ArrayList for the lists.Like this
List<string> asuDuplicat = new ArrayList<string>(asu);
List<string> rzsDuplicat = new ArrayList<string>(rzs);
use asuDuplicat,rzsDuplicat.
asuDuplicat.removeAll(tmp1);
rzsDuplicat.removeAll(tmp2);

Comparing Two ArrayLists to Get Unique and Duplicate Values

I have two ArrayLists as shown - pinklist and normallist. I am comparing both of them and finding the unique and duplicate values from both as shown below in code:
List<String> pinklist = t2.getList();
List<String> normallist = t.getList();
ArrayList<String> duplicatevalues = new ArrayList<String>();
ArrayList<String> uniquevalues = new ArrayList<String>();
for (String finalval : pinklist) {
if (pinklist.contains(normallist)) {
duplicatevalues.add(finalval);
} else if (!normallist.contains(pinklist)) {
uniquevalues.add(finalval);
}
}
I am getting the duplicateValues properly, but I am not getting the unique values.
this should do:
List<String> pinklist = t2.getList();
List<String> normallist = t.getList();
ArrayList<String> duplicates = new ArrayList<String>(normallist);
duplicates.retainAll(pinklist);
ArrayList<String> uniques = new ArrayList<String>(normallist);
uniques.removeAll(pinklist);
Explaination:
Every List can take another list as a constructor parameter, and copy it's values.
retainAll(list2) will remove all entries, that does not exist in list2.
removeAll(list2) will remove all entries, that does exist in list2.
We don't want to remove/retain on the original lists, because this will modify it, so we copy them, in the constructor.
You're ignoring finalval in your conditions, instead asking whether one list contains the other list.
You could do it like this:
// Variable names edited for readability
for (String item : pinkList) {
if (normalList.contains(item)) {
duplicateList.add(item);
} else {
uniqueList.add(item);
}
}
I wouldn't really call these "unique" or "duplicate" items though - those are usually about items within one collection. This is just testing whether each item from one list is in another. It's more like "existing" and "new" in this case, I'd say.
Note that as you're treating these in a set-based way, I'd suggest using a set implementation such as HashSet<E> instead of lists. The Sets class in Guava provides useful methods for working with sets.
Try ListUtils https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/ListUtils.html
To get duplicate values use ListUtils.intersection(list1, list2)
To get unique values you could use ListUtils.sum(list1, list2) and then subtract the duplicates list
Do it this way -
for (String finalval : pinklist)
{
if(normallist.contains(finalval))
{
// finalval is both in pinklist and in
// normallist. Add it as a duplicate.
duplicatevalues.add(finalval); // this will get you the duplicate values
}
else {
// finalval is in pinklist but not in
// normallist. Add it as unique.
uniquevalues.add(finalval); // this will get you the values which are in
// pinklist but not in normallist
}
}
// This will give you the values which are
// in normallist but not in pinklist.
for(String value : normallist) {
if(!pinklist.contains(value)) {
uniquevalues.add(value);
}
}
Using Java8 Stream API we can filter lists and get expected results.
List<String> listOne = // Your list1
List<String> listTwo = // Your list2
List<String> uniqueElementsFromBothList = new ArrayList<>();
List<String> commonElementsFromBothList = new ArrayList<>();
// Duplicate/Common elements from both lists
commonElementsFromBothList.addAll(
listOne.stream()
.filter(str -> listTwo.contains(str))
.collect(Collectors.toList()));
// Unique element from listOne
uniqueElementsFromBothList.addAll(
listOne.stream()
.filter(str -> !listTwo.contains(str))
.collect(Collectors.toList()));
// Unique element from listOne and listTwo
// Here adding unique elements of listTwo in existing unique elements list (i.e. unique from listOne)
uniqueElementsFromBothList.addAll(
listTwo.stream()
.filter(str -> !listOne.contains(str))
.collect(Collectors.toList()));
Here's my solution to the problem.
We can create a set containing elements from both the lists.
For the unique elements, using the Stream API, we can filter out the elements based on the predicates returning XOR of contains method. it will return true only for true ^ false OR false ^ true, ensuring only one of them contains it.
For the distinct elements, simply change the XOR to &&, and it'll check if both lists have the objects or not.
Code:
private static void uniqueAndDuplicateElements(List<String> a, List<String> b) {
Set<String> containsAll = new HashSet<String>();
containsAll.addAll(a);
containsAll.addAll(b);
List<String> uniquevalues = containsAll.stream()
.filter(str -> a.contains(str) ^ b.contains(str))
.collect(Collectors.toList());
List<String> duplicatevalues = containsAll.stream()
.filter(str -> a.contains(str) && b.contains(str))
.collect(Collectors.toList());
System.out.println("Unique elements from both lists: " + uniquevalues);
System.out.println("Elements present in both lists: " + duplicatevalues);
}
Why are you passing entire list to the contains method? You should pass finalval rather.

Categories

Resources