Create a list of the duplicate items in an ArrayList - java

I'm using a set to get a list of duplicate items from an ArrayList (which is populated from a database)
void getDuplicateHashTest() {
List<BroadcastItem> allDataStoreItems = itemsDAO.getAllItems();
Set<BroadcastItem> setOfAllData = new HashSet<>(allDataStoreItems);
List<BroadcastItem> diff = new ArrayList<>(setOfAllData);
allDataStoreItems.removeAll(diff);
}
So at the last line, all the items which are not duplicates should be removed from the list of all items.
The problem is when I print allDataStoreItems.size() I get 0
The set and the sublist print the correct number of items.
What am I doing wrong?

List#removeAll removes all occurrences of the given elements, not just one of each (in contrast to List#remove which only removes the first occurrence). So setOfAllData contains one copy of each element in your list, and then you remove all occurrences of each of those elements, meaning you'll always end up with an empty list.
To know how to fix this I'd need to know more about what you want the result to be. Do you want one copy of each element removed? If so, you could do that with:
List<BroadcastItem> allDataStoreItems = itemsDAO.getAllItems();
Set<BroadcastItem> setOfAllData = new HashSet<>(allDataStoreItems);
setOfAllData.forEach(allDataStoreItems::remove);

Its simple if you want to store only duplicates find the below code.
Set<BroadcastItem> duplicates = new HashSet<>;
Set<BroadcastItem> allItems=new HashSet<>
for(BroadcastItem b:allDataStoreItems){
boolean x=allItems.add(b);
if(x==false){
duplicates.add(b);
}
}

As already pointed out in the answer by jacobm : The Collection#removeAll method will remove all occurrences of a particular element. But the alternative of creating a list and calling remove repeatedly is not really a good solution: On a List, the remove call will usually have O(n) complexity, so figuring out the duplicates like this will have quadratic complexity.
A better solution is the one that was already mentioned by shamsher Khan in his answer (+1!) : You can iterate over the list, and keep track of the elements that have already seen, using a Set.
This solution has a complexity of O(n).
It's not clear whether you want the list or the set of all duplicates. For example, when the input is [1, 2,2,2, 3], should the result be [2,2] or just [2]? However, you can simply compute the list of duplicates, and make its elements unique in a second step, if necessary.
Here is an example:
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashSet;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;
public class FindDuplicatesInList
{
public static void main(String[] args)
{
List<Integer> list = Arrays.asList(0,1,1,1,2,3,3,4,5,6,7,7,7,8);
List<Integer> duplicates = computeDuplicates(list);
// Prints [1, 1, 3, 7, 7]
System.out.println(duplicates);
// Prints [1, 3, 7]
System.out.println(makeUnique(duplicates));
}
private static <T> List<T> makeUnique(List<? extends T> list)
{
return new ArrayList<T>(new LinkedHashSet<T>(list));
}
private static <T> List<T> computeDuplicates(List<? extends T> list)
{
Set<T> set = new HashSet<T>();
List<T> duplicates = new ArrayList<T>();
for (T element : list)
{
boolean wasNew = set.add(element);
if (!wasNew)
{
duplicates.add(element);
}
}
return duplicates;
}
}

Related

remove duplicate list from an arrayList using Set

I have a List that contains duplicate ArrayList.
I'm looking for a solution to remove them.
Here is an example:
listOne = [[1, 0], [0, 1], [3, 2], [2, 3]]
This set contains duplicate List. Normally i want to get :
theListAfterTransformation = [[1, 0],[3, 2]]
Here is my tiny example, i tried to use the Set but it didn't work well.
public class Example {
public static void main( String[] args ) {
ArrayList<ArrayList<Integer>> lists = new ArrayList<>();
ArrayList<Integer> list1 = new ArrayList<>(); list1.add(1); list1.add(0);
ArrayList<Integer> list2 = new ArrayList<>(); list2.add(0); list2.add(1);
ArrayList<Integer> list3 = new ArrayList<>(); list3.add(3); list3.add(2);
ArrayList<Integer> list4 = new ArrayList<>(); list4.add(2); list4.add(3);
lists.add(list1);lists.add(list2);lists.add(list3);lists.add(list4);
System.out.println(getUnduplicateList(lists));
}
public static ArrayList<ArrayList<Integer>> getUnduplicateList( ArrayList<ArrayList<Integer>> lists) {
Iterator iterator = lists.iterator();
Set<ArrayList<Integer>> set = new HashSet<>();
while (iterator.hasNext()){
ArrayList<Integer> list = (ArrayList<Integer>) iterator.next();
set.add(list);
}
return new ArrayList<>(set);
}
}
Note that is a tiny example from my project and it will be very hard to use a solution that change many thing in this implementation.
So take into account that the getUnduplicateList should keep the same signature. the good idea will be to change only the implementation.
This program print the same list as the input. any idea please.
A couple notes on terminology—Set is a distinct data structure from List, where the former is unordered and does not allow duplicates, while the latter is a basic, linear collection, that's generally ordered, and allows duplicates. You seem to be using the terms interchangeably, which may be part of the issue you're having: Set is probably the appropriate data structure here.
That said, it seems that your code is relying on the List API, so we can follow that along. Note that you should, in general, code to the interface (List), rather than the specific class (ArrayList).
Additionally, consider using the Arrays.asList shorthand method for initializing a list (note that this returns an immutable list).
Finally, note that a HashSet eliminates duplicates by checking if both objects have the same hashCode. Lists containing the same elements are still not considered to be the same list unless the elements appear in the same order, and will typically not be treated as duplicates. Sets, however, implement equals and hashCode in such a way that two sets containing exactly the same elements are considered equal (order doesn't matter).
Using your original starting collection, you can convert each inner-list to a set. Then, eliminate duplicates from the outer collection. Finally, convert the inner-collections back to lists, to maintain compatibility with the rest of your code (if needed). This approach will work regardless of the size of the inner-lists.
You can simulate these steps using a Stream, and using method references to convert to and from the Set, as below.
import java.util.List;
import java.util.Arrays;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.stream.Collectors;
public class Example {
public static void main( String[] args ) {
List<Integer> list1 = Arrays.asList(1, 0);
List<Integer> list2 = Arrays.asList(0, 1);
List<Integer> list3 = Arrays.asList(3, 2);
List<Integer> list4 = Arrays.asList(2, 3);
List<List<Integer>> lists = Arrays.asList(list1, list2, list3, list4);
System.out.println(getUnduplicateList(lists));
}
public static List<List<Integer>> getUnduplicateList(List<List<Integer>> lists) {
return lists
.stream()
.map(HashSet::new)
.distinct()
.map(ArrayList::new)
.collect(Collectors.toList());
}
}
You need to convert the inner lists to sets as well.
Another solution is to sort your lists and then run them through distinct Although this is not very efficient and you will also obtain a set of sorted lists:
Set<List<Integer>> collect = set.stream()
.map(list -> {
list.sort(Comparator.comparingInt(Integer::intValue));
return list;
})
.distinct()
.collect(Collectors.toSet());

Most efficient way to find duplicates in a linkedlist of linkedlist of strings - java

Let us suppose we have a linkedlist of linkedlist of strings.
LinkedList<LinkedList<String>> lls = new LinkedList<LinkedList<String>> ();
LinkedList<String> list1 = new LinkedList<String>(Arrays.asList("dog", "cat", "snake"));
LinkedList<String> list2 = new LinkedList<String>(Arrays.asList("donkey", "fox", "dog"));
LinkedList<String> list3 = new LinkedList<String>(Arrays.asList("horse", "cat", "pig"));
lls.add(list1);
lls.add(list2);
lls.add(list3);
As you can see, this 3 linkedlist of strings are different but also have some elements in common.
My goal is to write a function that compares each list with the others and returns TRUE if there is at least one element in common (dog is in list1 and list2), FALSE otherwise.
I think that the first thing I need is to compare all possible permutation among lists and the comparison between lists is element by element.
I'm not sure this is the most efficient approach.
Could you suggest an idea that is eventually most efficient?
Assuming that the given lists should not be changed by removing elements or sorting them (which has O(nlogn) complexity, by the way), you basically need one function as a "building block" for the actual solution. Namely, a function that checks whether one collection contains any element that is contained in another collection.
Of course, this can be solved by using Collection#contains on the second collection. But for some collections (particularly, for lists), this has O(n), and the overall running time of the check would be O(n*n).
To avoid this, you can create a Set that contains all elements of the second collection. For a Set, the contains method is guaranteed to be O(1).
Then, the actual check can be done conveniently, with Stream#anyMatch:
containing.stream().anyMatch(e -> set.contains(e))
So the complete example could be
import java.util.Arrays;
import java.util.Collection;
import java.util.LinkedHashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Set;
public class DuplicatesInLinkedLists
{
public static void main(String[] args)
{
LinkedList<LinkedList<String>> lls =
new LinkedList<LinkedList<String>>();
LinkedList<String> list1 =
new LinkedList<String>(Arrays.asList("dog", "cat", "snake"));
LinkedList<String> list2 =
new LinkedList<String>(Arrays.asList("donkey", "fox", "dog"));
LinkedList<String> list3 =
new LinkedList<String>(Arrays.asList("horse", "cat", "pig"));
lls.add(list1);
lls.add(list2);
lls.add(list3);
checkDuplicates(lls);
}
private static void checkDuplicates(
List<? extends Collection<?>> collections)
{
for (int i = 0; i < collections.size(); i++)
{
for (int j = i + 1; j < collections.size(); j++)
{
Collection<?> ci = collections.get(i);
Collection<?> cj = collections.get(j);
boolean b = containsAny(ci, cj);
System.out.println(
"Collection " + ci + " contains any of " + cj + ": " + b);
}
}
}
private static boolean containsAny(Collection<?> containing,
Collection<?> contained)
{
Set<Object> set = new LinkedHashSet<Object>(contained);
return containing.stream().anyMatch(e -> set.contains(e));
}
}
A side note: The code that you posted almost certainly does not make sense in the current form. The declaration and creation of the lists should usually rely on List:
List<List<String>> lists = new ArrayList<List<String>>();
lists.add(Arrays.asList("dog", "cat", "snake");
...
If the elements of the list have to me modifiable, then you could write
lists.add(new ArrayList<String>(Arrays.asList("dog", "cat", "snake"));
or, analogously, use LinkedList instead of ArrayList, but for the sketched use case, I can't imagine why there should be a strong reason to deliberately use LinkedList at all...
Add all the items in all lists to one single list, then sort it (Collections.sort). Then iterate through it and check for duplicates.
E.g.
ArrayList<String> list = new ArrayList<>();
list.addAll(list1); // Add the others as well
Collections.Sort(list);
for (String s : list) {
If (the item is the same as the previous item) {
return true;
}
}
Use retainAll()
for (final LinkedList<String> ll : lls)
{
list1.retainAll(ll);
}
System.out.println("list1 = " + list1);
LinkedList is not the best collection for duplicates detection. If you can, try to use HashSet, but if you can not do it you still can put all elements from list to set. Hashset contains elemnts without duplicates, so if there is a duplicated element in list size of hashset will contain less elements than all lists.
Assuming you want to use LinkedLists and aren't allowed convert to another data structure, what you could do is create a method that accepts a variable amount of LinkedLists. From there you want to grab all unique combinations of LinkedLists, and then compare all unique elements between those linked lists, if you find a common element mark that pair of linked lists as common. How you want to keep track of/return the data (set of linkedlist pairs that have an element in common for example) depends on what your output is supposed to look like, but that's the general structure of the code that i would use.

Infinite Loop Java for an unfathomable reason

The code is supposed to partition the list into sets. If the ArrayList contains the same strings twice in a row, it will add their indexes to one HashSet otherwise the indexes will be in different HashSets. The point is to put the indexes of all the same strings from the ArrayList in the same HashSet and the indexes of the different strings in different HashSets. For example, the program SHOULD print [[0, 1][2, 3]] but it's stuck in an infinite loop. I put a print statement to verify whether the first two indexes are being added to the HashSet, which they are. The program prints [[0, 1]] instead of the expected result. For some reason, list.get(index1).equals(list.get(index2)) always evaluates to true even though I update the indexes in the loop and the result should be false at the second iteration.
package quiz;
import java.util.HashSet;
import java.util.ArrayList;
import java.util.Iterator;
public class Question {
public static void main(String[] args) {
Question q = new Question();
ArrayList<String> list2 = new ArrayList<String>();
list2.add("a");
list2.add("a");
list2.add("c");
list2.add("c");
System.out.println(q.answer(list2));
}
public HashSet<HashSet<Integer>> answer(ArrayList<String> list){
HashSet<HashSet<Integer>> hashSet = new HashSet<HashSet<Integer>>();
HashSet<Integer> set = new HashSet<Integer>();
Iterator<String> it = list.iterator();
int index1 = 0;
int index2 = 1;
while (it.hasNext()){
while (list.get(index1).equals(list.get(index2))){
set.add(index1);
set.add(index2);
if (index1<list.size()-2){
index1=index1+1;
index2=index2+1;
}
}
hashSet.add(set);
System.out.println(hashSet);
}
/*else{
set.add(i);
}*/
return hashSet;
}
}
You get an infinite loop because you are using the iterator hasNext() but not using the it.next() afterwards which move the index forward.
In addition, you do not really need the iterator because you are not using the values. You should do something like that:
while(shouldStop)
......
if (index1<list.size()-2){
index1=index1+1;
index2=index2+1;
} else {
shouldStop=true
}
........

Best way to Iterate collection classes?

Guys i wanna ask about the best way to iterate collection classes ??
private ArrayList<String> no = new ArrayList<String>();
private ArrayList<String> code = new ArrayList<String>();
private ArrayList<String> name = new ArrayList<String>();
private ArrayList<String> colour = new ArrayList<String>();
private ArrayList<String> size = new ArrayList<String>();
// method for finding specific value inside ArrayList, if match then delete that element
void deleteSomeRows(Collection<String> column, String valueToDelete) {
Iterator <String> iterator = column.iterator();
do{
if (iterator.next()==valueToDelete){
iterator.remove();
}
}while(iterator.hasNext());
}
deleteSomeRows(no, "value" );
deleteSomeRows(code, "value" );
deleteSomeRows(name , "value");
deleteSomeRows(colour ,"value" );
deleteSomeRows(size , "value");
THE PROBLEM WITH CODES ABOVE IS THAT IT TAKES AMOUNT OF TIME JUST TO ITERATE EACH OF THOSE CLASSES ? ANY SOLUTION TO MAKE IT FASTER ? pls help if u care :D..
You could simplify your code:
while column.contains(valueToDelete)
{
column.remove(valueToDelete);
}
You're not going to be able to speed up your ArrayList iteration, especially if your list is not sorted. You're stuck at O(n) for this problem. If you sorted it and inserted logic to binary search for the item to remove until it is no longer found, you could speed up access.
This next suggestion isn't directly related to the time it takes, but it will cause you problems.
You should never compare String objects for equality using the == operator. This will cause a comparison of their pointer values.
Use this instead:
if (iterator.next().equals(valueToDelete))
EDIT: The problem here is not the iteration. The problem is removing the elements from the ArrayList. When you remove the first element from an ArrayList, then all subsequent elements have to be shifted one position to the left. So in the worst case, your current approach will have quadratic complexity.
It's difficult to avoid this in general. But in this case, the best tradeoff between simplicity and performance can probably be achieved like this: Instead of removing the elements from the original list, you create a new list which only contains the elements that are not equal to the "valueToDelete".
This could, for example, look like this:
import java.util.ArrayList;
import java.util.List;
public class QuickListRemove
{
public static void main(String[] args)
{
List<String> size = new ArrayList<String>();
size = deleteAll(size, "value");
}
private static <T> List<T> deleteAll(List<T> list, T valueToDelete)
{
List<T> result = new ArrayList<T>(list.size());
for (T value : list)
{
if (!value.equals(valueToDelete))
{
result.add(value);
}
}
return result;
}
}
If you want to modify the collection while iterating them then you should use Iterators, otherwise you can use the for-each loop.
For -each :
// T is the type f elements stored in myList
for(T val : myList)
{
// do something
}
Try putting a break after you find the element to delete.

How to keep List index fixed in Java

I want to keep the indices of the items in a Java List fixed.
Example code:
import java.util.ArrayList;
public class Test {
public static void main(String[] args) {
ArrayList<Double> a = new ArrayList<Double>();
a.add(12.3);
a.add(15.3);
a.add(17.3);
a.remove(1);
System.out.println(a.get(1));
}
}
This will output 17.3. The problem is that 17.3 was on index 2 and now it's on index 1!
Is there any way to preserve the indices of other elements when removing an element? Or is there another class more suitable for this purpose?
Note: I don't want a fixed size Collection.
You might want to use java.util.SortedMap with int keys:
import java.util.*;
public class Test {
public static void main(String[] args)
{
SortedMap<Integer, Double> a = new TreeMap<Integer, Double>();
a.put(0, 12.3);
a.put(1, 15.3);
a.put(2, 17.3);
System.out.println(a.get(1)); // prints 15.3
System.out.println(a.get(2)); // prints 17.3
a.remove(1);
System.out.println(a.get(1)); // prints null
System.out.println(a.get(2)); // prints 17.3
}
}
SortedMap is a variable-size Collection
It stores values mapped to an ordered set of keys (similar to List's indices)
No implementation of java.util.List#remove(int) may preserve the indices since the specification reads:
Removes the element at the specified position in this list (optional operation). Shifts any subsequent elements to the left (subtracts one from their indices). Returns the element that was removed from the list.
Instead of calling a.remove(1) you could do a.set(1, null). This will keep all elements in the same place while still "removing" the value at index one.
If the relationship should be always the same between the index and value then use a java.util.Map.
Instead of removing the element with the call to remove set the element to null:
i.e:
import java.util.ArrayList;
public class Test
{
public static void main(String[] args)
{
ArrayList<Double> a = new ArrayList<Double>();
a.add(12.3);
a.add(15.3);
a.add(17.3);
a.set(1, null);
System.out.println(a.get(1));
}
}
You could use a HashMap<Integer, Double>. You could add items using
myMap.put(currentMaximumIndex++, myDoubleValue);
This way, indices would be unique, if you need sparse storage you'd be reasonably okay, and removing a value wouldn't hurt existing ones.
Addition to the above answer its also suggested you should use LinkedHashMap<Integer,Double>, instead of a regular Hashmap
It will preserve the order in which you insert the element.

Categories

Resources