How to find duplicates in an ArrayList<Object>? - java

This is a pretty common question, but I could not find this part:
Say I have this array list:
List<MyDataClass> arrayList = new List<MyDataClass>;
MyDataClass{
String name;
String age;
}
Now, I need to find duplicates on the basis of age in MyDataClass and remove them. How is it possible using something like HashSet as described here?
I guess, we will need to overwrite equals in MyDataClass?
But, what if I do not have the luxury of doing that?
And How does HashSet actually internally find and does not add duplicates? I saw it's implementation here in OpenJDK but couldn't understand.

I'd suggest that you override both equals and hashCode (HashSet relies on both!)
To remove the duplicates you could simply create a new HashSet with the ArrayList as argument, and then clear the ArrayList and put back the elements stored in the HashSet.
class MyDataClass {
String name;
String age;
#Override
public int hashCode() {
return name.hashCode() ^ age.hashCode();
}
#Override
public boolean equals(Object obj) {
if (!(obj instanceof MyDataClass))
return false;
MyDataClass mdc = (MyDataClass) obj;
return mdc.name.equals(name) && mdc.age.equals(age);
}
}
And then do
List<MyDataClass> arrayList = new ArrayList<MyDataClass>();
Set<MyDataClass> uniqueElements = new HashSet<MyDataClass>(arrayList);
arrayList.clear();
arrayList.addAll(uniqueElements);
But, what if I do not have the luxury of doing that?
Then I'd suggest you do some sort of decorator-class that does provide these methods.
class MyDataClassDecorator {
MyDataClass mdc;
public MyDataClassDecorator(MyDataClass mdc) {
this.mdc = mdc;
}
#Override
public int hashCode() {
return mdc.name.hashCode() ^ mdc.age.hashCode();
}
#Override
public boolean equals(Object obj) {
if (!(obj instanceof MyDataClassDecorator))
return false;
MyDataClassDecorator mdcd = (MyDataClassDecorator) obj;
return mdcd.mdc.name.equals(mdc.name) && mdcd.mdc.age.equals(mdc.age);
}
}

And if you are not able to override "MyDataClass"'s hashCode and equals methods you could write a wrapper class that handles this.

Suppose that you have a class named Person that has two property: id , firstName.
write this mehod in its class:
String uniqueAttributes() {
return id + firstName;
}
The getDuplicates() method is now should be such as:
public static List<Person> getDuplicates(final List<Person> personList) {
return getDuplicatesMap(personList).values().stream()
.filter(duplicates -> duplicates.size() > 1)
.flatMap(Collection::stream)
.collect(Collectors.toList());
}
private static Map<String, List<Person>> getDuplicatesMap(List<Person> personList) {
return personList.stream().collect(groupingBy(Person::uniqueAttributes));
}

public Set<Object> findDuplicates(List<Object> list) {
Set<Object> items = new HashSet<Object>();
Set<Object> duplicates = new HashSet<Object>();
for (Object item : list) {
if (items.contains(item)) {
duplicates.add(item);
} else {
items.add(item);
}
}
return duplicates;
}

Related

Filter a java stream for distinct objects distinguishable according to an attribute without using collections

i have a stream of objects (Measurements) including a private sensorID attribute. I want to filter this stream so i have only Measurements with different sensorIDs. To my understanding the "distinct" method would do that for the objects but not for a specific attribute of the objects. This code does the job by using Collections:
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Map<Object, Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
and can be used like this
stream.filter(distinctByKey((Measurement p) -> p.getSensorId()))
for the Measurement class
public class Measurement {
private final int sensorId;
//more stuff
public Measurement(int sensorId) {
this.sensorId = sensorId;
//more stuff
}
public int getSensorId() {
return sensorId;
}
}
What i am looking for is a way to do the same thing without using collections or atleast with immutable datatypes. Any idears?
The solution, as suggested by #Deadpool in his comment, was simple.
Just override the equals Methode of the Object.
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
Measurement other = (Measurement) obj;
return Objects.equals(sensorID, other.sensorID);
}
but you also have to override the hashCode Methode like this
#Override
public int hashCode() {
return Objects.hash(sensorID);
}
Thats why it didnt work in the beginning.
A possible alternative could be the usage of a collection to map, like the following example:
Collection<Measurement> measurementsById = stream.
.collect(Collectors.toMap(Measurement::getSensorId, Function.identity(),
(measurement1, measurement2) -> measurement1))
.values()
In doing so, you are collecting the stream elements into a map keyed by sensorId. Since a map can have only one value for a key, you are simply selecting the first stream object for each key.
Then, given the result map, by calling values() method you get the list of distinct measurements you grouped by sensorId.

HashMap not retrieving both values even after overriding hashcode & equals for same object

I know this question will be pretty amateur but, I having trouble understanding why my hashmap will not store or retrieve values when I use the same object instance as a key. My code is as follows
public class Candidate {
private String id;
private String name;
public Candidate (String id, String name){
this.id = id;
this.name = name;
}
public static void main(String args[]){
Candidate cad = new Candidate("101","hari");
HashMap<Candidate,String> mp = new HashMap<Candidate,String>();
mp.put(cad, "sachin");
mp.put(cad, "shewag");
for(Candidate cand : mp.keySet()){
System.out.println(mp.get(cand).toString());
}
}
I am overriding hashcode and equals as follows.
#Override
public boolean equals(Object obj){
Candidate cad =(Candidate)obj;
if(!(obj instanceof Candidate)){
return false;
}
if(cad.id.equals(this.id) && cad.name.equals(this.name)){
return true;
}
return false;
}
#Override
public int hashCode(){
return Objects.hash(id, name);
}
When I try to get the size of the hashmap, it returns as only one. meaning the first insertion into the hashmap was overridden by the second one.
Is it because I am using the same instance of Candidate to insert two values? Is it possible to force hashmap to insert both key,value pairs?
The whole idea behind a Map is that 1) keys are unique -- it holds only one key/value pair for a particular key, and 2) its look up is relatively "cheap".
You've only got one object within your HashMap. Understand that when you add another key, value pair to the map, if the key is the same as a previous item in the map, the previous item is replaced by the new one. If you want to add two or more items, then use different keys, or create a Map that holds a List<...> of objects as its value. e.g.,
HashMap<Candidate, List<String>>
In this situation, you would first check to see if the Map holds a Candidate item, and if so, add a new String to its list. If not, then add the new Candidate with a new ArrayList<String> value. Usually I use a method for just this purpose, something like:
public static void put(Candidate cand, String text) {
if (newMap.containsKey(cand)) {
newMap.get(cand).add(text);
} else {
List<String> list = new ArrayList<>();
list.add(text);
newMap.put(cand, list);
}
}
And yes, as d.j.brown states in comment, fix your equals method to avoid a class cast exception.
Something like so:
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Objects;
public class MyCandidateTest {
private static Map<Candidate, List<String>> newMap = new HashMap<>();
public static void main(String args[]) {
Candidate cad = new Candidate("101", "hari");
put(cad, "Foo");
put(cad, "Bar");
for (Candidate cand : newMap.keySet()) {
System.out.println(newMap.get(cand).toString());
}
}
public static void put(Candidate cand, String text) {
if (newMap.containsKey(cand)) {
newMap.get(cand).add(text);
} else {
List<String> list = new ArrayList<>();
list.add(text);
newMap.put(cand, list);
}
}
}
public class Candidate {
private String id;
private String name;
public Candidate(String id, String name) {
this.id = id;
this.name = name;
}
#Override
public boolean equals(Object obj) {
// Candidate cad =(Candidate)obj; // !! no
if (!(obj instanceof Candidate)) {
return false;
}
Candidate cad = (Candidate) obj; // !! yes
if (cad.id.equals(this.id) && cad.name.equals(this.name)) {
return true;
}
return false;
}
#Override
public int hashCode() {
return Objects.hash(id, name);
}
}
There is a simpler way to do what you want with java-8 btw, simplified example:
HashMap<String, List<String>> mp = new HashMap<>();
List<String> list = Arrays.asList("aa", "aa", "bb", "bb");
for (String s : list) {
mp.computeIfAbsent(s, k -> new ArrayList<>()).add("c");
}
System.out.println(mp); // {bb=[c, c], aa=[c, c]}
Either use
Map<Candidate, List<String>> or
A good 3rd party alternative such as Google's Multimap: https://google.github.io/guava/releases/19.0/api/docs/com/google/common/collect/Multimap.html

How can I group a list of objects by two attributes in one iteration?

I'm trying to group a large list of objects by two of their attributes. To demonstrate what I mean, consider the following example.
public class Foo {
private String attributeA;
private String attributeB;
private String anotherAttribute;
}
I want to group a large list of Foo objects by the attributes attributeA and attributeB. Currently I do the following.
List<Foo> foos = getFoos();
Map<Set<String>, List<String>> groupedFoos = Sets.newHashMap();
Set<String> fooGroup;
for(Foo foo : foos) {
fooGroup = Sets.newHashMap(foo.getAttributeA(), foo.getAttributeB());
if (!groupedFoos.containsKey(fooGroup)) {
groupedFoos.put(fooGroup, Lists.newArrayList(foo));
} else {
groupedFoos.get(fooGroup).add(foo);
}
}
How can I achieve the same result without using a Map like Map<Set<String>, List<String>>? It is important to do this in one iteration. The values of the attributes attributeA and attributeB can be swapped. So using Pair as the key of the Map is also not an option.
If you want to get rid of Map as key, you can always write your own Key in a way that will compare both attributes (regardless of their order).
public class Key {
private String a;
private String b;
private String c;
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Key foo = (Key) o;
if (a.equals(foo.a) || a.equals(foo.b)) {
return true;
}
return b.equals(foo.b) || b.equals(foo.a);
}
#Override
public int hashCode() {
int result = a.hashCode();
result = 31 * result + b.hashCode();
return result;
}
}
Add a key method to your class.
public class Foo {
private String attributeA;
private String attributeB;
private String anotherAttribute;
public final String getKey() {
return this.attributeA + "$" + this.attributeB; //use $ or any other delimiter as suggested in the comment
}
}
Then if you can use Java8, use Collectors.groupingBy() method like below
final Map<String, List<Foo>> result = getFoos().stream().collect(Collectors.groupingBy(Foo:getKey));

Dynamically filter List with Predicate

I have the following list of Strings:
{"New York","London","Paris","Berlin","New York"}
I am trying to use the Guava Library and I want to filter this list in such a way that I will get only the strings which will equal to a string that I will provide. If I had a fixed value let's say "New York" I would do the following:
Predicate<String> myCity = new Predicate<String>() {
#Override public boolean apply(String city) {
return city == "New York";
}
};
But what if I want the return statement to be something like this :
return city == myStringVariable
Can I give the argument for the city to the Predicate or combine two predicates somehow ?
Guava provides a number of really generic Predicates, via the Predicates helper class.
To filter on equality (be it for a String or any other object), it provides the equalTo() predicate:
Predicate<String> myCity = Predicates.equalTo(myStringVariable);
Update to answer a question in the comments: how to filter when the list is not of Strings but of objects which have a String property.
You have several options, depending on what you already have:
Use imperative code
For a one-time use, unless you use Java 8, it's a bit verbose to use the functional constructs. See the FunctionalExplained page in the wiki.
private List<SomeObject> filterOnMyCity(List<SomeObject> list,
String value) {
List<SomeObject> result = new ArrayList<>();
for (SomeObject o : list) {
if (value.equals(o.getMyCity())) {
result.add(o);
}
}
return result;
}
Use an ad-hoc predicate
class MyCityPredicate implements Predicate<SomeObject> {
private final String value;
public MyCityPredicate(String value) {
this.value = value;
}
#Override
public boolean apply(#Nullable SomeObject input) {
return input != null && value.equals(input.getPropertyX());
}
}
return FluentIterable.from(list)
.filter(new MyCityPredicate(myStringVariable))
.toList();
Use an existing Function
class MyCityFunction implements Function<SomeObject, String> {
#Override
public String apply(#Nullable SomeObject input) {
return input == null ? null : input.getPropertyX();
}
}
return FluentIterable.from(list)
.filter(Predicates.compose(
Predicates.equalTo(myStringVariable),
new MyCityFunction()))
.toList();
However, I wouldn't override equals() as mentioned in the comments, it could be too specific to say that 2 instances of SomeObject are equals just because one of their properties is. It doesn't scale if you need to filter on 2 different properties in different contexts.
If you have Java 8, the code becomes much more compact using lambda expressions, and you can directly use the Stream API anyway so you don't need Guava for that.
Either you use a final String or subclass of Predicate :
final String testStr = textbox.getText(); //for example
Predicate<String> myCity = new Predicate<String>() {
#Override public boolean apply(String city) {
return city.equals(testStr);
}
};
or
public class CityPredicate implements Predicate<String>{
String cityName;
public CityPredicate(String cityName){
this.cityName = cityName;
}
#Override public boolean apply(String city) {
return city.equals(cityName);
}
}
//Use example :
Predicate<String> myCity = new CityPredicate("New York");
And as #Sotirios Delimanolis told you, always compare String with equals()
EDIT : example with Frank Pavageau's solution :
Given your class :
public class City{
String cityName;
public City(String cityName){
this.cityName=cityName;
}
#Override
public boolean equals(Object obj) {
if(obj instanceof String){
return cityName.equals(obj);
}
return false;
}
}
List<City> cities = Lists.newArrayList(new City("New York"),new City("Chicago"));
Predicate<String> myCityPredicate = Predicates.equalTo("New York");
final List<City> res = Lists.newArrayList(Iterables.filter(cities , myCityPredicate));
//res.size() will be 1 and containing only your City("New York")
//To check whether it is present you can do :
final boolean isIn = Predicates.in(cities).apply("New York");
You can construct the predicate by closing over a some other variable
final String cityName = "New York";
Predicate<String> myCity = new Predicate<String>() {
#Override public boolean apply(String city) {
return city.equals(cityName);
}
};
Note how I compare strings with their equals method rather than the == reference equality operator. See below for why.
How do I compare strings in Java?

Return a sublist based on member variable or mapping function

I have a list of pojo's List<Pojo> pojoList; and pojo.getColour(); returns an Enum instance.
And I want to do this :
List<Pojo> newlist = new ArrayList<Pojo>();
for(Pojo pojo:pojoList){
if(pojo.getColour() == Colour.Red){
newList.add(pojo);
}
}
I could see myself using a similar function for lists of other types so rather than repeating a lot of code is their a way to make it generic and/or functional ? So that I could create sublists of different types based upon a different rule ?
First of all, I should note that if you just want a new ArrayList containing the matching elements, the way you did it in your example is just fine. Until Java has lambda expressions, you're not going to get it simpler or better looking than that.
Since you tagged this with guava, here's how you could do this with Guava. You're basically filtering the original list on the composition of a predicate (== Color.Red) and a function (pojo.getColour()). So if you had a static final Function<Pojo, Colour> called COLOUR on Pojo (like this):
public static final Function<Pojo, Colour> COLOUR =
new Function<Pojo, Colour>() {
#Override public Colour apply(Pojo input) {
return input.getColour();
}
};
you could create that combination like this:
Predicate<Pojo> isRedPojo = Predicates.compose(
Predicates.equalTo(Colour.Red), Pojo.COLOUR);
You can then create a filtered view of the original list:
Iterable<Pojo> redPojos = Iterables.filter(pojoList, isRedPojo);
And you could copy that filtered view into an ArrayList if you want:
List<Pojo> copy = Lists.newArrayList(redPojos);
You'd have to make your type implement a common interface for the check:
public interface Candidate {
public boolean isAddable();
}
The loop then would look like this
List<Candidate> newlist = new ArrayList<Candidate>();
for(Candidate pojo:pojoList){
if(pojo.isAddable()){
newList.add(pojo);
}
}
and the Pojo class would have to implement the interface:
public class Pojo implments Candidate {
// ...
#Override
public boolean isAddable() {
return isRed();
}
}
Depending on how often you use it / how many different filters (only red, only green etc.) you are using, it could make sense to create a Filter interface - if it is only to check isRed then it is probably too much code and you are better off with a simple static method.
The good thing about this design is you can use it with any objects that you want to filter (see example with String below).
public static void main(String[] args) {
List<Pojo> originalList = Arrays.asList(new Pojo(true), new Pojo(false), new Pojo(false));
List<Pojo> filteredList = Utils.getFilteredList(originalList, new Filter<Pojo>() {
#Override
public boolean match(Pojo candidate) {
return candidate.isRed();
}
});
System.out.println(originalList.size()); //3
System.out.println(filteredList.size()); //1
//Now with strings
List<String> originalStringList = Arrays.asList("abc", "abd", "def");
List<String> filteredStringList = Utils.getFilteredList(originalStringList, new Filter<String>() {
#Override
public boolean match(String candidate) {
return candidate.contains("a");
}
});
System.out.println(originalStringList.size()); //3
System.out.println(filteredStringList.size()); //2
}
public static class Utils {
public static <T> List<T> getFilteredList(List<T> list, Filter<T> filter) {
List<T> selected = new ArrayList<>();
for (T t : list) {
if (filter.match(t)) {
selected.add(t);
}
}
return selected;
}
}
public static class Pojo {
private boolean isRed;
public Pojo(boolean isRed) {
this.isRed = isRed;
}
public boolean isRed() {
return isRed;
}
}
public interface Filter<T> {
/**
* When passed a candidate object, match returns true if it matches the filter conditions,
* or false if it does not.
* #param candidate the item checked against the filter
* #return true if the item matches the filter criteria
*/
boolean match(T candidate);
}
make an generic filter interface
public interface Filter<T>{
public boolean match(T item);
}
make a method using the filter
public <T> List<T> getFilteredList(List<T> oldList, List<T> filter){
List<T> newlist = new ArrayList<T>();
for(T item:oldList){
if(filter.match(item)){
newlist.add(item);
}
}
return newlist;
}
put it all together
List<Pojo> myList = ..
List<Pojo> redList = getFilteredList(myList,new Filter<Pojo>(){
public boolean match(Pojo item){ return item.isRed()};
});
List<Pojo> blueList = getFilteredList(myList,new Filter<Pojo>(){
public boolean match(Pojo item){ return item.COLOR== Color.BLUE};
});

Categories

Resources