TreeSet storing duplicate custom objects - java

Hello I probably oversaw something, but here it goes.
I have a TreeSet<CustomObject> and I do not want to have duplicates in the Set. My CustomObject class looks like this.
class CustomObject implements Comparable<CustomObject> {
String product;
String ean;
public CustomObject(String ean){
this.ean = ean;
// product is getting set via setter and can be null
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
CustomObject that = (CustomObject) o;
return ean.equals(that.ean);
}
#Override
public int hashCode() {
return ean.hashCode();
}
#Override
public int compareTo(CustomObject another) {
if(equals(another)) return 0;
if(product != null && another.product == null) return -1;
if(product == null) return 1;
return product.compareToIgnoreCase(another.product);
}
}
Now I have a add Function for new objects.
private final TreeSet<CustomObject> productCatalog;
public void addObject(SomeData tag) {
CustomObject p = new CustomObject(tag.getEan());
if (productCatalog.contains(p)) { // <-- This checks only one entry of the Set.
for (CustomObject temp : productCatalog) {
if (temp.equals(p)) {
p = temp; // I do stuff with that later which is irrelevent here
}
}
} else {
productCatalog.add(p);
}
}
The method productCatalog.contains(p) calls the compareTo method from the Comparable Interface and does the comparing. The issue here is that it literally only checks I think the last object? in the set. So what happens is that only one unique CustomObject entry is present.
This is the scenario when I follow it with the debugger:
productCatalog.contains(p)
calls compareTo
calls equals to check if ean.equals(that.ean)
returns once true, but every other time false. Since it only checks the last object
How can I have it to check not only one object in step 4, but all the present objects in the Set. What am I missing?
Thx!
EDIT: These are some sample data. For the simplicity SomeData tag is basically a String.
First run:
addObject("Ean1") // success added
addObject("Ean2") // success added
addObject("Ean3") // success added
addObject("Ean4") // success added
Everything gets added into the TreeSet.
Second run:
addObject("Ean1") // failed already in the map
addObject("Ean2") // failed already in the map
addObject("Ean3") // failed already in the map
addObject("Ean5") // success added
addObject("Ean4") // success added
addObject("Ean4") // success added
For testing purpose I manually set product names depending on the String ean.
public CustomObject(String ean){
this.ean = ean;
switch(ean){
case "Ean1": product = "TestProduct"; break;
case "Ean2": product = "ProductTest";break;
case "Ean3": product = "Product";break;
}
The TreeSet acts as a cache.
Edit2: This is how I solved it.
for (CustomObject temp : productCatalog) {
if (temp.equals(p)) {
p = temp; // I do stuff with that later which is irrelevent here
}
}
I removed the if statement with the contains method since that would always return ´1or-1in my special case. Now I simply iterate over the Set to correctly use theequals` method since the TreeSet uses compareTo() for checking every element in the Set.
The Java Docs state the following
Note that the ordering maintained by a set (whether or not an explicit
comparator is provided) must be consistent with equals if it is to
correctly implement the Set interface. (See Comparable or Comparator
for a precise definition of consistent with equals.) This is so
because the Set interface is defined in terms of the equals operation,
but a TreeSet instance performs all element comparisons using its
compareTo (or compare) method, so two elements that are deemed equal
by this method are, from the standpoint of the set, equal. The
behavior of a set is well-defined even if its ordering is inconsistent
with equals; it just fails to obey the general contract of the Set
interface.

The main problem:
compareTo does return 1 if both product and other.product are null. This is wrong because they are actually equal. You probably forgot to set product names for the higher ean values, like "Ean4" and "Ean5".
Old answer:
Your implementations of equals and compareTo do not fit together.
equals works on the ean and compareTo on the product. This only works if you implicitly assume that equal ean imply equal product. If this is not true in your test cases, the result will be wrong.
In either case, it is no good implementation because this can lead to a < b, b < c but a "equals" c.

Related

arraylist.contains() method returns false [duplicate]

Say I create one object and add it to my ArrayList. If I then create another object with exactly the same constructor input, will the contains() method evaluate the two objects to be the same? Assume the constructor doesn't do anything funny with the input, and the variables stored in both objects are identical.
ArrayList<Thing> basket = new ArrayList<Thing>();
Thing thing = new Thing(100);
basket.add(thing);
Thing another = new Thing(100);
basket.contains(another); // true or false?
class Thing {
public int value;
public Thing (int x) {
value = x;
}
equals (Thing x) {
if (x.value == value) return true;
return false;
}
}
Is this how the class should be implemented to have contains() return true?
ArrayList implements the List Interface.
If you look at the Javadoc for List at the contains method you will see that it uses the equals() method to evaluate if two objects are the same.
I think that right implementations should be
public class Thing
{
public int value;
public Thing (int x)
{
this.value = x;
}
#Override
public boolean equals(Object object)
{
boolean sameSame = false;
if (object != null && object instanceof Thing)
{
sameSame = this.value == ((Thing) object).value;
}
return sameSame;
}
}
The ArrayList uses the equals method implemented in the class (your case Thing class) to do the equals comparison.
Generally you should also override hashCode() each time you override equals(), even if just for the performance boost. HashCode() decides which 'bucket' your object gets sorted into when doing a comparison, so any two objects which equal() evaluates to true should return the same hashCode value(). I cannot remember the default behavior of hashCode() (if it returns 0 then your code should work but slowly, but if it returns the address then your code will fail). I do remember a bunch of times when my code failed because I forgot to override hashCode() though. :)
It uses the equals method on the objects. So unless Thing overrides equals and uses the variables stored in the objects for comparison, it will not return true on the contains() method.
class Thing {
public int value;
public Thing (int x) {
value = x;
}
equals (Thing x) {
if (x.value == value) return true;
return false;
}
}
You must write:
class Thing {
public int value;
public Thing (int x) {
value = x;
}
public boolean equals (Object o) {
Thing x = (Thing) o;
if (x.value == value) return true;
return false;
}
}
Now it works ;)
Just wanted to note that the following implementation is wrong when value is not a primitive type:
public class Thing
{
public Object value;
public Thing (Object x)
{
this.value = x;
}
#Override
public boolean equals(Object object)
{
boolean sameSame = false;
if (object != null && object instanceof Thing)
{
sameSame = this.value == ((Thing) object).value;
}
return sameSame;
}
}
In that case I propose the following:
public class Thing {
public Object value;
public Thing (Object x) {
value = x;
}
#Override
public boolean equals(Object object) {
if (object != null && object instanceof Thing) {
Thing thing = (Thing) object;
if (value == null) {
return (thing.value == null);
}
else {
return value.equals(thing.value);
}
}
return false;
}
}
Other posters have addressed the question about how contains() works.
An equally important aspect of your question is how to properly implement equals(). And the answer to this is really dependent on what constitutes object equality for this particular class. In the example you provided, if you have two different objects that both have x=5, are they equal? It really depends on what you are trying to do.
If you are only interested in object equality, then the default implementation of .equals() (the one provided by Object) uses identity only (i.e. this == other). If that's what you want, then just don't implement equals() on your class (let it inherit from Object). The code you wrote, while kind of correct if you are going for identity, would never appear in a real class b/c it provides no benefit over using the default Object.equals() implementation.
If you are just getting started with this stuff, I strongly recommend the Effective Java book by Joshua Bloch. It's a great read, and covers this sort of thing (plus how to correctly implement equals() when you are trying to do more than identity based comparisons)
Shortcut from JavaDoc:
boolean contains(Object o)
Returns true if this list contains the specified element. More formally,
returns true if and only if this list contains at least one element e such
that (o==null ? e==null : o.equals(e))
record overrides equals
You said:
another object with exactly the same constructor input
… and …
Assume the constructor doesn't do anything funny with the input, and the variables stored in both objects are identical.
As other Answers explain, you must override the Object#equals method for List#contains to work.
In Java 16+, the record feature automatically overrides that method for you.
A record is a brief way to write a class whose main purpose is to communicate data transparently and immutably. By default, you simply declare the member fields. The compiler implicitly creates the constructor, getters, equals & hashCode, and toString.
The logic of equals by default is to compare each and every member field of one object to the counterpart in another object of the same class. Likewise, the default implementations of hashCode and toString methods also consider each and every member field.
record Thing( int amount ) {} ;
That’s it, that is all the code you need for a fully-functioning read-only class with none of the usual boilerplate code.
Example usage.
Thing x = new Thing( 100 ) ;
Thing y = new Thing( 100 ) ;
boolean parity = x.equals( y ) ;
When run.
parity = true
Back to your List#contains question.
Thing x = new Thing( 100 );
List < Thing > things =
List.of(
new Thing( 100 ) ,
new Thing( 200 ) ,
new Thing( 300 )
);
boolean foundX = things.contains( x );
When run.
foundX = true
Bonus feature: A record can be declared locally, within a method. Or like a conventional class you can declare a record as a nested class, or as a separate class.

HashSet.contains(object) returns false for instance modified after insertion

According to the JavaDoc of java.util.HashSet.contains() the method does following
Returns true if this set contains the specified element. More
formally, returns true if and only if this set contains an element e
such that (o==null ? e==null : o.equals(e)).
However this does not seem to work for following code:
public static void main(String[] args) {
HashSet<DemoClass> set = new HashSet<DemoClass>();
DemoClass toInsert = new DemoClass();
toInsert.v1 = "test1";
toInsert.v2 = "test2";
set.add(toInsert);
toInsert.v1 = null;
DemoClass toCheck = new DemoClass();
toCheck.v1 = null;
toCheck.v2 = "test2";
System.out.println(set.contains(toCheck));
System.out.println(toCheck.equals(toInsert));
}
private static class DemoClass {
String v1;
String v2;
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((v1 == null) ? 0 : v1.hashCode());
result = prime * result + ((v2 == null) ? 0 : v2.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
DemoClass other = (DemoClass) obj;
if (v1 == null) {
if (other.v1 != null)
return false;
} else if (!v1.equals(other.v1))
return false;
if (v2 == null) {
if (other.v2 != null)
return false;
} else if (!v2.equals(other.v2))
return false;
return true;
}
}
prints out:
false
true
So although the equals method returns true, HashSet.contains() returns false.
I guess this is because I modified the toInsert instance AFTER it was added to the collection.
However this is in no way documented (or at least I wasn't able to find such). Also the documentation referenced above the equals method should be used but it does not seem so.
When an object is stored in a HashSet its put in a data structure that's easily (read: efficiently) searchable by the object's hashCode(). Modifying an object may change its hashCode() (depending on how you implemented it), but does not update its location in the HashSet, as the object has no way of knowing its contained in one.
There are a couple of things you can do here:
Modify the implementation of hashCode() so it isn't affected by the field you're changing. Assuming this field is important to the object's state, and participates in the equals(Object) method, this is somewhat of a code smell, and should probably be avoided.
Before modifying the object, remove it from the set, and then re-add it once you're done modifying it:
Set<DemoClass> mySet = ...;
DemoClass demo = ...;
boolean wasInSet = mySet.remove(demo);
demo.setV1("new v1");
demo.setV2("new v2");
if (wasInSet) {
set.add(demo);
}
HashSet and HashMap use both hashCode and equals methods to locate an object in their inner structure. hashCode is used to find a correct bucket and then equals is consulted to distinguish between different objects with the same hashCode, as the latter is not guaranteed to be unique. In almost any cases this is an extremely bad idea to modify object which serves as a key in a HashMap or is put into a HashSet. If those modifications change either hashCode or semantics of equals method, your object will not be found.
This is by-design behavior.
HashSet uses hashes to identify objects it holds.
So, if you change an object after it was placed into collection, it may be unable to find it.
You should either hold immutable objects only, or make mutable only that part of an object, which is not affect the hash.
I think better is to use HashMap, which clearly separates mutable and immutable parts.
It is quite clear, you are changing toInsert.v1 after adding to the set, and due to DemoClass obtain hashCode from v1 and v2 attributes, it won't find changed hashCode for elementes.

How to compare private data fields of objects to ensure they are the same (Java)?

Just to start off here, this is homework/a lab and I'm looking for advice. I am developing a very small program that is essentially a counter with a min/max value constraint and a method that pushes the value up and another that rolls the value back to zero 0. So, the private data fields I have for my Counter class are:
private int minimum;
private int maximum;
private int currentValue;
The trouble I am having here is with a method that compares my Counter Class to another theoretical object based off the same class. In this case, we're looking to see that the data fields between the two objects are the same. I have researched several ways of doing this including using reflections and the famous EqualsBuilder, but am having trouble implementing each.
Here's the code that they've given me.
public boolean equals(Object otherObject)
{
boolean result = true;
if (otherObject instanceof Counter)
{
}
return result;
}
Assuming your equals method is in the Counter class, it has access to all the private members of that class, even if they are members of a different instance of that class.
public boolean equals(Object otherObject)
{
if (otherObject instanceof Counter)
{
Counter ocounter = (Counter) otherObject;
if (this.minimum != ocounter.minimum)
return false;
...
} else {
return false;
}
return true;
}
Implementing the equals-method can be a real pain, especially if you have a lot of properties in your class.
The JavaDoc for the equals-method states
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
And, if you check the JavaDoc for the hashCode-method.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
Therefore, it is typically recommended that you implement both methods (equals and hashCode). The following shows one way of doing this that is based on the java.util.Objects-class that came with Java 7. The method Objects.equals(Object, Object) handles null checks which makes the code simpler and easier to read. Furthermore, the hash-method is a convenient way of creating values that can be used with hashCode.
So, to answer your question. In order to access the attributes of your other object, simply perform a type cast. After that you can access the other object's private properties. But, remember to always do this after you have checked the type using instanceof.
#Override
public boolean equals(Object other) {
if (other instanceof Counter) { // Always check the type to be safe
// Cast to a Counter-object
final Counter c = (Counter) other;
// Now, you can access the private properties of the other object
return Objects.equals(minimum, c.minimum) &&
Objects.equals(maximum, c.maximum) &&
Objects.equals(currentValue, c.currentValue);
}
return false; // If it is not the same type, always return false
}
#Override
public int hashCode() {
return Objects.hash(currentValue, maximum, minimum);
}
As equals is a method of Counter, you can access all the private fields of Counter, so you can do something like this:
if (otherObject instanceof Counter)
{
if (this.minimum != ((Counter) otherObject).minimum) {
result = false;
}
// [...]
}
Assuming your class is called Counter and you created getters for all the private fields (that you should do):
#Override
public boolean equals(Object other) {
boolean result = false;
if (other instanceof Counter) {
Counter c= (Counter) other;
result = (this.getMinimum() == that.getMinimum() &&
this.getMaximum() == that.getMaximum() &&
this.getCurrentValue() == that.getCurrentValue());
}
return result;
}

Java HashSet contains Object

I made my own class with an overridden equals method which just checks, if the names (attributes in the class) are equal. Now I store some instances of that class in a HashSet so that there are no instances with the same names in the HashSet.
My Question: How is it possible to check if the HashSet contains such an object. .contains() wont work in that case, because it works with the .equals() method. I want to check if it is really the same object.
edit:
package testprogram;
import java.util.HashSet;
import java.util.Set;
public class Example {
private static final Set<Example> set = new HashSet<Example>();
private final String name;
private int example;
public Example(String name, int example) {
this.name = name;
this.example = example;
set.add(this);
}
public boolean isThisInList() {
return set.contains(this);
//will return true if this is just equal to any instance in the list
//but it should not
//it should return true if the object is really in the list
}
public boolean remove() {
return set.remove(this);
}
//Override equals and hashCode
}
Sorry, my english skills are not very well. Please feel free to ask again if you don't understand what I mean.
In your situation, the only way to tell if a particular instance of an object is contained in the HashSet, is to iterate the contents of the HashSet, and compare the object identities ( using the == operator instead of the equals() method).
Something like:
boolean isObjectInSet(Object object, Set<? extends Object> set) {
boolean result = false;
for(Object o : set) {
if(o == object) {
result = true;
break;
}
}
return result;
}
The way to check if objects are the same object is by comparing them with == to see that the object references are equal.
Kind Greetings,
Frank
You will have to override the hashCode method also.
try this..
Considering only one property 'name' of your Objects to maintain uniqueness.
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + (name == null ? 0 : name.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
User other = (User) obj;
if (name == null) {
if (other.name != null) {
return false;
}
} else if (!name.equals(other.name)) {
return false;
}
return true;
}
I made my own class with an overridden equals method which just checks, if the names (attributes in the class) are equal.
This breaks the contract of .equals, and you must never do it no matter how convenient it seems.
Instead, if you want to index and look up elements by a certain attribute such as the name, use a HashMap<Name, YourType> to find them. Alternatively, use a TreeSet and pass it a Comparator that compares the name only. You can then remove the incorrect equals method.
There are then three ways if you want to find objects by reference equality:
Your objects have no inherent or useful notion of equality.
Don't implement equals. Leave it to its default. You can then use a HashSet to look for reference equality, and a HashMap or TreeSet to index them by any specific attributes.
Your objects do have a useful, universal notion of equality, but you want to find equivalent instances efficiently anyways.
This is almost never the case. However, you can use e.g. an Apache IdentityMap.
You don't care about efficiency.
Use a for loop and == every element.
HashSet contains uses the equals method to determine if the object is contained - and duplicates are not kept within the HashSet.
Assuming your equals and hashcode are only using a name field...
HashSet<MyObject> objectSet = new HashSet<MyObject>();
MyObject name1Object = new MyObject("name1");
objectSet.add(new MyObject("name1"));
objectSet.add(name1Object);
objectSet.add(new MyObject("name2"));
//HashSet now contains 2 objects, name1Object and the new name2 object
//HashSets do not hold duplicate objects (name1Object and the new object with name1 would be considered duplicates)
objectSet.contains(new MyObject("name1")) // returns true
objectSet.contains(name1Object) // returns true
objectSet.contains(new MyObject("name2")) // returns true
objectSet.contains(new MyObject("name3")) // returns false
If you wanted to check if the object in the HashSet is the exact object you are comparing you would have to pull it out and compare it directly using ==
for (MyObject o : objectSet)
{
if (o == name1Object)
{
return true;
}
}
If you do this alot for specific objects it might be easier to use a HashMap so you don't have to iterate through the list to grab a specific named Object. May be worth looking into for you because then you could do something like this:
(objectMap.get("name") == myNameObject) // with a HashMap<String, MyNameObject> where "name" is the key string.

Hash function for a generic object

How do you come up with a hash function for a generic object? There is the constraint that two objects need to have the same hash value if they are "equal" as defined by the user. How does Java accomplish this?
I just found the answer to my own question. The way Java does it is that it defines a hashCode for every object and by default the hashCode for two objects are the same iff the two objects are the same in memory. So when the client of the hashtable overrides the equals() method for an object, he should also override the method that computes hashcode such that if a.equals(b) is true, then a.hashCode() must also equal b.hashCode(). This way, it is assured that equal objects have the same hashcode.
First, basically you define the hash function of a class by overriding the hashCode() method. The Javadoc states:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
So the more important question is: What makes two of your objects equal? Or vice versa: What properties make your objects unique? If you have an answer to that, create an equals() method that compares all of the properties and returns true if they're all the same and false otherwise.
The hashCode() method is a bit more involved, I would suggest that you do not create it yourself but let your IDE do it. In Eclipse, you can select Source and then Generate hashCode() and equals() from the menu. This also guarantees that the requirements from above hold.
Here is a small (and simplified) example where the two methods have been generated using Eclipse. Notice that I chose not to include the city property since the zipCode already uniquely identifies the city within a country.
public class Address {
private String streetAndNumber;
private String zipCode;
private String city;
private String country;
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((country == null) ? 0 : country.hashCode());
result = prime * result
+ ((streetAndNumber == null) ? 0 : streetAndNumber.hashCode());
result = prime * result + ((zipCode == null) ? 0 : zipCode.hashCode());
return result;
}
#Override
public boolean equals(final Object obj) {
if(this == obj)
return true;
if(obj == null)
return false;
if(!(obj instanceof Address))
return false;
final Address other = (Address) obj;
if(country == null) {
if(other.country != null)
return false;
}
else if(!country.equals(other.country))
return false;
if(streetAndNumber == null) {
if(other.streetAndNumber != null)
return false;
}
else if(!streetAndNumber.equals(other.streetAndNumber))
return false;
if(zipCode == null) {
if(other.zipCode != null)
return false;
}
else if(!zipCode.equals(other.zipCode))
return false;
return true;
}
}
Java doesn't do that. If the hashCode() and equals() are not explicitly implemented, JVM will generate different hashCodes for meaningfully equal instances. You can check Effective Java by Joshua Bloch. It's really helpful.
Several options:
read Effective Java, by Joshua Bloch. It contains a good algorithm for hash codes
let your IDE generate the hashCode method
Java SE 7 and greater: use Objects.hash
The class java.lang.Object cheats. It defines equality (as is determined by equals) as being object identity (as can be determined by ==). So, unless you override equals in your subclass, two instances of your class are "equal", if they happen to be the same object.
The associated hash code for this is implemented by the system function System.identityHashCode (which is no longer really based on object addresses -- was it ever? -- but can be thought of as being implemented this way).
If you override equals, then this implementation of hashCode no longer makes sense.
Consider the following example:
class Identifier {
private final int lower;
private final int upper;
public boolean equals(Object any) {
if (any == this) return true;
else if (!(any instanceof Identifier)) return false;
else {
final Identifier id = (Identifier)any;
return lower == id.lower && upper == id.upper;
}
}
}
Two instances of this class are considered equal, if their "lower" and "upper" members have the same values. Since equality is now determined by object members, we need to define hashCode in a compatible way.
public int hashCode() {
return lower * 31 + upper; // possible implementation, maybe not too sophisticated though
}
As you can see, we use the same fields in hashCode which we also use when we determine equality. It is generally a good idea to base the hash code on all members, which are also considered when comparing for equality.
Consider this example instead:
class EmailAddress {
private final String mailbox;
private final String displayName;
public boolean equals(Object any) {
if (any == this) return true;
else if (!(any instanceof EmailAddress)) return false;
else {
final EmailAddress id = (EmailAddress)any;
return mailbox.equals(id.mailbox);
}
}
}
Since here, equality is only determined by the mailbox member, the hash code should also only be based on that member:
public int hashCode() {
return mailbox.hashCode();
}
Hashing of an object is established by overriding hashCode() method, which the developer can override.
Java uses prime numbers in the default hashcode calculation.
If the equals() and hashCode() method aren't implemented, the JVM will generate hashcode implicitly for the object (for Serializable classes, a serialVersionUID is generated).

Categories

Resources