Which method does Set.removeAll() use underneath: equals or compareTo? - java

Consider the code:
class A {
private int i;
boolean equals( Object t) {
if (this == t)
return true;
if (!( t instanceof A))
return false;
if (this.i == t.i);
}
}
Map<String,A> orig;
Map<String,B> dup;
I am trying to do this
orig.entrySet().removeAll(dup.entrySet());
I see that the equals method is called; is this always true, or might it call compareTo instead?

Yes, it calls equals(). compareTo() could only be used if the Set knew that it contained Comparable objects (sorted sets, for instance, might possibly do this).

It depends on the implementation.
For instance, a HashSet will use hashCode and equals. A TreeSet will probably use compareTo. Ultimately, so long as your types behave appropriately it shouldn't matter.

The TreeSet uses the compareTo, try this:
public class A {
private int i;
A(int i) {
this.i = i;
}
#Override
public boolean equals(Object t) {
if (this == t)
return true;
if (!( t instanceof A))
return false;
return (this.i == ((A)t).i);
}
public static void main(String[] args) {
List<A> remove = Arrays.asList(new A(123), new A(789));
Set<A> set = new TreeSet<A>(new Comparator<A>() {
#Override
public int compare(A o1, A o2) {
return o1.i - o2.i;
// return 0; // everything get removed
}
});
set.add(new A(123));
set.add(new A(456));
set.add(new A(789));
set.add(new A(999));
set.removeAll(remove);
for (A a : set) {
System.out.println(a.i);
}
System.out.println("done");
}
}
make the Comparator always return 0 and everything will be removed! Same happens if not using a Comparator but implementing Comparable.
The TreeSet is based on a TreeMap which uses the compareTo in getEntry.
In the Javadoc of the TreeSet you can (finally) read:
...the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method...
[]]

http://java.sun.com/j2se/1.5.0/docs/api/java/util/Collection.html
"Implementations are free to implement optimizations whereby the equals invocation is avoided, for example, by first comparing the hash codes of the two elements."
Most likely will use equals, but considering the statement above, you cannot fully rely on equals() to be called. Remember that it's always a good idea to override hashCode() whenever you override equals().

Some Set implementations rely on hashCode (e.g. HashSet). That is why you should always override hashCode too when you override equals.

The only implementation within the Java library that I am aware of that wont do this is IdentityHashMap. TreeMap for instance does not have an appropriate Comparator.

I don't see where compareTo is used; the javadoc for remove() for the Map interface says "More formally, if this map contains a mapping from key k to value v such that (key==null ? k==null : key.equals(k)), that mapping is removed." While for the Set interface it similarly says "More formally, removes an element e such that (o==null ? e==null : o.equals(e)), if the set contains such an element."
Note that removeAll()'s javadoc doesn't say how it operates, which means, as others have said, that it's an implementation detail.
In Sun's Java, according to Bloch in his Effective Java (if I remember correctly), it iterates over the collection and calls remove(), but he stresses that you must never assume that's how it's always done.

Related

How do I change the Java Set's retainall method to use the equals method instead of the == operator?

I am trying to test whether two HashSets of Strings contain identical Strings. The retainAll() method of Java Sets (which, as I understand it, implements the Collection interface) is a good way to check the intersection of two Sets. However, this method seems to test for equality using the == style check for whether they are references to the same memory object, rather than using the String's equals() method to check whether the contents are the same. Is there a way to get something the works like retainAll but that uses the equals() method?
I am trying to write code that checks whether a String contains a substring over a certain length from a certain other String. My strategy was to create a HashSet of each String containing all substrings of that length, then check whether the Sets contain Strings in common.
My current solution was to create my own static method that does what I want the retainAll method to do.
static boolean containsEqualElement(Set SetOne, Set SetTwo) {
Iterator it = SetOne.iterator();
while (it.hasNext()) {
Object thisComp = it.next();
Iterator it2 = SetTwo.iterator();
while (it2.hasNext()) {
if (it2.next().equals(thisComp)) {
return true;
}
}
}
return false;
}
I'm not sure how the efficiency of this method compares to the retainAll method.
This statement from your question:
However, this method seems to test for equality using the == style check for whether they are references to the same memory object, rather than using the String's equals() method to check whether the contents are the same
is wrong. retainAll does use contains, which in turn uses equals.
I don't fully understand your use case, but I think you might find the Collections.disjoint method more useful than retainAll. From the docs:
Returns true if the two specified collections have no elements in common.
You could use it like this:
if (!Collections.disjoint(setOne, setTwo)) {
// sets have at least one element in common
}
I'm proposing you use this method because retainAll modifies the set on which it's invoked on. Actually, it removes all the elements from this collection that are not contained in the argument collection. And from your code, it doesn't seem like you want this behavior.
Actually retainsAll use contains that itself use equals, at least the standard version. Maybe you actually used an IdentityHashMap instead that would indeed use the memory reference for equality, but that would be because you asked for it.
public boolean [More ...] retainAll(Collection<?> c) {
boolean modified = false;
Iterator<E> e = iterator();
while (e.hasNext()) {
if (!c.contains(e.next())) {
e.remove();
modified = true;
}
}
return modified;
}
public boolean [More ...] contains(Object o) {
Iterator<E> e = iterator();
if (o==null) {
while (e.hasNext())
if (e.next()==null)
return true;
} else
while (e.hasNext()
if (o.equals(e.next()))
return true;
}
return false;
}
Next time, please consider using the debugger to double check (even code from the JDK) or google it (like HashSet.retainAll code source) you would find something like that: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashSet.java
This is what I did to respond to your question.
If you check OpenJDK9 source code, you can see that retainAll() uses AbstractCollection.contains(Object o):
public boolean retainAll(Collection<?> c) {
Objects.requireNonNull(c);
boolean modified = false;
Iterator<E> it = iterator();
while (it.hasNext()) {
if (!c.contains(it.next())) {
it.remove();
modified = true;
}
}
return modified;
}
Documentation of contains() says:
Returns true if this collection contains the specified element. More formally, returns true if and only if this collection contains at least one element e such that (o==null ? e==null : o.equals(e)).
Hence retainAll() is based on equals() check, which is what you want.

Equal elements and Tree set

Created a static nested class which implements Comparable, and override Object.equals such that e1.compareTo(e2)==0 and e1.equals(e2)==true are not synonymous.
Then i add the objects into TreeSet and HashSet respectively using its add method.
I expected that insertion of multiple such objects into either a TreeSet or a HashSet will succeed, since both claim reliance on equals to determine uniqueness but i found inserting multiple such objects into a TreeSet will fail while inserting them into a HashSet will succeed.
public class Test {
/*
* This inner class deliberately has a compareTo method that is not
* consistent with equals
*/
static class TestObject implements Comparable<TestObject> {
#Override
public int compareTo(TestObject arg0) {
// No two of these objects can be ordered
return 0;
}
#Override
public boolean equals(Object arg0) {
// No two of these objects are ever equal to each other
return false;
}
}
public static void printSuccess(boolean success) {
if (success)
System.out.println(" Success");
else
System.out.println(" Failure");
}
public static void main(String[] args) {
TreeSet<TestObject> testTreeSet = new TreeSet<TestObject>();
HashSet<TestObject> testHashSet = new HashSet<TestObject>();
System.out.println("Adding to the HashSet:");
printSuccess(testHashSet.add(new TestObject()));
printSuccess(testHashSet.add(new TestObject()));
printSuccess(testHashSet.add(new TestObject()));
System.out.println("Copying to the TreeSet:");
for (TestObject to : testHashSet) {
printSuccess(testTreeSet.add(to));
}
}
}
Output of above program is
Adding to the HashSet:
Success
Success
Success
Copying to the TreeSet:
Success
Failure
Failure
Can some one tell me why Tree set is behaving like this ?
"a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal". https://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html
And your compareTo says they are all equal.
The return value 0 for compareTo means the objects are equal, so e1.compareTo(e2) == 0 if and only if e1.equals(e2) == true.
TreeSet guarantees oredering so it uses the compare method, and HashSet does not so it uses the equals method. Try changing the compareTo method to a positive/negative number instead.
You can read more about the Compareable interface here.
Also the javadoc of java.util.Comparator explicitly describes this case and tells that Comparator for SortedSet must be consistent with equals:
For example, suppose one adds two elements a and b such that
(a.equals(b) && c.compare(a, b) != 0) to an empty TreeSet with
comparator c. The second add operation will return true (and the size
of the tree set will increase) because a and b are not equivalent from
the tree set's perspective, even though this is contrary to the
specification of the Set.add method.

using HashSet with overridden .equals() [duplicate]

This question already has answers here:
When does HashSet 'add' method calls equals? [duplicate]
(4 answers)
Closed 4 years ago.
here is my code :
public class testGui {
public static void main(String[] arg){
class TESTS{
String t;
public TESTS(String t){
this.t = t;
}
#Override
public boolean equals(Object x){
System.out.println("My method is called...");
if(x instanceof TESTS){
TESTS zzz = (TESTS) x;
return zzz.t.compareTo(t)==0;
}
else return false;
}
}
HashSet<TESTS> allItems = new HashSet<TESTS>();
allItems.add(new TESTS("a"));
allItems.add(new TESTS("a"));
System.out.println(allItems.contains(new TESTS("a")));
}
}
I do not get why the hashset contains method is not calling my equals method as mentionned in their specifications :
More formally, adds the specified
element, o, to this set if this set
contains no element e such that
(o==null ? e==null : o.equals(e))
My code is returning false and not going into my equals method.
Thanks a lot for answering!
When you override equals, you must also override hashCode. Otherwise, equal objects will have different hash codes and be considered unequal.
It is also strongly recommended not to override only hashCode. But this is not essential, as unequal objects can have the same hash code.
The HashSet depends on the HashCode of each object. Before the equals method is called, the hashCode method will be called. If hashcodes are equal, then the hashset deems it worthy of evaluating the equals method.
Implement a hashcode method such that if a.equals(b) == true, then a.hashCode() == b.hashCode()
and it should start working as you would expect.
You should also implement hashCode, so that it is consistent with equals. HashSet uses the hashCode method to decide which bucket to put an item into, and calls equals only when the hash code of two items are the same.
Effective Java, 2nd Edition discusses this rule (and the consequences of breaking it) in Item 9: Always override hashCode when you override equals.
As most of the comments have been... just override the hashcode method (sample below) and you should be good.
#Override
public int hashCode() {
return t.hashCode()*31;
}

HashSet equality called with passed object instead of stored item

In the following code, the output shows CONTAINS for each object, whereas commenting out the anonymous object's equals() method results in MISSING, which leads me to believe the second equality pass (hashCode() -> equals()) actually calls the equality method of the supplied object instead of the object within the collection being tested.
List<String> strings = Arrays.asList("Hello", "there", "Qix");
HashSet<String> set = new HashSet<>(strings);
for(final String s : strings)
{
boolean contains = set.contains(new Object(){
#Override
public int hashCode() {
return s.hashCode();
}
#Override
public boolean equals(Object obj) {
return true;
}
});
System.out.format("%s: %s\n",
s,
contains ? "CONTAINS" : "MISSING");
}
Why is this? Is it because the equals() method, by principle, should be symmetric between the two objects?
The HashSet either has to do a.equals(b) or b.equals(a). And as they should be written to be symmetric*, it shouldn't matter which it chooses.
But for reference, the documentation states:
returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e))
* See http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html#equals(java.lang.Object).
It's an implementation detail which we shouldnt worry about since it's never said in public API how it uses equals. Like you said it's supposed to be symmetric anyway. If we go into src we'll see that it is really passedObject.equals(storedObject)
Becaue AFAIK the default Object.equals implementation uses reference comparison so it checks if the two references refer to the same object.
Here you are comparing instances of Strings with another custom classe, they can't be equal, whatever the way this is done.

contains() method in java.util.HashSet doesn't behave as i expected from it

This is the java main() method:
public static void main(String[] args) {
HashSet set = new HashSet();
Mapper test = new Mapper("asd", 0);
set.add(test);
System.out.println(new Mapper("asd", 0).equals(test));
System.out.println(set.contains(new Mapper("asd", 0)));
}
and my Mapper class is :
class Mapper {
String word;
Integer counter;
Mapper (String word, Integer counter) {
this.word = word;
this.counter = counter;
}
public boolean equals(Object o) {
if ((o instanceof Mapper) && (((Mapper)o).word == this.word)) {
return true;
}
return false;
}
}
and the result is :
true
false
From HashSet specifications, at this method I read this : "Returns true if this set contains the specified element. More formally, returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e)). "
So, can anyone explain me where i'm wrong? Or ...?
Thanks.
You need to implement a proper hashCode() function.
public int hashCode() {
// equal items should return the same hashcode
}
The Java utilites java.util contain a lot of classes which rely on hashing. To allow one to override equals() as they see fit means that one must also properly override hashCode() to match.
A correct implementation of hashCode() would return the same hash for any two objects where equals() returns true. Hash related functions check the hashes for equality before checking to see if the objects equal (to resolve hash collisions).
The hashCode contract says that if two objects are equal, they should have the same hash code. Collections like HashSet assume that this is upheld.
Object's implementations of equals and hashCode are based on addresses (or object IDs). If you override equals to do a content comparison, you should override hashCode to generate a hash based on content.

Categories

Resources