In Java, a Set cannot contain two unique objects where as a List doesn't have that restriction. What is the mechanism in a Set which identifies unique objects. Most probably it might be the implementation of the equals method or hashCode method or both of the objects which are being added. Does anyone know what's the actually mechanism of identifying unique objects. Is it the equals method, hashcode method or both the methods or something else ?
It depends on the Set implementation. For HashSet and LinkedHashSet, it uses both equals and hashCode methods. For TreeSet, it uses the natural Comparator of the object or a specific provided Comparator.
Basically, the Set javadoc explains this (emphasis mine):
A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.
The Set interface places additional stipulations, beyond those inherited from the Collection interface, on the contracts of all constructors and on the contracts of the add, equals and hashCode methods.
TreeSet has another behavior since it implements SortedSet (this interface extends the Set interface). From its javadoc (emphasis mine):
A Set that further provides a total ordering on its elements. The elements are ordered using their natural ordering, or by a Comparator typically provided at sorted set creation time. The set's iterator will traverse the set in ascending element order.)
Note that the ordering maintained by a sorted set (whether or not an explicit comparator is provided) must be consistent with equals if the sorted set is to correctly implement the Set interface (See the Comparable interface or Comparator interface for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a sorted set performs all element comparisons using its compareTo (or compare) method.
HashSet Source: seems the elements are stored as keys in a HashMap (which requires unique keys) and the put method in HashMap checks the following:
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
So it compares both the hash and runs equals.
TreeSet uses a TreeMap to back its data and TreeMaps put uses a comparator.
Related
I have a data structure which takes an optional Comparator to customise ordering (in this case it is a TreeSet but really it doesn't matter, I can swap out for a PriorityQueue without breaking my code). At present, it is ordered by a field price on the object which the structure is storing.
When 2 objects have the same price, I want timestamp to be the tie-breaker, timestamp being System.CurrentTime. To specify this in the comparator I have to use:
if (Object1.getPrice == Object2.getPrice && Object1.Timestamp > Object2.Timestamp) return 1
The problem is that this breaks the equals case when I do TreeSet.floor() or TreeSet.ceiling() - the method no longer recognises that 2 objects are of equal price but will still recognise if the price is higher/lower. How do I mitigate this?
With the Comparator you decide what is the order of the set (actually as well their "uniqueness"!).
So you have to decide if the objects are equal:
if the Price is the same
if the Price and Timestamp is the same
maybe using different data-strucure for this different usages would be an option or a different that could cover both. Depending on the other requirements and boundaries of your code.
Take a look here: https://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html
A NavigableSet implementation based on a TreeMap. The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
So you provide the Comparator.
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal.
So if there is already a equals object in the Set (consulting the comparator) it want be added again as this would be placed at the very same position in the Set according to the order (if you don't like that don't use a Set). They should be implemented consistent (even tough it is not forced by the compiler). So instead of using direct equals it used the comparator.
Internally the comparator could use the equals operator of the objects (inside the if statement like if(obj1.equals(obj2)){return 0;}).
But how to implement the equals - internally it would rely on some property comparison (like you do now in the comparator).
If you don't like that for your data structure maybe don't use a (Tree)Set.
Depending on the needs a Map or a List might be fine.
Choosing the right Collection:
http://www.javapractices.com/topic/TopicAction.do?Id=65
For the difference between == and equals:
== is comparing the references, lets say the pointer to the container, if true it is the same instance.
equals is comparing the object, lets say the content of the object container, if true it holds same value(s) (in its properties)
But out there are probably millions of exhausting articles on that.
Note: equals of an Object should be implemented consistent with the hashCode() method. And for a (Tree)Set the comparator should be implemented consistent with equals (maybe use something like if(obj1.equals(obj2)){return 0;}). Furthermore, other Set implementations use the hasCode() for that.
In Java, when should I implement Comparable<Something> versus implementing the equals method? I understand every time I implement equals I also have to implement hash code.
EDIT
Based on answers I am getting below:
Is it safe to say that if I implement Comparable then I don't need to implement equals and hashCode? As in: whatever I can accomplish with equal is already included in compareTo? For an example, I want to be able to compare two BSTs for equality. Implementing a hashCode for that seems daunting; so would comparable be sufficient?
If you only ever need to compare them for equality (or put them in a HashMap or HashSet which is effectively the same) you only need to implement equals and hashcode.
If your objects have an implicit order and you indend to sort them (or put them in a TreeMap or TreeSet which is effectively sorting) then you must implement Comparable or provide a Comparator.
Comparable is typically used for ordering items (like sorting) and equals is used for checking if two items are equal. You could use comparable to check for equality but it doesn't have to be. From the docs, "It is strongly recommended (though not required) that natural orderings be consistent with equals"
equals (and hashCode) are used for equality tests. The Comparable interface can be used for equality checks too, but it is in practice used for sorting elements or comparing their order.
That is, equals deals only with equality (similar to == and !=), while compareTo allows you to check many different types of inequality (similar to <, <=, ==, >=, > and !=). Therefore, if your class has a partial or total ordering of some kind, you may want to implement Comparable.
The relationship between compareTo and equals is briefly mentioned in the javadocs for Comparable:
It is strongly recommended, but not strictly required that (x.compareTo(y)==0) == (x.equals(y)). Generally speaking, any class that implements the Comparable interface and violates this condition should clearly indicate this fact. The recommended language is "Note: this class has a natural ordering that is inconsistent with equals."
Comparable is used by TreeSet to compare and sort anything you insert into it, and it is used by List#sort(null) to sort a list. It will be used in other places too, but those are the first that come to mind.
You should consider when you will use the object.
If for example in TreeMap and TreeSet, then you'll need Comparable. Also any case, when you will have to sort elements(especially sorted collections), implementing Comparable is mandatory. Overriding equals and hashcode will be needed in a very large amount of cases, most of them.
As per the javadocs:
This interface imposes a total ordering on the objects of each class that implements it. This ordering is referred to as the class's natural ordering, and the class's compareTo method is referred to as its natural comparison method.
Lists (and arrays) of objects that implement this interface can be sorted automatically by Collections.sort (and Arrays.sort). Objects that implement this interface can be used as keys in a sorted map or as elements in a sorted set, without the need to specify a comparator.
The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C. Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false
In short, equals() and hashcode() are for comparing equality whereas Comparable is for sorting
If you update the equals method, you should always update hashCode or you will be setting a boobytrap for yourself or others when they try to use it in a HashMap or HasSet or similar collections (very commonly used).
Comparable is required only when the interfaces you are working with require it. You can implement it for use with sorting and sorted collections but it's not actually required in most cases. An alternate approach is to pass in a Comparator to the sort or collection constructor. For example the String class has a case insensitive Comparator that is very useful and eliminates the need to create your own String class (i.e. String cannot be extended.)
There is a java bean Car that might contains two values: model and price.
Now suppose I override equals() and hashcode() checking only for model in that way:
public boolean equals(Object o) {
return this.model.equals(o.model);
}
public int hashCode() {
return model.hashCode();
}
This permit me to check if an arraylist already contain an item Car of the same model (and doesn't matter the price), in that way:
List<Car> car = new ArrayList<Car>();
car.add(new Car("carA",100f));
car.add(new Car("carB",101f));
car.add(new Car("carC",110f));
System.out.println(a.contains(new Car("carB",111f)));
It returns TRUE. That's fine, because the car already exist!
But now I decide that the ArrayList is not good, because I want to maintain the items ordered, so I substitute it with a TreeSet in this way:
Set<Car> car = new TreeSet<Car>(new Comparator<Car>() {
#Override
public int compare(Car car1, Car car2) {
int compPrice = - Float.compare(car1.getPrice(), car2.getPrice());
if (compPrice > 0 || compPrice < 0)
return compPrice;
else
return car1.getModel().compareTo(car2.getModel());
}});
car.add(new Car("carA",100f));
car.add(new Car("carB",101f));
car.add(new Car("carC",110f));
System.out.println(a.contains(new Car("carB",111f)));
But now there is a problem, it return FALSE... why?
It seems that when I invoke contains() using an arrayList the method equals() is invoked.
But it seems that when I invoke contains() using a TreeSet with a comparator, the comparator is used instead.
Why does that happen?
TreeSet forms a binary tree keeping elements according to natural (or not) orders, so in order to search quickly one specific element is the collection, TreeSet uses Comparable or Comparator instead of equals().
As TreeSet JavaDoc precises:
Note that the ordering maintained by a set (whether or not an explicit
comparator is provided) must be consistent with equals if it is to
correctly implement the Set interface. (See Comparable or Comparator
for a precise definition of consistent with equals.) This is so
because the Set interface is defined in terms of the equals operation,
but a TreeSet instance performs all element comparisons using its
compareTo (or compare) method, so two elements that are deemed equal
by this method are, from the standpoint of the set, equal. The
behavior of a set is well-defined even if its ordering is inconsistent
with equals; it just fails to obey the general contract of the Set
interface.
We can find a similarity with the HashCode/Equals contract:
If equals() returns true, hashcode() has to return true too in order to be found during search.
Likewise with TreeSet:
If contains() (using Comparator or Comparable) returns true, equals() has to return true too in order to be consistent with equals().
THEREFORE: Fields used within TreeSet.equals() method have to be exactly the same (no more, no less) than within your Comparator implementation.
A TreeSet is implicitly sorted, and it uses a Comparator for this sorting. The equals() method can only tell you if two objects are the same or different, not how they should be ordered for sorting. Only a Comparator can do that.
More to the point, a TreeSet also uses comparisons for searching. This is sort of the whole point of tree-based map/set. When the contains() method is called, a binary search is performed and the target is either found or not found, based on how the comparator is defined. The comparator defines not only logical order but also logical identity. If you are relying on logical identity defined by an inconsistent equals() implementation, then confusion will probably ensue.
The reason for the different behaviour is, that you consider the price member in the compare method, but ignore it in equals.
new Car("carB",101f) // what you add to the list
new Car("carB",111f) // what you are looking for
Both instances are "equals" (sorry...) since their model members are equal (and the implementation stops after that test). They "compare" as different, though, because that implementation also checks the price member.
Like for a object to be inserted into a HashMap the object should implement the equals() and the hashcode() method(not necessarily).
Are there any special conditions for an object to be inserted in a TreeMap ?
Unless a Comparator which mutually compares the keys is provided in the TreeMap's constructor, the keys must implement Comparable.
See the javadocs on TreeMap constructors for more information: http://download.oracle.com/javase/6/docs/api/java/util/TreeMap.html
EDIT: As #MeBigFatGuy points out it is highly recommended for keys to override equals() as well, in such a way that the implementation is consistent with the comparison. From the TreeMap javadoc:
Note that the ordering maintained by a sorted map (whether or not an explicit comparator is provided) must be consistent with equals if this sorted map is to correctly implement the Map interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Map interface is defined in terms of the equals operation, but a map performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the sorted map, equal. The behavior of a sorted map is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Map interface.
The type/class should (again, not necessarily) implement the Comparable interface (and override the compareTo method), so as to decide the ordering within the TreeMap.
I used a TreeSet with a self written Comparator. Now when I'm adding elements to the TreeSet and the Comparator's compare methods returns 0, it seems like the TreeSet contains only one of the Object with equal ranking.
I didn't see that this behaviour is documented in the javadocs. Maybe I miss something. Can you confirm this behaviour?
I edited the Comparator. Now it never returns 0 and the TreeSet contains all the Objects with equal ranking.
Is that the way it has to be, if I want to have multiple Objects with equal ranking?
That's the way it has to be, as a set is defined as including equal objects only once.
When your Comparator returns 0, two objects are considered equal, therefore only one (probably the first) of all equal objects is included in the set.
Yes, this is documented in the JavaDoc for TreeSet:
Note that the ordering maintained by a
set (whether or not an explicit
comparator is provided) must be
consistent with equals if it is to
correctly implement the Set interface.
(See Comparable or Comparator for a
precise definition of consistent with
equals.) This is so because the Set
interface is defined in terms of the
equals operation, but a TreeSet
instance performs all element
comparisons using its compareTo (or
compare) method, so two elements that
are deemed equal by this method are,
from the standpoint of the set, equal.
The behavior of a set is well-defined
even if its ordering is inconsistent
with equals; it just fails to obey the
general contract of the Set interface. (Emphasis mine)
If you want a sorted collection that can hold multiple objects which are equal to each other, then the TreeMultiset from Google Collections would probably do the trick.