Objects disappearing from Java TreeSet

Objects disappearing from Java TreeSet - java

I have a class that I'd like to put in a TreeSet, which implements Comparable to sort them by priority. Here's a small segment:
public abstract class PacketListener implements Comparable<PacketListener> {
public enum ListenerPriority {
LOWEST, LOW, NORMAL, HIGH, HIGHEST
}
private final ListenerPriority priority; // Initialized in constructor
// ... class body ...
#Override
public final int compareTo(PacketListener o) {
return priority.compareTo(o.priority);
}
}
The idea is obviously for the TreeSet to sort the objects by priority, allowing me to iterate through the listeners in order. However, I've discovered that, for some reason, I can't add a second PacketListener to the set object. After adding two different PacketListener objects, the size of the set remains at 1.
Should I not be using TreeSet?

The API docs for TreeSet contain this important information:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. [...] This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal.
In other words, a TreeSet can accommodate multiple instances of your PacketListener class, but only so long as each has a different priority than all the others, so that for every pair of elements A and B, exactly one of these holds: either A == B or A.compareTo(B) != 0.
If you must accommodate multiple PacketListener instances with the same priority in the same collection, then you need a different type of collection. HashSet is perfectly good with classes that use the equals() and hashCode() methods inherited from Object, provided that that is indeed the desired sense of instance equality. You could also consider a LinkedHashSet if you want some kind of guarantee about iteration order, or maybe a PriorityQueue if you want to order by priority but you are willing to use a different mechanism to avoid duplicates.

TreeSet treats two objects for which compareTo returns 0 to be equals. Which means you'll can never have two objects with the same priority in your tree set with your current implementation.
On way to fix your problem is to make your compareTo method consider all values that you care about (i.e. only objects that are actually equal return 0).

A Set is a collection of unique elements. As you are using the priority to compare PacketListener, I assume that there are at most five instances in you TreeSet, one for each priority.
If the structure allows you, you can find a secondary key to compare PacketListener, in case they have the same priority. If you can't, then TreeSet is the wrong way to go.

Related

Comparator Data Structure ordering - how to maintain insertion order whilst ordered on another field?

I have a data structure which takes an optional Comparator to customise ordering (in this case it is a TreeSet but really it doesn't matter, I can swap out for a PriorityQueue without breaking my code). At present, it is ordered by a field price on the object which the structure is storing.
When 2 objects have the same price, I want timestamp to be the tie-breaker, timestamp being System.CurrentTime. To specify this in the comparator I have to use:
if (Object1.getPrice == Object2.getPrice && Object1.Timestamp > Object2.Timestamp) return 1
The problem is that this breaks the equals case when I do TreeSet.floor() or TreeSet.ceiling() - the method no longer recognises that 2 objects are of equal price but will still recognise if the price is higher/lower. How do I mitigate this?

With the Comparator you decide what is the order of the set (actually as well their "uniqueness"!).
So you have to decide if the objects are equal:
if the Price is the same
if the Price and Timestamp is the same
maybe using different data-strucure for this different usages would be an option or a different that could cover both. Depending on the other requirements and boundaries of your code.
Take a look here: https://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html
A NavigableSet implementation based on a TreeMap. The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
So you provide the Comparator.
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal.
So if there is already a equals object in the Set (consulting the comparator) it want be added again as this would be placed at the very same position in the Set according to the order (if you don't like that don't use a Set). They should be implemented consistent (even tough it is not forced by the compiler). So instead of using direct equals it used the comparator.
Internally the comparator could use the equals operator of the objects (inside the if statement like if(obj1.equals(obj2)){return 0;}).
But how to implement the equals - internally it would rely on some property comparison (like you do now in the comparator).
If you don't like that for your data structure maybe don't use a (Tree)Set.
Depending on the needs a Map or a List might be fine.
Choosing the right Collection:
http://www.javapractices.com/topic/TopicAction.do?Id=65
For the difference between == and equals:
== is comparing the references, lets say the pointer to the container, if true it is the same instance.
equals is comparing the object, lets say the content of the object container, if true it holds same value(s) (in its properties)
But out there are probably millions of exhausting articles on that.
Note: equals of an Object should be implemented consistent with the hashCode() method. And for a (Tree)Set the comparator should be implemented consistent with equals (maybe use something like if(obj1.equals(obj2)){return 0;}). Furthermore, other Set implementations use the hasCode() for that.

Why is the comparator used instead of the equals() in Collections?

There is a java bean Car that might contains two values: model and price.
Now suppose I override equals() and hashcode() checking only for model in that way:
public boolean equals(Object o) {
return this.model.equals(o.model);
}
public int hashCode() {
return model.hashCode();
}
This permit me to check if an arraylist already contain an item Car of the same model (and doesn't matter the price), in that way:
List<Car> car = new ArrayList<Car>();
car.add(new Car("carA",100f));
car.add(new Car("carB",101f));
car.add(new Car("carC",110f));
System.out.println(a.contains(new Car("carB",111f)));
It returns TRUE. That's fine, because the car already exist!
But now I decide that the ArrayList is not good, because I want to maintain the items ordered, so I substitute it with a TreeSet in this way:
Set<Car> car = new TreeSet<Car>(new Comparator<Car>() {
#Override
public int compare(Car car1, Car car2) {
int compPrice = - Float.compare(car1.getPrice(), car2.getPrice());
if (compPrice > 0 || compPrice < 0)
return compPrice;
else
return car1.getModel().compareTo(car2.getModel());
}});
car.add(new Car("carA",100f));
car.add(new Car("carB",101f));
car.add(new Car("carC",110f));
System.out.println(a.contains(new Car("carB",111f)));
But now there is a problem, it return FALSE... why?
It seems that when I invoke contains() using an arrayList the method equals() is invoked.
But it seems that when I invoke contains() using a TreeSet with a comparator, the comparator is used instead.
Why does that happen?

TreeSet forms a binary tree keeping elements according to natural (or not) orders, so in order to search quickly one specific element is the collection, TreeSet uses Comparable or Comparator instead of equals().
As TreeSet JavaDoc precises:
Note that the ordering maintained by a set (whether or not an explicit
comparator is provided) must be consistent with equals if it is to
correctly implement the Set interface. (See Comparable or Comparator
for a precise definition of consistent with equals.) This is so
because the Set interface is defined in terms of the equals operation,
but a TreeSet instance performs all element comparisons using its
compareTo (or compare) method, so two elements that are deemed equal
by this method are, from the standpoint of the set, equal. The
behavior of a set is well-defined even if its ordering is inconsistent
with equals; it just fails to obey the general contract of the Set
interface.
We can find a similarity with the HashCode/Equals contract:
If equals() returns true, hashcode() has to return true too in order to be found during search.
Likewise with TreeSet:
If contains() (using Comparator or Comparable) returns true, equals() has to return true too in order to be consistent with equals().
THEREFORE: Fields used within TreeSet.equals() method have to be exactly the same (no more, no less) than within your Comparator implementation.

A TreeSet is implicitly sorted, and it uses a Comparator for this sorting. The equals() method can only tell you if two objects are the same or different, not how they should be ordered for sorting. Only a Comparator can do that.
More to the point, a TreeSet also uses comparisons for searching. This is sort of the whole point of tree-based map/set. When the contains() method is called, a binary search is performed and the target is either found or not found, based on how the comparator is defined. The comparator defines not only logical order but also logical identity. If you are relying on logical identity defined by an inconsistent equals() implementation, then confusion will probably ensue.

The reason for the different behaviour is, that you consider the price member in the compare method, but ignore it in equals.
new Car("carB",101f) // what you add to the list
new Car("carB",111f) // what you are looking for
Both instances are "equals" (sorry...) since their model members are equal (and the implementation stops after that test). They "compare" as different, though, because that implementation also checks the price member.

Contains and equals

I'm a little befuddled by some code:
for (AbstractItem item : mSetOfItems) {
if (item.equals(pPrimaryItem))
{
System.out.println("Contains? " + mSetOfItems.contains(pPrimaryItem));
}
}
How could it be possible that item.equals(pPrimaryItem) resolves as true, and mSetOfItems.contains(pPrimaryItem) resolves as false? Because that's what I'm seeing in my code.
In other words, if I iterate through my set, I can find an element equal to my test element. But if I use contains, my test elements is not reported being in the set. I'm baffled because I thought contains used equals. What could I be overlooking?

You didn't give the type of mSetOfItems, but I'm guessing that AbstractItem overrides .equals() but not .hashcode(). This is bad.
If mSetOfItems uses hashcode for lookup, which it could based on its type, you'll get the behavior you described.
Your assumption is that .contains() is implemented with iteration and .equals(). There's no list interface which guarantees that.

What is the implementation of mSetOfItems?
If it's a tree, it could be that your comparison function returns inconsistent values.
If it's a hash, it could be that your equals() returns true for objects with different hash codes, or that the object's hashCode() has changed since it was inserted into the set.

If your set is a TreeSet or some other set where you're using a custom comparator, then you could see this if the comparator was broken, either by not returning a valid sorted order or by having objects that are actually equal compare unequal. When the set internally looks up an element and uses the comparator, it would make a wrong choice and not see the element.
If your set is a HashSet, your hash function could be broken and cause two objects that are equal to have different hash code. Internally as the HashSet uses the object's hash code to figure out where to look, it might end up looking in the wrong bucket.
Alternatively, if you store objects in a Set of any sort and then modify them, you might end up breaking some internal invariant of the Set. For instance, if you store something in a HashSet and then change its value, it will be in the wrong bucket, and if you have a TreeSet and change the value it may appear in the wrong spot in sorted order.
If you are concurrently modifying the set, it's possible that you might have added the element in another thread but not had any guarantees that the operation that made that change be visible in another thread. The second thread would then not see the element even if it were added.

Check the hashcode() method of your class

If mSetOfItems is a java.util.HashTable (or similar 'Hash' Collection, Set, etc) then you must implement hashCode() as well. boolean contains(Object elem) will first try to find the passed object by calculating its hash and retrieving it in the Collection. Once contains finds something, it will then use the equals method to verify that the two objects are the same objects according to your implementation.
If not properly overridden, hashCode() will return an unpredictable int that is usually the integer representation of the internal address of the object itself. This will always be different for two distinct objects no matter the the values of their instance variables. If not overridden, contains won't be able to find it any objects...
When implementing hashCode() remind that:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Also, make sure that you properly overridden the equals function by respecting its signature:
public boolean equals(Object obj);

Java SortedSet + Comparator, consistency with equals() question

I'd like to have a SortedSet of Collections (Sets themselves, in this case, but not necessarily in general), that sorts by Collection size. This seems to violate the proscription to have the Comparator be consistent with equals() - i.e., two collections may be unequal (by having different elements), but compare to the same value (because they have the same number of elements).
Notionally, I could also put into the Comparator ways to sort sets of equal sizes, but the use of the sort wouldn't take advantage of that, and there's not really a useful + intuitive way to compare the Collections of equal size (at least, in my particular case), so that seems like a waste.
Does this case of inconsistency seem like a problem?

SortedSet interface extends core Set and thus should conform to the contract outlined in Set specification.
The only possible way to achieve that is to have your element's equal() method behavior be consistent with your Comparator - the reason for that is that core Set operates based on equality whereas SortedSet operates based on comparison.
For example, add() method defined in core Set interface specifies that you can't add an element to the set if there already is an element whose equal() method would return true with this new element as argument. Well, SortedSet doesn't use equal(), it uses compareTo(). So if your compareTo() returns false your element WILL be added even if equals() were to return true, thus breaking the Set contract.
None of this is a practical problem per se, however. SortedSet behavior is always consistent, even if compare() vs equals() are not.

As ChssPly76 wrote in a comment, you can use hashCode to decide the compareTo call in the case where two Collections have the same size but are not equal. This works fine, except in the rare case where you have two Collections with the same size, are not equal, but have the same hashCode. Admittedly, the chances of that happening are pretty small, but it is conceivable. If you want to be really careful, instead of hashCode, use System.identityHashCode instead. This should give you a unique number for each Collection, and you shouldn't get collisions.
At the end of the day, this gives you the functionality of having the Collections in the Set sorted by size, with arbitrary ordering in the case of two Collections with matching size. If this is all you need, it's not much slower than the usual comparison would be. If you need the ordering to be consistent between different JVM instances, this won't work and you'll have to do it some other way.
pseudocode:
if (a.equals(b)) {
return 0;
} else if (a.size() > b.size()) {
return 1;
} else if (b.size() > a.size()) {
return -1;
} else {
return System.identityHashCode(a) > System.identityHashCode(b) ? 1 : -1;
}

This seems to violate the proscription
to have the Comparator be consistent
with equals() - i.e., two collections
may be unequal (by having different
elements), but compare to the same
value (because they have the same
number of elements).
There is no requirement, either stated (in the Javadoc) or implied, that a Comparator be consistent with an object's implementation of boolean equals(Object).
Note that Comparable and Comparator are distinct interfaces with different purposes. Comparable is used to define a 'natural' order for a class. In that context, it would be a bad idea for equals and compateTo to be inconsistent. By contrast, a Comparator is used when you want to use a different order to the natural order of a class.
EDIT: Here's the complete paragraph from the Javadoc for SortedSet.
Note that the ordering maintained by a
sorted set (whether or not an explicit
comparator is provided) must be
consistent with equals if the sorted
set is to correctly implement the Set
interface. (See the Comparable
interface or Comparator interface for
a precise definition of consistent
with equals.) This is so because the
Set interface is defined in terms of
the equals operation, but a sorted
set performs all element comparisons
using its compareTo (or compare)
method, so two elements that are
deemed equal by this method are, from
the standpoint of the sorted set,
equal. The behavior of a sorted set is
well-defined even if its ordering is
inconsistent with equals; it just
fails to obey the general contract of
the Set interface.
I have highlighted the final sentence. The point is that such a SortedSet will work as you would most likely expect, but the behavior of some operations won't exactly match the Set specification ... because the specification defines their behavior in terms of the equals method.
So in fact, there is a stated requirement for consistency (my mistake), but the consequences of ignoring it are not as bad as you might think. Of course, it is up to decide if you should do that. In my estimation, it should be OK, provided that you comment the code thoroughly and make sure that the SortedSet does not 'leak'.
However, it is not clear to me that a Comparator for collections that only looks at an collections "size" is going to work ... from a semantic perspective. I mean, do you really want to say that all collections with (say) 2 elements are equal? This will mean that your set can only ever contain one collection of any given size ...

There is no reason why a Comparator should return the same results as equals(). In fact, the Comparator API was introduced because equals() just isn't enough: If you want to sort a collection, you must know whether two elements are lesser or greater.

It's a little bit odd that SortedSet as a part of the standard API breaks the contract defined in the Set interface and uses the Comparator to define equality instead of the equals method, but that's how it is.
If your actual problem is to sort a collection of collections according to the containted collections' size, you are better of with a List, which you can sort using Collections.sort(List, Comparator>);

TreeSet Comperator

I used a TreeSet with a self written Comparator. Now when I'm adding elements to the TreeSet and the Comparator's compare methods returns 0, it seems like the TreeSet contains only one of the Object with equal ranking.
I didn't see that this behaviour is documented in the javadocs. Maybe I miss something. Can you confirm this behaviour?
I edited the Comparator. Now it never returns 0 and the TreeSet contains all the Objects with equal ranking.
Is that the way it has to be, if I want to have multiple Objects with equal ranking?

That's the way it has to be, as a set is defined as including equal objects only once.
When your Comparator returns 0, two objects are considered equal, therefore only one (probably the first) of all equal objects is included in the set.

Yes, this is documented in the JavaDoc for TreeSet:
Note that the ordering maintained by a
set (whether or not an explicit
comparator is provided) must be
consistent with equals if it is to
correctly implement the Set interface.
(See Comparable or Comparator for a
precise definition of consistent with
equals.) This is so because the Set
interface is defined in terms of the
equals operation, but a TreeSet
instance performs all element
comparisons using its compareTo (or
compare) method, so two elements that
are deemed equal by this method are,
from the standpoint of the set, equal.
The behavior of a set is well-defined
even if its ordering is inconsistent
with equals; it just fails to obey the
general contract of the Set interface. (Emphasis mine)

If you want a sorted collection that can hold multiple objects which are equal to each other, then the TreeMultiset from Google Collections would probably do the trick.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.