Coming from a c++ world, I find reading of the HashSet documentation somewhat hard:
https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html
In c++, you would have:
http://en.cppreference.com/w/cpp/container/set
which in turns points to:
http://en.cppreference.com/w/cpp/concept/Compare
Which makes it obvious the requirement for the type of element handled by a std::set. My question is: What are the requirements for the type (E) of elements maintained by a Set in Java ?
Here is a short example which I fail to understand:
import gdcm.Tag;
import java.util.Set;
import java.util.HashSet;
public class TestTag
{
public static void main(String[] args) throws Exception
{
Tag t1 = new Tag(0x8,0x8);
Tag t2 = new Tag(0x8,0x8);
if( t1 == t2 )
throw new Exception("Instances are identical" );
if( !t1.equals(t2) )
throw new Exception("Instances are different" );
if( t1.hashCode() != t2.hashCode() )
throw new Exception("hashCodes are different" );
Set<Tag> s = new HashSet<Tag>();
s.add(t1);
s.add(t2);
if( s.size() != 1 )
throw new Exception("Invalid size: " + s.size() );
}
}
The above simple code fails with:
Exception in thread "main" java.lang.Exception: Invalid size: 2 at TestTag.main(TestTag.java:42)
From my reading of the documentation only the equals operator needs to be implemented for Set:
https://docs.oracle.com/javase/7/docs/api/java/util/Set.html
What am I missing from the documentation ?
I just tried to reproduce your issue, and maybe you just didn't override equals and/or hashSet correctly.
Take a look at my incorrect implemenation of Tag:
public class Tag {
private int x, y;
public Tag(int x, int y) {
this.x = x;
this.y = y;
}
public boolean equals(Tag tag) {
if (x != tag.x) return false;
return y == tag.y;
}
#Override
public int hashCode() {
int result = x;
result = 31 * result + y;
return result;
}
}
Looks quite ok doesn't it? But the problem is, I actually do not override the correct equals method, I overloaded it with my own implementation.
To work correctly, equals has to look like this:
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Tag tag = (Tag) o;
if (x != tag.x) return false;
return y == tag.y;
}
What am I missing from the documentation ?
You are looking at the wrong part of the documentation.
The C++ set is an "sorted set of unique objects", and are "usually implemented as red-black trees."
In Java, Set is a more abstract concept (it's an interface, not a class) with multiple implementations, most notably the HashSet and the TreeSet (ignoring concurrent implementations).
As you can probably guess from the name alone, the Java TreeSet is the equivalent of the C++ set.
As for requirements, HashSet uses the hashCode() and equals() methods. They are defined on the Object class, and needs to be overridden on classes that needs to be in a HashSet or as keys in a HashMap.
For TreeSet and keys of TreeMap, you have two options: Provide a Comparator when creating the TreeSet (similar to C++), or have the objects implement the Comparable interface.
I guess this was simply a combination of bad luck and misunderstanding of HashSet requirement. Thanks to #christophe for help, I realized the issue when I tried adding in my swig generated Tag.java class:
#Override
public boolean equals(Object o) {
}
I got the following error message:
gdcm/Tag.java:78: error: method does not override or implement a method from a supertype
#Override
^
1 error
1 warning
Which meant my error was simply:
I had the wrong signature in the first place: boolean equals(Object o) != boolean equals(Tag t)
The hint was simply to use the #Override keyword.
For those asking for the upstream code, the Java code is generated by swig. The original c++ code is here:
https://github.com/malaterre/GDCM/blob/master/Source/DataStructureAndEncodingDefinition/gdcmTag.h
Related
This question already has answers here:
"Comparison method violates its general contract!"
(13 answers)
Closed 6 years ago.
I am trying to make a search auto-complete feature in an android app. There is list of business names and also a list business types that I am putting into one arraylist, scoring according to a fuzzy search algorithm, and then sorting the list based on the scores against the search term. I want business types that score the same as a business name to appear first. I wrap the instances of Business and BusinessType in this class as they are scored, add to list and then sort :
public class SearchMatch<T extends NameMatcher> implements Comparable<SearchMatch> {
public T data;
public int score;
public SearchMatch(T data, int score) {
this.data = data;
this.score = score;
}
#Override
public int compareTo(SearchMatch o) {
if(this.score == o.score && this.data instanceof BusinessType
&& o.data instanceof Business){
return -1;
}
return o.score - this.score;
}
}
... but this does not work. I get a "Comparison method violates its general contract." from Collections.sort - and nothing else in logcat.
I cant see what is wrong with it or how it violates transitivity (from other similar posts). The strange thing is if I return 1 instead of -1 I dont get the error but I get the wrong order of preference.
Thanks
Solved
public class SearchMatch<T extends NameMatcher> implements Comparable<SearchMatch> {
public final T data;
public final int score;
public SearchMatch(final T data,final int score) {
this.data = data;
this.score = score;
}
#Override
public int compareTo(SearchMatch o) {
if(this.score == o.score && this.data instanceof BusinessType
&& o.data instanceof Business){
return -1;
}
if(this.score == o.score && this.data instanceof Business
&& o.data instanceof BusinessType){
return 1;
}
return o.score - this.score;
}
}
Consider this:
Suppose you have an two objects b and bt that have the same score, one is a Business and the other is a BusinessType.
compareTo(bt, b) -> -1
compareTo(b, bt) -> 0 // Incorrect! This should be >= 0.
Another potential problem is that o.score - this.score does not take account of integer overflow.
It is also possible that you are changing the values of score and/or data, or that the values are inconsistent due to incorrect synchronization. I recommend that you declare the two fields to be final. That will make the class immutable, and will also remove the need for synchronization (in this respect at least).
(Please refer to the other answers for the explanation of the "contract" that you are violating.)
You have violated the contract of comparator found in the comparator javadoc that states
sgn(compare(x, y)) == -sgn(compare(y, x))
Essentially you return -1 for both instances of x and y such that the if statement holds. This is a violation of the compartor contract. You should try returning 0 or throwing an exception if it should not occur.
Here is the documentation for Comparable. It states the following:
The implementor must ensure sgn(x.compareTo(y)) ==
-sgn(y.compareTo(x)) for all x and y. (This implies that x.compareTo(y) must throw an exception if y.compareTo(x) throws an
exception.)
The implementor must also ensure that the relation is transitive:
(x.compareTo(y)>0 && y.compareTo(z)>0) implies x.compareTo(z)>0.
Your example does not seem to be satisfying these hence, the warning.
This question already has answers here:
Java 1.7 Override of hashCode() not behaving as I would expect
(2 answers)
Closed 6 years ago.
I seem to be getting duplicate keys in the standard Java HashMap. By "duplicate", I mean the keys are equal by their equals() method. Here is the problematic code:
import java.util.Map;
import java.util.HashMap;
public class User {
private String userId;
public User(String userId) {
this.userId = userId;
}
public boolean equals(User other) {
return userId.equals(other.getUserId());
}
public int hashCode() {
return userId.hashCode();
}
public String toString() {
return userId;
}
public static void main(String[] args) {
User arvo1 = new User("Arvo-Part");
User arvo2 = new User("Arvo-Part");
Map<User,Integer> map = new HashMap<User,Integer>();
map.put(arvo1,1);
map.put(arvo2,2);
System.out.println("arvo1.equals(arvo2): " + arvo1.equals(arvo2));
System.out.println("map: " + map.toString());
System.out.println("arvo1 hash: " + arvo1.hashCode());
System.out.println("arvo2 hash: " + arvo2.hashCode());
System.out.println("map.get(arvo1): " + map.get(arvo1));
System.out.println("map.get(arvo2): " + map.get(arvo2));
System.out.println("map.get(arvo2): " + map.get(arvo2));
System.out.println("map.get(arvo1): " + map.get(arvo1));
}
}
And here is the resulting output:
arvo1.equals(arvo2): true
map: {Arvo-Part=1, Arvo-Part=2}
arvo1 hash: 164585782
arvo2 hash: 164585782
map.get(arvo1): 1
map.get(arvo2): 2
map.get(arvo2): 2
map.get(arvo1): 1
As you can see, the equals() method on the two User objects is returning true and their hash codes are the same, yet they each form a distinct key in map. Furthermore, map continues to distinguish between the two User keys in the last four get() calls.
This directly contradicts the documentation:
More formally, if this map contains a mapping from a key k to a value v such that (key==null ? k==null : key.equals(k)), then this method returns v; otherwise it returns null. (There can be at most one such mapping.)
Is this a bug? Am I missing something here? I'm running Java version 1.8.0_92, which I installed via Homebrew.
EDIT: This question has been marked as a duplicate of this other question, but I'll leave this question as is because it identifies a seeming inconsistency with equals(), whereas the other question assumes the error lies with hashCode(). Hopefully the presence of this question will make this issue more easily searchable.
The issue lies in your equals() method. The signature of Object.equals() is equals(OBJECT), but in your case it is equals(USER), so these are two completely different methods and the hashmap is calling the one with Object parameter. You can verify that by putting an #Override annotation over your equals - it will generate a compiler error.
The equals method should be:
#Override
public boolean equals(Object other) {
if(other instanceof User){
User user = (User) other;
return userId.equals(user.userId);
}
return false;
}
As a best practice you should always put #Override on the methods you override - it can save you a lot of trouble.
Your equals method does not override equals, and the types in the Map are erased at runtime, so the actual equals method called is equals(Object). Your equals should look more like this:
#Override
public boolean equals(Object other) {
if (!(other instanceof User))
return false;
User u = (User)other;
return userId.equals(u.userId);
}
OK, so first of all, the code doesn't compile. Missing this method:
other.getUserId()
But aside from that, you'll need to #Override equals method, IDE like Eclipse can also help generating equals and hashCode btw.
#Override
public boolean equals(Object obj)
{
if(this == obj)
return true;
if(obj == null)
return false;
if(getClass() != obj.getClass())
return false;
User other = (User) obj;
if(userId == null)
{
if(other.userId != null)
return false;
}
else if(!userId.equals(other.userId))
return false;
return true;
}
Like others answered you had a problem with the equals method signature. According to Java equals best practice you should implement equals like the following :
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
User user = (User) o;
return userId.equals(user.userId);
}
Same thing applies for the hashCode() method. see Overriding equals() and hashCode() method in Java
The Second Problem
you don't have duplicates anymore now, but you have a new problem, your HashMap contains only one element:
map: {Arvo-Part=2}
This is because both User objects are referencing the same String(JVM String Interning), and from the HashMap perspective your two objects are the same, since both objects are equivalent in hashcode and equals methods. so when you add your second object to the HashMap you override your first one.
to avoid this problem, make sure you use a unique ID for each User
A simple demonstration on your users :
As a relative Java noob, I was baffled to find out the following:
Point.java:
public class Point {
...
public boolean equals(Point other) {
return x == other.x && y == other.y;
}
...
}
Edge.java:
public class Edge {
public final Point a, b;
...
public boolean equals(Edge other) {
return a.equals(other.a) && b.equals(other.b);
}
...
}
main snippet:
private Set blockedEdges;
public Program(...) {
...
blockedEdges = new HashSet<Edge>();
for (int i = 0; ...) {
for (int j = 0; ...) {
Point p = new Point(i, j);
for (Point q : p.neighbours()) {
Edge e = new Edge(p, q);
Edge f = new Edge(p, q);
blockedEdges.add(e);
// output for each line is:
// contains e? true; e equals f? true; contains f? false
System.out.println("blocked edge from "+p+"to " + q+
"; contains e? " + blockedEdges.contains(e)+
" e equals f? "+ f.equals(e) +
"; contains f? " + blockedEdges.contains(f));
}
}
}
}
Why is this surprising? Because I checked the documentation before I coded this to rely on equality and it says:
Returns true if this set contains the specified element. More
formally, returns true if and only if this set contains an element e
such that (o==null ? e==null : o.equals(e))
This sentence is very clear and it states that nothing more than equality is needed. f.equals(e) returns true as shown in the output. So clearly the set does indeed contain an element e such that o.equals(e), yet contains(o) returns false.
While it is certainly understandable that a hash set also depends on the hash values being the same, this fact is mentioned neither in the docs of HashSet itself, nor is any such possibility mentioned in the docs of Set.
Thus, HashSet doesn't adhere to its specification. This looks like a very serious bug to me. Am I completely on the wrong track here? Or how come behaviour like this is accepted?
You're not overriding equals (you're overloading it). equals need to accept an Object as argument.
Do something like
#Override
public boolean equals(Object o) {
if (!(o instanceof Point))
return false;
Point other = (Point) o;
return x == other.x && y == other.y;
}
(and same for Edge)
It's also important to always override hashCode when you're overriding equals. See for instance Why do I need to override the equals and hashCode methods in Java?
Note that this mistake would have been caught by the compile if you had used #Override. This is why it's good practice to always use it where possible.
Of course, empty definition can differ. I'm used to PHP's empty though, which calls empty everything that evaluates to false. I'd like to call these things empty in my Java application:
null
String of zero length
0 Integer, Float or Double
false
Any array of zero length
Empty ArrayList or HashMap
Java has, for example, toString convention. Every object is granted to give you some string representation. In my Settings class I operate with HashMap<String, Object>. My empty method looks now like this:
public boolean empty(String name) {
Object val = settings.get(name);
if(val!=null) {
return false;
}
return true;
}
I'd like to extend it in a conventional manner, rather than if(val instanceof XXX) chain.
No, there is no standard convention for this in Java. Also, in Java there is no such thing as "evaluate to false" (except for booleans and Booleans, of course).
You will have to write a method (or rather, a series of overloaded methods for each type you need it for) which implements your notion of "empty". For example:
public static boolean isEmpty(String s) {
return (s == null) || (s.isEmpty());
}
public static boolean isEmpty(int i) {
return i == 0;
}
...
You could use overloading to describe all the "empty" objects:
public static boolean empty(Object o) {
return o == null;
}
public static boolean empty(Object[] array) {
return array == null || array.length == 0;
}
public static boolean empty(int[] array) { //do the same for other primitives
return array == null || array.length == 0;
}
public static boolean empty(String s) {
return s == null || s.isEmpty();
}
public static boolean empty(Number n) {
return n == null || n.doubleValue() == 0;
}
public static boolean empty(Collection<?> c) {
return c == null || c.isEmpty();
}
public static boolean empty(Map<?, ?> m) {
return m == null || m.isEmpty();
}
Examples:
public static void main(String[] args) {
Object o = null;
System.out.println(empty(o));
System.out.println(empty(""));
System.out.println(empty("as"));
System.out.println(empty(new int[0]));
System.out.println(empty(new int[] { 1, 2}));
System.out.println(empty(Collections.emptyList()));
System.out.println(empty(Arrays.asList("s")));
System.out.println(empty(0));
System.out.println(empty(1));
}
AFAIK there is no such convention. It's fairly common to see project specific utility classes with methods such as:
public static boolean isEmpty(String s) {
return s == null || s.isEmpty();
}
However I personally think its use is a bit of a code smell in Java. There's a lot of badly written Java around, but well written Java shouldn't need null checks everywhere, and you should know enough about the type of an object to apply type-specific definitions of "empty".
The exception would be if you were doing reflection-oriented code that worked with Object variables who's type you don't know at compile time. That code should be so isolated that it's not appropriate to have a util method to support it.
Python's duck-typing means the rules are sort of different.
How about creating an interface EmptinessComparable or something similar, and having all your classes implement that? So you can just expect that, and not have to ask instanceof every time.
Java does not, but Groovy does. Groovy runs on the Java VM alongside Java code and provides many shortcuts and convenient conventions such as this. A good approach is write foundation and crital project components in Java and use Groovy for less critical higher level components.
If you want to use the one approach, I would overload a utility method:
public class MyUtils {
public static boolean isEmpty(String s) {
return s == null || s.isEmpty();
}
public static boolean isEmpty(Boolean b) {
return b == null || !b;
}
// add other versions of the method for other types
}
Then your code always looks like:
if (MyUtils.isEmpty(something))
If the type you're checking isn't supported, you'll get a compiler error, and you can implement another version as you like.
There are ways to establish the notion of emptiness but it's not standardized across all Java classes. For example, the Map (implementation) provides the Map#containsKey() method to check if a key exists or not. The List and String (implementations) provide the isEmpty() method but the List or String reference itself could be null and hence you cannot avoid a null check there.
You could however come up with a utility class of your own that takes an Object and using instanceof adapts the empty checks accordingly.
public final class DataUtils {
public static boolean isEmpty(Object data) {
if (data == null) {
return false;
}
if (data instanceof String) {
return ((String) data).isEmpty();
}
if (data instanceof Collection) {
return ((Collection) data).isEmpty();
}
}
}
The Guava Libraries already contains Defaults class that do just that.
Calling defaultValue will return the default value for any primitive type (as specified by the JLS), and null for any other type.
You can use it like shown below:
import com.google.common.base.Defaults;
Defaults.defaultValue(Integer.TYPE); //will return 0
Below is example code on how to use it:
import com.google.common.base.Defaults;
public class CheckingFieldsDefault
{
public static class MyClass {
private int x;
private int y = 2;
}
public static void main() {
MyClass my = new MyClass();
System.out.println("x is defualt: " + (my.x == Defaults.defaultValue(box(my.x).TYPE)));
System.out.println("y is defualt: " + (my.y == Defaults.defaultValue(box(my.y).TYPE)));
}
private static <T extends Object> T box(T t) {
return t;
}
}
So I've been struggling with a problem for a while now, figured I might as well ask for help here.
I'm adding Ticket objects to a TreeSet, Ticket implements Comparable and has overridden equals(), hashCode() and CompareTo() methods. I need to check if an object is already in the TreeSet using contains(). Now after adding 2 elements to the set it all checks out fine, yet after adding a third it gets messed up.
running this little piece of code after adding a third element to the TreeSet, Ticket temp2 is the object I'm checking for(verkoopLijst).
Ticket temp2 = new Ticket(boeking, TicketType.STANDAARD, 1,1);
System.out.println(verkoop.getVerkoopLijst().first().hashCode());
System.out.println(temp2.hashCode());
System.out.println(verkoop.getVerkoopLijst().first().equals(temp2));
System.out.println(verkoop.getVerkoopLijst().first().compareTo(temp2));
System.out.println(verkoop.getVerkoopLijst().contains(temp2));
returns this:
22106622
22106622
true
0
false
Now my question would be how this is even possible?
Edit:
public class Ticket implements Comparable{
private int rijNr, stoelNr;
private TicketType ticketType;
private Boeking boeking;
public Ticket(Boeking boeking, TicketType ticketType, int rijNr, int stoelNr){
//setters
}
#Override
public int hashCode(){
return boeking.getBoekingDatum().hashCode();
}
#Override
#SuppressWarnings("EqualsWhichDoesntCheckParameterClass")
public boolean equals(Object o){
Ticket t = (Ticket) o;
if(this.boeking.equals(t.getBoeking())
&&
this.rijNr == t.getRijNr() && this.stoelNr == t.getStoelNr()
&&
this.ticketType.equals(t.getTicketType()))
{
return true;
}
else return false;
}
/*I adjusted compareTo this way because I need to make sure there are no duplicate Tickets in my treeset. Treeset seems to call CompareTo() to check for equality before adding an object to the set, instead of equals().
*/
#Override
public int compareTo(Object o) {
int output = 0;
if (boeking.compareTo(((Ticket) o).getBoeking())==0)
{
if(this.equals(o))
{
return output;
}
else return 1;
}
else output = boeking.compareTo(((Ticket) o).getBoeking());
return output;
}
//Getters & Setters
On compareTo contract
The problem is in your compareTo. Here's an excerpt from the documentation:
Implementor must ensure sgn(x.compareTo(y)) == -sgn(y.compareTo(x)) for all x and y.
Your original code is reproduced here for reference:
// original compareTo implementation with bug marked
#Override
public int compareTo(Object o) {
int output = 0;
if (boeking.compareTo(((Ticket) o).getBoeking())==0)
{
if(this.equals(o))
{
return output;
}
else return 1; // BUG!!!! See explanation below!
}
else output = boeking.compareTo(((Ticket) o).getBoeking());
return output;
}
Why is the return 1; a bug? Consider the following scenario:
Given Ticket t1, t2
Given t1.boeking.compareTo(t2.boeking) == 0
Given t1.equals(t2) return false
Now we have both of the following:
t1.compareTo(t2) returns 1
t2.compareTo(t1) returns 1
That last consequence is a violation of the compareTo contract.
Fixing the problem
First and foremost, you should have taken advantage of the fact that Comparable<T> is a parameterizable generic type. That is, instead of:
// original declaration; uses raw type!
public class Ticket implements Comparable
it'd be much more appropriate to instead declare something like this:
// improved declaration! uses parameterized Comparable<T>
public class Ticket implements Comparable<Ticket>
Now we can write our compareTo(Ticket) (no longer compareTo(Object)). There are many ways to rewrite this, but here's a rather simplistic one that works:
#Override public int compareTo(Ticket t) {
int v;
v = this.boeking.compareTo(t.boeking);
if (v != 0) return v;
v = compareInt(this.rijNr, t.rijNr);
if (v != 0) return v;
v = compareInt(this.stoelNr, t.stoelNr);
if (v != 0) return v;
v = compareInt(this.ticketType, t.ticketType);
if (v != 0) return v;
return 0;
}
private static int compareInt(int i1, int i2) {
if (i1 < i2) {
return -1;
} else if (i1 > i2) {
return +1;
} else {
return 0;
}
}
Now we can also define equals(Object) in terms of compareTo(Ticket) instead of the other way around:
#Override public boolean equals(Object o) {
return (o instanceof Ticket) && (this.compareTo((Ticket) o) == 0);
}
Note the structure of the compareTo: it has multiple return statements, but in fact, the flow of logic is quite readable. Note also how the priority of the sorting criteria is explicit, and easily reorderable should you have different priorities in mind.
Related questions
What is a raw type and why shouldn't we use it?
How to sort an array or ArrayList ASC first by x and then by y?
Should a function have only one return statement?
This could happen if your compareTo method isn't consistent. I.e. if a.compareTo(b) > 0, then b.compareTo(a) must be < 0. And if a.compareTo(b) > 0 and b.compareTo(c) > 0, then a.compareTo(c) must be > 0. If those aren't true, TreeSet can get all confused.
Firstly, if you are using a TreeSet, the actual behavior of your hashCode methods won't affect the results. TreeSet does not rely on hashing.
Really we need to see more code; e.g. the actual implementations of the equals and compareTo methods, and the code that instantiates the TreeSet.
However, if I was to guess, it would be that you have overloaded the equals method by declaring it with the signature boolean equals(Ticket other). That would lead to the behavior that you are seeing. To get the required behavior, you must override the method; e.g.
#Override
public boolean equals(Object other) { ...
(It is a good idea to put in the #Override annotation to make it clear that the method overrides a method in the superclass, or implements a method in an interface. If your method isn't actually an override, then you'll get a compilation error ... which would be a good thing.)
EDIT
Based on the code that you have added to the question, the problem is not overload vs override. (As I said, I was only guessing ...)
It is most likely that the compareTo and equals are incorrect. It is still not entirely clear exactly where the bug is because the semantics of both methods depends on the compareTo and equals methods of the Boeking class.
The first if statement of the Ticket.compareTo looks highly suspicious. It looks like the return 1; could cause t1.compareTo(t2) and t2.compareTo(t1) to both return 1 for some tickets t1 and t2 ... and that would definitely be wrong.