I have an object that represents an UNKNOWN value, or "Null object".
As in SQL, this object should never be equal to anything, including another UNKNOWN, so (UNKNOWN == UNKNOWN) -> false.
The object is used, however, in hashtables and the type is Comparable, so I created a class as follows:
public class Tag implements Comparable {
final static UNKNOWN = new Tag("UNKNOWN");
String id;
public Tag(String id) {
this.id = id;
}
public int hashCode(){
return id.hashCode();
}
public String toString(){
return id;
}
public boolean equals(Object o){
if (this == UNKNOWN || o == UNKNOWN || o == null || !(o instanceof Tag))
return false;
return this.id.equals(((Tag)o).id);
}
public int compareTo(Tag o){
if (this == UNKNOWN)
return -1;
if (o == UNKNOWN || o == null)
return 1;
return this.id.compareTo(o.id);
}
}
But now compareTo() seems "inconsistent"?
Is there a better way to implement compareTo()?
The documentation for compareTo mentions this situation:
It is strongly recommended, but not strictly required that
(x.compareTo(y)==0) == (x.equals(y))
Generally speaking, any class that implements the Comparable interface and violates this condition should clearly indicate this fact. The recommended language is "Note: this class has a natural ordering that is inconsistent with equals."
Therefore, if you want your object to be Comparable and yet still not allow two UNKNOWN objects to be equal via the equals method, you must make your compareTo "Inconsistent with equals."
An appropriate implementation would be:
public int compareTo(Tag t) {
return this.id.compareTo(t.id);
}
Otherwise, you could make it explicit that UNKNOWN values in particular are not Comparable:
public static boolean isUnknown(Tag t) {
return t == UNKNOWN || (t != null && "UNKNOWN".equals(t.id));
}
public int compareTo(Tag t) {
if (isUnknown(this) || isUnknown(t)) {
throw new IllegalStateException("UNKNOWN is not Comparable");
}
return this.id.compareTo(t.id);
}
You're correct that your compareTo() method is now inconsistent. It violates several of the requirements for this method. The compareTo() method must provide a total order over the values in the domain. In particular, as mentioned in the comments, a.compareTo(b) < 0 must imply that b.compareTo(a) > 0. Also, a.compareTo(a) == 0 must be true for every value.
If your compareTo() method doesn't fulfil these requirements, then various pieces of the API will break. For example, if you sort a list containing an UNKNOWN value, then you might get the dreaded "Comparison method violates its general contract!" exception.
How does this square with the SQL requirement that null values aren't equal to each other?
For SQL, the answer is that it bends its own rules somewhat. There is a section in the Wikipedia article you cited that covers the behavior of things like grouping and sorting in the presence of null. While null values aren't considered equal to each other, they are also considered "not distinct" from each other, which allows GROUP BY to group them together. (I detect some specification weasel wording here.) For sorting, SQL requires ORDER BY clauses to have additional NULLS FIRST or NULLS LAST in order for sorting with nulls to proceed.
So how does Java deal with IEEE 754 NaN which has similar properties? The result of any comparison operator applied to NaN is false. In particular, NaN == NaN is false. This would seem to make it impossible to sort floating point values, or to use them as keys in maps. It turns out that Java has its own set of special cases. If you look at the specifications for Double.compareTo() and Double.equals(), they have special cases that cover exactly these situations. Specifically,
Double.NaN == Double.NaN // false
Double.valueOf(Double.NaN).equals(Double.NaN) // true!
Also, Double.compareTo() is specified so that it considers NaN equal to itself (it is consistent with equals) and NaN is considered larger than every other double value including POSITIVE_INFINITY.
There is also a utility method Double.compare(double, double) that compares two primitive double values using these same semantics.
These special cases let Java sorting, maps, and so forth work perfectly well with Double values, even though this violates IEEE 754. (But note that primitive double values do conform to IEEE 754.)
How should this apply to your Tag class and its UNKNOWN value? I don't think you need to follow SQL's rules for null here. If you're using Tag instances in Java data structures and with Java class libraries, you'd better make it conform to the requirements of the compareTo() and equals() methods. I'd suggest making UNKNOWN equal to itself, to have compareTo() be consistent with equals, and to define some canonical sort order for UNKNOWN values. Usually this means sorting it higher than or lower than every other value. Doing this isn't terribly difficult, but it can be subtle. You need to pay attention to all the rules of compareTo().
The equals() method might look something like this. Fairly conventional:
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
return obj instanceof Tag && id.equals(((Tag)obj).id);
}
Once you have this, then you'd write compareTo() in a way that relies on equals(). (That's how you get the consistency.) Then, special-case the unknown values on the left or right-hand sides, and finally delegate to comparison of the id field:
public int compareTo(Tag o) {
if (this.equals(o)) {
return 0;
}
if (this.equals(UNKNOWN)) {
return -1;
}
if (o.equals(UNKNOWN)) {
return 1;
}
return id.compareTo(o.id);
}
I'd recommend implementing equals(), so that you can do things like filter UNKNOWN values of a stream, store it in collections, and so forth. Once you've done that, there's no reason not to make compareTo consistent with equals. I wouldn't throw any exceptions here, since that will just make standard libraries hard to use.
The simple answer is: you shouldn't.
You have contradiction requirements here. Either your tag objects have an implicit order (that is what Comparable expresses) OR you can have such "special" values that are not equal to anything, not even themselves.
As the other excellent answer and the comments point out: yes, you can somehow get there; for example by simply allowing for a.compare(b) < 0 and b.compare(a) < 0 at the same time; or by throwing an exception.
But I would simply be really careful about this. You are breaking a well established contract. And the fact that some javadoc says: "breaking the contract is OK" is not the point - breaking that contract means that all the people working on this project have to understand this detail.
Coming from there: you could go forward and simply throw an exception within compareTo() if a or b are UNKNOWN; by doing so you make at least clear that one shouldn't try to sort() a List<Tag> for example. But hey, wait - how would you find out that UNKNOWN is present in your list? Because, you know, UNKNOWN.equals(UNKNOWN) returns false; and contains() is using equals.
In essence: while technically possible, this approach causes breakages wherever you go. Meaning: the fact that SQL supports this concept doesn't mean that you should force something similar into your java code. As said: this idea is very much "off standards"; and is prone to surprise anybody looking at it. Aka "unexpected behavior" aka bugs.
A couple seconds of critical thinking:
There is already a null in Java and you can not use it as a key for a reason.
If you try and use a key that is not equal to anything else including
itself you can NEVER retrieve the value associated with that key!
Related
The following compares two enum values using ==:
MyEnum enum1 = blah(); // could return null
MyEnum enum2 = blahblah() // could return null
if (enum1 == enum2) {
// ...
}
But PMD gives a CompareObjectsWithEquals warning on line 3:
Use equals() to compare object references
Not sure I understand the source code for this check but thought it was OK to compare two enums using == so am wondering whether my code could be improved or the check is incorrect.
This is indeed accepted as bug:
http://sourceforge.net/p/pmd/bugs/1028/
http://sourceforge.net/p/pmd/bugs/909/
However, it seems to be tricky to catch all possible cases (quote from the newer bug):
That one is a bit tricky, as in order to determine, whether a type is
a Enum, we need type resolution.
I was able to adjust the rule to check, whether the type of the
variables is an Enum. This works only, if the Enum types are on the
"auxclasspath" of pmd, so that the type resolution can find it.
Your example in isolation would still trigger this false positive, as
PMD doesn't know what ProcessingStatus is. I verified it with
java.math.RoundingMode, which is always on the classpath and will be
resolved.
("Your example" refers to the ticket author, not the OP on Stack Overflow)
Your case might work with PMD 5, the source you linked belongs to PMD 4.
Update: The current source contains the additional check for enums:
// skip, if it is an enum
if (type0.getType() != null && type0.getType().equals(type1.getType()) && type0.getType().isEnum()) {
return data;
}
It's fine to use .equals(), because under the hoods, what happens is that the instances are compared with ==.
public final boolean equals(Object other) {
return this==other;
}
Note that this implementation of .equals() is final, which means you cannot override it in your enum.
EDIT: Heart of the Matter
When would an Identity Test pass, when the rest of a traditional equals method would fail? Is this added just to save the time of doing extra work?
Original Post
I am utilizing the CompareToBuilder from org.apache.commons.lang3.builder.CompareToBuilder in a class I am testing. I notice that the EqualsBuilder requires the following code to be explicitly called BEFORE calling the equals builder.
if (obj == this) { return true; } // Identity test
Such logic also appears in the Eclipse auto-generated equals method. I am attempting to use DRY methodology by having my equals method simply call my compareTo method and test equivalence to 0.
One question is whether I need to include the above code in my equals method, add it to my compareTo method or if it is already covered by CompareToBuilder. I notice that CompareToBuilder checks the equivalence of the parameters passed but does not recieve any direct references to the original lhs (this) and rhs (Object obj). That leads me to think that it is an oversight I should correct in my compareTo method.
My largest issue is that I cannot seem to devise a potential test case in which obj == this but this.compareTo(obj) != 0. Only thing that comes to mind is in an improperly implemented compareTo, sending in obj with one of its instance variables null could return a non-zero number if it is not first checked to see if the corresponding variable in this is also null. (Ran into this yesterday).
Sample equals method:
#Override
public boolean equals(Object obj) {
if (obj != null && obj instanceof MyClass)
return this.compareTo((MyClass)obj) == 0;
return false;
}
Sample compareTo method
#Override
public int compareTo(MyClass other) {
return new CompareToBuilder()
.append(this.getParam1(), other.getParam1())
.append(this.getParam2(), other.getParam2())
.toComparison();
}
Looking in the documentation for Comparable, compareTo() requires that
x.compareTo(y) == -y.compareTo(x)
Therefore, if x == y, x.compareTo(x) == -x.compareTo(x) and thus x.compareTo(x) == 0. So the only way to make x == y && x.compareTo(y) != 0 is to break the compareTo() contract.
Note that, in my understanding, the obj == this test tends to be added to equals() methods as an optimization to keep from evaluating a deep comparison when unnecessary and does not affect the result of the call. (See Effective Java 2nd Ed, Item 8.) I assume the same holds true for the compareTo() method here.
When would an Identity Test pass, when the rest of a traditional equals method would fail? Is this added just to save the time of doing extra work?
The short answer is: never.
The identity test is a cheap optimization intended as a quick way out of the rest of the equals() method. In fact, Joshua Bloch recommends that every equals() method begin with just such a test as an optimization (Effective Java, 2nd Ed.).
Mr. Bloch recommends the following general structure for equals():
Use the == operator to check if the argument is a reference to this object. If it is, return true.
Use the instanceof operator to check if the argument is of the correct type. If it is not, then return false.
Cast the argument to the correct type.
For each significant field in the class, check if that field of the argument matches the corresponding field of this object.
and etc.
When would an identity test pass when a traditional equals fail?
Technically, if "traditional" also means "naive", then an object graph with a loop in it (e.g. A has an ivar that points to B which has an ivar that points to A) would pass the identity test, but a traditional equals implementation would go into an infinite loop.
I'm not sure if the Apache library would protect against this.
I'm implementing a value object for these interfaces:
interface FooConsumer
{
public void setFoo(FooKey key, Foo foo);
public Foo getFoo(FooKey key);
}
// intent is for this to be a value object with equivalence based on
// name and serial number
interface FooKey
{
public String getName();
public int getSerialNumber();
}
and from what I've read (e.g. in Enforce "equals" in an interface and toString(), equals(), and hashCode() in an interface) it looks like the recommendation is to provide an abstract base class, e.g.
abstract class AbstractFooKey
{
final private String name;
final private int serialNumber
public AbstractFooKey(String name, int serialNumber)
{
if (name == null)
throw new NullPointerException("name must not be null");
this.name = name;
this.serialNumber = serialNumber;
}
#Override public boolean equals(Object other)
{
if (other == this)
return true;
if (!(other instanceof FooKey))
return false;
return getName().equals(other.getName()
&& getSerialNumber() == other.getSerialNumber()
&& hashCode() == other.hashCode(); // ***
}
#Override public int hashCode()
{
return getName().hashCode() + getSerialNumber()*37;
}
}
My question is about the last bit I added here, and how to deal with the situation where AbstractFooKey.equals(x) is called with a value for x that is an instance of a class that implements FooKey but does not subclass AbstractFooKey. I'm not sure how to handle this; on the one hand I feel like the semantics of equality should just depend on the name and serialNumber being equal, but it appears like the hashCodes have to be equal as well in order to satisfy the contract for Object.equals().
Should I be:
really lax and just forget about the line marked ***
lax and keep what I have
return false from equals() if the other object is not an AbstractFooKey
be really strict and get rid of the interface FooKey and replace it with a class that is final?
something else?
Document the required semantics as part of the contract.
Ideally you'd actually have a single implementation which is final, which kind of negates the need of an interface for this particular purpose. You may have other reasons for wanting an interface for the type.
The contract requirements of Object is actually from hashCode: If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
You don't need to include hashCode in the equals computation, rather you need to include all properties involved in equals in the hashCode calculation. In this case I'd simply compare serialNumber and name in both equals and hashCode.
Keep it simple unless you have a real reason to complicate it.
Start with a final, immutable class.
If you need an interface, create one to match, and document the semantics and default implementation.
For the equals and hashmap, there are strict contracts:
Reflexive - It simply means that the object must be equal to itself, which it would be at any given instance; unless you intentionally override the equals method to behave otherwise.
Symmetric - It means that if object of one class is equal to another class object, the other class object must be equal to this class object. In other words, one object can not unilaterally decide whether it is equal to another object; two objects, and consequently the classes to which they belong, must bilaterally decide if they are equal or not. They BOTH must agree.
Transitive - It means that if the first object is equal to the second object and the second object is equal to the third object; then the first object is equal to the third object. In other words, if two objects agree that they are equal, and follow the symmetry principle, one of them can not decide to have a similar contract with another object of different class. All three must agree and follow symmetry principle for various permutations of these three classes.
Consistent - It means that if two objects are equal, they must remain equal as long as they are not modified. Likewise, if they are not equal, they must remain non-equal as long as they are not modified. The modification may take place in any one of them or in both of them.
null comparison - It means that any instantiable class object is not equal to null, hence the equals method must return false if a null is passed to it as an argument. You have to ensure that your implementation of the equals method returns false if a null is passed to it as an argument.
Contract for hashCode():
Consistency during same execution - Firstly, it states that the hash code returned by the hashCode method must be consistently the same for multiple invocations during the same execution of the application as long as the object is not modified to affect the equals method.
Hash Code & Equals relationship - The second requirement of the contract is the hashCode counterpart of the requirement specified by the equals method. It simply emphasizes the same relationship - equal objects must produce the same hash code. However, the third point elaborates that unequal objects need not produce distinct hash codes.
(From: Technofundo: Equals and Hash Code)
However, using instanceof in equals is not the right thing to do. Joshua Bloch detailed this in Effective Java, and your concerns regarding the validity of your equals implementation is valid. Most likely, problems arising from using instanceof are going to violate the transitivity part in the contract when used in connection with descendants of the base class - unless the equals function is made final.
(Detailed a bit better than I could ever do here: Stackoverflow: Any reason to prefer getClass() over instanceof when generating .equals()?)
Also read:
Java API equals contract
Java API hashCode contract
If the equality of a FooKey is such that two FooKeys with the same name and serial numbers are considered to be equal then you can remove the line in the equals() clause that compares the hashcodes.
Or you could leave it in, it does not really matter assuming that all implementors of the FooKey interface have a correct implementation of equals and gethashcode but I would recommend removing it since otherwise a reader of the code could get the impression that it is there because it makes a difference when in reality it does not.
You can also get rid of the '*37' in the gethashcode method, it is unlikely it would contribute to better hashcode distribution.
In terms of your question 3, I would say no, don't do that, unless the equality contract for FooKey is not controlled by you (in which case trying to enforce an equality contract for the interface is questionable anyway)
Eclipse source menu has a "generate hashCode / equals method" which generates functions like the one below.
String name;
#Override
public int hashCode()
{
final int prime = 31;
int result = 1;
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
#Override
public boolean equals(Object obj)
{
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
CompanyRole other = (CompanyRole) obj;
if (name == null)
{
if (other.name != null)
return false;
} else if (!name.equals(other.name))
return false;
return true;
}
If I select multiple fields when generating hashCode() and equals() Eclipse uses the same pattern shown above.
I am not an expert on hash functions and I would like to know how "good" the generated hash function is? What are situations where it will break down and cause too many collisions?
You can see the implementation of hashCode function in java.util.ArrayList as
public int hashCode() {
int hashCode = 1;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
}
return hashCode;
}
It is one such example and your Eclipse generated code follows a similar way of implementing it. But if you feel that you have to implement your hashCode by your own, there are some good guidelines given by Joshua Bloch in his famous book Effective Java. I will post those important points from Item 9 of that book. Those are,
Store some constant nonzero value, say, 17, in an int variable called result.
For each significant field f in your object (each field taken into account by the equals method, that is), do the following:
a. Compute an int hash code c for the field:
i. If the field is a boolean, compute (f ? 1 : 0).
ii. If the field is a byte, char, short, or int, compute (int) f.
iii. If the field is a long, compute (int) (f ^ (f >>> 32)).
iv. If the field is a float, compute Float.floatToIntBits(f).
v. If the field is a double, compute Double.doubleToLongBits(f), and then hash the resulting long as in step 2.a.iii.
vi. If the field is an object reference and this class’s equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If a more complex comparison is required, compute a “canonical representation” for this field and invoke hashCode on the canonical representation. If the value of the field is null, return 0 (or some other constant, but 0 is traditional)
vii. If the field is an array, treat it as if each element were a separate field.
That is, compute a hash code for each significant element by applying
these rules recursively, and combine these values per step 2.b. If every
element in an array field is significant, you can use one of the
Arrays.hashCode methods added in release 1.5.
b. Combine the hash code c computed in step 2.a into result as follows:
result = 31 * result + c;
Return result.
When you are finished writing the hashCode method, ask yourself whether
equal instances have equal hash codes. Write unit tests to verify your intuition!
If equal instances have unequal hash codes, figure out why and fix the problem.
Java language designers and Eclipse seem to follow similar guidelines I suppose. Happy coding. Cheers.
Since Java 7 you can use java.util.Objects to write short and elegant methods:
class Foo {
private String name;
private String id;
#Override
public int hashCode() {
return Objects.hash(name,id);
}
#Override
public boolean equals(Object obj) {
if (obj instanceof Foo) {
Foo right = (Foo) obj;
return Objects.equals(name,right.name) && Objects.equals(id,right.id);
}
return false;
}
}
Generally it is good, but:
Guava does it somehow better, I prefer it. [EDIT: It seems that as of JDK7 Java provides a similar hash function].
Some frameworks can cause problems when accessing fields directly instead of using setters/getters, like Hibernate for example. For some fields that Hibernate creates lazy, it creates a proxy not the real object. Only calling the getter will make Hibernate go for the real value in the database.
Yes, it is perfect :) You will see this approach almost everywhere in the Java source code.
It's a standard way of writing hash functions. However, you can improve/simplify it if you have some knowledge about the fields. E.g. you can ommit the null check, if your class guarantees that the field never be null (applies to equals() as well). Or you can of delegate the field's hash code if only one field is used.
I would also like to add a reference to Item 9, in Effective Java 2nd Edition by Joshua Bloch.
Here is a recipe from Item 9 : ALWAYS OVERRIDE HASHCODE WHEN YOU OVERRIDE EQUALS
Store some constant nonzero value, say, 17, in an int variable called result.
For each significant field f in your object (each field taken into account by the equals method, that is), do the following:
a. Compute an int hash code c for the field:
i. If the field is a boolean, compute (f ? 1 : 0).
ii. If the field is a byte, char, short, or int, compute (int) f.
iii. If the field is a long,compute(int)(f^(f>>>32)).
iv. If the field is a float, compute Float.floatToIntBits(f).
v. If the field is a double, compute Double.doubleToLongBits(f), and then hash the resulting long as in step 2.a.iii.
vi. If the field is an object reference and this class’s equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If a more complex comparison is required, compute a “canonical representation” for this field and invoke hashCode on the canonical representation. If the value of the field is null, return 0 (or some other constant, but 0 is traditional).
vii. If the field is an array, treat it as if each element were a separate field. That is, compute a hash code for each significant element by applying these rules recursively, and combine these values per step 2.b. If every element in an array field is significant, you can use one of the Arrays.hashCode methods added in release 1.5.
b. Combine the hash code c computed in step 2.a into result as follows: result = 31 * result + c;
3. Return result.
4. When you are finished writing the hashCode method, ask yourself whether equal instances have equal hash codes. Write unit tests to verify your intuition! If equal instances have unequal hash codes, figure out why and fix the problem.
If you are using Apache Software Foundation (commons-lang library) then
below classes will help you to generate hashcode/equals/toString methods using reflection.
You don't need to worry about regenerating hashcode/equals/toString methods when you add/remove instance variables.
EqualsBuilder - This class provides methods to build a good equals method for any class. It follows rules laid out in Effective Java , by Joshua Bloch. In particular the rule for comparing doubles, floats, and arrays can be tricky. Also, making sure that equals() and hashCode() are consistent can be difficult.
HashCodeBuilder - This class enables a good hashCode method to be built for any class. It follows the rules laid out in the book Effective Java by Joshua Bloch. Writing a good hashCode method is actually quite difficult. This class aims to simplify the process.
ReflectionToStringBuilder - This class uses reflection to determine the fields to append. Because these fields are usually private, the class uses AccessibleObject.setAccessible(java.lang.reflect.AccessibleObject[], boolean) to change the visibility of the fields. This will fail under a security manager, unless the appropriate permissions are set up correctly.
Maven Dependency:
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>${commons.lang.version}</version>
</dependency>
Sample Code:
import org.apache.commons.lang.builder.EqualsBuilder;
import org.apache.commons.lang.builder.HashCodeBuilder;
import org.apache.commons.lang.builder.ReflectionToStringBuilder;
public class Test{
instance variables...
....
getter/setter methods...
....
#Override
public String toString() {
return ReflectionToStringBuilder.toString(this);
}
#Override
public int hashCode() {
return HashCodeBuilder.reflectionHashCode(this);
}
#Override
public boolean equals(Object obj) {
return EqualsBuilder.reflectionEquals(this, obj);
}
}
One potential drawback is that all objects with null fields will have a hash code of 31, thus there could be many potential collisions between objects that only contain null fields. This would make for slower lookups in Maps.
This can occur when you have a Map whose key type has multiple subclasses. For example, if you had a HashMap<Object, Object>, you could have many key values whose hash code was 31. Admittedly, this won't occur that often. If you like, you could randomly change the values of the prime to something besides 31, and lessen the probability of collisions.
The contract of equals with regards to null, is as follows:
For any non-null reference value x, x.equals(null) should return false.
This is rather peculiar, because if o1 != null and o2 == null, then we have:
o1.equals(o2) // returns false
o2.equals(o1) // throws NullPointerException
The fact that o2.equals(o1) throws NullPointerException is a good thing, because it alerts us of programmer error. And yet, that error would not be catched if for various reasons we just switched it around to o1.equals(o2), which would just "silently fail" instead.
So the questions are:
Why is it a good idea that o1.equals(o2) should return false instead of throwing NullPointerException?
Would it be a bad idea if wherever possible we rewrite the contract so that anyObject.equals(null) always throw NullPointerException instead?
On comparison with Comparable
In contrast, this is what the Comparable contract says:
Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false.
If NullPointerException is appropriate for compareTo, why isn't it for equals?
Related questions
Comparable and Comparator contract with regards to null
A purely semantical argument
These are the actual words in the Object.equals(Object obj) documentation:
Indicates whether some other object is "equal to" this one.
And what is an object?
JLS 4.3.1 Objects
An object is a class instance or an array.
The reference values (often just references) are pointers to these objects, and a special null reference, which refers to no object.
My argument from this angle is really simple.
equals tests whether some other object is "equal to" this
null reference gives no other object for the test
Therefore, equals(null) should throw NullPointerException
To the question of whether this asymmetry is inconsistent, I think not, and I refer you to this ancient Zen kōan:
Ask any man if he's as good as the next man and each will say yes.
Ask any man if he's as good as nobody and each will say no.
Ask nobody if it's as good as any man and you'll never get a reply.
At that moment, the compiler reached enlightenment.
An exception really should be an exceptional situation. A null pointer might not be a programmer error.
You quoted the existing contract. If you decide to go against convention, after all this time, when every Java developer expects equals to return false, you'll be doing something unexpected and unwelcome that will make your class a pariah.
I could't disagree more. I would not rewrite equals to throw an exception all the time. I'd replace any class that did that if I were its client.
Think of how .equals is related to == and .compareTo is related to the comparison operators >, <, >=, <=.
If you're going to argue that using .equals to compare an object to null should throw a NPE, then you'd have to say that this code should throw one as well:
Object o1 = new Object();
Object o2 = null;
boolean b = (o1 == o2); // should throw NPE here!
The difference between o1.equals(o2) and o2.equals(o1) is that in the first case you're comparing something to null, similar to o1 == o2, while in the second case, the equals method is never actually executed so there's no comparison happening at all.
Regarding the .compareTo contract, comparing a non-null object with a null object is like trying do this:
int j = 0;
if(j > null) {
...
}
Obviously this won't compile. You can use auto-unboxing to make it compile, but you get a NPE when you do the comparison, which is consistent with the .compareTo contract:
Integer i = null;
int j = 0;
if(j > i) { // NPE
...
}
Not that this is neccessarily an answer to your question, it is just an example of when I find it useful that the behaviour is how it is now.
private static final String CONSTANT_STRING = "Some value";
String text = getText(); // Whatever getText() might be, possibly returning null.
As it stands I can do.
if (CONSTANT_STRING.equals(text)) {
// do something.
}
And I have no chance of getting a NullPointerException. If it were changed as you suggested, I would be back to having to do:
if (text != null && text.equals(CONSTANT_STRING)) {
// do something.
}
Is this a good enough reason for the behaviour to be as it is?? I don't know, but it is a useful side-effect.
If you take object oriented concepts into account, and consider the whole sender and receiver roles, I'd say that behaviour is convenient. See in the first case you're asking an object if he is equal to nobody. He SHOULD say "NO, I'm not".
In the second case though, you don't have a reference to anyone So you aren't really asking anyone. THIS should throw an exception, the first case shouldn't.
I think it's only asymmetric if you kind of forget about object orientation and treat the expression as a mathematical equality. However, in this paradigm both ends play different roles, so it is to be expected that order matters.
As one final point. A null pointer exception should be raised when there's an error in your code. However, Asking an object if he is nobody, shouldn't be considered a programming flaw. I think it's perfectly ok to ask an object if he isn't null. What if you don't control the source that provides you with the object? and this source sends you null. Would you check if the object is null and only afterwards see if they are equals? Wouldn't it be more intuitive to just compare the two and whatever the second object is the comparison will be carried out without exceptions?
In all honesty, I would be pissed if an equals method within its body returns a null pointer exception on purpose. Equals is meant to be used against any sort of object, so it shouldn't be so picky on what it receives. If an equals method returned npe, the last thing on my mind would be that it did that on purpose. Specially considering it's an unchecked exception. IF you did raise an npe a guy would have to remember to always check for null before calling your method, or even worse, surround the call to equals in a try/catch block (God I hate try/catch blocks) But oh well...
Personally, I'd rather it perform as it does.
The NullPointerException identifies that the problem is in the object against which the equals operation is being performed.
If the NullPointerException was used as you suggest and you tried the (sort of pointless) operation of...
o1.equals(o1) where o1= null...
Is the NullPointerException thrown because your comparison function is screwed or because o1 is null but you didn't realise?
An extreme example, I know, but with current behaviour I feel you can tell easily where the problem lies.
In the first case o1.equals(o2) returns false because o1 is not equal to o2, which is perfectly fine. In the second case, it throws NullPointerException because o2 is null. One cannot call any method on a null. It may be a limitation of programming languages in general, but we have to live with it.
It is also not a good idea to throw NullPointerException you are violating the contract for the equals method and making things more complex than it has to be.
There are many common situations where null is not in any way exceptional, e.g. it may simply represent the (non-exceptional) case where a key has no value, or otherwise stand for “nothing”. Hence, doing x.equals(y) with an unknown y is also quite common, and having to always check for null first would be just wasted effort.
As for why null.equals(y) is different, it is a programming error to call any instance method on a null reference in Java, and therefore worthy of an exception. The ordering of x and y in x.equals(y) should be chosen such that x is known to not be null. I would argue that in almost all cases this reordering can be done based on what is known about the objects beforehand (e.g., from their origin, or by checking against null for other method calls).
Meanwhile if both objects are of unknown “nullness”, then other code almost certainly requires checking at least one of them, or not much can be done with the object without risking the NullPointerException.
And since this is the way it is specified, it is a programming error to break the contract and raise an exception for a null argument to equals. And if you consider the alternative of requiring an exception to be thrown, then every implementation of equals would have to make a special case of it, and every call to equals with any potentially null object would have to check before calling.
It could have been specified differently (i.e., the precondition of equals would require the argument to be non-null), so this is not to say that your argumentation is invalid, but the current specification makes for a simpler and more practical programming language.
Note that the contract is "for any non-null reference x". So the implementation will look like:
if (x != null) {
if (x.equals(null)) {
return false;
}
}
x need not be null to be deemed equal to null because the following definition of equals is possible:
public boolean equals(Object obj) {
// ...
// If someMember is 0 this object is considered as equal to null.
if (this.someMember == 0 and obj == null) {
return true;
}
return false;
}
I think it's about convenience and more importantly consistency - allowing nulls to be part of the comparison avoids having to do a null check and implement the semantics of that each time equals is called. null references are legal in many collection types, so it makes sense they can appear as the right side of the comparison.
Using instance methods for equality, comparison etc., necessarily makes the arrangement asymmetric - a little hassle for the huge gain of polymorphism. When I don't need polymorphism, I sometimes create a symmetric static method with two arguments, MyObject.equals(MyObjecta, MyObject b). This method then checks whether one or both arguments are null references. If I specifically want to exclude null references, then I create an additional method e.g. equalsStrict() or similar, that does a null check before delegating to the other method.
You should return false if the parameter is null.
To show that this is the standard, see 'Objects.equals(Object, Object) from java.util, that performs an assymetric null check on the first parameter only (on which equals(Object) will be called). From the OpenJDK SE 11 source code (SE 1.7 contains exactly the same):
public static boolean equals(Object a, Object b) {
return (a == b) || (a != null && a.equals(b));
}
This also handles two null values as equal.
This is a tricky question. For backward compatability you can't do so.
Imagine the following scenario
void m (Object o) {
if (one.equals (o)) {}
else if (two.equals (o)) {}
else {}
}
Now with equals returning false else clause will get executed, but not when throwing an exception.
Also null is not really equal to say "2" so it makes perfect sense to return false. Then it is probably better to insist null.equals("b") to return also false :))
But this requirement does make a strange and non symmetric equals relation.