Overriding broken equals

Overriding broken equals - java

I have some third party code that has a lot of classes with broken equals() and hash-code implementations. I cannot change the third-party code but need the equals method badly. To overcome this I came up with the the following approaches:
1) Create an EqualsUtility which has a bunch of overloaded static equals() methods.
Problem: the class will become very large as the third-party code grows.
2) Create adapter classes for all the third party classes and write an equals method.
Problem: Too many new classes are created.
Is there a third, more clean way to do this.

You may try to check the object equality with some 3rd party lib, for example with the ApacheCommoms EqualsBuilder. But that could be not a very good solution, since it uses reflection for comparison. Furthermore, it doesn't help with hash code implementation.
IMO, extending base classes and overriding equals and hashcode methods are prefferable. Any other solution, including aspects, 3rd party libs for deep object comparison, some proxy objects, are not good for performance and in some cases for understanding of your code.

Check out AspectJ. It's an Aspect Oriented programming library for Java that let's you do exactly what you want to do. You describe an entity called a pointcut that corresponds roughly to in this case a method invocation of the equals() method on the library's objects.
You then write code that gets executed when that pointcut is hit. You can write different types of pointcuts. So for instance you can have your code execute before the equals() method, after the equals() method, or around the equals() method. If you write an around() type you can choose to handle the call your self or do some work then call the original method.
You could do an around() and rewrite the equals method so it is correct for your situation.
Very powerful stuff.

Related

Effective Java: should I override equals() and hashCode() if the objects I'm creating are never compared with each other?

If the objects I create are not used for comparisons such as list.contains(new Employee("MM")), and also if those objects will only be stored in Lists returned from a database such as List<Employee>employeeList = employeeService.getEmployeeList(); then do I need to override equals() and hashCode() in Employee class?

No, you do not need to override .equals() and .hashCode() if you don't need a custom definition of equality. As long as you intend to treat every instance of your class as un-equal to other instances the defaults will work fine. You can store such objects in Lists and even in hash-based collections such as HashMaps and HashSets - both classes have no problems with the default Object notion of equivalence.
Furthermore for many classes you shouldn't override these methods. Many common design patterns will include classes that aren't intended to ever be equivalent, such as factories, singletons, and state machines. Defining a custom notion of equality for such classes can introduce strange bugs, or at a minimum simply be unnecessary boilerplate.
On the other hand value types, or classes intended specifically to be a structured representation of some sort of data should almost always override .equals() and .hashCode() (and possibly implement Comparable as well), because it's what users of these sort of classes are likely to expect. The Auto/Value project makes creating such value types really painless; if that's the type of class you're constructing I'd strongly encourage you to use it.

If you know you will never be using the object as a key in a HashMap or will never be putting it in any sort of Set, or never doing anything with it where you will do any object comparison other than "are these references referring to literally same instance or not", then you do not have to override equals() and hashCode().
And if that's not the case and you do have to override them, then do consider letting your IDE generate the overrides rather than doing it manually -- especially for hashCode(). And be aware that when having the IDE generate these, you can tell the IDE which fields to include and which fields not to include, which even further reduces any need to write the overrides manually.

As QuantumMechanic had said above, you do not need to override equals() and hashcode().
However, if the Employee class is going to be shared with other people, it's a good idea to add equal and hashcode so that it is easier for other people to use.
Also, Eclipse can generate these functions by right clicking -> Source -> Generate hashCode() and equals()
Good luck!

Why overriding Equals Method:
If you object needs to be stored on Collection i.e List, you should override equals method since when you will use indexOf, lastIndexOf etc API method as those api methods internally uses equals method. If you dont override equal method, then you might get those object back from collections since identity checking is not the right way to get the object back from Collections.
Why overriding hashCode Method:
If your object needs to be stored in a set or as key object in a map collection, you must override HashCode and Equals both becuase both methods are used to get the object back from those collections.

Extend JUnit assertEquals

I want to create a new assertEquals which takes in as input a custom pojo for expected and actual.
How would I go about extending assertEquals of the JUnit library?
What I could do is implement a compare method which returns a boolean and have this as the input to the assertEquals or even assertTrue but creating my own assertEquals seems more elegant.
Would it simply be the case of returning true if equal or raising a AssertionError?

As #Makoto commented, you could use a custom Hamcrest Matcher.
The disadvantage of the other common answer here (just change the definition of Object#equals for your class), is that you would have one and only one way of comparing your objects, and it would have to match exactly what is needed by the test rather than what would be needed by users of the class. The two needs may or may not be identical. Often in testing, I only need to assert one or two values, sometimes several, but usually not what gets tested by the "natural" #equals method of my class. Furthermore, I work with a lot of classes that don't even have an explicit override of #equals. In these cases you would have to define one that works simply for the case of your test, whereas it semantically might not represent the domain very well.

There is an overload of assertEquals that takes objects. It (eventually) calls the equals method on both objects. So, the "compare" method you need to write is an override of Object's equals method.
Once you have that written, then you can call assertEquals("Failure message", yourObject1, yourObject2). There is no need to extend JUnit for this case.
As an aside, if you override equals, then you should override hashCode also, in a consistent way.

Asserts are static methods, defined in Assert class. TestCase derives from that class, hence all asserts are available there too.
java.lang.Object
|
+--junit.framework.Assert
|
+--junit.framework.TestCase
Creating another derived class, such as class MyTestCase extends TestCase and define your customized asserts there the same way they are defined in Assert.
Use MyTestCase instead of TestCase
Profit!
Another approach is to define proper equals() method and use standard assertEquals(). You can use EqualsBuilder to get a reasonable implementation automagically using reflection.
http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/builder/EqualsBuilder.html
If you don't like Apache Commons, you can find similar tool in Guava.

Java equals and hashCode methods - technical constraints

I've got question about java's equals(Object o) and hashCode() methods. What are the technical constraints of implementation this both methods? Is there something that I can't do during implement this methods?

None. It's just two methods in Object class. You could even change an object's state within this methods and this will freak out every developer and system but it's still valid from technical point of view.

You can technically anything inside them you can do in any other methods.
Instead what you concern yourself with are the practical and contractual obligations of the methods.
Good rules of thumb:
If you override one, override the other.
Variables used in one should be used in the other.

a given object must consistently report the same hash value
two objects which equals() says are equal must report the same hash value - so no timestamps in the hashcode :).
Two unequal objects can also have the same hashcode, though it is better to make the hashcode difficult to repoduce.

All you need to remember, is:
those constrains are very important
all are very well documented in javadoc for Object.hashCode() and Object.equals()
Make sure you understand it every time you override any of those methods.

Is there something that I can't do during implement this methods?
Well, as a rule of thumb (and as already #RHT mentioned above and #Antti Sykäri explains here) do your future self a favor and always use EqualsBuilder and HashCodeBuilder from the Apache Commons Lang library. Because, quite frankly, at least in my case I never get to remember all the nitty-gritty details that a correct implementation would require. ;-)

Checking for deep equality in JUnit tests

I am writing unit tests for objects that are cloned, serialized, and/or written to an XML file. In all three cases I would like to verify that the resulting object is the "same" as the original one. I have gone through several iterations in my approach and having found fault with all of them, was wondering what other people did.
My first idea was to manually implement the equals method in all the classes, and use assertEquals. I abandoned this this approach after deciding that overriding equals to perform a deep compare on mutable objects is a bad thing, as you almost always want collections to use reference equality for mutable objects they contain[1].
Then I figured I could just rename the method to contentEquals or something. However, after thinking more, I realized this wouldn't help me find the sort of regressions I was looking for. If a programmer adds a new (mutable) field, and forgets to add it to the clone method, then he will probably forget to add it to the contentEquals method too, and all these regression tests I'm writing will be worthless.
I then wrote a nifty assertContentEquals function that uses reflection to check the value of all the (non-transient) members of an object, recursively if necessary. This avoids the problems with the manual compare method above since it assumes by default that all fields must be preserved and the programmer must explicitly declare fields to skip. However, there are legitimate cases when a field really shouldn't be the same after cloning[2]. I put in an extra parameter toassertContentEquals that lists which fields to ignore, but since this list is declared in the unit test, it gets real ugly real fast in the case of recursive checking.
So I am now thinking of moving back to including a contentEquals method in each class being tested, but this time implemented using a helper function similar to the assertContentsEquals described above. This way when operating recursively, the exemptions will be defined in each individual class.
Any comments? How have you approached this issue in the past?
Edited to expound on my thoughts:
[1]I got the rational for not overriding equals on mutable classes from this article. Once you stick a mutable object in a Set/Map, if a field changes then its hash will change but its bucket will not, breaking things. So the options are to not override equals/getHash on mutable objects or have a policy of never changing a mutable object once it has been put into a collection.
I didn't mention that I am implementing these regression test on an existing codebase. In this context, the idea of changing the definition of equals, and then having to find all instances where it could change the behavior of the software frightens to me. I feel like I could easily break more than I fix.
[2]One example in our code base is a graph structure, where each node needs a unique identifier to use to link the nodes XML when eventually written to XML. When we clone these objects we want the identifier to be different, but everything else to remain the same. After ruminating about it more, it seems like the questions "is this object already in this collection" and "are these objects defined the same", use fundamentally different concepts of equality in this context. The first is asking about identity and I would want the ID included if doing a deep compare, while the second is asking about similarity and I don't want the ID included. This is making me lean more against implementing the equals method.
Do you guys agree with this decision, or do you think that implementing equals is the better way to go?

I would go with the reflection approach and define a custom Annotation with RetentionPolicy.RUNTIME to allow the implementers of the tested classes to mark the fields that are expected to change after cloning. You can then check the annotation with reflection and skip the marked fields.
This way you can keep your test code generic and simple and have a convenient means to mark exceptions directly in the code without affecting the design or runtime behavior of the code that needs to be tested.
The annotation could look like this:
import java.lang.annotation.*;
#Retention(RetentionPolicy.RUNTIME)
#Target({ElementType.FIELD})
public #interface ChangesOnClone
{
}
This is how it can be used in the code that is to be tested:
class ABC
{
private String name;
#ChangesOnClone
private Cache cache;
}
And finally the relevant part of the test code:
for ( Field field : fields )
{
if( field.getAnnotation( ChangesOnClone.class ) )
continue;
// else test it
}

AssertJ's offers a recursive comparison function:
assertThat(new).usingRecursiveComparison().isEqualTo(old);
See the AssertJ documentation for details: https://assertj.github.io/doc/#basic-usage
Prerequisites for using AssertJ:
import:
import static org.assertj.core.api.Assertions.*;
maven dependency:
<!-- test -->
<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
<version>3.19.0</version>
<scope>test</scope>
</dependency>

Additional methods in the Object Class

The Object class has a number of methods such as equals, hashCode, notify, wait etc.
What methods do you think are missing from the Object class and why? Are there any additional methods you wish it had?

I don't think it should have any extra methods... in fact, I think various methods that are there shouldn't be there in the first place.
One problem in Java is that a lot of types (such as HashMap) always use the hash code and equality methods on the keys directly - it would be much better if Java had an interface like .NET's IEqualityComparer<T> that anything wanting to perform hashing/equality comparisons could delegate to.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.