What are the alternatives to comparing the equality of two objects? - java

http://leepoint.net/notes-java/data/expressions/22compareobjects.html
It turns out that defining equals()
isn't trivial; in fact it's moderately
hard to get it right, especially in
the case of subclasses. The best
treatment of the issues is in
Horstmann's Core Java Vol 1.
If equals() must always be overridden, then what is a good approach for not being cornered into having to do object comparison? What are some good "design" alternatives?
EDIT:
I'm not sure this is coming across the way that I had intended. Maybe the question should be more along the lines of "Why would you want to compare two objects?" Based upon your answer to that question, is there an alternative solution to comparison? I don't mean, a different implementation of equals. I mean, not using equality at all. I think the key point is to start with that question, why would you want to compare two objects.

If equals() must always be overridden,
then what is a good approach for not
being cornered into having to do
object comparison?
You are mistaken. You should override equals as seldom as possible.
All this info comes from Effective Java, Second Edition (Josh Bloch). The first edition chapter on this is still available as a free download.
From Effective Java:
The easiest way to avoid problems is
not to override the equals method, in
which case each instance of the class
is equal only to itself.
The problem with arbitrarily overriding equals/hashCode is inheritance. Some equals implementations advocate testing it like this:
if (this.getClass() != other.getClass()) {
return false; //inequal
}
In fact, the Eclipse (3.4) Java editor does just this when you generate the method using the source tools. According to Bloch, this is a mistake as it violates the Liskov substitution principle.
From Effective Java:
There is no way to extend an
instantiable class and add a value
component while preserving the equals
contract.
Two ways to minimize equality problems are described in the Classes and Interfaces chapter:
Favour composition over inheritance
Design and document for inheritance or else prohibit it
As far as I can see, the only alternative is to test equality in a form external to the class, and how that would be performed would depend on the design of the type and the context you were trying to use it in.
For example, you might define an interface that documents how it was to be compared. In the code below, Service instances might be replaced at runtime with a newer version of the same class - in which case, having different ClassLoaders, equals comparisons would always return false, so overriding equals/hashCode would be redundant.
public class Services {
private static Map<String, Service> SERVICES = new HashMap<String, Service>();
static interface Service {
/** Services with the same name are considered equivalent */
public String getName();
}
public static synchronized void installService(Service service) {
SERVICES.put(service.getName(), service);
}
public static synchronized Service lookup(String name) {
return SERVICES.get(name);
}
}
"Why would you want to compare two objects?"
The obvious example is to test if two Strings are the same (or two Files, or URIs). For example, what if you wanted to build up a set of files to parse. By definition, the set contains only unique elements. Java's Set type relies on the equals/hashCode methods to enforce uniqueness of its elements.

I don't think it's true that equals should always be overridden. The rule as I understand it is that overriding equals is only meaningful in cases where you're clear on how to define semantically equivalent objects. In that case, you override hashCode() as well so that you don't have objects that you've defined as equivalent returning different hashcodes.
If you can't define meaningful equivalence, I don't see the benefit.

How about just do it right?
Here's my equals template which is knowledge applied from Effective Java by Josh Bloch. Read the book for more details:
#Override
public boolean equals(Object obj) {
if(this == obj) {
return true;
}
// only do this if you are a subclass and care about equals of parent
if(!super.equals(obj)) {
return false;
}
if(obj == null || getClass() != obj.getClass()) {
return false;
}
final YourTypeHere other = (YourTypeHere) obj;
if(!instanceMember1.equals(other.instanceMember1)) {
return false;
}
... rest of instanceMembers in same pattern as above....
return true;
}

Mmhh
In some scenarios you can make the object unmodifiable ( read-only ) and have it created from a single point ( a factory method )
If two objects with the same input data ( creation parameters ) are needed the factory will return the same instance ref and then using "==" would be enough.
This approach is useful under certain circumstances only. And most of the times would look overkill.
Take a look at this answer to know how to implement such a thing.
warning it is a lot of code
For short see how the wrapper class works since java 1.5
Integer a = Integer.valueOf( 2 );
Integer b = Integer.valueOf( 2 );
a == b
is true while
new Integer( 2 ) == new Integer( 2 )
is false.
It internally keeps the reference and return it if the input value is the same.
As you know Integer is read-only
Something similar happens with the String class from which that question was about.

Maybe I'm missing the point but the only reason to use equals as opposed to defining your own method with a different name is because many of the Collections (and probably other stuff in the JDK or whatever it's called these days) expect the equals method to define a coherent result. But beyond that, I can think of three kinds of comparisons that you want to do in equals:
The two objects really ARE the same instance. This makes no sense to use equals because you can use ==. Also, and correct me if I've forgotten how it works in Java, the default equals method does this using the automatically generated hash codes.
The two objects have references to the same instances, but are not the same instance. This is useful, uh, sometimes... particularly if they are persisted objects and refer to the same object in the DB. You would have to define your equals method to do this.
The two objects have references to objects that are equal in value, though they may or may not be the same instances (in other words, you compare values all the way through the hierarchy).
Why would you want to compare two objects? Well, if they're equal, you would want to do one thing, and if they're not, you would want to do something else.
That said, it depends on the case at hand.

The main reason to override equals() in most cases is to check for duplicates within certain Collections. For example, if you want to use a Set to contain an object you have created you need to override equals() and hashCode() within your object. The same applies if you want to use your custom object as a key in a Map.
This is critical as I have seen many people make the mistake in practice of adding their custom objects to Sets or Maps without overriding equals() and hashCode(). The reason this can be especially insidious is the compiler will not complain and you can end up with multiple objects that contain the same data but have different references in a Collection that does not allow duplicates.
For example if you had a simple bean called NameBean with a single String attribute 'name', you could construct two instances of NameBean (e.g. name1 and name2), each with the same 'name' attribute value (e.g. "Alice"). You could then add both name1 and name2 to a Set and the set would be size 2 rather than size 1 which is what is intended. Likewise if you have a Map such as Map in order to map the name bean to some other object, and you first mapped name1 to the string "first" and later mapped name2 to the string "second" you will have both key/value pairs in the map (e.g. name1->"first", name2->"second"). So when you do a map lookup it will return the value mapped to the exact reference you pass in, which is either name1, name2, or another reference with name "Alice" that will return null.
Here is a concrete example preceded by the output of running it:
Output:
Adding duplicates to a map (bad):
Result of map.get(bean1):first
Result of map.get(bean2):second
Result of map.get(new NameBean("Alice"): null
Adding duplicates to a map (good):
Result of map.get(bean1):second
Result of map.get(bean2):second
Result of map.get(new ImprovedNameBean("Alice"): second
Code:
// This bean cannot safely be used as a key in a Map
public class NameBean {
private String name;
public NameBean() {
}
public NameBean(String name) {
this.name = name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
#Override
public String toString() {
return name;
}
}
// This bean can safely be used as a key in a Map
public class ImprovedNameBean extends NameBean {
public ImprovedNameBean(String name) {
super(name);
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if(obj == null || getClass() != obj.getClass()) {
return false;
}
return this.getName().equals(((ImprovedNameBean)obj).getName());
}
#Override
public int hashCode() {
return getName().hashCode();
}
}
public class MapDuplicateTest {
public static void main(String[] args) {
MapDuplicateTest test = new MapDuplicateTest();
System.out.println("Adding duplicates to a map (bad):");
test.withDuplicates();
System.out.println("\nAdding duplicates to a map (good):");
test.withoutDuplicates();
}
public void withDuplicates() {
NameBean bean1 = new NameBean("Alice");
NameBean bean2 = new NameBean("Alice");
java.util.Map<NameBean, String> map
= new java.util.HashMap<NameBean, String>();
map.put(bean1, "first");
map.put(bean2, "second");
System.out.println("Result of map.get(bean1):"+map.get(bean1));
System.out.println("Result of map.get(bean2):"+map.get(bean2));
System.out.println("Result of map.get(new NameBean(\"Alice\"): "
+ map.get(new NameBean("Alice")));
}
public void withoutDuplicates() {
ImprovedNameBean bean1 = new ImprovedNameBean("Alice");
ImprovedNameBean bean2 = new ImprovedNameBean("Alice");
java.util.Map<ImprovedNameBean, String> map
= new java.util.HashMap<ImprovedNameBean, String>();
map.put(bean1, "first");
map.put(bean2, "second");
System.out.println("Result of map.get(bean1):"+map.get(bean1));
System.out.println("Result of map.get(bean2):"+map.get(bean2));
System.out.println("Result of map.get(new ImprovedNameBean(\"Alice\"): "
+ map.get(new ImprovedNameBean("Alice")));
}
}

Equality is fundamental to logic (see law of identity), and there's not much programming you can do without it. As for comparing instances of classes that you write, well that's up to you. If you need to be able to find them in collections or use them as keys in Maps, you'll need equality checks.
If you've written more than a few nontrivial libraries in Java, you'll know that equality is hard to get right, especially when the only tools in the chest are equals and hashCode. Equality ends up being tightly coupled with class hierarchies, which makes for brittle code. What's more, no type checking is provided since these methods just take parameters of type Object.
There's a way of making equality checking (and hashing) a lot less error-prone and more type-safe. In the Functional Java library, you'll find Equal<A> (and a corresponding Hash<A>) where equality is decoupled into a single class. It has methods for composing Equal instances for your classes from existing instances, as well as wrappers for Collections, Iterables, HashMap, and HashSet, that use Equal<A> and Hash<A> instead of equals and hashCode.
What's best about this approach is that you can never forget to write equals and hash method when they are called for. The type system will help you remember.

Related

Reason behind JVM's default Object.HashCode() implementation

I am trying to understand why JVM's default implementation does not return same hashcode() value for all the objects...
I have written a program where i have overridden equals() but not hashCode(), and the consequences are scary.
HashSet is adding two objects even the equals are same.
TreeSet is throwing exception with Comparable implementation..
And many more..
Had the default Object'shashCode() implementation returns same int value, all these issues could have been avoided...
I understand their's alot written and discussed about hashcode() and equals() but i am not able to understand why things cant be handled at by default, this is error prone and consequences could be really bad and scary..
Here's my sample program..
import java.util.HashSet;
import java.util.Set;
public class HashcodeTest {
public static void main(String...strings ) {
Car car1 = new Car("honda", "red");
Car car2 = new Car("honda", "red");
Set<Car> set = new HashSet<Car>();
set.add(car1);
set.add(car2);
System.out.println("size of Set : "+set.size());
System.out.println("hashCode for car1 : "+car1.hashCode());
System.out.println("hashCode for car2 : "+car2.hashCode());
}
}
class Car{
private String name;
private String color;
public Car(String name, String color) {
super();
this.name = name;
this.color = color;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getColor() {
return color;
}
public void setColor(String color) {
this.color = color;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Car other = (Car) obj;
if (color == null) {
if (other.color != null)
return false;
} else if (!color.equals(other.color))
return false;
if (name == null) {
if (other.name != null)
return false;
} else if (!name.equals(other.name))
return false;
return true;
}
}
Output:
size of Set : 2
hashCode for car1 : 330932989
hashCode for car2 : 8100393
It seems that you want to propose to calculate hashCode by default just by taking all the object fields and combining their hashCodes using some formula. Such approach is wrong and may lead to many unpleasant circumstances. In your case it would work, because your object is very simple. But real life objects are much more complex. A few examples:
Objects are connected into double-linked list (every object has previous and next fields). How default implementation would calculate the hashCode? If it should check the fields, it will end up with infinite recursion.
Ok, suppose that we can detect infinite recursion. Let's just have single-linked list. In this case the hashCode of every node should be calculated from all the successor nodes? What if this list contains millions of nodes? All of them should be checked to generate the hashCode?
Suppose you have two HashSet objects. First is created like:
HashSet<Integer> a = new HashSet<>();
a.add(1);
The second is created like this:
HashSet<Integer> b = new HashSet<>();
for(int i=1; i<1000; i++) b.add(i);
for(int i=2; i<1000; i++) b.remove(i);
From user's point of view both contain only one element. But programmatically the second one holds big hash-table inside (like array of 2048 entries of which only one is not null), because when you added many elements, the hash-table was resized. In contrast, the first one holds small hash-table inside (e.g. 16 elements). So programmatically objects are very different: one has big array, other has small array. But they are equal and have the same hashCode, thanks to custom implementation of hashCode and equals.
Suppose you have different List implementations. For example, ArrayList and LinkedList. Both contain the same elements and from the user's point of view they are equal and should have the same hashCode. And they indeed equal and have the same hashCode. However their internal structure is completely different: ArrayList contains an array while LinkedList contains pointers to the objects representing head and tail. So you cannot just generate the hashCode based on their fields: it surely will be different.
Some object may contain the field which is lazily initialized (initialized to null and calculated from other fields only when necessary). What if you have two otherwise equal objects and one has its lazy field initialized while other is not? We should exclude this lazy field from hashCode calculation.
So, there are many cases when universal hashCode approach would not work and may even produce problems (like making your program crash with StackOverflowError or stuck enumerating all the linked objects). Due to this the simplest implementation was selected which is based on object identity. Note that the contract of hashCode and equals requires them to be consistent, and it's fulfilled by default implementation. If you redefine equals, you just must redefine hashCode as well.
You broke the contract.
hashcode and equals should be written in such a way, that when equals return true these objects has same hashcode.
If you override equals then you must provide hashcode that works properly.
Default implementation can't handle it, because default implementation don't know which fields are important. And automatic implementation would not do it in efficient way, the hashcode function is to speed up operations like data lookup in data structures, if it is implemented improperly, then performance will suffer.
From the Docs
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
From documentation:
If two objects are equal according to the equals(Object)
method, then calling the hashCode} method on each of
the two objects must produce the same integer result.
then if you overrides how equals() behave, you must override hashCode() as well.
Also, from docs of equals() -
Note that it is generally necessary to override the hashCode
method whenever this method is overridden, so as to maintain the
general contract for the hashCode method, which states
that equal objects must have equal hash codes.
From javadoc of Object class:
Returns a hash code value for the object. This method is supported for the benefit of hash tables such as those provided by HashMap.
Thus if default implementation provides the same hash, it defeats the purpose.
And for a default implementation, it cannot assume all the classes are of value class, thus the last sentence from doc:
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects.

data members to consider while overriding hashcode and equals

I know (contract) we need to override hashcode when equals is overridden.
Why should I consider same fields used for equals comparison to compute hashcode?
Is it to improve performance, by avoiding too many objects mapping to same bucket, as in below case?
i.e. all objects created on same "date" would map to same bucket and linear comparison will take time in checking object exists using equals() method?
If my above statement is true, what other potential issues will come with below code other than performance issue. Is that the only reason we should use same fields / members used in equals to compute hashcode? Please share. Thanks.
class MyClass {
int date;
int pay;
int id;
public boolean equals(Object o) {
//null and same class instance check
MyClass obj = (MyClass) o;
return (date == obj.date && pay == obj.pay && id == obj.id);
}
public int hashCode() {
int hash = 7;
return (31 * hash + date);
}
}
//please pardon syntax errors, I typed without using ide.
***my intention is to use all fields in equals, and know why same number of elements should be used in hashcode, and what happens if only few elements are used
Clarification:
With only using "date" to compute hashcode,pointer checks right bucket address (do you agree?) furthermore, I get list of items in that bucket, collection will iterate over to check if particular obj exists using equals. And my definition of equals is "all fields must be same". With this, I believe my code works fine, and I only find performance issue. Please point out where I am wrong. Thank you
For your example, I suggest you use just id for equality and that annotate that they're overrides. Also, I like to override toString()
#Override
public boolean equals(Object o) {
if (o instanceof MyClass) {
return (id == ((MyClass) o).id);
}
return false;
}
#Override
public int hashCode() {
return id;
}
#Override
public String toString() {
return String.format("MyClass (id=%d, date=%d, pay=%d)", id, date, pay);
}
That way you can update the date and/or the pay without having to recreate the hash structure. Also, that's what appears to be unique about instances.
I found the answer in Effective Java, by Joshua Bloch, 2nd edtn, page 49 "Do not be tempted to exclude significant parts of an object from the hash code computation to improve performance" . The poor quality may degrade hash tables' performance.
So my guess was right, multiple hashes will map to same bucket.
Additional information:
http://www.javaranch.com/journal/2002/10/equalhash.html
Since the class members/variables num and data do participate in the
equals method comparison, they should also be involved in the
calculation of the hash code. Though, this is not mandatory. You can
use subset of the variables that participate in the equals method
comparison to improve performance of the hashCode method. Performance
of the hashCode method indeed is very important.

What is a 'canonical representation' of a field meant to be for equals() method (Joshua Bloch)

In chapter 3, item 8:
public final class CaseInsensitiveString {
private final String s;
public CaseInsensitiveString(String s) {
if (s == null)
throw new NullPointerException();
this.s = s;
}
#Override public boolean equals(Object o) {
return o instanceof CaseInsensitiveString &&
((CaseInsensitiveString) o).s.equalsIgnoreCase(s);
}
// remainder omitted
}
After describing issues surrounding the equals() method, he goes on to talk about this class in the context of comparing fields.
For some classes, such as CaseInsensitiveString above, field comparisons are more complex than simple equality tests. If this is the case, you may want to store a canonical form of the field, so the equals() method can do cheap exact comparisons on these canonical forms rather than more costly inexact comparisons. This technique is most appropriate for immutable classes; if the object can change, you must keep the canonical form up-to-date.
So my question (and I double-checked what 'canonical' means): what is Bloch talking about? What would the canonical form be? I'm ready to be told that the answer is very simple (presumably otherwise his editor would have told him to add more) but I want to see other people say so.
He also mentions the same thing for hashCode() in the next item 9.
To give it in context, he also discusses a bad version of the equals() method for CaseInsensitiveString:
// Broken - violates symmetry
#Override public boolean equals(Object o) {
if (o instanceof CaseInsensitiveString)
return s.equalsIgnoreCase(
((CaseInsensitiveString) o).s);
if (o instanceof String) // one-way interoperability!
return s.equalsIgnoreCase((String) o);
return false;
}
You should add another final field and store value s.toUpperCase() for it.
This new field will be canonical representation s field. New implementation of method equals() (see code bellow) will be cheaper. This approach will work only for immutable classes.
Another point you should not forget override hashCode() if you override equals().
public final class CaseInsensitiveString {
private final String s;
private final String sForEquals; //field added for simplifier equals method
public CaseInsensitiveString(String s) {
if (s == null) {
throw new IllegalArgumentException(); //NullPointerException() - bad practice
}
this.s = s;
this.sForEquals = s.toUpperCase();
}
#Override
public boolean equals(Object o) {
return o instanceof CaseInsensitiveString &&
((CaseInsensitiveString) o).sForEquals.equals(this.sForEquals);
}
#Override
public int hashCode(){
return sForEquals.hashCode();
}
// remainder omitted
}
The term canonical has some different usages. It refers to values that have several representations (or maybe several varying values that are equal). Then often one specific representation (or value) is chosen as canonical one.
Example: Sets of integers: canonical { 2, 3, 5 } = { 3, 5, 2 } = { 2, 2, 5, 3 } = .... .
For the plain java String there is as issue too. The same text in Unicode can be represented differently: ĉ either as one code point "\u0109"SMALL-LETTER-C-WITH-CIRCUMFLEX, or as two code points c SMALL-LETTER-C and a zero-width ^ COMBINED-DIACRITICAL-MARK-CIRCUMFLEX ("\u0063\u0302").
So even a plain String should be canonicalized in some cases:
String s = "...";
String s1 = Normalizer.normalize(s, Normalizer.Form.NFKD);
This uses Normalizer to decompose a string. This has the advantage, that one could sort and "c" and "ĉ" stay together. One could remove the combining diacritical marks with a regex and would have an ASCII version.
In fact different operating systems handle Unicode names differently, and not always version control systems respect a cross-platform canonicalisation.
Only after a Normalizer.normalize a comparison with String.equals indeed indicates Unicode text equality.
Your question had two parts:
Canonical form means "standardised form - in this case a lowercase version of the field, used for comparison. Every time the value changes, the lowercase copy would have to be updated, so there's an overhead to this design choice. Further, this idea is an optimization for performance only, and frankly is not recommended as it's "premature optimisation"
Non symmetry of equals allows code such that a.equals(b) but not b.equals(a), thus violating the equals contract. In your example, it's possible for a String to be equal to an instance of your class, because its equals() method allows that, but the implementation of equals() in the String class does not allow for an instance of your class to be considered as equal to a String.

In Java how can I check if an object is in a linked list?

Below is my class. The insertSymbol method is supposed to add an object to the linked list which is then added to a hash table. But when I print the contents of the hash table it has double entries. I tried to correct this by using "if(temp.contains(value)){return;}" but it isn't working. I read that I need to use #override in a couple of places. Could anyone help me know how and where to use the overrides? Thank you!
import java.util.*;
public class Semantic {
String currentScope;
Stack theStack = new Stack();
HashMap<String, LinkedList> SymbolTable= new HashMap<String, LinkedList>();
public void insertSymbol(String key, SymbolTableItem value){
LinkedList<SymbolTableItem> temp = new LinkedList<SymbolTableItem>();
if(SymbolTable.get(key) == null){
temp.addLast(value);
SymbolTable.put(key, temp);
}else{
temp = SymbolTable.get(key);
if(temp.contains(value)){
return;
}else{
temp.addLast(value);
SymbolTable.put(key, temp);
}
}
}
public String printValues(){
return SymbolTable.toString();
}
public boolean isBoolean(){
return true;
}
public boolean isTypeMatching(){
return true;
}
public void stackPush(String theString){
theStack.add(theString);
}
}
You have multiple options here. You'll need at least to add an equals (and therefor also a hashcode) method to your class.
However, if you want your collection to only contain unique items, why not use a Set instead?
If you still want to use a List, you can use your current approach, it just that the characteristics of a Set are that all items in a Set are unique, so a Set might make sense here.
Adding an equals method can quite easily be done. Apache Equalsbuilder is a good approach in this.
You don't need the 2nd line when you add a new value with the same key:
temp.addLast(value);
SymbolTable.put(key, temp); // <-- Not needed. Its already in there.
Let me explain something that #ErikPragt alludes to regarding this code:
if(temp.contains(value)){
What do you suppose that means?
If you look in the javadocs for LinkedList you will find that if a value in the list is non-null, it uses the equals() method on the value object to see if the list element is the same.
What that means, in your case, is that your class SymbolTableItem needs an equals() method that will compare two of these objects to see if they are the same, whatever that means in your case.
Lets assume the instances will be considered the same if the names are the same. You will need a method like this in the 'SymbolTableItem` class:
#Overrides
public boolean equals(Object that) {
if (that == null) {
return false;
}
if (this.getName() == null) {
return that.getName() == null;
}
return this.getName().equals(that.getName());
}
It it depends on more fields, the equals will be correspondingly more complex.
NOTE: One more thing. If you add an equals method to a class, it is good programming practice to add a hashcode() method too. The rule is that if two instances are equal, they should have the same hashcode and if not equal they don't have to be different hashcodes but it would be very nice if they did.
If you use your existing code where only equals is used, you don't need a hashcode, stricly. But if you don't add a hashcode it could be a problem someday. Maybe today.
In the case where the name is all that matters, your hashcode could just return: this.getName().hashcode().
Again, if there are more things to compare to tell if they are equal, the hashcode method will be more complex.

Is there a Java utility to do a deep comparison of two objects? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
How to "deep"-compare two objects that do not implement the equals method based on their field values in a test?
Original Question (closed because lack of precision and thus not fulfilling SO standards), kept for documentation purposes:
I'm trying to write unit tests for a variety of clone() operations inside a large project and I'm wondering if there is an existing class somewhere that is capable of taking two objects of the same type, doing a deep comparison, and saying if they're identical or not?
Unitils has this functionality:
Equality assertion through reflection, with different options like ignoring Java default/null values and ignoring order of collections
I love this question! Mainly because it is hardly ever answered or answered badly. It's like nobody has figured it out yet. Virgin territory :)
First off, don't even think about using equals. The contract of equals, as defined in the javadoc, is an equivalence relation (reflexive, symmetric, and transitive), not an equality relation. For that, it would also have to be antisymmetric. The only implementation of equals that is (or ever could be) a true equality relation is the one in java.lang.Object. Even if you did use equals to compare everything in the graph, the risk of breaking the contract is quite high. As Josh Bloch pointed out in Effective Java, the contract of equals is very easy to break:
"There is simply no way to extend an instantiable class and add an aspect while preserving the equals contract"
Besides what good does a boolean method really do you anyway? It'd be nice to actually encapsulate all the differences between the original and the clone, don't you think? Also, I'll assume here that you don't want to be bothered with writing/maintaining comparison code for each object in the graph, but rather you're looking for something that will scale with the source as it changes over time.
Soooo, what you really want is some kind of state comparison tool. How that tool is implemented is really dependent on the nature of your domain model and your performance restrictions. In my experience, there is no generic magic bullet. And it will be slow over a large number of iterations. But for testing the completeness of a clone operation, it'll do the job pretty well. Your two best options are serialization and reflection.
Some issues you will encounter:
Collection order: Should two collections be considered similar if they hold the same objects, but in a different order?
Which fields to ignore: Transient? Static?
Type equivalence: Should field values be of exactly the same type? Or is it ok for one to extend the other?
There's more, but I forget...
XStream is pretty fast and combined with XMLUnit will do the job in just a few lines of code. XMLUnit is nice because it can report all the differences, or just stop at the first one it finds. And its output includes the xpath to the differing nodes, which is nice. By default it doesn't allow unordered collections, but it can be configured to do so. Injecting a special difference handler (Called a DifferenceListener) allows you to specify the way you want to deal with differences, including ignoring order. However, as soon as you want to do anything beyond the simplest customization, it becomes difficult to write and the details tend to be tied down to a specific domain object.
My personal preference is to use reflection to cycle through all the declared fields and drill down into each one, tracking differences as I go. Word of warning: Don't use recursion unless you like stack overflow exceptions. Keep things in scope with a stack (use a LinkedList or something). I usually ignore transient and static fields, and I skip object pairs that I've already compared, so I don't end up in infinite loops if someone decided to write self-referential code (However, I always compare primitive wrappers no matter what, since the same object refs are often reused). You can configure things up front to ignore collection ordering and to ignore special types or fields, but I like to define my state comparison policies on the fields themselves via annotations. This, IMHO, is exactly what annotations were meant for, to make meta data about the class available at runtime. Something like:
#StatePolicy(unordered=true, ignore=false, exactTypesOnly=true)
private List<StringyThing> _mylist;
I think this is actually a really hard problem, but totally solvable! And once you have something that works for you, it is really, really, handy :)
So, good luck. And if you come up with something that's just pure genius, don't forget to share!
In AssertJ, you can do:
Assertions.assertThat(expectedObject).isEqualToComparingFieldByFieldRecursively(actualObject);
Probably it won't work in all cases, however it will work in more cases that you'd think.
Here's what the documentation says:
Assert that the object under test (actual) is equal to the given
object based on recursive a property/field by property/field
comparison (including inherited ones). This can be useful if actual's
equals implementation does not suit you. The recursive property/field
comparison is not applied on fields having a custom equals
implementation, i.e. the overridden equals method will be used instead
of a field by field comparison.
The recursive comparison handles cycles. By default floats are
compared with a precision of 1.0E-6 and doubles with 1.0E-15.
You can specify a custom comparator per (nested) fields or type with
respectively usingComparatorForFields(Comparator, String...) and
usingComparatorForType(Comparator, Class).
The objects to compare can be of different types but must have the
same properties/fields. For example if actual object has a name String
field, it is expected the other object to also have one. If an object
has a field and a property with the same name, the property value will
be used over the field.
Override The equals() Method
You can simply override the equals() method of the class using the EqualsBuilder.reflectionEquals() as explained here:
public boolean equals(Object obj) {
return EqualsBuilder.reflectionEquals(this, obj);
}
Just had to implement comparison of two entity instances revised by Hibernate Envers. I started writing my own differ but then found the following framework.
https://github.com/SQiShER/java-object-diff
You can compare two objects of the same type and it will show changes, additions and removals. If there are no changes, then the objects are equal (in theory). Annotations are provided for getters that should be ignored during the check. The frame work has far wider applications than equality checking, i.e. I am using to generate a change-log.
Its performance is OK, when comparing JPA entities, be sure to detach them from the entity manager first.
I am usin XStream:
/**
* #see java.lang.Object#equals(java.lang.Object)
*/
#Override
public boolean equals(Object o) {
XStream xstream = new XStream();
String oxml = xstream.toXML(o);
String myxml = xstream.toXML(this);
return myxml.equals(oxml);
}
/**
* #see java.lang.Object#hashCode()
*/
#Override
public int hashCode() {
XStream xstream = new XStream();
String myxml = xstream.toXML(this);
return myxml.hashCode();
}
http://www.unitils.org/tutorial-reflectionassert.html
public class User {
private long id;
private String first;
private String last;
public User(long id, String first, String last) {
this.id = id;
this.first = first;
this.last = last;
}
}
User user1 = new User(1, "John", "Doe");
User user2 = new User(1, "John", "Doe");
assertReflectionEquals(user1, user2);
Hamcrest has the Matcher samePropertyValuesAs. But it relies on the JavaBeans Convention (uses getters and setters). Should the objects that are to be compared not have getters and setters for their attributes, this will not work.
import static org.hamcrest.beans.SamePropertyValuesAs.samePropertyValuesAs;
import static org.junit.Assert.assertThat;
import org.junit.Test;
public class UserTest {
#Test
public void asfd() {
User user1 = new User(1, "John", "Doe");
User user2 = new User(1, "John", "Doe");
assertThat(user1, samePropertyValuesAs(user2)); // all good
user2 = new User(1, "John", "Do");
assertThat(user1, samePropertyValuesAs(user2)); // will fail
}
}
The user bean - with getters and setters
public class User {
private long id;
private String first;
private String last;
public User(long id, String first, String last) {
this.id = id;
this.first = first;
this.last = last;
}
public long getId() {
return id;
}
public void setId(long id) {
this.id = id;
}
public String getFirst() {
return first;
}
public void setFirst(String first) {
this.first = first;
}
public String getLast() {
return last;
}
public void setLast(String last) {
this.last = last;
}
}
If your objects implement Serializable you can use this:
public static boolean deepCompare(Object o1, Object o2) {
try {
ByteArrayOutputStream baos1 = new ByteArrayOutputStream();
ObjectOutputStream oos1 = new ObjectOutputStream(baos1);
oos1.writeObject(o1);
oos1.close();
ByteArrayOutputStream baos2 = new ByteArrayOutputStream();
ObjectOutputStream oos2 = new ObjectOutputStream(baos2);
oos2.writeObject(o2);
oos2.close();
return Arrays.equals(baos1.toByteArray(), baos2.toByteArray());
} catch (IOException e) {
throw new RuntimeException(e);
}
}
Your Linked List example is not that difficult to handle. As the code traverses the two object graphs, it places visited objects in a Set or Map. Before traversing into another object reference, this set is tested to see if the object has already been traversed. If so, no need to go further.
I agree with the person above who said use a LinkedList (like a Stack but without synchronized methods on it, so it is faster). Traversing the object graph using a Stack, while using reflection to get each field, is the ideal solution. Written once, this "external" equals() and "external" hashCode() is what all equals() and hashCode() methods should call. Never again do you need a customer equals() method.
I wrote a bit of code that traverses a complete object graph, listed over at Google Code. See json-io (http://code.google.com/p/json-io/). It serializes a Java object graph into JSON and deserialized from it. It handles all Java objects, with or without public constructors, Serializeable or not Serializable, etc. This same traversal code will be the basis for the external "equals()" and external "hashcode()" implementation. Btw, the JsonReader / JsonWriter (json-io) is usually faster than the built-in ObjectInputStream / ObjectOutputStream.
This JsonReader / JsonWriter could be used for comparison, but it will not help with hashcode. If you want a universal hashcode() and equals(), it needs it's own code. I may be able to pull this off with a generic graph visitor. We'll see.
Other considerations - static fields - that's easy - they can be skipped because all equals() instances would have the same value for static fields, as the static fields is shared across all instances.
As for transient fields - that will be a selectable option. Sometimes you may want transients to count other times not. "Sometimes you feel like a nut, sometimes you don't."
Check back to the json-io project (for my other projects) and you will find the external equals() / hashcode() project. I don't have a name for it yet, but it will be obvious.
I think the easiest solution inspired by Ray Hulha solution is to serialize the object and then deep compare the raw result.
The serialization could be either byte, json, xml or simple toString etc. ToString seems to be cheaper. Lombok generates free easy customizable ToSTring for us. See example below.
#ToString #Getter #Setter
class foo{
boolean foo1;
String foo2;
public boolean deepCompare(Object other) { //for cohesiveness
return other != null && this.toString().equals(other.toString());
}
}
I guess you know this, but In theory, you're supposed to always override .equals to assert that two objects are truly equal. This would imply that they check the overridden .equals methods on their members.
This kind of thing is why .equals is defined in Object.
If this were done consistently you wouldn't have a problem.
A halting guarantee for such a deep comparison might be a problem. What should the following do? (If you implement such a comparator, this would make a good unit test.)
LinkedListNode a = new LinkedListNode();
a.next = a;
LinkedListNode b = new LinkedListNode();
b.next = b;
System.out.println(DeepCompare(a, b));
Here's another:
LinkedListNode c = new LinkedListNode();
LinkedListNode d = new LinkedListNode();
c.next = d;
d.next = c;
System.out.println(DeepCompare(c, d));
Apache gives you something, convert both objects to string and compare strings, but you have to Override toString()
obj1.toString().equals(obj2.toString())
Override toString()
If all fields are primitive types :
import org.apache.commons.lang3.builder.ReflectionToStringBuilder;
#Override
public String toString() {return
ReflectionToStringBuilder.toString(this);}
If you have non primitive fields and/or collection and/or map :
// Within class
import org.apache.commons.lang3.builder.ReflectionToStringBuilder;
#Override
public String toString() {return
ReflectionToStringBuilder.toString(this,new
MultipleRecursiveToStringStyle());}
// New class extended from Apache ToStringStyle
import org.apache.commons.lang3.builder.ReflectionToStringBuilder;
import org.apache.commons.lang3.builder.ToStringStyle;
import java.util.*;
public class MultipleRecursiveToStringStyle extends ToStringStyle {
private static final int INFINITE_DEPTH = -1;
private int maxDepth;
private int depth;
public MultipleRecursiveToStringStyle() {
this(INFINITE_DEPTH);
}
public MultipleRecursiveToStringStyle(int maxDepth) {
setUseShortClassName(true);
setUseIdentityHashCode(false);
this.maxDepth = maxDepth;
}
#Override
protected void appendDetail(StringBuffer buffer, String fieldName, Object value) {
if (value.getClass().getName().startsWith("java.lang.")
|| (maxDepth != INFINITE_DEPTH && depth >= maxDepth)) {
buffer.append(value);
} else {
depth++;
buffer.append(ReflectionToStringBuilder.toString(value, this));
depth--;
}
}
#Override
protected void appendDetail(StringBuffer buffer, String fieldName,
Collection<?> coll) {
for(Object value: coll){
if (value.getClass().getName().startsWith("java.lang.")
|| (maxDepth != INFINITE_DEPTH && depth >= maxDepth)) {
buffer.append(value);
} else {
depth++;
buffer.append(ReflectionToStringBuilder.toString(value, this));
depth--;
}
}
}
#Override
protected void appendDetail(StringBuffer buffer, String fieldName, Map<?, ?> map) {
for(Map.Entry<?,?> kvEntry: map.entrySet()){
Object value = kvEntry.getKey();
if (value.getClass().getName().startsWith("java.lang.")
|| (maxDepth != INFINITE_DEPTH && depth >= maxDepth)) {
buffer.append(value);
} else {
depth++;
buffer.append(ReflectionToStringBuilder.toString(value, this));
depth--;
}
value = kvEntry.getValue();
if (value.getClass().getName().startsWith("java.lang.")
|| (maxDepth != INFINITE_DEPTH && depth >= maxDepth)) {
buffer.append(value);
} else {
depth++;
buffer.append(ReflectionToStringBuilder.toString(value, this));
depth--;
}
}
}}

Categories

Resources