Creating .hashcode() and .equals() methods for a large object

Creating .hashcode() and .equals() methods for a large object - java

I have a class with many (about 100) fields. I want to implement hashCode() and equals() methods for these fields, is there any alternative to doing this manually?

There's no great answer. Here are a few suggestions. As others have commented, 100 fields are far too many. You best bet is to refactor the class. But, if you must keep it all together:
Could you use a Map (or other Collection) to hold many of the fields?
If so, you can use their built in hashCode() and equals() methods. (Or Guava etc. as pointed out by #dimo414)
hashcode() should only consider immutable fields (or at least fields that seldom seldom change.)
If only a few of your fields are immutable, that will greatly simplify your hashCode() code. And, more importantly, make it correct. :-)
With 100+ fields, what's the realistic chance that two instances will ever be equal?
If the answer is "extremely rarely", ask yourself if you could get away with using the basic Object equality (in effect, using ==)?
Do you already have an informative toString() method?
If so, you can sometimes use that String as an inefficient, but easy to code, hashCode() and equals(). e.g.:
public int hashCode() { return this.toString().hashCode(); }
public boolean equals(Object o) {
return (o instanceof MyClass) &&
(this.toString().equals(o.toString()));
}

Others have pointed out that an object this large is likely not a great pattern to follow, so I'll assume you know that already and have decided to proceed anyways. I'm also going to assume this object is (mostly) immutable, since implementing .hashCode() for mutable objects is generally a bad plan (at the very least, you have to be careful about putting mutable objects in a HashSet or as keys in a HashMap).
If you have a class with a large number of fields you can avoid defining complex .hashCode(), .equals(), and .toString() methods by taking advantage of existing functionality that does the same thing. An easy option is to construct a List or Map of your fields, and simply call the respective methods of that Collection. You can even cache the return values of these functions, rather than hold onto the whole Collection, if you want.
There are also many useful utilities to make these methods easier; there's way too many to list, but I'll try to call out a couple of particularly useful ones:
Stock JDK:
Hashing: Objects.hash(), Arrays.hashCode()
Equals: Arrays.equals()
ToString: Arrays.toString()
Guava:
Hashing: Hashing.combineOrdered() and a whole batch of powerful hashing utilities.
Equals: Iterables.elementsEqual()
ToString: MoreObjects.toStringHelper()
AutoValue: Awesome tool, does everything you want for you as long as your object is conceptually a value type.
Additionally, you could use reflection to get all the fields in your object at runtime. This would be slower than a hard-coded implementation, but it would likely be faster to write. If you aren't overly concerned about speed, this is a good option.

I think It is better to use the eclipse functionality to generating code for hashCode() and equals() which does a pretty good job for implementing this methods.

While it's not a good practice to have an object with that many fields, sometimes legacy constraints trap you in a bad situation.
Regardless of size, I find the easiest way to override these methods is by using the Apache commons library. It uses reflection to generate the values from the instance and there are a number of ways to configure the results. I also never have to remember to regenerate the method if the fields update, unlike the eclipse generated methods.
#Override
public final boolean equals(final Object obj) {
if (obj == this) {
return true;
}
if(obj != null && obj.getClass() == this.getClass()) {
return EqualsBuilder.relectionEquals(this, obj, true);
}
return false;
}
#Override
public final int hashCode() {
return HashCodeBuilder.relectionHashCode(this);
}
#Override
public final String toString() {
return ToStringBuilder.reflectionToString(this, ToStringStyle.MULTI_LINE_STYLE);
}

Related

Is there any benefit to use methods of Objects.java? [duplicate]

This question already has an answer here:
Purpose of Objects.isNull(...) / Objects.nonNull(...)
(1 answer)
Closed 4 years ago.
I examined methods of Objects.java, but i couldn't find too much useful sides of that methods. For Example the code that will work when i use Objects.isNull :
public static boolean isNull(Object obj) {
return obj == null;
}
There are the two ways for checking nullity of two objects :
if(o == null)
if(Objects.isNull(o))
So there are not so many differences between them. Another example the code that will work i use Objects.toString
public static String toString(Object o) {
return String.valueOf(o);
}
When i use it It calls toString of object at background.(With only one difference it writes "null", if the object is null because it uses String.valueOf()
And Objects.equals :
public static boolean equals(Object a, Object b) {
return (a == b) || (a != null && a.equals(b));
}
It will makes null check in every check(without knowing it is necessary or not.)
Am i wrong? If i am, why should i use that methods and other methods of Objects.java?
EDIT
I did not asked this question only for Objects.isNull and Objects.nonNull, i want to know purpose, usability(except for lambdas also) and benefits of Objects class and its methods. But in javadoc is written that only for Objects.isNull and Objects.nonNull have purpose to use with lambdas(as predicate filter(Objects::isNull)). I want to know others as well.

Objects.isNull(), and the more useful Objects.nonNull(), exist for the purpose of being used in lambda expressions. Objects.toString() was introduced for null safety (as pointed out by #davidxxx), but is also very useful in lambdas.
For example, list.stream().filter(Objects::nonNull).map(Objects::toString) will give you a Stream<String> with the results of calling toString() on all the elements in list that are not null.
Objects.equals() is useful precisely when you know that the objects you're comparing might be null, as it saves you some typing.

You seem to be asking 3 separate questions, so I'll address them separately:
isNull() and its companion nonNull() were added in Java 8 to be used as method references, similarly to Integer.sum() and Boolean.logicalOr(). For example:
// Print only non-null elements
list.stream()
.filter(Objects::nonNull)
.forEach(System.out::println);
I don't see any advantage in calling Objects.toString() over String.valueOf(). Maybe it was included for uniformity with the other null-safe helpers.
If you know the objects are non-null, go ahead and use Object.equals(). Objects.equals() is meant to be used when they might both be null.

In some cases some of these methods don't bring a "great" value and you can suitably ignore them.
1) But as you manipulate classes that miss some checks (check null to prevent NullPointerException) or "optimization" (check first reference equality in equals() for example) , using these Objects methods allow to not be hurt by these and so to
keep your client code robust without writing directly all these checks.
2) Another interesting use is for lambda body as you want to use a method reference
3) At last, it allows to make homogeneous the way to perform these very common processings.
These 3 processing rely on 3 different ways :
String.valueOf(o);
if(o == null){...}
a.equals(b);
While these rely on a single way : utility methods defined in Objects.
Objects.toString(o);
if(Objects.isNull(o)){...}
if(Objects.equals(a, b)){...}

How to compare Java function object to a specific method? [duplicate]

Say I have a List of object which were defined using lambda expressions (closures). Is there a way to inspect them so they can be compared?
The code I am most interested in is
List<Strategy> strategies = getStrategies();
Strategy a = (Strategy) this::a;
if (strategies.contains(a)) { // ...
The full code is
import java.util.Arrays;
import java.util.List;
public class ClosureEqualsMain {
interface Strategy {
void invoke(/*args*/);
default boolean equals(Object o) { // doesn't compile
return Closures.equals(this, o);
}
}
public void a() { }
public void b() { }
public void c() { }
public List<Strategy> getStrategies() {
return Arrays.asList(this::a, this::b, this::c);
}
private void testStrategies() {
List<Strategy> strategies = getStrategies();
System.out.println(strategies);
Strategy a = (Strategy) this::a;
// prints false
System.out.println("strategies.contains(this::a) is " + strategies.contains(a));
}
public static void main(String... ignored) {
new ClosureEqualsMain().testStrategies();
}
enum Closures {;
public static <Closure> boolean equals(Closure c1, Closure c2) {
// This doesn't compare the contents
// like others immutables e.g. String
return c1.equals(c2);
}
public static <Closure> int hashCode(Closure c) {
return // a hashCode which can detect duplicates for a Set<Strategy>
}
public static <Closure> String asString(Closure c) {
return // something better than Object.toString();
}
}
public String toString() {
return "my-ClosureEqualsMain";
}
}
It would appear the only solution is to define each lambda as a field and only use those fields. If you want to print out the method called, you are better off using Method. Is there a better way with lambda expressions?
Also, is it possible to print a lambda and get something human readable? If you print this::a instead of
ClosureEqualsMain$$Lambda$1/821270929#3f99bd52
get something like
ClosureEqualsMain.a()
or even use this.toString and the method.
my-ClosureEqualsMain.a();

This question could be interpreted relative to the specification or the implementation. Obviously, implementations could change, but you might be willing to rewrite your code when that happens, so I'll answer at both.
It also depends on what you want to do. Are you looking to optimize, or are you looking for ironclad guarantees that two instances are (or are not) the same function? (If the latter, you're going to find yourself at odds with computational physics, in that even problems as simple as asking whether two functions compute the same thing are undecidable.)
From a specification perspective, the language spec promises only that the result of evaluating (not invoking) a lambda expression is an instance of a class implementing the target functional interface. It makes no promises about the identity, or degree of aliasing, of the result. This is by design, to give implementations maximal flexibility to offer better performance (this is how lambdas can be faster than inner classes; we're not tied to the "must create unique instance" constraint that inner classes are.)
So basically, the spec doesn't give you much, except obviously that two lambdas that are reference-equal (==) are going to compute the same function.
From an implementation perspective, you can conclude a little more. There is (currently, may change) a 1:1 relationship between the synthetic classes that implement lambdas, and the capture sites in the program. So two separate bits of code that capture "x -> x + 1" may well be mapped to different classes. But if you evaluate the same lambda at the same capture site, and that lambda is non-capturing, you get the same instance, which can be compared with reference equality.
If your lambdas are serializable, they'll give up their state more easily, in exchange for sacrificing some performance and security (no free lunch.)
One area where it might be practical to tweak the definition of equality is with method references because this would enable them to be used as listeners and be properly unregistered. This is under consideration.
I think what you're trying to get to is: if two lambdas are converted to the same functional interface, are represented by the same behavior function, and have identical captured args, they're the same
Unfortunately, this is both hard to do (for non-serializable lambdas, you can't get at all the components of that) and not enough (because two separately compiled files could convert the same lambda to the same functional interface type, and you wouldn't be able to tell.)
The EG discussed whether to expose enough information to be able to make these judgments, as well as discussing whether lambdas should implement more selective equals/hashCode or more descriptive toString. The conclusion was that we were not willing to pay anything in performance cost to make this information available to the caller (bad tradeoff, punishing 99.99% of users for something that benefits .01%).
A definitive conclusion on toString was not reached but left open to be revisited in the future. However, there were some good arguments made on both sides on this issue; this is not a slam-dunk.

To compare labmdas I usually let the interface extend Serializable and then compare the serialized bytes. Not very nice but works for the most cases.

I don't see a possibility, to get those informations from the closure itself.
The closures doesn't provide state.
But you can use Java-Reflection, if you want to inspect and compare the methods.
Of course that is not a very beautiful solution, because of the performance and the exceptions, which are to catch. But this way you get those meta-informations.

Using intermediate array for hashCode and equals

As its a pain to handle structural changes of the class in two places I often do:
class A {
class C{}
class B{}
private B bChild;
private C cChild;
private Object[] structure() {
return new Object[]{bChild, cChild};
}
public int hashCode() {
Arrays.hashCode(structure());
}
public boolean equals(Object that) {
//type check here
return Arrays.equals(this.structure(), ((A)that).structure());
}
}
What's bad about this approach besides boxing of primitives?
Can it be improved?

It's a clever way to reuse library methods, which is generally a good idea; but it does a great deal of excess allocation and array manipulation, which might be terribly inefficient in such frequently used methods. All in all, I'd say its cute, but it wouldn't pass a review.

In JDK 7 they added the java.util.Objects class. It actually implements a hash and equals utility in a manner that reminds what you wrote. The point being that this approach is actually sanctioned by JDK developers. Ernest Friedman-Hill has a point but in the majority of cases I don't think that the extra few machine instructions are worth saving at the expense of readability.
For example: the hash utility method is implemented as:
public static int hash(Object... values) {
return Arrays.hashCode(values);
}

Someone familiarizing themselves with the code will have a bit more difficulty seeing what's going on. It's less "obvious" than listing the individual fields, as demonstrated by my previously erroneous answer. It is true, that "equals" is generally implemented with an "Object" passed in, so it's debatable, but the input is cast after the reference equality check. That is not the case here.
One improvement might be to store the array as a private data member rather than create it with the structure method, sacrificing a bit of memory to avoid the boxing.

Can anything warn me against type.equals(incompatibleType)?

Is there any tool that can warn me against the following sort of code:
if ( someClass.equals( someString ))
For example:
if ( myObject.getClass().equals( myClassName ))
Such a thing is legal Java (equals takes an Object) but will never evaluate to true (a class can never equal a String) so is almost certainly a bug.
I have checked Eclipse, FindBugs and PMD but none seem to support this feature?

Yes, IntelliJ IDEA has such an inspection that I believe is enabled by default. It flags the following:
Class<?> clazz = String.class;
if (clazz.equals("foo")) {
//...
}
With the warning:
'equals()' between objects of inconvertible types.
The inspection can be enabled/disabled through Settings->Project Settings->Inspections, then under Probable Bugs check/uncheck "'equals()' between objects of inconvertible types."
FindBugs also should catch this with the "EC: Call to equals() comparing different types" bug check. It can be integrated with Eclipse as it appears you are aware.
Neither is a silver bullet though; they can't read your mind. The best you can hope for is that it will favour false positives rather than false negatives.

This is the idea behind the IEquatable<T> interface in .NET: providing a mechanism for types to implement what I'll call strongly typed equality. There is also the IEqualityComparer<T> interface for allowing this logic to be implemented in a separate type.
According to this StackOverflow question (answered by Jon Skeet, who generally seems to know what he's talking about), there doesn't seem to be any equivalent in Java.
Of course, you can always implement such a thing yourself for your own types, but it won't do you much good with types that are part of Java's base class libraries. For compile-time detection of such issues, your best bet is some sort of analysis tool (Mark Peters indicates there is apparently one built in to IntelliJ IDEA) that can suggest to you that certain code might be suspect. In general, assuming you aren't one to ignore warnings, this ought to be good enough.

What you are checking for is not necessarily a "problem": equals() is declared in the Object class, and takes and Object as its parameter. Classes override this method, and their implementation may well allow an object of a different class to "equal" the target object.
I have done this a few times myself, for example for allowing an object to "equal" another object if the other object (say a String) matches the key field of the target:
class MyClass {
private String id;
public boolean equals(Object obj) {
// Compare as if "this" is the id field
return id.equals(obj instanceof MyClass ? ((MyClass)obj).id : obj);
}
public int hashCode() {
return id.hashCode(); // so hashCode() agrees with equals()
}
}
It's actually pretty handy, because the following code will work as desired:
List<MyClass> list = new ArrayList<MyClass>();
// collection methods will work with instances:
list.contains(someInstance);
list.remove(someInstance);
list.indexOf(someInstance);
// and with keys!
// handy if you can only get the key, for example from a web url parameter
list.contains("somekey");
list.remove("somekey");
list.indexOf("somekey");

Java equals(): to reflect or not to reflect

This question is specifically related to overriding the equals() method for objects with a large number of fields. First off, let me say that this large object cannot be broken down into multiple components without violating OO principles, so telling me "no class should have more than x fields" won't help.
Moving on, the problem came to fruition when I forgot to check one of the fields for equality. Therefore, my equals method was incorrect. Then I thought to use reflection:
--code removed because it was too distracting--
The purpose of this post isn't necessarily to refactor the code (this isn't even the code I am using), but instead to get input on whether or not this is a good idea.
Pros:
If a new field is added, it is automatically included
The method is much more terse than 30 if statements
Cons:
If a new field is added, it is automatically included, sometimes this is undesirable
Performance: This has to be slower, I don't feel the need to break out a profiler
Whitelisting certain fields to ignore in the comparison is a little ugly
Any thoughts?

If you did want to whitelist for performance reasons, consider using an annotation to indicate which fields to compare. Also, this implementation won't work if your fields don't have good implementations for equals().
P.S. If you go this route for equals(), don't forget to do something similar for hashCode().
P.P.S. I trust you already considered HashCodeBuilder and EqualsBuilder.

Use Eclipse, FFS!
Delete the hashCode and equals methods you have.
Right click on the file.
Select Source->Generate hashcode and equals...
Done! No more worries about reflection.
Repeat for each field added, you just use the outline view to delete your two methods, and then let Eclipse autogenerate them.

If you do go the reflection approach, EqualsBuilder is still your friend:
public boolean equals(Object obj) {
return EqualsBuilder.reflectionEquals(this, obj);
}

Here's a thought if you're worried about:
1/ Forgetting to update your big series of if-statements for checking equality when you add/remove a field.
2/ The performance of doing this in the equals() method.
Try the following:
a/ Revert back to using the long sequence of if-statements in your equals() method.
b/ Have a single function which contains a list of the fields (in a String array) and which will check that list against reality (i.e., the reflected fields). It will throw an exception if they don't match.
c/ In your constructor for this object, have a synchronized run-once call to this function (similar to a singleton pattern). In other words, if this is the first object constructed by this class, call the checking function described in (b) above.
The exception will make it immediately obvious when you run your program if you haven't updated your if-statements to match the reflected fields; then you fix the if-statements and update the field list from (b) above.
Subsequent construction of objects will not do this check and your equals() method will run at it's maximum possible speed.
Try as I might, I haven't been able to find any real problems with this approach (greater minds may exist on StackOverflow) - there's an extra condition check on each object construction for the run-once behaviour but that seems fairly minor.
If you try hard enough, you could still get your if-statements out of step with your field-list and reflected fields but the exception will ensure your field list matches the reflected fields and you just make sure you update the if-statements and field list at the same time.

You can always annotate the fields you do/do not want in your equals method, that should be a straightforward and simple change to it.
Performance is obviously related to how often the object is actually compared, but a lot of frameworks use hash maps, so your equals may be being used more than you think.
Also, speaking of hash maps, you have the same issue with the hashCode method.
Finally, do you really need to compare all of the fields for equality?

You have a few bugs in your code.
You cannot assume that this and obj are the same class. Indeed, it's explicitly allowed for obj to be any other class. You could start with if ( ! obj instanceof myClass ) return false; however this is still not correct because obj could be a subclass of this with additional fields that might matter.
You have to support null values for obj with a simple if ( obj == null ) return false;
You can't treat null and empty string as equal. Instead treat null specially. Simplest way here is to start by comparing Field.get(obj) == Field.get(this). If they are both equal or both happen to point to the same object, this is fast. (Note: This is also an optimization, which you need since this is a slow routine.) If this fails, you can use the fast if ( Field.get(obj) == null || Field.get(this) == null ) return false; to handle cases where exactly one is null. Finally you can use the usual equals().
You're not using foundMismatch
I agree with Hank that [HashCodeBuilder][1] and [EqualsBuilder][2] is a better way to go. It's easy to maintain, not a lot of boilerplate code, and you avoid all these issues.

You could use Annotations to exclude fields from the check
e.g.
#IgnoreEquals
String fieldThatShouldNotBeCompared;
And then of course you check the presence of the annotation in your generic equals method.

If you have access to the names of the fields, why don't you make it a standard that fields you don't want to include always start with "local" or "nochk" or something like that.
Then you blacklist all fields that begin with this (code is not so ugly then).
I don't doubt it's a little slower. You need to decide whether you want to swap ease-of-updates against execution speed.

Take a look at org.apache.commons.EqualsBuilder:
http://commons.apache.org/proper/commons-lang/javadocs/api-3.2/org/apache/commons/lang3/builder/EqualsBuilder.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.