Using intermediate array for hashCode and equals - java

As its a pain to handle structural changes of the class in two places I often do:
class A {
class C{}
class B{}
private B bChild;
private C cChild;
private Object[] structure() {
return new Object[]{bChild, cChild};
}
public int hashCode() {
Arrays.hashCode(structure());
}
public boolean equals(Object that) {
//type check here
return Arrays.equals(this.structure(), ((A)that).structure());
}
}
What's bad about this approach besides boxing of primitives?
Can it be improved?

It's a clever way to reuse library methods, which is generally a good idea; but it does a great deal of excess allocation and array manipulation, which might be terribly inefficient in such frequently used methods. All in all, I'd say its cute, but it wouldn't pass a review.

In JDK 7 they added the java.util.Objects class. It actually implements a hash and equals utility in a manner that reminds what you wrote. The point being that this approach is actually sanctioned by JDK developers. Ernest Friedman-Hill has a point but in the majority of cases I don't think that the extra few machine instructions are worth saving at the expense of readability.
For example: the hash utility method is implemented as:
public static int hash(Object... values) {
return Arrays.hashCode(values);
}

Someone familiarizing themselves with the code will have a bit more difficulty seeing what's going on. It's less "obvious" than listing the individual fields, as demonstrated by my previously erroneous answer. It is true, that "equals" is generally implemented with an "Object" passed in, so it's debatable, but the input is cast after the reference equality check. That is not the case here.
One improvement might be to store the array as a private data member rather than create it with the structure method, sacrificing a bit of memory to avoid the boxing.

Related

Why does the instance method `hashCode` on `java.lang.Integer` make an extra jump to the static class method to simply return its own integer value?

I was just exploring different kinds of implementations to the hashCode() method. I opened up the java.lang.Integer class and found this implementation for hashCode():
public int hashCode() {
return Integer.hashCode(value);
}
public static int hashCode(int value) {
return value;
}
My question is, why can't the implementation be as simple as:
public int hashCode(){
return this.value;
}
What is the need to create an additional static method to pass around the value and return the same? Am I overlooking any important detail here?
That code does look odd when viewed on its own.
But notice that the static method java.lang.Integer.hashCode:
was added later, in Java 8
is public
The source code in Java 14 shows no comments to explain why this static method was added. Because the method is public, I presume this new static method plays a part in some new feature in Java 8, perhaps related to streams, called elsewhere in the OpenJDK codebase.
As noted in the Javadoc, the source code of the existing Integer::hashCode instance method was rewritten to call the static hashCode simply for consistency. This way there is only one place where the hash code is actually being generated. Having only one place is wise for review and maintenance of the codebase.
Making hashCode static is certainly unusual. The purpose of the hashCode method is to identify one object of that class to another for use in collections such as HashSet or HashMap. Given that we are comparing instances by the method, it makes sense for hashCode to be an instance method rather than static.
The optimizing compiler such as HotSpot or OpenJ9 is likely to inline the hashCode method calls, making moot the instance-method versus static-method arrangement in source code.
#Basil Bourque's answer covers just about everything. But he leaves open the question of why the public static void hashCode(int) was added.
The change was made in this changeset in November 2012
http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/be1fb42ef696/src/share/classes/java/lang/Integer.java
The title and summary for the changeset say this:
7088913: Add compatible static hashCode(primitive) to primitive wrapper classes
Summary: Adds static utility methods to each primitive wrapper class to allow calculation of a hashCode value from an unboxed primitive.
Note that the changeset does not document the motivation for the change.
I infer that one purpose of the enhancement is to avoid the application programmer having to know how the primitive wrapper classes are computed. Prior to Java 8, to compute the wrapper-compatible hash code for a primitive int, the programmer would have to have written either
int value = ...
int hash = ((Integer) value).hashCode(); // Facially inefficient (depending on
// JIT compiler's ability to get
// rid of the box/unbox sequence)
or
int value = ...
int hash = value; // Hardwires knowledge of how
// Integer.hashCode() is computed.
While the "knowledge" is trivial for int / Integer, consider the case of double / Double where the hash code computation is:
long bits = doubleToLongBits(value);
return (int)(bits ^ (bits >>> 32));
It seems likely that this changeset was also motivated by the Streams project; e.g. so that Integer::hashCode can be used in a stream of integers.
However, the changeset that added sum, min and max for use in stream reductions happened a couple of months after this one. So we cannot definitively make the connection ... based on this evidence.

How to compare Java function object to a specific method? [duplicate]

Say I have a List of object which were defined using lambda expressions (closures). Is there a way to inspect them so they can be compared?
The code I am most interested in is
List<Strategy> strategies = getStrategies();
Strategy a = (Strategy) this::a;
if (strategies.contains(a)) { // ...
The full code is
import java.util.Arrays;
import java.util.List;
public class ClosureEqualsMain {
interface Strategy {
void invoke(/*args*/);
default boolean equals(Object o) { // doesn't compile
return Closures.equals(this, o);
}
}
public void a() { }
public void b() { }
public void c() { }
public List<Strategy> getStrategies() {
return Arrays.asList(this::a, this::b, this::c);
}
private void testStrategies() {
List<Strategy> strategies = getStrategies();
System.out.println(strategies);
Strategy a = (Strategy) this::a;
// prints false
System.out.println("strategies.contains(this::a) is " + strategies.contains(a));
}
public static void main(String... ignored) {
new ClosureEqualsMain().testStrategies();
}
enum Closures {;
public static <Closure> boolean equals(Closure c1, Closure c2) {
// This doesn't compare the contents
// like others immutables e.g. String
return c1.equals(c2);
}
public static <Closure> int hashCode(Closure c) {
return // a hashCode which can detect duplicates for a Set<Strategy>
}
public static <Closure> String asString(Closure c) {
return // something better than Object.toString();
}
}
public String toString() {
return "my-ClosureEqualsMain";
}
}
It would appear the only solution is to define each lambda as a field and only use those fields. If you want to print out the method called, you are better off using Method. Is there a better way with lambda expressions?
Also, is it possible to print a lambda and get something human readable? If you print this::a instead of
ClosureEqualsMain$$Lambda$1/821270929#3f99bd52
get something like
ClosureEqualsMain.a()
or even use this.toString and the method.
my-ClosureEqualsMain.a();
This question could be interpreted relative to the specification or the implementation. Obviously, implementations could change, but you might be willing to rewrite your code when that happens, so I'll answer at both.
It also depends on what you want to do. Are you looking to optimize, or are you looking for ironclad guarantees that two instances are (or are not) the same function? (If the latter, you're going to find yourself at odds with computational physics, in that even problems as simple as asking whether two functions compute the same thing are undecidable.)
From a specification perspective, the language spec promises only that the result of evaluating (not invoking) a lambda expression is an instance of a class implementing the target functional interface. It makes no promises about the identity, or degree of aliasing, of the result. This is by design, to give implementations maximal flexibility to offer better performance (this is how lambdas can be faster than inner classes; we're not tied to the "must create unique instance" constraint that inner classes are.)
So basically, the spec doesn't give you much, except obviously that two lambdas that are reference-equal (==) are going to compute the same function.
From an implementation perspective, you can conclude a little more. There is (currently, may change) a 1:1 relationship between the synthetic classes that implement lambdas, and the capture sites in the program. So two separate bits of code that capture "x -> x + 1" may well be mapped to different classes. But if you evaluate the same lambda at the same capture site, and that lambda is non-capturing, you get the same instance, which can be compared with reference equality.
If your lambdas are serializable, they'll give up their state more easily, in exchange for sacrificing some performance and security (no free lunch.)
One area where it might be practical to tweak the definition of equality is with method references because this would enable them to be used as listeners and be properly unregistered. This is under consideration.
I think what you're trying to get to is: if two lambdas are converted to the same functional interface, are represented by the same behavior function, and have identical captured args, they're the same
Unfortunately, this is both hard to do (for non-serializable lambdas, you can't get at all the components of that) and not enough (because two separately compiled files could convert the same lambda to the same functional interface type, and you wouldn't be able to tell.)
The EG discussed whether to expose enough information to be able to make these judgments, as well as discussing whether lambdas should implement more selective equals/hashCode or more descriptive toString. The conclusion was that we were not willing to pay anything in performance cost to make this information available to the caller (bad tradeoff, punishing 99.99% of users for something that benefits .01%).
A definitive conclusion on toString was not reached but left open to be revisited in the future. However, there were some good arguments made on both sides on this issue; this is not a slam-dunk.
To compare labmdas I usually let the interface extend Serializable and then compare the serialized bytes. Not very nice but works for the most cases.
I don't see a possibility, to get those informations from the closure itself.
The closures doesn't provide state.
But you can use Java-Reflection, if you want to inspect and compare the methods.
Of course that is not a very beautiful solution, because of the performance and the exceptions, which are to catch. But this way you get those meta-informations.

Java safe return type container

I am a c++ developer by day, and I am used to the convention of const return types. I am aware that there is no facility similar to this in java.
I have a specific situation and was wondering the best immutable collection for my task. In C++ I would just use std::vector.
I have a WavFile class that currently has a float[] data, I would like to replace this with something that could be immutable.
Some important stipulations about the container is that its size is known at creation, and it does not need to dynamically grow or shrink at all. Secondly, it should be O(1) to index into the container.
And most importantly, like the topic alludes to, I want to be able to have a getter that returns an immutable version of this container.
What would be the container type I am looking for? Is this something that is possible in java?
If you can live with the boxing cost, a Collections.unmodifiableList() or Guava ImmutableList will work.
If not, try Trove, which provides TFloatArrayList and an easy way to make them unmodifiable.
IMHO, it is better to tackle the problem by design. Returning a collection or the array, even it is const, is still unsatisfactory OO design, as you are exposing internal data (although not as bad as exposing it as public member variable).
You may further think of how people are supposed to use the data.
If you expect people to simply iterate through each data point, then you may provide something like
public class WaveFile {
public DoubleStream dataStream() {...}
public forEachDataPoint(DoubleConsumer consumer) {...}
}
If you really want to allow people to do random access, you may provide
public class WaveFile {
public float dataPointAt(int index) {...}
public DoubleStream dataStream(int fromIndex, int toIndex) {...}
}
Such kind of encapsulation avoid a lot of unexpected way of using data, and give you a lot of flexibility on how you represent the data internally. For example, you may do lazy loading from file on disk, you may make a "WaveFile" for which the data points are generated on the fly by some formula etc.

Creating .hashcode() and .equals() methods for a large object

I have a class with many (about 100) fields. I want to implement hashCode() and equals() methods for these fields, is there any alternative to doing this manually?
There's no great answer. Here are a few suggestions. As others have commented, 100 fields are far too many. You best bet is to refactor the class. But, if you must keep it all together:
Could you use a Map (or other Collection) to hold many of the fields?
If so, you can use their built in hashCode() and equals() methods. (Or Guava etc. as pointed out by #dimo414)
hashcode() should only consider immutable fields (or at least fields that seldom seldom change.)
If only a few of your fields are immutable, that will greatly simplify your hashCode() code. And, more importantly, make it correct. :-)
With 100+ fields, what's the realistic chance that two instances will ever be equal?
If the answer is "extremely rarely", ask yourself if you could get away with using the basic Object equality (in effect, using ==)?
Do you already have an informative toString() method?
If so, you can sometimes use that String as an inefficient, but easy to code, hashCode() and equals(). e.g.:
public int hashCode() { return this.toString().hashCode(); }
public boolean equals(Object o) {
return (o instanceof MyClass) &&
(this.toString().equals(o.toString()));
}
Others have pointed out that an object this large is likely not a great pattern to follow, so I'll assume you know that already and have decided to proceed anyways. I'm also going to assume this object is (mostly) immutable, since implementing .hashCode() for mutable objects is generally a bad plan (at the very least, you have to be careful about putting mutable objects in a HashSet or as keys in a HashMap).
If you have a class with a large number of fields you can avoid defining complex .hashCode(), .equals(), and .toString() methods by taking advantage of existing functionality that does the same thing. An easy option is to construct a List or Map of your fields, and simply call the respective methods of that Collection. You can even cache the return values of these functions, rather than hold onto the whole Collection, if you want.
There are also many useful utilities to make these methods easier; there's way too many to list, but I'll try to call out a couple of particularly useful ones:
Stock JDK:
Hashing: Objects.hash(), Arrays.hashCode()
Equals: Arrays.equals()
ToString: Arrays.toString()
Guava:
Hashing: Hashing.combineOrdered() and a whole batch of powerful hashing utilities.
Equals: Iterables.elementsEqual()
ToString: MoreObjects.toStringHelper()
AutoValue: Awesome tool, does everything you want for you as long as your object is conceptually a value type.
Additionally, you could use reflection to get all the fields in your object at runtime. This would be slower than a hard-coded implementation, but it would likely be faster to write. If you aren't overly concerned about speed, this is a good option.
I think It is better to use the eclipse functionality to generating code for hashCode() and equals() which does a pretty good job for implementing this methods.
While it's not a good practice to have an object with that many fields, sometimes legacy constraints trap you in a bad situation.
Regardless of size, I find the easiest way to override these methods is by using the Apache commons library. It uses reflection to generate the values from the instance and there are a number of ways to configure the results. I also never have to remember to regenerate the method if the fields update, unlike the eclipse generated methods.
#Override
public final boolean equals(final Object obj) {
if (obj == this) {
return true;
}
if(obj != null && obj.getClass() == this.getClass()) {
return EqualsBuilder.relectionEquals(this, obj, true);
}
return false;
}
#Override
public final int hashCode() {
return HashCodeBuilder.relectionHashCode(this);
}
#Override
public final String toString() {
return ToStringBuilder.reflectionToString(this, ToStringStyle.MULTI_LINE_STYLE);
}

LinkedList insert tied to inserted object

I have code that looks like this:
public class Polynomial {
List<Term> term = new LinkedList<Term>();
and it seems that whenever I do something like term.add(anotherTerm), with anotherTerm being... another Term object, it seems anotherTerm is referencing the same thing as what I've just inserted into term so that whenever I try to change anotherTerm, term.get(2) (let's say) get's changed too.
How can I prevent this from happening?
Since code was requested:
//since I was lazy and didn't want to go through the extra step of Polynomial.term.add
public void insert(Term inserting) {
term.add(inserting);
}
Code calling the insert method:
poly.insert(anotherTerm);
Code creating the anotherTerm Term:
Term anotherTerm = new Term(3, 7.6); //sets coefficient and power to 3 and 7.6
New code calling the insert method:
poly.insert((Term)anotherTerm.clone());
Which unfortunately still doesn't work due to clone() has protected access in java.lang.Object, even after doing public class Term implements Cloneable{
The solution is simple: make Term immutable.
Effective Java 2nd Edition, Item 15: Minimize mutability:
Immutable objects are simple.
Immutable objects can be shared freely.
Immutable objects make great building blocks for other objects.
Classes should be immutable unless there's a very good reason to make them mutable.
If a class cannot be made immutable, limit its mutability as much as possible.
Make every field final unless there is a compelling reason to make it non-final
Something as simple and small as Term really should be made immutable. It's a much better overall design, and you wouldn't have to worry about things like you were asking in your question.
See also
What is meant by immutable?
This advice becomes even more compelling since the other answers are suggesting that you use clone().
Effective Java 2nd Edition, Item 11: Override clone judiciously
Because of the many shortcomings, some expert programmers simply choose to never override the clone method and never invoke it except, perhaps, to copy arrays.
From an interview with author Josh Bloch:
If you've read the item about cloning in my book, especially if you read between the lines, you will know that I think clone is deeply broken.
DO NOT make Term implements Cloneable. Make it immutable instead.
See also
How to properly override clone method?
Why people are so afraid of using clone() (on collection and JDK classes) ?
OK, replacing my old answer with this, now that I understand the question and behavior better.
You can do this if you like:
public void insertTerm(Term term) {
polynomial.insert(new Term(term));
}
and then create a new Term constructor like this:
public Term(Term term) {
this.coefficient = term.coefficient;
this.exponent = term.exponent;
}
That should work.
EDIT: Ok, I think I see what it is you're doing now. If you have this class:
public class Polynomial
{
List<Term> term = new LinkedList<Term>();
public void insert(Term inserting)
{
term.add(inserting);
}
}
And then you do this:
Polynomal poly = new Polynomal()
Term term = new Term();
poly.insert(term);
term.coefficient = 4;
...then the object term is the same object as poly.get(0). "term" and "poly.get(0)" are both references to the same object - changing one will change the other.
Question is no so clear, but i just try , when you are adding the objects , add anotherTerm.clone()
It sounds like you are not instantiating new Objects, just referencing the same one. You should instantiate a new Term, either with Term term = new Term(); or by cloning term.clone().
EDIT to be able to be cloned, Term need to implement the Cloneable interface. That means that you are responsible for how the new copy of a Term should be defined.
Hard to tell without seeing the code that calls the insert method, but sounds like that is the problem.

Categories

Resources