Creating Java classes usable as keys for a hashmap - java

I have made a class called Coordinates which simply holds some x and y integers. I want to use this as a key for a HashMap.
However, I noticed that when you create two different instances of Coordinates with the same x and y values, they are used as different keys by the hash map. That is, you can put two entries even though both of them have the same coordinates.
I have overriden equals():
public boolean equals(Object obj) {
if (!(obj instanceof Coord)) {
return false;
}else if (obj == this) {
return true;
}
Coord other = (Coord)obj;
return (x == other.x && y == other.y);
}
But the HashMap still uses the two instances as if they were different keys. What do I do?
And I know I could use an integer array of two elements instead. But I want to use this class.

You need to override hashCode. Java 7 provides a utility method for this.
#Override
public int hashCode() {
return Objects.hash(x, y);
}

You should also override hashCode() so that two equal instances have the same hashCode(). E.g.:
#Override
public int hashCode() {
int result = x;
result = 31 * result + y;
return result;
}
Note that it is not strictly required for two instances that are not equal to have different hash codes, but the less collisions you have, the better performance you'll get from you HashMap.

A hash map uses the hashCode method of objects to determine which bucket to put the object into.
If your object doesn't implement hashCode, it inherits the default implementation from Object. From the docs:
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
As such, each object will appear to be distinct.
Note that different objects may return the same hashCode.
That's called a collision.
When that happens,
then in addition to the hashCode,
the hash map implementation will use the equals method to determine if two objects are equal.
Note that most IDE offer to generate the equals and hashCode methods from the fields defined in your class. In fact, IntelliJ encourages to define these two methods at the same time. For good reason. These two methods are intimately related,
and whenever you change one of them, or implement one of them, or override one of them,
you must review (and most probably change) the other one too.
The methods in this class are 100% generated code (by IntelliJ):
class Coord {
private int x;
private int y;
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Coord coord = (Coord) o;
if (x != coord.x) return false;
if (y != coord.y) return false;
return true;
}
#Override
public int hashCode() {
int result = x;
result = 31 * result + y;
return result;
}
}

You probably did not override the hashCode method. Why is that required ? To answer this, you must understand how an hashtable works.
An hashtable is basically an array of linkedlists. Each bucket in the array corresponds to a particular value of hashCode % numberOfBuckets. All the objects with the same hashCode % numberOfBuckets will be stored within a linkedlist in the associated bucket and will be recognized (during the lookup for instance) basing on their equals method. Therefore, the exact specification is a.hashCode() != b.hashCode() => !a.equals(b) which is equivalent to a.equals(b) => a.hashCode() == b.hashCode().
If you use the default implementation of hashCode, which is based on the reference, then two objects that are equal but have a different reference (and so, most probably, a different hashCode) will be stored in a different bucket, resulting in a duplicate key.

Related

Is there any chance for the hash codes of two different objects of being same? [duplicate]

In Java, obj.hashCode() returns some value. What is the use of this hash code in programming?
hashCode() is used for bucketing in Hash implementations like HashMap, HashTable, HashSet, etc.
The value received from hashCode() is used as the bucket number for storing elements of the set/map. This bucket number is the address of the element inside the set/map.
When you do contains() it will take the hash code of the element, then look for the bucket where hash code points to. If more than 1 element is found in the same bucket (multiple objects can have the same hash code), then it uses the equals() method to evaluate if the objects are equal, and then decide if contains() is true or false, or decide if element could be added in the set or not.
From the Javadoc:
Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java programming language.)
hashCode() is a function that takes an object and outputs a numeric value. The hashcode for an object is always the same if the object doesn't change.
Functions like HashMap, HashTable, HashSet, etc. that need to store objects will use a hashCode modulo the size of their internal array to choose in what "memory position" (i.e. array position) to store the object.
There are some cases where collisions may occur (two objects end up with the same hashcode), and that, of course, needs to be solved carefully.
The value returned by hashCode() is the object's hash code, which is the object's memory address in hexadecimal.
By definition, if two objects are equal, their hash code must also be equal. If you override the equals() method, you change the way two objects are equated and Object's implementation of hashCode() is no longer valid. Therefore, if you override the equals() method, you must also override the hashCode() method as well.
This answer is from the java SE 8 official tutorial documentation
A hashcode is a number generated from any object.
This is what allows objects to be stored/retrieved quickly in a Hashtable.
Imagine the following simple example:
On the table in front of you. you have nine boxes, each marked with a number 1 to 9. You also have a pile of wildly different objects to store in these boxes, but once they are in there you need to be able to find them as quickly as possible.
What you need is a way of instantly deciding which box you have put each object in. It works like an index. you decide to find the cabbage so you look up which box the cabbage is in, then go straight to that box to get it.
Now imagine that you don't want to bother with the index, you want to be able to find out immediately from the object which box it lives in.
In the example, let's use a really simple way of doing this - the number of letters in the name of the object. So the cabbage goes in box 7, the pea goes in box 3, the rocket in box 6, the banjo in box 5 and so on.
What about the rhinoceros, though? It has 10 characters, so we'll change our algorithm a little and "wrap around" so that 10-letter objects go in box 1, 11 letters in box 2 and so on. That should cover any object.
Sometimes a box will have more than one object in it, but if you are looking for a rocket, it's still much quicker to compare a peanut and a rocket, than to check a whole pile of cabbages, peas, banjos, and rhinoceroses.
That's a hash code. A way of getting a number from an object so it can be stored in a Hashtable. In Java, a hash code can be any integer, and each object type is responsible for generating its own. Lookup the "hashCode" method of Object.
Source - here
Although hashcode does nothing with your business logic, we have to take care of it in most cases. Because when your object is put into a hash based container(HashSet, HashMap...), the container puts/gets the element's hashcode.
hashCode() is a unique code which is generated by the JVM for every object creation.
We use hashCode() to perform some operation on hashing related algorithm like Hashtable, Hashmap etc..
The advantages of hashCode() make searching operation easy because when we search for an object that has unique code, it helps to find out that object.
But we can't say hashCode() is the address of an object. It is a unique code generated by JVM for every object.
That is why nowadays hashing algorithm is the most popular search algorithm.
One of the uses of hashCode() is building a Catching mechanism.
Look at this example:
class Point
{
public int x, y;
public Point(int x, int y)
{
this.x = x;
this.y = y;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Point point = (Point) o;
if (x != point.x) return false;
return y == point.y;
}
#Override
public int hashCode()
{
int result = x;
result = 31 * result + y;
return result;
}
class Line
{
public Point start, end;
public Line(Point start, Point end)
{
this.start = start;
this.end = end;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Line line = (Line) o;
if (!start.equals(line.start)) return false;
return end.equals(line.end);
}
#Override
public int hashCode()
{
int result = start.hashCode();
result = 31 * result + end.hashCode();
return result;
}
}
class LineToPointAdapter implements Iterable<Point>
{
private static int count = 0;
private static Map<Integer, List<Point>> cache = new HashMap<>();
private int hash;
public LineToPointAdapter(Line line)
{
hash = line.hashCode();
if (cache.get(hash) != null) return; // we already have it
System.out.println(
String.format("%d: Generating points for line [%d,%d]-[%d,%d] (no caching)",
++count, line.start.x, line.start.y, line.end.x, line.end.y));
}

Java, Date, Array, hashcode() [duplicate]

In Java, obj.hashCode() returns some value. What is the use of this hash code in programming?
hashCode() is used for bucketing in Hash implementations like HashMap, HashTable, HashSet, etc.
The value received from hashCode() is used as the bucket number for storing elements of the set/map. This bucket number is the address of the element inside the set/map.
When you do contains() it will take the hash code of the element, then look for the bucket where hash code points to. If more than 1 element is found in the same bucket (multiple objects can have the same hash code), then it uses the equals() method to evaluate if the objects are equal, and then decide if contains() is true or false, or decide if element could be added in the set or not.
From the Javadoc:
Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java programming language.)
hashCode() is a function that takes an object and outputs a numeric value. The hashcode for an object is always the same if the object doesn't change.
Functions like HashMap, HashTable, HashSet, etc. that need to store objects will use a hashCode modulo the size of their internal array to choose in what "memory position" (i.e. array position) to store the object.
There are some cases where collisions may occur (two objects end up with the same hashcode), and that, of course, needs to be solved carefully.
The value returned by hashCode() is the object's hash code, which is the object's memory address in hexadecimal.
By definition, if two objects are equal, their hash code must also be equal. If you override the equals() method, you change the way two objects are equated and Object's implementation of hashCode() is no longer valid. Therefore, if you override the equals() method, you must also override the hashCode() method as well.
This answer is from the java SE 8 official tutorial documentation
A hashcode is a number generated from any object.
This is what allows objects to be stored/retrieved quickly in a Hashtable.
Imagine the following simple example:
On the table in front of you. you have nine boxes, each marked with a number 1 to 9. You also have a pile of wildly different objects to store in these boxes, but once they are in there you need to be able to find them as quickly as possible.
What you need is a way of instantly deciding which box you have put each object in. It works like an index. you decide to find the cabbage so you look up which box the cabbage is in, then go straight to that box to get it.
Now imagine that you don't want to bother with the index, you want to be able to find out immediately from the object which box it lives in.
In the example, let's use a really simple way of doing this - the number of letters in the name of the object. So the cabbage goes in box 7, the pea goes in box 3, the rocket in box 6, the banjo in box 5 and so on.
What about the rhinoceros, though? It has 10 characters, so we'll change our algorithm a little and "wrap around" so that 10-letter objects go in box 1, 11 letters in box 2 and so on. That should cover any object.
Sometimes a box will have more than one object in it, but if you are looking for a rocket, it's still much quicker to compare a peanut and a rocket, than to check a whole pile of cabbages, peas, banjos, and rhinoceroses.
That's a hash code. A way of getting a number from an object so it can be stored in a Hashtable. In Java, a hash code can be any integer, and each object type is responsible for generating its own. Lookup the "hashCode" method of Object.
Source - here
Although hashcode does nothing with your business logic, we have to take care of it in most cases. Because when your object is put into a hash based container(HashSet, HashMap...), the container puts/gets the element's hashcode.
hashCode() is a unique code which is generated by the JVM for every object creation.
We use hashCode() to perform some operation on hashing related algorithm like Hashtable, Hashmap etc..
The advantages of hashCode() make searching operation easy because when we search for an object that has unique code, it helps to find out that object.
But we can't say hashCode() is the address of an object. It is a unique code generated by JVM for every object.
That is why nowadays hashing algorithm is the most popular search algorithm.
One of the uses of hashCode() is building a Catching mechanism.
Look at this example:
class Point
{
public int x, y;
public Point(int x, int y)
{
this.x = x;
this.y = y;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Point point = (Point) o;
if (x != point.x) return false;
return y == point.y;
}
#Override
public int hashCode()
{
int result = x;
result = 31 * result + y;
return result;
}
class Line
{
public Point start, end;
public Line(Point start, Point end)
{
this.start = start;
this.end = end;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Line line = (Line) o;
if (!start.equals(line.start)) return false;
return end.equals(line.end);
}
#Override
public int hashCode()
{
int result = start.hashCode();
result = 31 * result + end.hashCode();
return result;
}
}
class LineToPointAdapter implements Iterable<Point>
{
private static int count = 0;
private static Map<Integer, List<Point>> cache = new HashMap<>();
private int hash;
public LineToPointAdapter(Line line)
{
hash = line.hashCode();
if (cache.get(hash) != null) return; // we already have it
System.out.println(
String.format("%d: Generating points for line [%d,%d]-[%d,%d] (no caching)",
++count, line.start.x, line.start.y, line.end.x, line.end.y));
}

Hash function for a generic object

How do you come up with a hash function for a generic object? There is the constraint that two objects need to have the same hash value if they are "equal" as defined by the user. How does Java accomplish this?
I just found the answer to my own question. The way Java does it is that it defines a hashCode for every object and by default the hashCode for two objects are the same iff the two objects are the same in memory. So when the client of the hashtable overrides the equals() method for an object, he should also override the method that computes hashcode such that if a.equals(b) is true, then a.hashCode() must also equal b.hashCode(). This way, it is assured that equal objects have the same hashcode.
First, basically you define the hash function of a class by overriding the hashCode() method. The Javadoc states:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
So the more important question is: What makes two of your objects equal? Or vice versa: What properties make your objects unique? If you have an answer to that, create an equals() method that compares all of the properties and returns true if they're all the same and false otherwise.
The hashCode() method is a bit more involved, I would suggest that you do not create it yourself but let your IDE do it. In Eclipse, you can select Source and then Generate hashCode() and equals() from the menu. This also guarantees that the requirements from above hold.
Here is a small (and simplified) example where the two methods have been generated using Eclipse. Notice that I chose not to include the city property since the zipCode already uniquely identifies the city within a country.
public class Address {
private String streetAndNumber;
private String zipCode;
private String city;
private String country;
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((country == null) ? 0 : country.hashCode());
result = prime * result
+ ((streetAndNumber == null) ? 0 : streetAndNumber.hashCode());
result = prime * result + ((zipCode == null) ? 0 : zipCode.hashCode());
return result;
}
#Override
public boolean equals(final Object obj) {
if(this == obj)
return true;
if(obj == null)
return false;
if(!(obj instanceof Address))
return false;
final Address other = (Address) obj;
if(country == null) {
if(other.country != null)
return false;
}
else if(!country.equals(other.country))
return false;
if(streetAndNumber == null) {
if(other.streetAndNumber != null)
return false;
}
else if(!streetAndNumber.equals(other.streetAndNumber))
return false;
if(zipCode == null) {
if(other.zipCode != null)
return false;
}
else if(!zipCode.equals(other.zipCode))
return false;
return true;
}
}
Java doesn't do that. If the hashCode() and equals() are not explicitly implemented, JVM will generate different hashCodes for meaningfully equal instances. You can check Effective Java by Joshua Bloch. It's really helpful.
Several options:
read Effective Java, by Joshua Bloch. It contains a good algorithm for hash codes
let your IDE generate the hashCode method
Java SE 7 and greater: use Objects.hash
The class java.lang.Object cheats. It defines equality (as is determined by equals) as being object identity (as can be determined by ==). So, unless you override equals in your subclass, two instances of your class are "equal", if they happen to be the same object.
The associated hash code for this is implemented by the system function System.identityHashCode (which is no longer really based on object addresses -- was it ever? -- but can be thought of as being implemented this way).
If you override equals, then this implementation of hashCode no longer makes sense.
Consider the following example:
class Identifier {
private final int lower;
private final int upper;
public boolean equals(Object any) {
if (any == this) return true;
else if (!(any instanceof Identifier)) return false;
else {
final Identifier id = (Identifier)any;
return lower == id.lower && upper == id.upper;
}
}
}
Two instances of this class are considered equal, if their "lower" and "upper" members have the same values. Since equality is now determined by object members, we need to define hashCode in a compatible way.
public int hashCode() {
return lower * 31 + upper; // possible implementation, maybe not too sophisticated though
}
As you can see, we use the same fields in hashCode which we also use when we determine equality. It is generally a good idea to base the hash code on all members, which are also considered when comparing for equality.
Consider this example instead:
class EmailAddress {
private final String mailbox;
private final String displayName;
public boolean equals(Object any) {
if (any == this) return true;
else if (!(any instanceof EmailAddress)) return false;
else {
final EmailAddress id = (EmailAddress)any;
return mailbox.equals(id.mailbox);
}
}
}
Since here, equality is only determined by the mailbox member, the hash code should also only be based on that member:
public int hashCode() {
return mailbox.hashCode();
}
Hashing of an object is established by overriding hashCode() method, which the developer can override.
Java uses prime numbers in the default hashcode calculation.
If the equals() and hashCode() method aren't implemented, the JVM will generate hashcode implicitly for the object (for Serializable classes, a serialVersionUID is generated).

How to properly implement equals in Java

I need to implement the equals method in some class A. Class A has an orderer collection of Enum type, and the behaviour I want to achive is that equals returns true for two instances of Class A that have exactly the same Enum values in the collection (in exactly the same positions of the collection).
As I'm new to java, I'm having problems with this, and I dont know how to properly implement equals or the hashcode methods, so any help would be good :)
If you're using eclipse (netbeans has similar features, as do most java IDEs), you can simply got to the "Source" menu, and choose "Generate hashcode() and equals()". Then you select the fields you want to be considered (in your case the list of enum values.
That being said, assuming you already have the enum, here's the code that eclipse generated for me. Not that hashcode usually involves a prime number, as well as multiplication and addition. This tends to give you somewhat decent distribution of values.
public class Foo {
private List<FooEnum> enumValues;
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((enumValues == null) ? 0 : enumValues.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Foo other = (Foo) obj;
if (enumValues == null) {
if (other.enumValues != null)
return false;
}
else if (!enumValues.equals(other.enumValues))
return false;
return true;
}
}
The overridden equals method will look like this
public boolean equals(Object o) {
if ((o instanceof yourtype) &&
(((yourtype)o).getPropertyToTest() == this.propertyToTest)) {
return true;
}
else {
return false;
}
}
The overridden hashCode method will look like this
public int hashCode() { return anIntRepresentingTheHashCode}
Pulling from the javadocs, your equals method must meet the following criteria:
reflexive - x.equals(x) is true
symmetric - if x.equals(y) then y.equals(x)
transitive - if x.equals(y) and y.equals(z) then x.equals(z)
consistent - if x.equals(y) is true, then it's always true unless the object is modified
null - x.equals(null) is false
Also, if two objects are equal based on the equals method, they must have identical hash codes.
The reverse is not true. If two objects are not equal, they may or may not have identical hash codes
Use EnumSet It retains natural order as per java docs also and it is optimized for Enums only.
The iterator returned by the iteratormethod traverses the elements in their natural order (the order in which the enum constants are declared). The returned iterator is weakly consistent: it will never throw ConcurrentModificationException and it may or may not show the effects of any modifications to the set that occur while the iteration is in progress.
You can use EnumSet as below
import java.util.EnumSet;
public enum Direction {
LEFT,
RIGHT,
ABOVE,
BELOW;
private static EnumSet<Direction> someDirection = EnumSet.of(Direction.LEFT,Direction.RIGHT) ;
}
Now because you are using EnumSet equals and Hashcode method will be provided default from AbstractSet which is parent class of EnumSet
So You don't have to care about them.

Finding same objects in Java

I produce a bunch of objects in Java. Each object has attribute area and a set of integers. I want to store those objects for example in a map(keys should be integers in a growing order). Two objects are the same if their area is equal and their sets are the same.
If two objects don't have the same area then there is no need for me to check whether their sets are the same.
What is the best practice for implementing this in Java? How should I compose hash and equal functions?
Here's sample pair of hashCode\equals generated by IDE:
class Sample {
final int area;
final Set<Integer> someData;
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Sample sample = (Sample) o;
if (area != sample.area) return false;
if (!someData.equals(sample.someData)) return false;
return true;
}
#Override
public int hashCode() {
int result = area;
result = 31 * result + someData.hashCode();
return result;
}
}
This code assumes someData can't be null -- to simplify things. You can see that equality of types is checked at first, then area equality is checked and then equality of Set<Integer> is checked. Note that built-in equals of Set is used in this -- so you have re-usage of that method. This is idiomatic way to test compound types for equality.
You just need your object to implement the Comparable interface and code your logic in the compareTo method. Here's a good link to help you achieve that.
A rule of thumb is that you should compare all relevant fields in your equals() implementation (fastest first, so compare your areas up front, then the integer sets) and use THE SAME fields in your hashCode(). If in doubt, use Eclipse's Source - Generate hashCode() and equals()... feature (and then fix the equals() code to compare the areas first.)
Just compare their area first in equals (after the == and type check of course), and return false if these differ. If the areas equal, go on and compare the sets.
For implementing equals (and hashCode) in general, here is a relevant thread and a good article (including several further references).

Categories

Resources