The behaviour of equals() method in Java [duplicate]

The behaviour of equals() method in Java [duplicate] - java

This question already has answers here:
Weird Integer boxing in Java
(12 answers)
What is the difference between == and equals() in Java?
(26 answers)
Closed 9 years ago.
Consider the following Java code:
Object a = new Integer(2);
Object b = new Integer(2);
System.out.println(a.equals(b));
Object x = new Object();
Object y = new Object();
System.out.println(x.equals(y));
The first print statement prints true and the second false.
If this is an intentional behavior, how this helps programming in Java?
If this is not an intentional behavior, is this a defect in Java?

I'm going to answer your question with reservations, but you should know that you are hurting yourself if the intent of the question was to get you to learn and your solution was to ask StackOverflow. That aside...
This behavior is intentional.
The default equals() method on java.lang.Object compares memory addresses, which means that all objects are different from each other (only two references to the same object will return true).
java.lang.Integer overrides this to compare the value of the Integers, so two different Integers both representing the number two compare equal. If you used == instead, you would get false for both cases.
Standard practice in Java is to override the equals method to return true for objects which have the same logical value, even if they were created at different times (or even with different parameters). It's not very useful to have objects representing numbers if you don't have a way to ask, "do these two things represent the same value?".
Incidentally, and this is a tangent here, Java actually keeps a cache of Integer objects for small values. So sometimes you may get two Integer objects where even the == operator will return true, despite you getting them from two different sources. You can even get code that behaves differently for larger integers than it does for smaller, without having it look at the integral values!

This is intended behaviour.
Object.equals() considers the object identity (i.e. an object is only equal to itself), which is the only thing you can do for generic objects.
Integer overrides the method to depend on the integer value, since two Integer objects with the same value are logically equal. Many other classes also override equals() since it's a core mechanism of the standard API and a lot of functionaliy e.g. in the collections framework depends on it.
Why do are you puzzled by the behaviour anyway? Most people are only confused by the == operator not behaving like that (it works like Object.equals()).

The equals method in Java serves a specific purpose: it determines if the objects are logically equal, i.e. their content is the same, whatever that may mean in the context of each specific class. This is in contrast to the objects being the same: two different objects could be logically equivalent.
Going back to your example, a and b are different objects that represent the same logical entity - an integer value of 2. They model the same concept - an integer number, and integer numbers with the same value are identical to each other. a and b are, therefore, equal.
The x and y objects, on the other hand, do not represent the same logical entity (in fact, they do not represent anything). That's why they are neither the same nor are equivalent.

It is intentional, of course.
When comparing Integer objects, equals returns true if their values (in your case, 1) are equal.
Applied on different objects of type Object, it returns false.
x.equals(x)
would return true.

See also Object.equals() doc. By the way, consider using Integer.valueOf(2) rather than new Integer(2) as it reduces memory footprint.
One last funny thing Integer.valueOf(2)==Integer.valueOf(2) will return true but Integer.valueOf(2000)==Integer.valueOf(2000) will return false because in the first case you will receive twice the same instance of the Integer object (there is a cache behind) but not in the second case because the cache is only for values between -127 to 128

Related

Why does Java need equals() if there is hashCode()?

If two objects return same hashCode, doesn't it mean that they are equal? Or we need equals to prevent collisions?
And can I implement equals by comparing hashCodes?

If two objects have the same hashCode then they are NOT necessarily equal. Otherwise you will have discovered the perfect hash function. But the opposite is true - if the objects are equal, then they must have the same hashCode.

hashCode and Equals are different information about objects
Consider the analogy to Persons where hashcode is the Birthday,
in that escenario, you and many other people have the same b-day (same hashcode), all you are not the same person however..

Why does Java need equals() if there is hashCode()?
Java needs equals() because it is the method through which object equality is tested by examining classes, fields, and other conditions the designer considers to be part of an equality test.
The purpose of hashCode() is to provide a hash value primarily for use by hash tables; though it can also be used for other purposes. The value returned is based on an object's fields and hash codes of its composite and/or aggregate objects. The method does not take into account the class or type of object.
The relationship between equals() and hashCode() is an implication.
Two objects that are equal implies that the have the same hash code.
Two objects having the same hash code does not imply that they are equal.
The latter does not hold for several reasons:
There is a chance that two distinct objects may return the same hash code. Keep in mind that a hash value folds information from a large amount of data into a smaller number.
Two objects from different classes with similar fields will most likely use the same type of hash function, and return equal hash values; yet, they are not the same.
hashCode() can be implementation-specific returning different values on different JVMs or JVM target installations.
Within the same JVM, hashCode() can be used as a cheap precursor for equality by testing for a known hash code first and only if the same testing actual equality; provided that the equality test is significantly more expensive than generating a hash code.
And can I implement equals by comparing hashCodes?
No. As mentioned, equal hash codes does not imply equal objects.

The hashCode method as stated in the Oracle Docs is a numeric representation of an object in Java. This hash code has limited possible values (represented by the values which can be stored in an int).
For a more complex class, there is a high possibility that you will find two different objects which have the same hash code value. Also, no one stops you from doing this inside any class.
class Test {
#Override
public int hashCode() {
return 0;
}
}
So, it is not recommended to implement the equals method by comparing hash codes. You should use them for comparison only if you can guarantee that each object has an unique hash code. In most cases, your only certainty is that if two objects are equal using o1.equals(o2) then o1.hashCode() == o2.hashCode().
In the equals method you can define a more complex logic for comparing two objects of the same class.

If two objects return same hashCode, doesn't it mean that they are equal?
No it doesn't mean that.
The javadocs for Object state this:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals
comparisons on the object is modified. ...
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must
produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCodemethod on
each of the two objects must produce distinct integer results. ...
Note the highlighted statement. It plainly says "No" to your question.
There is another way to look at this.
The hashCode returns an int.
There are only 232 distinct values that an int can take.
If a.hashCode() == b.hashCode() implies a.equals(b), then there can be only 232 distinct (i.e. mutually unequal) objects at any given time in a running Java application.
That last point is plainly not true. Indeed, it is demonstrably not true if you have a large enough heap to hold 232 instances of java.lang.Object ... in a 64-bit JVM.
And a third way is to some well-known examples where two different two character strings have the same hashcode.
Given that your assumption is incorrect, the reasoning that follows from it is also incorrect.
Java does need an equals method.
You generally cannot implement equals using just hashCode.
You may be able to use hashCode to implement a faster equals method, but only if calling hashCode twice is faster than comparing two objects. It generally isn't.

hashCodes are equal -> Objects might be equal -> further comparision is required
hashCodes are different -> Object are not equal (if hashCode is implemented right)
That's how equals method are implemented. At first you check if hashCodes are equal. If yes, you need to check class fields to see if it represents the exact same object. If hashCodes are different, you can be sure that objects are not equal.

Sometimes (very often?) you don't!
These answers are not untrue. But they don't tell the whole story.
One example would be where you are creating a load of objects of class SomeClass, and each instance that is created is given a unique ID by incrementing a static variable, nInstanceCount, or some such, in the constructor:
iD = nInstanceCount++;
Your hash function could then be
int hashCode(){
return iD;
}
and your equals could then be
boolean equals( Object obj ){
if( ! ( obj instanceof SomeClass )){
return false;
}
return hashCode() == obj.hashCode();
}
... under such circumstances your idea that "equals is superfluous" is effectively true: if all classes behaved like this, Java 10 (or Java 23) might say, ah, let's just get rid of silly old equals, what's the point? (NB backwards compatibility would then go out the window).
There are two essential points:
you couldn't then create more than MAXINT instances of SomeClass. Or... you could ... if you set up a system for reassigning the IDs of previously destroyed instances. IDs are typically long rather than int ... but this wouldn't work because hashCode() returns int.
none of these objects could then be "equal" to another one, since equality = identity for this particular class, as you have defined it. Often this is desirable. Often it shuts off whole avenues of possibilities...
The necessary implication of your question is, perhaps, what's the use of these two methods which, in a rather annoying way, have to "cooperate"? Frelling, in his/her answer, alluded to the crucial point: hash codes are needed for sorting into "buckets" with classes like HashMap. It's well worth reading up on this: the amount of advanced maths that has gone into designing efficient "bucket" mechanisms for classes like HashMap is quite frightening. After reading up on it you may come to have (like me) a bit of understanding and reverence about how and why you should bother implementing hashCode() with a bit of thought!

Hashcode returning same values for different references [duplicate]

This question already has answers here:
Why can hashCode() return the same value for different objects in Java?
(6 answers)
Closed 7 years ago.
I was thinking hashcodes are only implemented in HashMap, Hashtable. In my understanding hashCode value will be same for both on object level also.
Hence
String str="Niks";
String str1=new String("Niks");
System.out.println(str.hashCode());
System.out.println(str1.hashCode());
The same has codes are returned because on object level the hashcode will be implemented as below. Correct me if i am wrong.
result = prime * result + ((str == null) ? 0 : str.hashCode());
result2 = prime * result + ((str1 == null) ? 0 : str1.hashCode());
Output:
75268767
75268767

Generally if I'm using strings as keys in a hashmap, I don't want to have to worry about whether the string I'm using as a key for a lookup is exactly the same reference as what I may have inserted previously, that would complicate things immensely. I want to be able to create a string as a key and know that, if the map has a string with the same value as what I'm using, that the map will find it. So comparing references is not what I want, I want to compare by value.
For objects like Strings or numbers (java.lang.Integer, java.math.BigInteger, java.math.BigDecimal), typically comparing references is not useful; all people are interested in is the value. These are called value objects, and equality and hashCode are based strictly on the object's value, not on comparing references.
From a definition posted by Martin Fowler:
In P of EAA I described Value Object as a small object such as a Money or date range object. Their key property is that they follow value semantics rather than reference semantics.
You can usually tell them because their notion of equality isn't based on identity, instead two value objects are equal if all their fields are equal. Although all fields are equal, you don't need to compare all fields if a subset is unique - for example currency codes for currency objects are enough to test equality.
A general heuristic is that value objects should be entirely immutable. If you want to change a value object you should replace the object with a new one and not be allowed to update the values of the value object itself - updatable value objects lead to aliasing problems.

The hashCode() method of objects is used when you insert them into a HashTable, HashMap or HashSet. Just as important is the equals() method.
If two objects are equal, like you have demonstrated in your example (Two strings with the value 'Niks') (see this answer for more information) then they will have the same Hash, an important point to note is that just because two Objects have the same hash, they may not be equal!

How do I check in Java whether two references are holding the same object? [duplicate]

This question already has answers here:
Comparing references in Java
(5 answers)
Closed 7 years ago.
I am looking for a Java equivalent of pointer-comparison in C/C++.
Suppose I am invoking a getSomething() method of an object of a class from a third-party. How do I know if the implementation of the getSomething() is just returning the same instance everytime or returning a different instance of the object?
I am not looking to check whether two objects are identical, but need to check if they are the same instance or not. The motivation is, if the 3rd party implementation is creating a new instance everytime, probably I can optimize my code by not invoking the method everytime. Assume getSomething() is invoked by me 1000s of times a second at run-time
From what I read from various articles, I shouldn't rely on hashCode(). In that case what is the way to do this?

Reference types are "pointing to" the same object, when they are equivalent according to the == (identity) operator.
The question says:
I am not looking to check whether two objects are identical, but need to check if they are the same instance or not (emphasis mine)
You actually mixed equality with identity (as I did initially in the answer). Being the same instance is identity. Being the same value is equality.

The == operator compares the values held by the two operands. If the operands are primitive, the actual values are checked for equality. If the operands are of object type, then == checks for equality of identity, functioning in this role just like Python's is keyword.
Note that there is bit of fudginess caused by autboxing and autounboxing between primitive and wrapper object types.

== double equal operator are use to check whether two reference variable hold the same object or not.
Eg.
String s1 = new String("hello");
String s2 = s1;
String s3 = new String("hello");
System.out.println(s1==s2); // it prints true
System.out.println(s1==s3); // it prints false
s1 and s2 pointing to the same object.
s3 holds newly created object and hence s1 == s3 prints false

I think you're not looking for the ==operator. To compare two instances of an object, for example, I use the equals() Object method. Here are two useful links about comparisons:
java ==, equals(), compare(), compareTo()
Javadocs equals() method

You can use == operator to compare two objects if they are having same reference or not.

Item-9: "Always override hashCode() when you override equals"

With respect to 3 contracts mentioned below:
1) Whenever hashCode() is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
From this statement, i understand that, In a single execution of an application, if hashCode() is used one or more times on same object it should return same value.
2) If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.
From this statement, i understand that, to perform the equality operation(in broad scope) in your subclass, There are at least four different degrees of equality.
(a) Reference equality(==), comparing the internal address of two reference type objects.
(b) Shallow structural equality: two objects are "equals" if all their fields are ==.
{ For example, two SingleLinkedList whose "size" fields are equal and whose "head" field point to the same SListNode.}
(c) Deep structural equality: two objects are "equals" if all their fields are "equals".
{For example, two SingleLinkedList that represent the same sequence of "items" (though the SListNodes may be different).}
(d) Logical equality. {Two examples:
(a) Two "Set" objects are "equals" if they contain the same elements, even if the underlying lists store the elements in different orders.
(b) The Fractions 1/3 and 2/6 are "equals", even though their numerators and denominators are all different.}
Based on above four categories of equality, second contract will hold good only: if(Say) equals() method returns truth value based on logical_equality between two objects then hashCode() method must also consider logical_equality amidst computation before generating the integer for every new object instead of considering internal address of a new object.
But i have a problem in understanding this third contract.
3) IT IS NOT REQUIRED that if two objects are unequal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
In second contract, As we are saying that hashCode() method should be accordingly[for ex: considering logical_equality before generating integer] implemented, I feel, It is not true to say that, if two objects are unequal according to equals(Object) then hashCode() method may produce same integer results as mentioned in third contract? As per the argument in second contract, hashCode() must produce distinct integer results. One just writing return 42 in hashCode() is breaking second contract!
please help me understand this point!

It would be impossible for hashCode() to always return different values for unequal objects. For example, there are 2^64 different Long values, but only 2^32 possible int values. Therefore the hashCode() method for Long has to have some repeats. In situations like this you have to try hard to ensure that your hashCode() method distributes values as evenly as possible, and is unlikely to produce repeats for the instances you are most likely to use in practice.
The second condition just says that two equal() instances must return the same hashCode() value, so this program must print true:
Long a = Long.MIN_VALUE;
Long b = Long.MIN_VALUE;
System.out.println(a.hashCode() == b.hashCode()); // a.equals(b), so must print true.
However this program also prints true:
Long c = 0L;
Long d = 4294967297L;
System.out.println(c.hashCode() == d.hashCode()); // prints true even though !c.equals(d)

hashCode() does not have to produce a distinct result. return 0; is a perfectly legal implementation of hashCode() - it ensures that two equal objects will have the same hash code. But it will ensure dismal performance when using HashMaps and HashSets.
It's preferable that hashCode() return values will be distinct (i.e., objects that are not equal should have different hash codes), but it's not required.

The second contract states what happens when equals() returns true. It does not say anything about the case when equals() returns false.
The third contract is just a reminder about that fact. It reminds you that when equals() is false for two objects, there is no connection between their hash codes. They may be same or different, as the implementation happens to make them.

The third point means that you can have many unequal objects with the same hashcode. . For example 2 string objects can have the same hashcode. The second point states that two equal objects must have the same hashcode. . return 5 is a valid hash implementation because it returns the same value for 2 equal objects.

How to test for equivalence in a generic class?

Scheme knows three different equivalence operators: eq?, eqv?, equal?. Sere here for details. In short: eq? tests references, eqv? tests values and equal? recursively tests the values of lists. I would like to write a Java generic which needs the functionality of Scheme´s equal?.
I tried to use Java´s equals() method, because I thought it does a value comparison, because for a reference comparison the == operator exists and there is no need for equals to do the same. But this assumption is completely wrong, because equals in Java is completely unreliable. Sometimes it does a value comparison and sometimes it does a reference comparison. And one can not be sure which class does a reference comparison and which class does a value comparison.
This means that equals can not be used in a generic, because the generic would not do the same for all types. And it is also not possible to restrict the generic type in a way that only types are acceptable which implement the correct value comparison.
So the question is: how to do a reliable value comparison in a generic? Do I have to write it on my own from scratch?
By the way: I think Java´s equal failure does not start with Array. It starts already with Object. I think it is wrong that equals for two objects returns false. It must return true, because if you do a value comparison of something which does not have a value the values can not differ and therefor they must be the same. Scheme does it in that way and it is perfectly reasonable: (equal? (vector) (vector)) -> #t.

In Scheme, list equivalence is based completely on the structure of the items.
In Java by comparison, equality is context-dependent depending on the type of object, and may use some or all of the internal structure in its equivalence calculation. What it means for two objects of the same type to be "equal" is up to the object type to determine, so long as the general contract for equals is met (most notably that it forms an equivalence relation with all other objects).
Assuming all types used in a program have a reasonable equals definition, they should have a "reliable" value comparison, at least in the sense of the object oriented paradigm.
Returning to the analogous Java equal? implementation. It's a bit hard to piece together from the question's phrasing, but from context clues it appears that this is also attempting to operate on lists of items. The equals method on Java's List type already implements behavior directly analogous to Scheme's equals? operation:
Compares the specified object with this list for equality. Returns true if and only if the specified object is also a list, both lists have the same size, and all corresponding pairs of elements in the two lists are equal. (Two elements e1 and e2 are equal if Objects.equals(e1, e2).) In other words, two lists are defined to be equal if they contain the same elements in the same order.
This definition also means that recursive list structures also work in a similar manner as Scheme's equals? operation.
Note that the List behavior is notably different from that of Java's array type (which you mention in your question). Arrays in Java are a fairly low-level type, and do not support much of the typical object-oriented functionality one might expect. Of particular note, for equality, arrays are compared by object reference rather than by a structural comparison of the items in the array. There are ways to do sensible equality comparison on arrays using methods in the Arrays class (e.g. Arrays.equals and Arrays.deepEquals).
As an aside, to address your postscript about the equality of two bare Objects.
assert !(new Object().equals(new Object()))
From an object-oriented perspective, it is sensible that two bare objects be equal only if they're the same reference. First, as mentioned above, there is not a direct relation between an object's internal structure and its equality, so there's no need for them to be equal. There is virtually no context as to what two different instances of Object represent from a object modeling perspective, so there's no inherent conceptual way to tell that these two objects are logically the "same" thing.
In summary, assuming all the types in your list have a sensible version of equals() defined per their object's type, Java's List.equals() behaves directly analogously to Scheme's equals? operation.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.