Does java.util.HashSet not Adhere to its Specification? - java

As a relative Java noob, I was baffled to find out the following:
Point.java:
public class Point {
...
public boolean equals(Point other) {
return x == other.x && y == other.y;
}
...
}
Edge.java:
public class Edge {
public final Point a, b;
...
public boolean equals(Edge other) {
return a.equals(other.a) && b.equals(other.b);
}
...
}
main snippet:
private Set blockedEdges;
public Program(...) {
...
blockedEdges = new HashSet<Edge>();
for (int i = 0; ...) {
for (int j = 0; ...) {
Point p = new Point(i, j);
for (Point q : p.neighbours()) {
Edge e = new Edge(p, q);
Edge f = new Edge(p, q);
blockedEdges.add(e);
// output for each line is:
// contains e? true; e equals f? true; contains f? false
System.out.println("blocked edge from "+p+"to " + q+
"; contains e? " + blockedEdges.contains(e)+
" e equals f? "+ f.equals(e) +
"; contains f? " + blockedEdges.contains(f));
}
}
}
}
Why is this surprising? Because I checked the documentation before I coded this to rely on equality and it says:
Returns true if this set contains the specified element. More
formally, returns true if and only if this set contains an element e
such that (o==null ? e==null : o.equals(e))
This sentence is very clear and it states that nothing more than equality is needed. f.equals(e) returns true as shown in the output. So clearly the set does indeed contain an element e such that o.equals(e), yet contains(o) returns false.
While it is certainly understandable that a hash set also depends on the hash values being the same, this fact is mentioned neither in the docs of HashSet itself, nor is any such possibility mentioned in the docs of Set.
Thus, HashSet doesn't adhere to its specification. This looks like a very serious bug to me. Am I completely on the wrong track here? Or how come behaviour like this is accepted?

You're not overriding equals (you're overloading it). equals need to accept an Object as argument.
Do something like
#Override
public boolean equals(Object o) {
if (!(o instanceof Point))
return false;
Point other = (Point) o;
return x == other.x && y == other.y;
}
(and same for Edge)
It's also important to always override hashCode when you're overriding equals. See for instance Why do I need to override the equals and hashCode methods in Java?
Note that this mistake would have been caught by the compile if you had used #Override. This is why it's good practice to always use it where possible.

Related

Java: HashSet what is the Compare concept?

Coming from a c++ world, I find reading of the HashSet documentation somewhat hard:
https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html
In c++, you would have:
http://en.cppreference.com/w/cpp/container/set
which in turns points to:
http://en.cppreference.com/w/cpp/concept/Compare
Which makes it obvious the requirement for the type of element handled by a std::set. My question is: What are the requirements for the type (E) of elements maintained by a Set in Java ?
Here is a short example which I fail to understand:
import gdcm.Tag;
import java.util.Set;
import java.util.HashSet;
public class TestTag
{
public static void main(String[] args) throws Exception
{
Tag t1 = new Tag(0x8,0x8);
Tag t2 = new Tag(0x8,0x8);
if( t1 == t2 )
throw new Exception("Instances are identical" );
if( !t1.equals(t2) )
throw new Exception("Instances are different" );
if( t1.hashCode() != t2.hashCode() )
throw new Exception("hashCodes are different" );
Set<Tag> s = new HashSet<Tag>();
s.add(t1);
s.add(t2);
if( s.size() != 1 )
throw new Exception("Invalid size: " + s.size() );
}
}
The above simple code fails with:
Exception in thread "main" java.lang.Exception: Invalid size: 2 at TestTag.main(TestTag.java:42)
From my reading of the documentation only the equals operator needs to be implemented for Set:
https://docs.oracle.com/javase/7/docs/api/java/util/Set.html
What am I missing from the documentation ?
I just tried to reproduce your issue, and maybe you just didn't override equals and/or hashSet correctly.
Take a look at my incorrect implemenation of Tag:
public class Tag {
private int x, y;
public Tag(int x, int y) {
this.x = x;
this.y = y;
}
public boolean equals(Tag tag) {
if (x != tag.x) return false;
return y == tag.y;
}
#Override
public int hashCode() {
int result = x;
result = 31 * result + y;
return result;
}
}
Looks quite ok doesn't it? But the problem is, I actually do not override the correct equals method, I overloaded it with my own implementation.
To work correctly, equals has to look like this:
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Tag tag = (Tag) o;
if (x != tag.x) return false;
return y == tag.y;
}
What am I missing from the documentation ?
You are looking at the wrong part of the documentation.
The C++ set is an "sorted set of unique objects", and are "usually implemented as red-black trees."
In Java, Set is a more abstract concept (it's an interface, not a class) with multiple implementations, most notably the HashSet and the TreeSet (ignoring concurrent implementations).
As you can probably guess from the name alone, the Java TreeSet is the equivalent of the C++ set.
As for requirements, HashSet uses the hashCode() and equals() methods. They are defined on the Object class, and needs to be overridden on classes that needs to be in a HashSet or as keys in a HashMap.
For TreeSet and keys of TreeMap, you have two options: Provide a Comparator when creating the TreeSet (similar to C++), or have the objects implement the Comparable interface.
I guess this was simply a combination of bad luck and misunderstanding of HashSet requirement. Thanks to #christophe for help, I realized the issue when I tried adding in my swig generated Tag.java class:
#Override
public boolean equals(Object o) {
}
I got the following error message:
gdcm/Tag.java:78: error: method does not override or implement a method from a supertype
#Override
^
1 error
1 warning
Which meant my error was simply:
I had the wrong signature in the first place: boolean equals(Object o) != boolean equals(Tag t)
The hint was simply to use the #Override keyword.
For those asking for the upstream code, the Java code is generated by swig. The original c++ code is here:
https://github.com/malaterre/GDCM/blob/master/Source/DataStructureAndEncodingDefinition/gdcmTag.h

Checking if a value is already in a List won't work

I am doing a small program that holds shelves in a library list. If the number of shelf was already entered before, you can't enter it again. However, it's not working.
Here is my code in the main class:
Shelf s = new Shelf(1);
Shelf s2 = new Shelf(1);
Library l = new Library();
l.Addshelf(s);
l.Addshelf(s2);
As you can see I entered 1 in both objects as the shelf number so this code below should then run from the library class
public void Addshelf(Shelf s)
{
List li = new ArrayList();
if(li.contains(s))
{
System.out.println("already exists");
} else {
li.add(s);
}
}
The problem must be in the above method. I want to know how I check if that shelf number already exists in the list, in which case it should prompt me with the above statement - "already exists.
You'll have to override equals method in Shelf in order to get the behavior you desire.
Without overriding equals, ArrayList::contains, which calls ArrayList::indexOf, would use the default implementation of Object::equals, which compares object references.
#Override
public boolean equals (Object anObject)
{
if (this == anObject)
return true;
if (anObject instanceof Shelf) {
Shelf anotherShelf = (Shelf) anObject;
return this.getShelfNumber() == anotherShelf.getShelfNumber(); // assuming this
// is a primitive
// (if not, use equals)
}
return false;
}
If you look at the Javadoc for List at the contains method you will see that it uses the equals()method to evaluate if two objects are the same. So you have to override the method equals on your Shelf class.
Example:
public class Shelf
{
public int a;
public Shelf (int x)
{
this.a= x;
}
#Override
public boolean equals(Object object)
{
boolean isEqual= false;
if (object != null && object instanceof Shelf)
{
isEqual = (this.a == ((Shelf) object).a);
}
return isEqual;
}
}
Make sure that you have override equals() method in Shelf.
From Java doc. How contains() works?
Returns true if this list contains the specified element. More
formally, returns true if and only if this list contains at least one
element e such that (o==null ? e==null : o.equals(e)).
^^
Try overriding methods hashCode() and equals(Object obj) in your Shelf class and then call contains.
Equals and HashCode tutorial

Correct way to implement Map<MyObject,ArrayList<MyObject>>

I was asked this in interview. using Google Guava or MultiMap is not an option.
I have a class
public class Alpha
{
String company;
int local;
String title;
}
I have many instances of this class (in order of millions). I need to process them and at the end find the unique ones and their duplicates.
e.g.
instance --> instance1, instance5, instance7 (instance1 has instance5 and instance7 as duplicates)
instance2 --> instance2 (no duplicates for instance 2)
My code works fine
declare datastructure
HashMap<Alpha,ArrayList<Alpha>> hashmap = new HashMap<Alpha,ArrayList<Alpha>>();
Add instances
for (Alpha x : arr)
{
ArrayList<Alpha> list = hashmap.get(x); ///<<<<---- doubt about this. comment#1
if (list == null)
{
list = new ArrayList<Alpha>();
hashmap.put(x, list);
}
list.add(x);
}
Print instances and their duplicates.
for (Alpha x : hashmap.keySet())
{
ArrayList<Alpha> list = hashmap.get(x); //<<< doubt about this. comment#2
System.out.println(x + "<---->");
for(Alpha y : list)
{
System.out.print(y);
}
System.out.println();
}
Question: My code works, but why? when I do hashmap.get(x); (comment#1 in code). it is possible that two different instances might have same hashcode. In that case, I will add 2 different objects to the same List.
When I retrieve, I should get a List which has 2 different instances. (comment#2) and when I iterate over the list, I should see at least one instance which is not duplicate of the key but still exists in the list. I don't. Why?. I tried returning constant value from my hashCode function, it works fine.
If you want to see my implementation of equals and hashCode,let me know.
Bonus question: Any way to optimize it?
Edit:
#Override
public boolean equals(Object obj) {
if (obj==null || obj.getClass()!=this.getClass())
return false;
if (obj==this)
return true;
Alpha guest = (Alpha)obj;
return guest.getLocal()==this.getLocal()
&& guest.getCompany() == this.getCompany()
&& guest.getTitle() == this.getTitle();
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + (title==null?0:title.hashCode());
result = prime * result + local;
result = prime * result + (company==null?0:company.hashCode());
return result;
}
it is possible that two different instances might have same hashcode
Yes, but hashCode method is used to identify the index to store the element. Two or more keys could have the same hashCode but that's why they are also evaluated using equals.
From Map#containsKey javadoc:
Returns true if this map contains a mapping for the specified key. More formally, returns true if and only if this map contains a mapping for a key k such that (key==null ? k==null : key.equals(k)). (There can be at most one such mapping.)
Some enhancements to your current code:
Code oriented to interfaces. Use Map and instantiate it by HashMap. Similar to List and ArrayList.
Compare Strings and Objects in general using equals method. == compares references, equals compares the data stored in the Object depending the implementation of this method. So, change the code in Alpha#equals:
public boolean equals(Object obj) {
if (obj==null || obj.getClass()!=this.getClass())
return false;
if (obj==this)
return true;
Alpha guest = (Alpha)obj;
return guest.getLocal().equals(this.getLocal())
&& guest.getCompany().equals(this.getCompany())
&& guest.getTitle().equals(this.getTitle());
}
When navigating through all the elements of a map in pairs, use Map#entrySet instead, you can save the time used by Map#get (since it is supposed to be O(1) you won't save that much but it is better):
for (Map.Entry<Alpha, List<Alpha>> entry : hashmap.keySet()) {
List<Alpha> list = entry.getValuee();
System.out.println(entry.getKey() + "<---->");
for(Alpha y : list) {
System.out.print(y);
}
System.out.println();
}
Use equals along with hashCode to solve the collision state.
Steps:
First compare on the basis of title in hashCode()
If the title is same then look into equals() based on company name to resolve the collision state.
Sample code
class Alpha {
String company;
int local;
String title;
public Alpha(String company, int local, String title) {
this.company = company;
this.local = local;
this.title = title;
}
#Override
public int hashCode() {
return title.hashCode();
}
#Override
public boolean equals(Object obj) {
if (obj instanceof Alpha) {
return this.company.equals(((Alpha) obj).company);
}
return false;
}
}
...
Map<Alpha, ArrayList<Alpha>> hashmap = new HashMap<Alpha, ArrayList<Alpha>>();
hashmap.put(new Alpha("a", 1, "t1"), new ArrayList<Alpha>());
hashmap.put(new Alpha("b", 2, "t1"), new ArrayList<Alpha>());
hashmap.put(new Alpha("a", 3, "t1"), new ArrayList<Alpha>());
System.out.println("Size : "+hashmap.size());
Output
Size : 2

AssertEquals with Collections with non primitive template parameters

I have a class for a string-number pair. This class has the method compareTo implemented.
A method of another class returns a collection of elements of the pair type.
I wanted to perform a unit test on this method, and therefore wrote the following:
#Test
public void testWeight() {
Collection<StringNumber<BigDecimal>> expected = new Vector<StringNumber<BigDecimal>>();
expected.add(new StringNumber<BigDecimal>("a", BigDecimal.ONE));
expected.add(new StringNumber<BigDecimal>("b", BigDecimal.ONE));
Collection<StringNumber<BigDecimal>> actual = new Vector<StringNumber<BigDecimal>>();
expected.add(new StringNumber<BigDecimal>("a", BigDecimal.ONE));
expected.add(new StringNumber<BigDecimal>("b", BigDecimal.ONE));
//Collection<StringNumber<BigDecimal>> actual = A.f();
assertEquals(expected, actual);
}
But as you can see, the assertion fails, even though the elements in the collections are identical. What can be the reason?
The error I get is
java.lang.AssertionError: expected: java.util.Vector<[a:1, b:1]>
but was: java.util.Vector<[a:1, b:1]>
Which does not make scene to me.
Your StringNumber class requires equals() method. Then it will work. Assuming this class contains string and number fields (auto-generated by my IDE):
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (!(o instanceof StringNumber)) {
return false;
}
StringNumber that = (StringNumber) o;
if (number != null ? !number.equals(that.number) : that.number != null) {
return false;
}
return !(string != null ? !string.equals(that.string) : that.string != null);
}
#Override
public int hashCode() {
int result = string != null ? string.hashCode() : 0;
result = 31 * result + (number != null ? number.hashCode() : 0);
return result;
}
Few remarks:
Two Vector's (why are you using such archaic data structure) are equal if:
both [...] have the same size, and all corresponding pairs of elements in the two lists are equal. (Two elements e1 and e2 are equal if (e1==null ? e2==null : e1.equals(e2)).)
That's why overriding equals() is required.
when implementing equals() you must implement hashCode(). Not required here, but better be safe than sorry: What issues should be considered when overriding equals and hashCode in Java?.

Strange Java HashMap behavior - can't find matching object

I've been encountering some strange behavior when trying to find a key inside a java.util.HashMap, and I guess I'm missing something. The code segment is basically:
HashMap<Key, Value> data = ...
Key k1 = ...
Value v = data.get(k1);
boolean bool1 = data.containsKey(k1);
for (Key k2 : data.keySet()) {
boolean bool2 = k1.equals(k2);
boolean bool3 = k2.equals(k1);
boolean bool4 = k1.hashCode() == k2.hashCode();
break;
}
That strange for loop is there because for a specific execution I happen to know that data contains only one item at this point and it is k1, and indeed bool2, bool3 and bool4 will be evaluated to true in that execution. bool1, however, will be evaluated to false, and v will be null.
Now, this is part of a bigger program - I could not reproduce the error on a smaller sample - but still it seems to me that no matter what the rest of the program does, this behavior should never happen.
EDIT: I have manually verified that the hash code does not change between the time the object was inserted to the map and the time it was queried. I'll keep checking this venue, but is there any other option?
This behavior could happen if the hash code of the key were changed after it was inserted in to the map.
Here's an example with the behavior you described:
public class Key
{
int hashCode = 0;
#Override
public int hashCode() {
return hashCode;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Key other = (Key) obj;
return hashCode == other.hashCode;
}
public static void main(String[] args) throws Exception {
HashMap<Key, Integer> data = new HashMap<Key, Integer>();
Key k1 = new Key();
data.put(k1, 1);
k1.hashCode = 1;
boolean bool1 = data.containsKey(k1);
for (Key k2 : data.keySet()) {
boolean bool2 = k1.equals(k2);
boolean bool3 = k2.equals(k1);
boolean bool4 = k1.hashCode() == k2.hashCode();
System.out.println("bool1: " + bool1);
System.out.println("bool2: " + bool2);
System.out.println("bool3: " + bool3);
System.out.println("bool4: " + bool4);
break;
}
}
}
From the API description of the Map interface:
Note: great care must be exercised if
mutable objects are used as map keys.
The behavior of a map is not specified
if the value of an object is changed
in a manner that affects equals
comparisons while the object is a key
in the map. A special case of this
prohibition is that it is not
permissible for a map to contain
itself as a key. While it is
permissible for a map to contain
itself as a value, extreme caution is
advised: the equals and hashCode
methods are no longer well defined on
such a map.
Also, there are very specific requirements on the behavior of equals() and hashCode() for types used as Map keys. Failure to follow the rules here will result in all sorts of undefined behavior.
If you're certain the hash code does not change between the time the key is inserted and the time you do the contains check, then there is something seriously wrong somewhere. Are you sure you're using a java.util.HashMap and not a subclass of some sort? Do you know what implementation of the JVM you are using?
Here's the source code for java.util.HashMap.getEntry(Object key) from Sun's 1.6.0_20 JVM:
final Entry<K,V> getEntry(Object key) {
int hash = (key == null) ? 0 : hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
As you can see, it retrieves the hashCode, goes to the corresponding slot in the table, then does an equals check on each element in that slot. If this is the code you're running and the hash code of the key has not changed, then it must be doing an equals check which must be failing.
The next step would be for you to give us some more code or context - the hashCode and equals methods of your Key class at a minimum.
Alternatively, I would recommend hooking up to a debugger if you can. Watch what bucket your key is hashed to, and step through the containsKey check to see where it's failing.
Is this application multi-threaded? If so, another thread could change the data between the data.containsKey(k1) call and the data.keySet() call.
If equals() returns true for two objects, then hashCode() should return the same value. If equals() returns false, then hashCode() should return different values.
For Reference:
http://www.ibm.com/developerworks/java/library/j-jtp05273.html
Perhaps the Key class looks like
Key
{
boolean equals = false ;
public boolean equals ( Object oth )
{
try
{
return ( equals ) ;
}
finally
{
equals = true ;
}
}
}

Categories

Resources