Unordered pair in Java

Unordered pair in Java - java

I need a Java generic class to represent unordered pairs of any type. Meanwhile I see two solutions:
HashSet to store the pair elements
a Pair class with overriden hashCode and equals (to make Pair(a, b) and Pair(b, a) equal).
Does it make sense? What would you suggest?

In your place I would roll out my own class. As long as you are interested in sets of only two objects, using HashMap, HashSet (which, incidentally, uses a HashMap internally anyway) or any other class designed for sets of arbitrary cardinality is a waste of resources and adds unneeded complexity.
Just create your own class with proper equals() and hashCode() implementations. Having a contains() operation, or even implementing parts of the Set interface, might also make sense.
One important note: make sure you document your class extensively - at least specify whether equals() performs an identity or an equality comparison for the contained objects, and what is the meaning of a null contained reference...

Related

How can I take into consideration the object itself when calculating a hash for an object in Java?

I was working on some algorithmic problems when I got to this and it seemed interesting to me. If I have two lists (so two different objects), with the same values, the hashcode is the same. After some reading, I understand that this is how it should behave. For example:
List<String> lst1 = new LinkedList<>(Arrays.asList("str1", "str2"));
List<String> lst2 = new LinkedList<>(Arrays.asList("str1", "str2"));
System.out.println(lst1.hashCode() + " " + lst2.hashCode());
...........
Result: 2640541 2640541
My purpose would be to differentiate between lst1 and lst2 in a list for example.
Is there a structure (like a HashSet for example) that takes into consideration the actual object and not only the values inside the object when calculating the hashcode for something?

Yes, you can use java's java.util.IdentityHashMap, or guava's identity hash set.
The hashes of the two lists must be equal, because the objects are equal. But the identity map and set above are based on the identity of the list objects, not their hash.

If I have two lists (so two different objects), with the same values, the hashcode is the same. After some reading, I understand that this is how it should behave.
Yes, this is part of the specification of java.util.List.
Is there a structure (like a HashSet for example) that takes into consideration the actual object and not only the values inside the object when calculating the hashcode for something?
My purpose would be to differentiate between lst1 and lst2 in a list for example
It is unclear what "in a list" means here. For example, Collection.contains() and List.equals() are defined in terms or members' equals() methods, and likewise the behavior of List.remove(Object). Although distinct objects, your two Lists will compare equal to each other, so those methods will not distinguish between them, neither directly nor as members of another list. You can always compare them for reference equality (==), however, to determine that they are not the same object despite being equals() each other.
As far as a collection that takes members' object identity into account, you could consider java.util.IdentityHashMap. Two such maps having keys and associated values that are pairwise equals() each other but not identical will not compare equals() to each other. Such sets will typically have different hash codes than each other, though that cannot be guaranteed. Note well, however, the warnings throughout the documentation of IdentityHashMap that although it implements the Map API, many of the behavioral details are inconsistent with the requirements of that interface.
Note also that
most of the above is relevant only for collections whose members are of a type that overrides equals() and hashCode(). The implementations of or inherited from Object differentiate between objects on a reference-equality basis, so the ordinary collections classes have no surprises for you there.
identical string literals are not required to represent distinct objects, so the lst1 and lst2 in your example code may in fact contain identical elements, in the reference equality sense.

Not generally in collections, because you generally want two collections with all the same items to be equal (which is why they implement it like this- equals will return true and the hash codes are the same).
You can subclass a list and have it not do that, it would just not be widely useful and would cause a lot of confusion if other programmers read your code. In that case, you'd just want equals to return the result of == and hashCode to return the integer value of the reference (the same thing that Object.equals does).

When is hashcode useful if I am never using hashtable?

Lets say I am implementing a class called Car, with 2 member variables int numDoors, and String color.
In a hypothetical case, I am never going to use such a car in hashtable or hashmap or any structure that needs a hash, time immemorial.
Now, why is it still required to override hashCode along with equals ?
Note: all answers I checkout include use in hashtable / hashmap. I have tried extensively to get this answer, so as a request dont mark it as a duplicate. Thanks

It's the general convention:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
However, it's not entirely enforceable.
There are times in which you would believe that you don't need to have hashCode defined and implemented for your object, and if you don't use any structure that relies on a hash to store or reference it, you'd be correct.
But, there are third-party libraries in which your object may come into contact with, and they may very well be using a Map or Set to do their work, and they'd have the expectation that you followed conventions.
It's up to you to not implement hashCode along with equals - you're certainly not forced to (although many would argue that this is a bug), but beware that your object may not work as well with a third party library for this reason.

The only conceivable types which would not be able to override hashCode method in a fashion consistent with the hashCode and equals contract would be those which are unable to override hashCode [e.g. because a base class declared it final]. There is thus almost never any reason for a type not to legitimately implement hashCode(). Even if a type cannot guarantee that instances which are unequal won't spontaneously become equal, the author of the type may still legitimately implement hashCode() by picking a 32-big int value [e.g. 8675309] and implementing hashCode() as #override int hashCode() { return 8675309; }. Doing this will allow all of the hash-table-based collection types to work correctly. Storing very many such items into a hash table will severely degrade performance, but hash tables with just a few items will work just fine and generally perform decently. By contrast, if one doesn't override hashCode then even a hash table will likely work incorrectly if even a single item is stored into it.
Incidentally, in some cases there may be advantages to implementing hashCode even when not using hashed collections. For example, some immutable collection types which support deep comparison might call hashCode() on the items stored therein. If a collection is large, and/or comparison operations on the items stored within it are expensive, the efficiency of testing two collections for equality ("do they contain equal items") may be enhanced by using a three-step process:
Compare the aggregate hashcode of two collections. If they're not equal, no reason to look any further. Will often yield instant results, no matter the size of the collections.
Compare the cached hash codes of all the items. If the collections' contents match except for the last couple items, and if comparisons between items may be expensive (e.g. the items are thousand-character strings) this will often avoid the need to compare all but one of the items for equality [note that if all but one of the items matched, and its hash code differed, then the aggregate hash code would differ and we wouldn't have gotten this far].
If all the hash codes match, then call equals on each pair of items that don't compare reference-equal.
Note that if two collections contain distinct items with equal content, a comparison is going to need to deeply examine all of the items; hashCode can't do anything to help with that case. On the other hand, in most cases where things are compared they are not equal, and using cached hashCode() values may facilitate orders-of-magnitude speedups in those cases.

Java: external class for determining equivalence?

Java has a Comparator<T> for providing comparison of objects external to the class itself, to allow for multiple/alternate methods of doing ordered comparisons.
But the only standard way of doing unordered comparisons is to override equals() within a class.
What should I do when I want to provide multiple/alternate unordered comparisons external to a class? (Obvious use case is partitioning a collection into equivalence classes based on particular properties.)
Assuming the end use is for unordered checking (e.g. not for sorting or indexing), is it ever OK to implement Comparator<T> that just checks for equality, returning 0 if two objects are equal, and a value != 0 when two objects are unequal? (note: the only reason I don't jump on this solution, is that technically it can break the contract for Comparator by not providing a relation that satisfies transitivity and symmetry.)
It seems like there should have been an EqualsComparator<T> standard class or something.
(Does Guava handle anything like this?)

Yes, Guava has the Equivalence interface and the Equivalences class (Removed in Guava release 14.0).
(And yes, it's something which is very useful and sadly lacking in Java. We really should have options around this for HashMap, HashSet etc...)
While Comparator<T> may be okay in some situations, it doesn't provide the hashCode method which would be important for hash-based collections.

Should compareTo and equals be consistent if the Object is mutable and will not be used in sorted Set, etc

The Java documentation for Comparable recommends that compareTo be consistent with equals (because of the behavior in sorted sets or sorted maps).
I have a Player object which is mutable. Player will not be used in any kind of sorted set or sorted map.
I want to have an array of Players (Player[]) and assign each player a random order of play (a member variable m_nPlayOrder), then sort them in that array based on their m_nPlayOrder member variable.
I will implement the Comparable interface and the compareTo function to achieve this.
My question is, is it ok in this case if Player has a compareTo that is NOT consistant with equals? It will not be consistant with equals unless I also override equals (and hashCode) and have it return true if the Players m_nPlayerOrder are equal. I do not want to do that.
UPDATE:
Thanks everyone for the replys! I'm going with implementing the Comparator instead of Comparable.

You could implement a java.util.Comparator (a stand-alone object that "knows" your Player class and compares accordingly) and use that instead of implementing Comparable in the class.
Comparator has the advantage in that you can implement any number of them for different types of sorts, rather than the single Comparable.

It is not a good practice to implement equals and compareTo that are inconsistent.
Player will not be used in any kind of sorted set or sorted map.
The problem is that someone else trying to maintain your code at some time in the future may not realize that you have implemented inconsistent methods, and may try to put a Player into a sorted set/map or some other data structure that expects consistent methods.
There is a good alternative available to you in the form of standalone Comparator objects.

I do not think you have the problem that you think you have, but perhaps I'm missing something (wouldn't be the first time).
The requirement here is that 2 objects that are equal should also return 0 for compareTo(), meaning that they occupy the same rank for sorting purposes. There is a separate requirement that 2 equal objects should also have the same hashCode. But this isn't a concern for you, because you're overriding neither equals() nor hashCode(); the Java platform will satisfy this one for you.
Since your objects will be equal based on reference comparison (not custom, logical comparison), you won't have a problem unless your objects don't compare to themselves. In orther words, as long as a single object, when asked to compareTo() itself, would always return 0, you're fine.

The Comparable API says, "It is strongly recommended (though not required) that natural orderings be consistent with equals." If you don't, be certain to document the limitations you noted. Alternatively, it may be easy to implement Comparable fully, as suggested in this answer.

hash of java hashtable

The hashCode of a java Hashtable element is always unique?
If not, how can I guarantee that one search will give me the right element?

Not necessarily. Two distinct (and not-equal) objects can have the same hashcode.

First thing first.
You should consider to use HashMap instead of Hashtable, as the latter is considered obsolete (it enforces implicit synchronization, which is not required most of the time. If you need a synchronized HashMap, it is easily doable)
Now, regarding your question.
Hashcode is not guaranteed to be unique mathematically-wise,
however, when you're using HashMap (or Hashtable), it does not matter.
If two keys generate the same hash code, an equals is automatically invoked on each one of the keys to guarantee that the correct object will be retrieved.
If you're using a String as your key, you're worry free,
But if you're using your own object as the key, you should override the equals and the hashCode methods.
The equals method is mandatory for the proper operation of HashMap, whereas the hashCode method should be coded such that the hash-table will be relatively sparse (otherwise your hashmap, will be just a long array)
If you're using Eclipse there's an easy way to generate hashCode and equals, it basically does all the work for you.

From the Java documentation:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals comparisons
on the object is modified. This integer need not remain consistent
from one execution of an application to another execution of the same
application.
If two objects are equal according to the equals(Object)
method, then calling the hashCode method on each of the two objects must
produce the same integer result.
It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on each of
the two objects must produce distinct integer results. However, the
programmer should be aware that producing distinct integer results for
unequal objects may improve the performance of hashtables.
As much as is reasonably practical,
the hashCode method defined by class
Object does return distinct integers
for distinct objects. (This is
typically implemented by converting
the internal address of the object
into an integer, but this
implementation technique is not
required by the JavaTM programming
language.)
So yes, you can typically expect the default hashCode for an Object to be unique. However, if the method has been overridden by the class you are storing in the Hashtable, all bets are off.

Ideally, yes. In reality, collisions do occasionally happen.

The hashCode of a java Hashtable element is always unique?
They should. At least within the same class.
If not, how can I guarantee that one search will give me the right element?
By specifying your self a good hasCode implementation for your class: Override equals() and hashCode

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.