I have a factory that creates objects of class MyClass, returning already generated ones when they exist. As I have the creation method (getOrCreateMyClass) taking multiple parameters, which is the best way to use a Map to store and retrieve the objects?
My current solution is the following, but it doesn't sound too clear to me.
I use the hashCode method (slightly modified) of class MyClass to build an int based on the parameters of class MyClass, and I use it as the key of the Map.
import java.util.HashMap;
import java.util.Map;
public class MyClassFactory {
static Map<Integer, MyClass> cache = new HashMap<Integer, MyClass>();
private static class MyClass {
private String s;
private int i;
public MyClass(String s, int i) {
}
public static int getHashCode(String s, int i) {
final int prime = 31;
int result = 1;
result = prime * result + i;
result = prime * result + ((s == null) ? 0 : s.hashCode());
return result;
}
#Override
public int hashCode() {
return getHashCode(this.s, this.i);
}
}
public static MyClass getOrCreateMyClass(String s, int i) {
int hashCode = MyClass.getHashCode(s, i);
MyClass a = cache.get(hashCode);
if (a == null) {
a = new MyClass(s, i);
cache.put(hashCode , a);
}
return a;
}
}
Your getOrCreateMyClass doesn't seem to add to the cache if it creates.
I think this will also not perform correctly when hashcodes collide. Identical hashcodes do not imply equal objects. This could be the source of the bug you mentioned in a comment.
You might consider creating a generic Pair class with actual equals and hashCode methods and using Pair<String, Integer> class as the map key for your cache.
Edit:
The issue of extra memory consumption by storing both a Pair<String, Integer> key and a MyClass value might be best dealt with by making the Pair<String, Integer> into a field of MyClass and thereby having only one reference to this object.
With all of this though, you might have to worry about threading issues that don't seem to be addressed yet, and which could be another source of bugs.
And whether it is actually a good idea at all depends on whether the creation of MyClass is much more expensive than the creation of the map key.
Another Edit:
ColinD's answer is also reasonable (and I've upvoted it), as long as the construction of MyClass is not expensive.
Another approach that might be worth consideration is to use a nested map Map<String, Map<Integer, MyClass>>, which would require a two-stage lookup and complicate the cache updating a bit.
You really shouldn't be using the hashcode as the key in your map. A class's hashcode is not intended to necessarily guarantee that it will not be the same for any two non-equal instances of that class. Indeed, your hashcode method could definitely produce the same hashcode for two non-equal instances. You do need to implement equals on MyClass to check that two instances of MyClass are equal based on the equality of the String and int they contain. I'd also recommend making the s and i fields final to provide a stronger guarantee of the immutability of each MyClass instance if you're going to be using it this way.
Beyond that, I think what you actually want here is an interner.... that is, something to guarantee that you'll only ever store at most 1 instance of a given MyClass in memory at a time. The correct solution to this is a Map<MyClass, MyClass>... more specifically a ConcurrentMap<MyClass, MyClass> if there's any chance of getOrCreateMyClass being called from multiple threads. Now, you do need to create a new instance of MyClass in order to check the cache when using this approach, but that's inevitable really... and it's not a big deal because MyClass is easy to create.
Guava has something that does all the work for you here: its Interner interface and corresponding Interners factory/utility class. Here's how you might use it to implement getOrCreateMyClass:
private static final Interner<MyClass> interner = Interners.newStrongInterner();
public static MyClass getOrCreateMyClass(String s, int i) {
return interner.intern(new MyClass(s, i));
}
Note that using a strong interner will, like your example code, keep each MyClass it holds in memory as long as the interner is in memory, regardless of whether anything else in the program has a reference to a given instance. If you use newWeakInterner instead, when there isn't anything elsewhere in your program using a given MyClass instance, that instance will be eligible for garbage collection, helping you not waste memory with instances you don't need around.
If you choose to do this yourself, you'll want to use a ConcurrentMap cache and use putIfAbsent. You can take a look at the implementation of Guava's strong interner for reference I imagine... the weak reference approach is much more complicated though.
Related
A few weeks back I wrote a Java class with the following behavior:
Each object contains a single final integer field
The class contains a static Map (Key: Integer, Content: MyClass)
Whenever an object of the class is instantiated a look-up is done, if an object with the wanted integer field already exists in the static map: return it, otherwise create one and put it in the map.
As code:
public class MyClass
{
private static Map<Integer, MyClass> map;
private final int field;
static
{
map = new HashMap<>();
}
private MyClass(int field)
{
this.field = field;
}
public static MyClass get(int field)
{
synchronized (map)
{
return map.computeIfAbsent(field, MyClass::new);
}
}
}
This way I can be sure, that only one object exists for each integer (as field). I'm currently concerned, that this will prevent the GC to collect objects, which I no longer need, since the objects are always stored in the map (a reference exists)...
If I wrote a loop like function like this:
public void myFunction() {
for (int i = 0; i < Integer.MAX_VALUE; i++) {
MyClass c = MyClass.get(i);
// DO STUFF
}
}
I would end up with Integer.MAX_VALUE objects in memory after calling the method. Is there a way I can check, whether references to objects in the map exists and otherwise remove them?
This looks like a typical case of the multiton pattern: You want to have at most one instance of MyClass for a given key. However, you also seem to want to limit the amount of instances created. This is very easy to do by lazily instantiating your MyClass instances as you need them. Additionally, you want to clean up unused instances:
Is there a way I can check, whether references to objects in the map exists and otherwise remove them?
This is exactly what the JVM's garbage collector is for; There is no reason to try to implement your own form of "garbage collection" when the Java core library already provides tools for marking certain references as "not strong", i.e. should refer to a given object only if there is a strong reference (i.e. in Java, a "normal" reference) somewhere referring to it.
Implementation using Reference objects
Instead of a Map<Integer, MyClass>, you should use a Map<Integer, WeakReference<MyClass>> or a Map<Integer, SoftReference<MyClass>>: Both WeakReference and SoftReference allow the MyClass instances they refer to to be garbage-collected if there are no strong (read: "normal") references to the object. The difference between the two is that the former releases the reference on the next garbage collection action after all strong references are gone, while the latter one only releases the reference when it "has to", i.e. at some point which is convenient for the JVM (see related SO question).
Plus, there is no need to synchronize your entire Map: You can simply use a ConcurrentHashMap (which implements ConcurrentMap), which handles multi-threading in a way much better than by locking all access to the entire map. Therefore, your MyClass.get(int) could look like this:
private static final ConcurrentMap<Integer, Reference<MyClass>> INSTANCES = new ConcurrentHashMap<>();
public static MyClass get(final int field) {
// ConcurrentHashMap.compute(...) is atomic <https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#compute-K-java.util.function.BiFunction->
final Reference<MyClass> ref = INSTANCES.compute(field, (key, oldValue) -> {
final Reference<MyClass> newValue;
if (oldValue == null) {
// No instance has yet been created; Create one
newValue = new SoftReference<>(new MyClass(key));
} else if (oldValue.get() == null) {
// The old instance has already been deleted; Replace it with a
// new reference to a new instance
newValue = new SoftReference<>(new MyClass(key));
} else {
// The existing instance has not yet been deleted; Re-use it
newValue = oldValue;
}
return newValue;
});
return ref.get();
}
Finally, in a comment above, you mentioned that you would "prefer to cache maybe up to say 1000 objects and after that only cache, what is currently required/referenced". Although I personally see little (good) reason for it, it is possible to perform eager instantiation on the "first"† 1000 objects by adding them to the INSTANCES map on creation:
private static final ConcurrentMap<Integer, Reference<MyClass>> INSTANCES = createInstanceMap();
private static ConcurrentMap<Integer, Reference<MyClass>> createInstanceMap() {
// The set of keys to eagerly initialize instances for
final Stream<Integer> keys = IntStream.range(0, 1000).boxed();
final Collector<Integer, ?, ConcurrentMap<Integer, Reference<MyClass>>> mapFactory = Collectors
.toConcurrentMap(Function.identity(), key -> new SoftReference<>(new MyClass(key)));
return keys.collect(mapFactory);
}
†How you define which objects are the "first" ones is up to you; Here, I'm just using the natural order of the integer keys because it's suitable for a simple example.
Your function for examining your cache is cringe worthy. First, as you said, it creates all the cache objects. Second, it iterates Integer.MAX_VALUE times.
Better would be:
public void myFunction() {
for(MyClass c : map.values()) {
// DO STUFF
}
}
To the issue at hand: Is it possible to find out whether an Object has references to it?
Yes. It is possible. But you won't like it.
http://docs.oracle.com/javase/1.5.0/docs/guide/jvmti/jvmti.html
jvmtiError
IterateOverReachableObjects(jvmtiEnv* env,
jvmtiHeapRootCallback heap_root_callback,
jvmtiStackReferenceCallback stack_ref_callback,
jvmtiObjectReferenceCallback object_ref_callback,
void* user_data)
Loop over all reachable objects in the heap. If a MyClass object is reachable, then, well, it is reachable.
Of course, by storing the object in your cache, you are making it reachable, so you'd have to change your cache to WeakReferences, and see if you can exclude those from the iteration.
And you're no longer using pure Java, and jvmti may not be supported by all VM's.
As I said, you won't like it.
While trying to model polynomials, in particular their multiplication, I run into the following problem. During the multiplication, the individual monomials of the two polynomials are multiplied and of course in can happen that I have (3x^2 y + 5x y^2) * (x + y). The result contains 3x^2 y^2 and 5 x^2 y^2, which I want to combine by addition right away.
Naturally I would like to use the part x^2 y^2 of the monomial as a key in a (hash) map to add up the different coefficients (3 and 5 in the example). But the monomial object as I envisage it should naturally also contain the coefficient, which should not be part of the map key.
Of course I could write equals/hashcode of the monomial object such that they ignore the coefficient. But this feels just so wrong, because mathematically a monomial clearly is only equal to another one if also the coefficients are equal.
Introducing a coefficient-free monomial object for intermediate operations does also not look right.
Instead of using the map, I could use a list and use a binary search with a dedicated comparator that ignores the coefficient.
Short of implementing a map which does not use the keys' equals/hashcode, but a dedicated one, are there any better ideas of how to fuse the monomials?
Since the JDK implementation of [Linked]HashMap does not permits you to override the equals/hashCode implementation, the only other ways are:
a wrapping object like this:
class A {
private final String fieldA; // equals/hashCode based on that field.
private final String fieldB; // equals/hashCode based on that field.
}
class B {
private A a;
public int hashCode() {return a.fieldA.hashCode();}
public boolean equals(Object o) {... the same ... }
}
Map<B, Value> map = new HashMap<B, Value>();
map.put(new B(new A("fieldA", "fieldB")), new Value(0));
Well, with more getters/constructors.
This can be annoying, and perhaps there exists some library (like Guava) that allows an equals/hashCode method to be given like you can give a Comparator to TreeMap.
You'll find below a sample implementation that point out what to do to decorate an existing map.
use a TreeMap with a specific Comparator. The other answer point it, but I'd say you'll need to correctly define a Comparator because this could lead to problems: if you compareTo method returns 0 when equality is reached, and 1 in other case, this means there is no natural ordering. You should try to find one, or use the wrapper object.
If you want to take the challenge, you can create a basic implementation using delegation/decoration over another HashMap (this could be another kind of map, like LinkedHashMap):
public class DelegatingHashMap<K,V> implements Map<K,V> {
private final BiPredicate<K,Object> equalsHandler;
private final IntFunction<K> hashCodeHandler;
private final Map<Wrapper<K>,V> impl = new HashMap<>();
public DelegatingHashMap(
BiPredicate<K,Object> equalsHandler,
IntFunction<K> hashCodeHandler
) {
this.equalsHandler = requireNonNull(equalsHandler, "equalsHandler");
this.hashCodeHandler= requireNonNull(hashCodeHandler, "hashCodeHandler");
}
public Object get(K key) {
Wrapper<K> wrap = new Wrapper<>(key);
return impl.get(wrap);
}
...
static class Wrapper<K2> {
private final K2 key;
private final BiPredicate<K> equalsHandler;
private final IntFunction<K> hashCodeHandler;
public int hashCode() {return hashCodeHandler.apply(key);}
public boolean equals(Object o) {
return equalsHandler.test(key, o);
}
}
}
And the code using the map:
DelegatingHashMap<String, Integer> map = new DelegatingHashMap<>(
(key, old) -> key.equalsIgnoreCase(Objects.toString(o, "")),
key -> key.toLowerCase().hashCode()
);
map.put("Foobar", 1);
map.put("foobar", 2);
System.out.println(map); // print {foobar: 2}
But perhaps the best (for the memory) would be to rewrite the HashMap to directly use the handler instead of a wrapper.
You could use a TreeMap with a custom comparator:
TreeMap(Comparator<? super K> comparator)
Constructs a new, empty tree map, ordered according to the given comparator.
(Source)
Consider using a TreeMap, which is a SortedMapand thus also a Map. You can provide a Comparator to its constructor. The sorted map will use that Comparator for sorting the map keys. But importantly, for your case, it will consuder keys to be equal if the Comparator returns 0. In your case that will require a Comparator that is not consustent with equals, which could cause you problems if you are not careful.
Another option is to introduce another class, which acts as an adaptor for a Mononomial and can be used as a map key having the properties you deserve.
I think it may be better to separate the monomial into 2 parts: the coefficient and the variable. That way you can use the variable part in your map as the key and the coefficient as the value (which can then up updated).
All this code should be implementation details inside a Polynomial object
I'm not sure why you think a coefficient-free monomial does not look right. You don't have to expose the object to the outside if you don't want. But it might be a nice way to have getters on your Polynomial to get the coefficients for each monomial.
http://www.javapractices.com/topic/TopicAction.do?Id=29
Above is the article which i am looking at. Immutable objects greatly simplify your program, since they:
allow hashCode to use lazy initialization, and to cache its return value
Can anyone explain me what the author is trying to say on the above
line.
Is my class immutable if its marked final and its instance variable
still not final and vice-versa my instance variables being final and class being normal.
As explained by others, because the state of the object won't change the hashcode can be calculated only once.
The easy solution is to precalculate it in the constructor and place the result in a final variable (which guarantees thread safety).
If you want to have a lazy calculation (hashcode only calculated if needed) it is a little more tricky if you want to keep the thread safety characteristics of your immutable objects.
The simplest way is to declare a private volatile int hash; and run the calculation if it is 0. You will get laziness except for objects whose hashcode really is 0 (1 in 4 billion if your hash method is well distributed).
Alternatively you could couple it with a volatile boolean but need to be careful about the order in which you update the two variables.
Finally for extra performance, you can use the methodology used by the String class which uses an extra local variable for the calculation, allowing to get rid of the volatile keyword while guaranteeing correctness. This last method is error prone if you don't fully understand why it is done the way it is done...
If your object is immutable it can't change it's state and therefore it's hashcode can't change. That allows you to calculate the value once you need it and to cache the value since it will always stay the same. It's in fact a very bad idea to implement your own hasCode function based on mutable state since e.g. HashMap assumes that the hash can't change and it will break if it does change.
The benefit of lazy initialization is that hashcode calculation is delayed until it is required. Many object don't need it at all so you save some calculations. Especially expensive hash calculations like on long Strings benefit from that.
class FinalObject {
private final int a, b;
public FinalObject(int value1, int value2) {
a = value1;
b = value2;
}
// not calculated at the beginning - lazy once required
private int hashCode;
#Override
public int hashCode() {
int h = hashCode; // read
if (h == 0) {
h = a + b; // calculation
hashCode = h; // write
}
return h; // return local variable instead of second read
}
}
Edit: as pointed out by #assylias, using unsynchronized / non volatile code is only guaranteed to work if there is only 1 read of hashCode because every consecutive read of that field could return 0 even though the first read could already see a different value. Above version fixes the problem.
Edit2: replaced with more obvious version, slightly less code but roughly equivalent in bytecode
public int hashCode() {
int h = hashCode; // only read
return h != 0 ? h : (hashCode = a + b);
// ^- just a (racy) write to hashCode, no read
}
What that line means is, since the object is immutable, then the hashCode has to only be computed once. Further, it doesn't have to be computed when the object is constructed - it only has to be computed when the function is first called. If the object's hashCode is never used then it is never computed. So the hashCode function can look something like this:
#Override public int hashCode(){
synchronized (this) {
if (!this.computedHashCode) {
this.hashCode = expensiveComputation();
this.computedHashCode = true;
}
}
return this.hashCode;
}
And to add to other answers.
Immutable object cannot be changed. The final keyword works for basic data types such as int. But for custom objects it doesn't mean that - it has to be done internally in your implementation:
The following code would result in a compilation error, because you are trying to change a final reference/pointer to an object.
final MyClass m = new MyClass();
m = new MyClass();
However this code would work.
final MyClass m = new MyClass();
m.changeX();
This question already has answers here:
How to get the unique ID of an object which overrides hashCode()?
(11 answers)
Closed 9 years ago.
I made a vector set in order to avoid thrashing the GC with iterator allocations and the like
( you get a new/free each for both the set reference and the set iterator for each traversal of a HashSet's values or keys )
anyway supposedly the Object.hashCode() method is a unique id per object. (would fail for a 64 bit version?)
But in any case it is overridable and therefore not guaranteed unique, nor unique per object instance.
If I want to create an "ObjectSet" how do I get a guaranteed unique ID for each instance of an object??
I just found this: which answers it.
How to get the unique ID of an object which overrides hashCode()?
The simplest solution is to add a field to the object. This is the fastest and most efficient solution and avoid any issues of objects failing to be cleaned up.
abstract Ided {
static final AtomicLong NEXT_ID = new AtomicLong(0);
final long id = NEXT_ID.getAndIncrement();
public long getId() {
return id;
}
}
If you can't modify the class, you can use an IdentityHashMap like #glowcoder's deleted solution.
private static final Map<Object, Long> registry = new IdentityHashMap<Object, Long>();
private static long nextId = 0;
public static long idFor(Object o) {
Long l = registry.get(o);
if (l == null)
registry.put(o, l = nextId++);
return l;
}
public static void remove(Object o) {
registry.remove(o);
}
No, that's not how hashCode() works. The returned value does not have to be unique. The exact contract is spelled out in the documentation.
Also,
supposedly the Object.hashCode() method is a unique id per object
is not true. To quote the documentation:
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects.
java.lang.System.identityHashCode(obj); will do this for you, if you really need it and understand the repercussions. It gets the identity hashcode, even if the method to provide the hashcode has been overridden.
Trying to outperform the java GC sounds like premature optimization to me.
The GC is already tuned to handle small short-lived objects. If you have performance issues with GC, you ought to help the GC, not re-implement it (IMNSHO)
http://leepoint.net/notes-java/data/expressions/22compareobjects.html
It turns out that defining equals()
isn't trivial; in fact it's moderately
hard to get it right, especially in
the case of subclasses. The best
treatment of the issues is in
Horstmann's Core Java Vol 1.
If equals() must always be overridden, then what is a good approach for not being cornered into having to do object comparison? What are some good "design" alternatives?
EDIT:
I'm not sure this is coming across the way that I had intended. Maybe the question should be more along the lines of "Why would you want to compare two objects?" Based upon your answer to that question, is there an alternative solution to comparison? I don't mean, a different implementation of equals. I mean, not using equality at all. I think the key point is to start with that question, why would you want to compare two objects.
If equals() must always be overridden,
then what is a good approach for not
being cornered into having to do
object comparison?
You are mistaken. You should override equals as seldom as possible.
All this info comes from Effective Java, Second Edition (Josh Bloch). The first edition chapter on this is still available as a free download.
From Effective Java:
The easiest way to avoid problems is
not to override the equals method, in
which case each instance of the class
is equal only to itself.
The problem with arbitrarily overriding equals/hashCode is inheritance. Some equals implementations advocate testing it like this:
if (this.getClass() != other.getClass()) {
return false; //inequal
}
In fact, the Eclipse (3.4) Java editor does just this when you generate the method using the source tools. According to Bloch, this is a mistake as it violates the Liskov substitution principle.
From Effective Java:
There is no way to extend an
instantiable class and add a value
component while preserving the equals
contract.
Two ways to minimize equality problems are described in the Classes and Interfaces chapter:
Favour composition over inheritance
Design and document for inheritance or else prohibit it
As far as I can see, the only alternative is to test equality in a form external to the class, and how that would be performed would depend on the design of the type and the context you were trying to use it in.
For example, you might define an interface that documents how it was to be compared. In the code below, Service instances might be replaced at runtime with a newer version of the same class - in which case, having different ClassLoaders, equals comparisons would always return false, so overriding equals/hashCode would be redundant.
public class Services {
private static Map<String, Service> SERVICES = new HashMap<String, Service>();
static interface Service {
/** Services with the same name are considered equivalent */
public String getName();
}
public static synchronized void installService(Service service) {
SERVICES.put(service.getName(), service);
}
public static synchronized Service lookup(String name) {
return SERVICES.get(name);
}
}
"Why would you want to compare two objects?"
The obvious example is to test if two Strings are the same (or two Files, or URIs). For example, what if you wanted to build up a set of files to parse. By definition, the set contains only unique elements. Java's Set type relies on the equals/hashCode methods to enforce uniqueness of its elements.
I don't think it's true that equals should always be overridden. The rule as I understand it is that overriding equals is only meaningful in cases where you're clear on how to define semantically equivalent objects. In that case, you override hashCode() as well so that you don't have objects that you've defined as equivalent returning different hashcodes.
If you can't define meaningful equivalence, I don't see the benefit.
How about just do it right?
Here's my equals template which is knowledge applied from Effective Java by Josh Bloch. Read the book for more details:
#Override
public boolean equals(Object obj) {
if(this == obj) {
return true;
}
// only do this if you are a subclass and care about equals of parent
if(!super.equals(obj)) {
return false;
}
if(obj == null || getClass() != obj.getClass()) {
return false;
}
final YourTypeHere other = (YourTypeHere) obj;
if(!instanceMember1.equals(other.instanceMember1)) {
return false;
}
... rest of instanceMembers in same pattern as above....
return true;
}
Mmhh
In some scenarios you can make the object unmodifiable ( read-only ) and have it created from a single point ( a factory method )
If two objects with the same input data ( creation parameters ) are needed the factory will return the same instance ref and then using "==" would be enough.
This approach is useful under certain circumstances only. And most of the times would look overkill.
Take a look at this answer to know how to implement such a thing.
warning it is a lot of code
For short see how the wrapper class works since java 1.5
Integer a = Integer.valueOf( 2 );
Integer b = Integer.valueOf( 2 );
a == b
is true while
new Integer( 2 ) == new Integer( 2 )
is false.
It internally keeps the reference and return it if the input value is the same.
As you know Integer is read-only
Something similar happens with the String class from which that question was about.
Maybe I'm missing the point but the only reason to use equals as opposed to defining your own method with a different name is because many of the Collections (and probably other stuff in the JDK or whatever it's called these days) expect the equals method to define a coherent result. But beyond that, I can think of three kinds of comparisons that you want to do in equals:
The two objects really ARE the same instance. This makes no sense to use equals because you can use ==. Also, and correct me if I've forgotten how it works in Java, the default equals method does this using the automatically generated hash codes.
The two objects have references to the same instances, but are not the same instance. This is useful, uh, sometimes... particularly if they are persisted objects and refer to the same object in the DB. You would have to define your equals method to do this.
The two objects have references to objects that are equal in value, though they may or may not be the same instances (in other words, you compare values all the way through the hierarchy).
Why would you want to compare two objects? Well, if they're equal, you would want to do one thing, and if they're not, you would want to do something else.
That said, it depends on the case at hand.
The main reason to override equals() in most cases is to check for duplicates within certain Collections. For example, if you want to use a Set to contain an object you have created you need to override equals() and hashCode() within your object. The same applies if you want to use your custom object as a key in a Map.
This is critical as I have seen many people make the mistake in practice of adding their custom objects to Sets or Maps without overriding equals() and hashCode(). The reason this can be especially insidious is the compiler will not complain and you can end up with multiple objects that contain the same data but have different references in a Collection that does not allow duplicates.
For example if you had a simple bean called NameBean with a single String attribute 'name', you could construct two instances of NameBean (e.g. name1 and name2), each with the same 'name' attribute value (e.g. "Alice"). You could then add both name1 and name2 to a Set and the set would be size 2 rather than size 1 which is what is intended. Likewise if you have a Map such as Map in order to map the name bean to some other object, and you first mapped name1 to the string "first" and later mapped name2 to the string "second" you will have both key/value pairs in the map (e.g. name1->"first", name2->"second"). So when you do a map lookup it will return the value mapped to the exact reference you pass in, which is either name1, name2, or another reference with name "Alice" that will return null.
Here is a concrete example preceded by the output of running it:
Output:
Adding duplicates to a map (bad):
Result of map.get(bean1):first
Result of map.get(bean2):second
Result of map.get(new NameBean("Alice"): null
Adding duplicates to a map (good):
Result of map.get(bean1):second
Result of map.get(bean2):second
Result of map.get(new ImprovedNameBean("Alice"): second
Code:
// This bean cannot safely be used as a key in a Map
public class NameBean {
private String name;
public NameBean() {
}
public NameBean(String name) {
this.name = name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
#Override
public String toString() {
return name;
}
}
// This bean can safely be used as a key in a Map
public class ImprovedNameBean extends NameBean {
public ImprovedNameBean(String name) {
super(name);
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if(obj == null || getClass() != obj.getClass()) {
return false;
}
return this.getName().equals(((ImprovedNameBean)obj).getName());
}
#Override
public int hashCode() {
return getName().hashCode();
}
}
public class MapDuplicateTest {
public static void main(String[] args) {
MapDuplicateTest test = new MapDuplicateTest();
System.out.println("Adding duplicates to a map (bad):");
test.withDuplicates();
System.out.println("\nAdding duplicates to a map (good):");
test.withoutDuplicates();
}
public void withDuplicates() {
NameBean bean1 = new NameBean("Alice");
NameBean bean2 = new NameBean("Alice");
java.util.Map<NameBean, String> map
= new java.util.HashMap<NameBean, String>();
map.put(bean1, "first");
map.put(bean2, "second");
System.out.println("Result of map.get(bean1):"+map.get(bean1));
System.out.println("Result of map.get(bean2):"+map.get(bean2));
System.out.println("Result of map.get(new NameBean(\"Alice\"): "
+ map.get(new NameBean("Alice")));
}
public void withoutDuplicates() {
ImprovedNameBean bean1 = new ImprovedNameBean("Alice");
ImprovedNameBean bean2 = new ImprovedNameBean("Alice");
java.util.Map<ImprovedNameBean, String> map
= new java.util.HashMap<ImprovedNameBean, String>();
map.put(bean1, "first");
map.put(bean2, "second");
System.out.println("Result of map.get(bean1):"+map.get(bean1));
System.out.println("Result of map.get(bean2):"+map.get(bean2));
System.out.println("Result of map.get(new ImprovedNameBean(\"Alice\"): "
+ map.get(new ImprovedNameBean("Alice")));
}
}
Equality is fundamental to logic (see law of identity), and there's not much programming you can do without it. As for comparing instances of classes that you write, well that's up to you. If you need to be able to find them in collections or use them as keys in Maps, you'll need equality checks.
If you've written more than a few nontrivial libraries in Java, you'll know that equality is hard to get right, especially when the only tools in the chest are equals and hashCode. Equality ends up being tightly coupled with class hierarchies, which makes for brittle code. What's more, no type checking is provided since these methods just take parameters of type Object.
There's a way of making equality checking (and hashing) a lot less error-prone and more type-safe. In the Functional Java library, you'll find Equal<A> (and a corresponding Hash<A>) where equality is decoupled into a single class. It has methods for composing Equal instances for your classes from existing instances, as well as wrappers for Collections, Iterables, HashMap, and HashSet, that use Equal<A> and Hash<A> instead of equals and hashCode.
What's best about this approach is that you can never forget to write equals and hash method when they are called for. The type system will help you remember.