Can hashCode() have dynamically changeable content? - java

In my implementation, I have a class A which overrides equals(Object) and hashCode(). But I have a small doubt that is, while adding the instance of A to HashSet/HashMap the value of the hashCode() is x, after sometime the value of the same hashCode() changed to y. Will it effect anything?

The hash code mustn't change after it's been added to a map / set. It's okay for it to change before that, although it generally makes the type easier to work with if it doesn't change.
If the hash code changes, the key won't be found in the map / set, as even if it ends up in the same bucket, the hash code will be changed first.

When the return value of hashCode() or equals() changes while the object is contained in HashMap/HashSet etc., the behavior is undefined (you could get all kinds of strange behavior). So one must avoid such mutation of keys while the object is contained in such collections etc.
It is considered best to use only immutable objects for keys (or place them in a HashSet etc.). In fact python for example, does not allow mutable objects to be used as keys in maps. It is permissive/common to use mutable objects as keys in Java, but in such case it is advisable to make such objects "effectively immutable". I.e. do not change the state of such objects at all after instantiation.
To give an example, using a list as a key in a Map is usually considered okay, but you should avoid mutating such lists at any point of your application to avoid getting bitten by nasty bugs.
As long as you don't change the return value of hashCode() and equals() while the objects are in the container, you should be ok on paper. But one could easily introduce nasty, hard to find bugs by mistake so it's better to avoid the situation altogether.

Yes, the hash code of an object must not change during its lifetime. If it does, you need to notify the container (if that's possible); otherwise you will can get wrong results.
Edit: As pointed out, it depends on the container. Obviously, if the container never uses your hashCode or equals methods, nothing will go wrong. But as soon as it tries to compare things for equality (all maps and sets), you'll get yourself in trouble.

Yes. Many people answered the question here, I just want to say an analogy. Hash code is something like address in hash-based collection:
Imagine you check in a hotel by your name "Mike", after that you change your name to "GreatMike" on check-paper. Then when someone looks for you by your name "Mike", he cannot find you anymore.

Related

When is hashcode useful if I am never using hashtable?

Lets say I am implementing a class called Car, with 2 member variables int numDoors, and String color.
In a hypothetical case, I am never going to use such a car in hashtable or hashmap or any structure that needs a hash, time immemorial.
Now, why is it still required to override hashCode along with equals ?
Note: all answers I checkout include use in hashtable / hashmap. I have tried extensively to get this answer, so as a request dont mark it as a duplicate. Thanks
It's the general convention:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
However, it's not entirely enforceable.
There are times in which you would believe that you don't need to have hashCode defined and implemented for your object, and if you don't use any structure that relies on a hash to store or reference it, you'd be correct.
But, there are third-party libraries in which your object may come into contact with, and they may very well be using a Map or Set to do their work, and they'd have the expectation that you followed conventions.
It's up to you to not implement hashCode along with equals - you're certainly not forced to (although many would argue that this is a bug), but beware that your object may not work as well with a third party library for this reason.
The only conceivable types which would not be able to override hashCode method in a fashion consistent with the hashCode and equals contract would be those which are unable to override hashCode [e.g. because a base class declared it final]. There is thus almost never any reason for a type not to legitimately implement hashCode(). Even if a type cannot guarantee that instances which are unequal won't spontaneously become equal, the author of the type may still legitimately implement hashCode() by picking a 32-big int value [e.g. 8675309] and implementing hashCode() as #override int hashCode() { return 8675309; }. Doing this will allow all of the hash-table-based collection types to work correctly. Storing very many such items into a hash table will severely degrade performance, but hash tables with just a few items will work just fine and generally perform decently. By contrast, if one doesn't override hashCode then even a hash table will likely work incorrectly if even a single item is stored into it.
Incidentally, in some cases there may be advantages to implementing hashCode even when not using hashed collections. For example, some immutable collection types which support deep comparison might call hashCode() on the items stored therein. If a collection is large, and/or comparison operations on the items stored within it are expensive, the efficiency of testing two collections for equality ("do they contain equal items") may be enhanced by using a three-step process:
Compare the aggregate hashcode of two collections. If they're not equal, no reason to look any further. Will often yield instant results, no matter the size of the collections.
Compare the cached hash codes of all the items. If the collections' contents match except for the last couple items, and if comparisons between items may be expensive (e.g. the items are thousand-character strings) this will often avoid the need to compare all but one of the items for equality [note that if all but one of the items matched, and its hash code differed, then the aggregate hash code would differ and we wouldn't have gotten this far].
If all the hash codes match, then call equals on each pair of items that don't compare reference-equal.
Note that if two collections contain distinct items with equal content, a comparison is going to need to deeply examine all of the items; hashCode can't do anything to help with that case. On the other hand, in most cases where things are compared they are not equal, and using cached hashCode() values may facilitate orders-of-magnitude speedups in those cases.

Are there any constraints on the hash map data type

Are there any constraint on the key type in the hash map and hash table?----Interview Question.
I think yes we can customize it as needed.
Technically, no. Generally, you want to use an object that implements equals() and hashCode() although that is not strictly necessary. If you don't, then it will use the base implementations defined by Object which compare object identity. A lot of times, that is not appropriate but sometimes it's fine.
Technically the key doesn't need to be immutable as long as the values used in the equals() and hashCode() implementations are immutable. For example, if your class Foo uses a string "foo" as part of its has then that value "foo" must not change. That's because hash maps put the keys into buckets based on the hashCode() value for efficiency reasons. If the hashCode suddenly changes, the hash map is unaware and the key will now live in the wrong bucket and you'll run into nasty bugs because it's then possible to have "duplicate" objects in your map. Hope that makes sense.
Several things to consider:
Just about the "Type", you cannot use primitive type. This is language constraint of Java. e.g. HashMap<int, Foo> is not valid, you need to use HashMap<Integer, Foo>
Base on the way HashMap work, key should have a meaningful implementation of hashCode() and equals(). How it is "meaningful" depends on your need. It may be possible that the default implementation in Object already serve your need, but you need to aware of it.
Once an object instance is put into the Map as key, its hashCode() and equals() should stay consistent. You should never put to a map, and change the state of the object instance as Key and cause hashCode()/equals() returns different value. The easiest way to ensure it is of course use an immutable object as key. However it is still fine that you use mutable object, but in your code, you ensure changing state of keys are not happening.

Practical example for Immutable class

It is obvious that immutability increases the re-usability since it creates new object in each state change.Can somebody tells me a practical scenario where we need a immutable class ?
Consider java.lang.String. If it weren't immutable, every time you ever have a string you want to be confident wouldn't change underneath you, you'd have to create a copy.
Another example is collections: it's nice to be able to accept or return a genuinely immutable collection (e.g. from Guava - not just an immutable view on a mutable collection) and have confidence that it won't be changed.
Whether those count as "needs" or not, I don't know - but I wouldn't want to develop without them.
A good example is related to hashing. A class overrides the equals() and hashCode() methods so that it can be used in data structures like HashSet and (as keys in) HashMap, and the hash code is typically derived by some identifying member attributes. However, if these attributes were to change then so would the object's hash code, so the object is no longer usable in a hashing data structure.
Java provides a nice example: String.
This article has a good color example (since color definitions don't change).
http://www.ibm.com/developerworks/java/library/j-jtp02183/index.html

Java HashMap or IdentityHashMap

There are some cases that the key objects used in map do not override hashCode() and equals() from Object, for examples, use a socket Connection or java.lang.Class as keys.
Is there any potential defect to use these objects as keys in a HashMap?
Should I use IdentityHashMap in these cases?
If equals() and hashCode() are not overridden on key objects, HashMap and IdentityHashMap should have the same semantics. The default equals() implementation uses reference semantics, and the default hashCode() is the system identity hash code of the object.
This is only harmful in cases where different instances of an object can be considered logically equal. For example, you would not want to use IdentityHashMap if your keys were:
new Integer(1)
and
new Integer(1)
Since these are technically different instances of the Integer class. (You should really be using Integer.valueOf(1), but that's getting off-topic.)
Class as keys should be okay, except in very special circumstances (for example, the hibernate ORM library generates subclasses of your classes at runtime in order to implement a proxy.) As a developer I would be skeptical of code which stores Connection objects in a Map as keys (maybe you should be using a connection pool, if you are managing database connections?). Whether or not they will work depends on the implementation (since Connection is just an interface).
Also, it's important to note that HashMap expects the equals() and hashCode() determination to remain constant. In particular, if you implement some custom hashCode() which uses mutable fields on the key object, changing a key field may make the key get 'lost' in the wrong hashtable bucket of the HashMap. In these cases, you may be able to use IdentityHashMap (depending on the object and your particular use case), or you might just need a different equals()/hashCode() implementation.
From a mobile code security point of view, there are situations where using IdentityHashMap or similar becomes necessary. Malicious implementations of non-final key classes can override hashCode and equals to be malicious. They can, for instance, claim equality to different instances, acquire a reference to other instances they are compared to, etc. I suggest breaking with standard practice by staying safe and using IdentityHashMap where you want identity semantics. There rarely is a good reason to changing the meaning of equality in a subclass where the superclass is already being compared. I guess the most likely scenario is a broken, non-symmetric proxy.
The implementation of IdentityHashMap is quite different than HashMap. It uses linear probing rather than Entry objects as links in a chain. This leads to a slight reduction in the number of objects, although a pretty small difference in total memory use. I don't have any good performance statistics I can cite. There used to be a performance difference between using (non-overridden) Object.hashCode and System.identityHashCode, but that got cleared up a few years ago.
In situation you describes, the behaviors of HashMap and IdentityHashMap is identical.
On the contrast to this, if keys overrides equals() and hashCode(), behaviors of two maps are different.
see java.util.IdentityHashMap's javadoc below.
This class implements the Map interface with a hash table, using reference-equality in place of object-equality when comparing keys (and values). In other words, in an IdentityHashMap, two keys k1 and k2 are considered equal if and only if (k1==k2). (In normal Map implementations (like HashMap) two keys k1 and k2 are considered equal if and only if (k1==null ? k2==null : k1.equals(k2)).)
In summary, my answer is that:
Is there any potential defect to use these objects as keys in a HashMap?
--> No
Should I use IdentityHashMap in these cases? --> No
While there's no theoretical problem, you should avoid IdentityHashMap unless you have an explicit reason to use it. It provides no appreciable performance or other benefit in the general case, and when you inevitably start introducing objects into the map that do override equals() and hashCode(), you'll end up with subtle, hard-to-diagnose bugs.
If you think you need IdentityHashMap for performance reasons, use a profiler to confirm that suspicion before you make the switch. My guess is you'll find plenty of other opportunities for optimization that are both safer and make a bigger difference.
As far as I know, the only problem with a hashmap with bad keys, would be with very big hashmaps- your keys could be very bad and you get o(n) retrieval time, instead of o(1). If it does break anything else, I would be interested to hear about it though :)

Why is getEntry(Object key) not exposed on HashMap?

Here is my use case, I have an object that is logically equal to my HashMap key but not the same object (not ==). I need to get the actuall key object out of the HashMap so that i can synchronise on it. I am aware that i can iterate over the ketSet, but this is slow in comparison to hashing.
Looking through the java.util.HashMap implementation i see a getEntry(Object key) method that is exactly what i need. Any idea why this has not been exposed?
Can you think of any other way i can get the key out?
I think you would be better off putting in an extra layer of indirection on the value. The key should also be a "pure" value. Instead of:
Map<ReferenceObjectKey,Thing> map;
Use:
Map<ValueObjectKey,ReferenceObject<Thing>> map;
I can't answer your actual question (why is the method not exposed) beyond the rather obvious, "because the authors decided not to expose it."
However your question leads me to believe that you have a rather strange synchronization scheme going on; from my understanding you're only trying to call it to get a canonical representation of equal objects for synchronization. That sounds like a really bad idea, as I noted in my comment to the question.
A better approach would be to revisit how and why you want to synchronize on these key objects, and rework your synchronization to be clearer and saner, preferably at a level higher up or by using an alternative approach altogether.
It might help if you posted a code snippet of what you want to do with this synchronization so that others can give their opinions on a cleaner way to implement it. One example would simply be to use a thread-safe map class (such as ConcurrentHashMap), if this is indeed what you're trying to achieve here.
Edit: Have a look at How To Ask Questions The Smart Way, in particular the bullet point I've linked as this is a classic example of that deficiency. It seems likely that your overall design is a bit off and needs to go in a different direction; so while you're stuck on this specific issue it's a symptom of a larger problem. Giving us the broader context will lead to you getting much better overall answers.
Actually, the method the caller is asking for would have been useful. It was arguably a mistake that it, or something like it, was not included.
As it is, supposing you wish to increment the Integer value that's mapped from key "a" -- you end up having to do a hash lookup on "a" twice. Supposing you want to distinguish between a value being not present and the value being present but mapped to null -- again, two hash lookups.
In practice the world hasn't ended because of this, though.
I stumbled upon this problem recently myself recently. When I boiled the problem down enough, it was that I was essentially using 2 different methods to associate data with the part of the key object that was used for determining equality.
With the value the key mapped to, via the Map
With the data contained with the key object, but that wasn't used in the .equals()/hashCode methods, via composition.
I was using a List in the key class to determine equality and hashcode, and there were 3 other fields in it - a boolean, and 2 Strings. In the end, I remade the map as a Map<List<String>, ...> and refactored the other 3 fields into their own class, then had the original class as a composition of the List and the new class. I felt that the code seemed better after this.
This sounds like a deeper problem you're heaving. Why do you need such a thing? Why is the key not unique to its object?
What do you mean with "so this i can synchronise on it" ?
I'm sorry, but you seem to have a conceptual break here.
If your problem is that you "hold" an equivalent object (.equals() is true but == is false) to a key, and need to find the key, using the Object variant of get would not help you, because the only .equals that Object supports is identity (==).
What you need to do is to implement equals() and of course hashcode() in your key class.
This will make it trivial to obtain the entry.

Categories

Resources