Should custom key objects be immutable ?If yes , then why? - java

Okay I want to have custom user defined objects as keys in my HashMap instead of say String. Should the candidate objects be immutable ? I read somewhere that the best practice is to make them immutable but I can not figure out the reason myself .

If you have a mutable key in a HashMap, then it will end up in the wrong bucket, which totally breaks the Map.
insert key, hashCode() is called, bucket assigned
change key, hashCode changes, no longer matches the bucket
look up by (new) key, hashCode() leads to the wrong bucket, value not found
look up by (old) key, hashCode() leads to the "correct" bucket, but now the key found there is no longer equal (because it is the "new" key now), so it is also discarded
If you have a mutable key in a TreeMap, then it will end up in the wrong position of the tree, which is supposed to be sorted (and that happens at insertion-time). Basically same flow as above.
And since we like similes around here, this is like changing your name in an existing phonebook with a magic marker without printing a whole new book: So your new name "Smith" will still be listed between "John" and "Johnston" (where no one will look for it), and no one will find it between "Smart" and "Smithers" (where they are looking for it). TreeMap works just like a phonebook.

Yes they should be immutable as they would not function well as keys if they could be changed. Imagine buying a lock and key for your house, but then deciding you'd like to make the key prettier by hammering it into a different shape. It wouldn't work very well, would it? The same principles apply here.

Yes. If you update key from somewhere else then you can no longer lookup the value stored for that key.

Related

Can I change the inner structure of objects in a HashTable while iterating over it?

Like, the title says. Can I change the inner structure of objects in a HashTable while iterating over its keys? I know I cant change the Map itself, or at least that it is risky to do so, but despite google searches I haven't found any clear or simple answer as to whether or not it is ok to change the attributes of the objects themselves in the hashmap. My gut feeling says no, since this would probably change the hash, but it would be good to know for certain. I am also interested in replacing the value for the keys while iterating over them. Is this possible?
Apologies if this has been answered a lot of times before.
To be short, will these two methods work as expected?
public class Manager {
private Hashtable<MyClassA, BufferedImage> ht1;
private Hashtable<MyClassB, JSlider> ht2;
private Image method1() {
for(MyClassB mcb: ht2.keySet()){
mcb.calculateStuff(ht2.get(mcb).getValue());
//CalculateStuff() doesnt change anything, but if it takes long, the JSliders might be
//changed by the user or a timer, resulting in a new hashCode(), and potentially problems.
}
}
private void method2(){
for(MyClassA mca: ht1.keySet()){
mca.changeInnerStructureOfA(); //Changes the fields of the object mca.
ht1.put(mca.calculateNewImage());
}
}
It is not allowed to mutate keys of a hash-based container in any situation, not only while iterating over the container. The reason for this is that any mutation that changes the value of hash function leaves your container in an invalid state, when the hashed key is sitting in the hash bucket that does not correspond to the key's hash value.
This is the reason behind a strong recommendation of using only immutable classes as keys in hash-based containers.
I am also interested in replacing the value for the keys while iterating over them. Is this possible?
No, this is not possible. In order to replace a key in a container with another key you need to remove the item first, and then re-insert it back with the new key. This, however, would trigger concurrent modification exception.
If you need to replace a significant number of keys, the best approach would be making a new hash container, and populate it with key-vale pairs as you iterate the original container.
If you need to replace only a small number of keys, make a list of objects describing the change (old key, new key, value), populate the list as you iterate then original container, and then walk the list of changes to make the alterations to the original container.

100% Accurate key,value HashMap

According to the webpage http://www.javamex.com/tutorials/collections/hash_codes_advanced.shtml
hash codes do not uniquely identify an object. They simply narrow down the choice of matching items, but it is expected that in normal use, there is a good chance that several objects will share the same hash code. When looking for a key in a map or set, the fields of the actual key object must therefore be compared to confirm a match."
First does this mean that keys used in a has map may point to more then one value as well? I assume that it does.
If this is the case. How can I create a "Always Accurate" hashmap or similar key,value object?
My key needs to be String and my value needs to be String as well.. I need around 4,000 to 10,000 key value pairs..
A standard hashmap will guarantee unique keys. A hashcode is not equivalent to a key. It is just a means of quickly reducing the set of possible values down to objects (strings in your case) that have a specific hashcode.
First, let it be noted: Java's HashMaps work. Assuming the hash function is implemented correctly, you'll always get the same value for the same key.
Now, in a hash map, the key's hash code determines the bucket in which the value will be placed (read about hash tables if you're not familiar with the term). The performance of the map depends on how well the hash codes are distributed, and how balanced is the number of values in every bucket. Since you're using String, rest assure. HashMap will be "Always Accurate".

Can i have HashSets as the keys in a HashMap? Suggest an alternative if not

Edit: explained the problem properly now.
I have a hashmap where i want to store sets of words seen together (key) and the lines in which they were seen together(value). This is the structure i came up with:
HashMap<HashSet<String>, HashSet<Integer>> hm= ...
for inputs:
mango, banana, apple
apple, banana
peach, walrus
walrus, peach
As I read this, line by line, I make new temporary keys (hashsets not yet inserted into hashmap) from the combination of words in the line. Each temporary key is a hashset of a subset of the words in the line. If a temporary key already exists in my hashmap, which i check by
if(hashmap.containsKey(hashset))
i simply add the new line to that key's corresponding value, if not, I make a new entry in the hashmap and take care of it.
At no point do i change an existing key. I only update their corresponding values in the hasmmap.
my hashmap, at the end of reading the file, should look something like this
[apple, banana]=[1,2]
[peach, walrus]=[3,4]
...
the problem is that the
if(hashmap.containsKey(hashset))
piece of code doesn't always detect existing keys. Why is this? Is this structure not allowed?
Thank you
This should work, but you need to watch out for mutability of the keys. If you ever change the contents of one of the keys, its hashcode will change, and your map will start doing strange things. From the javadoc for Map:
Note: great care must be exercised if mutable objects are used as map
keys. The behavior of a map is not specified if the value of an object
is changed in a manner that affects equals comparisons while the
object is a key in the map. A special case of this prohibition is that
it is not permissible for a map to contain itself as a key. While it
is permissible for a map to contain itself as a value, extreme caution
is advised: the equals and hashCode methods are no longer well defined
on such a map.
To avoid this, wrap the keys with Collections.unmodifiableSet() immediately upon creation, or just use ImmutableSet from Guava.
You can, but once you have added a HashSet as a key to a HashMap you shouldn't modify it again, as the HashSet.hashCode() might change and you'll never find your HashSet again. In other words, if you're doing something like that, be sure that your keys are immutable HashSets (see also Matt's answer here)
An alternative is to use the MultiKeyMap along with a MultiKey from commons collections
The problem you have is well explained by #Lukas ans #Matt.
I think you could get away by using extending or using a decorator pattern to create a Hashset that overides equals and hashCode in a way that is independent of the contents.
This way you can avoid introducing dependencies on third party jars just for a specific problem

Can hashCode() have dynamically changeable content?

In my implementation, I have a class A which overrides equals(Object) and hashCode(). But I have a small doubt that is, while adding the instance of A to HashSet/HashMap the value of the hashCode() is x, after sometime the value of the same hashCode() changed to y. Will it effect anything?
The hash code mustn't change after it's been added to a map / set. It's okay for it to change before that, although it generally makes the type easier to work with if it doesn't change.
If the hash code changes, the key won't be found in the map / set, as even if it ends up in the same bucket, the hash code will be changed first.
When the return value of hashCode() or equals() changes while the object is contained in HashMap/HashSet etc., the behavior is undefined (you could get all kinds of strange behavior). So one must avoid such mutation of keys while the object is contained in such collections etc.
It is considered best to use only immutable objects for keys (or place them in a HashSet etc.). In fact python for example, does not allow mutable objects to be used as keys in maps. It is permissive/common to use mutable objects as keys in Java, but in such case it is advisable to make such objects "effectively immutable". I.e. do not change the state of such objects at all after instantiation.
To give an example, using a list as a key in a Map is usually considered okay, but you should avoid mutating such lists at any point of your application to avoid getting bitten by nasty bugs.
As long as you don't change the return value of hashCode() and equals() while the objects are in the container, you should be ok on paper. But one could easily introduce nasty, hard to find bugs by mistake so it's better to avoid the situation altogether.
Yes, the hash code of an object must not change during its lifetime. If it does, you need to notify the container (if that's possible); otherwise you will can get wrong results.
Edit: As pointed out, it depends on the container. Obviously, if the container never uses your hashCode or equals methods, nothing will go wrong. But as soon as it tries to compare things for equality (all maps and sets), you'll get yourself in trouble.
Yes. Many people answered the question here, I just want to say an analogy. Hash code is something like address in hash-based collection:
Imagine you check in a hotel by your name "Mike", after that you change your name to "GreatMike" on check-paper. Then when someone looks for you by your name "Mike", he cannot find you anymore.

Use a Java hash map even when there is no "mapping"?

I want to store some objects and then be able to retrieve them later as efficiently as possible. I will also remove some of them under certain conditions. It seems a hash map would be the right choice.
But, from what I've seen, hash maps always associate a value with another? For example, "john" and "555-5555", his phone number.
Now, my situation. Suppose I have a bunch of people, and each person is connected to other people. So, I need each person to store its contacts.
What I'm doing is have each person have a hashmap, and then I'd add to the hash otherPerson, otherPerson. Basically, the key is the value. Am I doing it wrong?
EDIT I don't think the HashSet would solve my problem because I have to retrieve the value to update it and there is no get method. Remove returns a boolean, so I can't even remove it to put it back again, which would probably be a bad idea anyway.
If all you need is checking if A is one of B's contacts, then Set is choice. It has contains() for that purpose.
Otherwise, the most suitable might be Map, as you need efficient retrieval operation. You said currently you use same object as key and value, but I'm not sure how you get the the key in the first place. Say you'd like to get contact A from B's contacts, and you use something like 'B.contacts.get(A)', where do you get A from? If you already have A, what's for to get it from the map again? (maybe there are multiple instances of the same person?)
Unless there are multiple instances of the same person, I'd say for each Person, define a ID like unique attribute, and use that as the key for the contacts map. Also, do you define equal()/hashCode() for person class? Map/Set uses hashCode() and equal() for finding the match. Depending on your usage, you might need to consider rewrite them for efficiency.
I don't think the HashSet would solve my problem because I have to retrieve the value to update it and there is no get method.
This is a puzzling statement. Why would you want to retrieve a value using a get method to update it? Surely, if you know which object you need to retrieve from the set/map, you don't need to retrieve it.
For example:
HashSet<Person> relations = ...
Person p = ...
if (relations.remove(p)) {
// we removed an object such that p.equals(obj) is true.
}
Now if you are worried that the object that was removed was equal to, but not identical to p, it seems to me that something is wrong with your design. Either:
you should not be creating multiple Person instances that are equal, or
you should not be caring that Person instances are not identical, or
you should not have overridden equals(Object).
In short, the problem is that you are not managing object identity properly.
Well, the data structure you'd be looking for here, would be a HashSet (or some other kind of set), I think (if your framework/library offers it). A set just says "I have the following items" instead of "I have the following items mapped to the following values". Which would be what you're modeling here.
As for HashSet vs. other implementations (if present): That all depends on what you're doing. If you need fast lookup, i. e. "is this element in the set?" questions, then hashing is a good thing. Other underlying data structures are perhaps better optimized for other set operations, such as union, intersection, etc.
A hash table/map simply requires that you have a way to get the values you're interested in looking up later; that's what the key is for.
However, in your specific case, it sounds like you're looking for a way to store relationships between people, and what you're keeping track of is whether or not person A has a relationship with person B. A better representation for that sort of thing is an adjacency list.
Am I missing something or don't you simply need an ArrayList<Person>?
I would just store the contacts in a List<Person>. E.g.
public class Person {
private List<Person> contacts;
}
With regard to editing the individual contact, it is really not the parent Person's responsibility to do that. It should at highest add/remove contacts. You can perfectly do that by contacts.add(otherPerson) or contacts.remove(otherPerson).
When you want to edit an individual Person, which may be one of the contacts, just get a handle to it independently, e.g. personDAO.find(personId) and then update it accordingly. It's actually also the Person's own responsibility to edit own details. With a good ORM under the hood, the changes will be reflected in the contact list of other Persons.
If you need to iterate through the people, or require them to have ordering, consider TreeMap or TreeSet instead of hashing.

Categories

Resources