Comparing key elements of a Map using their hash codes [closed] - java

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I want to compare the elements of a map using HashCode. Is it possible to do so?
For example, my HashMap looks like this:
HashMap<Integer,String> map=new HashMap<Integer, String>();
map.put(123,"ABC");
map.put(345,"Abc");
map.put(245,"abc");
I assume that all the values must have same hash codes, so that I can compare them and get all keys(123,345,245).
Is my assumption correct? Can I use hash codes in order to compare the keys?

It seems to me that what you really want is to make the name to be key of the map and the corresponding telephone number set the value.
Also you want that your keys are not case sensitive... String instances that differ in case will have different hash-codes and so you cannot use them here as the key per se. What you need to do then is to transform names into a canonical form, say all lowercase, when accessing the map so that difference in case is not relevant any longer.
There is a few way to go about this... extending a HashMap to suit your needs is an elegant one.
Better to use a String to store phone number as often they do contain non numeric characters...
public class PhoneBook extends HashMap<String,Set<String>> {
public PhoneBook() { }
public PhoneBook(int initialCapacity) { super(initialCapacity); }
// Use this method to add numbers to the phone-book
// returns true if the phone directory changed as a result of the call.
public boolean add(String name, String number) {
String canonicalName = name.toLowerCase();
Set<String> existingNumbers = super.get(name);
if (existingNumbers == null)
super.put(canonicalName,existingNumbers = new HashSet<>(10));
// give an estimate capacity per name, in this example 10.
return existingNumbers.add(number);
}
#Override
public Set<String> put(String name, Set<String> numberSet) {
throw new UnsupportedOperationException("you must use add(String) to add numbers");
}
#Override
public Set<String> get(String name) {
String canonicalName = name.toLowerCase();
Set<String> existingNumbers = super.get(canonicalName);
return existingNumbers == null ? Collections.EMPTY_SET : existingNumbers;
}
}
You may need to override some other operations from Map/Hash map to make sure consistency is preserved.

You don't need to do that explicitly.
HashMap already hashes its keys.
Just invoke: map.keySet() to get the Set of your keys.
Example
HashMap<Integer,String> map = new HashMap<Integer, String>();
map.put(123,"ABC");
map.put(345,"Abc");
map.put(245,"abc");
System.out.println(map.keySet());
Output
[245, 123, 345]
What if I need the keys sorted (here, in their natural order)?
Two solutions.
Use a TreeMap instead
Map<Integer,String> map = new TreeMap<Integer, String>();
Wrap the Set around a TreeSet later on
System.out.println(new TreeSet<Integer>(map.keySet()));

There are different ways to achieve this i.e. single key -> multi values
Using Hashmap and arraylist combine
Using guava collections's multimap
Using multimap provided by apache commons collection
Here is detail explanation:
http://java.dzone.com/articles/hashmap-%E2%80%93-single-key-and

Your assumption is not correct.
From Wikipedia:
In the Java programming language, every class implicitly or explicitly
provides a hashCode() method, which digests the data stored in an
instance of the class into a single hash value (a 32-bit signed
integer). This hash is used by other code when storing or
manipulating the instance (...). This property is important to the
performance of hash tables and other data structures that store
objects in groups ("buckets") based on their computed hash values.
Now, each of your instance has a hash code that can be same or different with the hash code of other objects. The Map collection uses that hash code (via the hashcode() method) of your keys in order to store data in buckets internally (If you want more details on this, look here).
As aforementioned, this does not mean that all the objects stored in the same bucket have for sure the same or different hash code though.
UPDATE: Why do you want to compare the objects using their hash codes? If you want to compare the keys or the values of the map, I suggest that you make a value or object comparison by iterating the set of them (you can get them by map.keySet() or map.values() respectively) and not by using hash code. You can not be sure whether or not the value from hashCode() will be the same or different for different objects.

Related

Java - multiple hashmaps pointing to the same key

I have multiple files which contains key=value string pairs. The keys are the same between the files, but the values differs. Each file can have 1000 plus of such pairs.
I want to store each file in a separate hashmap, ie map<KeyString, ValueString>, so if there are five files, then there will be five hashmaps.
To avoid duplicating the keys across each hashmap, is it possible to have each map reference the same key? Note that once the keys are added to the map, it will not be deleted.
I considered making the first file the 'base' as in the flyweight pattern, this base would be the intrinsic set of keys/values. The other remaining files would be the extrinsic set of values, but I don't know how to relate the values back to the base (intrinsic) keys without duplicating the keys?
I am open to a simpler/better approach.
I can think about a simpler approach. Instead of having Map<String, String> think of Map<String, List<String> or directly MultiMap<String, String> from guava.
If each key is in each file and all have values, you could store values from first file at 0th index, from the second at 1st index etc.
If it wouldn't work, I recommend a Collection<Map<String, String>, so you're able to iterate through your Maps. Then when you want to add value to one of the Maps, go through all keySets and if one of them contains that key, just put with object returned from this keySet.
Other solution would be to have a HashSet of keys that have already been put. This would be more efficient.
After reading in the keys, you can use String.intern().
When called, what it does is either:
add the String to the internal pool if it didn't exist already;
return the equivalent String from the pool if it already existed.
String#intern Javadoc
First of all, I don't see the problem with storing multiple instances of your String keys. 5 HashMaps * 1000 keys is a very small number, and you shouldn't have memory issues.
That said, if you still want to avoid duplicating the Strings, you can create the first HashMap, and then you the exact same keys for the other HashMaps.
For example, suppose map1 is the first HashMap and it is already populated with the contents of the first file.
You can write something like this to populate the 2nd HashMap:
for (String key : map1.keySet()) {
map2.put (key, someValue);
}
Of course you will have to find for each key of the first map the corresponding value of the second map. If the keys are not stored in the same order in the input files, this may require some preliminary sorting step.
Perhaps you could hold a static Map<> to map your keys to unique Integers and use those Integers for the keys to your map?
Something like:
class KeySharedMap<K,V> {
// The next key to use. Using Atomics for the auto-increment.
static final AtomicInteger next = new AtomicInteger(0);
// Static mapping of keys to unique Integers.
static final ConcurrentMap<Object,Integer> keys = new ConcurrentHashMap<>();
// The map indexed by Integer from the `keys`.
Map<Integer, V> map = new HashMap<>();
public V get(Object key) {
return map.get(keys.get(key));
}
public V put(Object key, V value) {
// Associate a unique integer for each unique key.
keys.computeIfAbsent(key,x -> next.getAndIncrement());
// Put it in my map.
return map.put(keys.get(key),value);
}
}
Yes, I realise that K is not used here but I suspect it would be necessary if you wish to implement Map<K,V>.

Is there an efficient way of checking if HashMap contains keys that map to the same value?

I basically need to know if my HashMap has different keys that map to the same value. I was wondering if there is a way other than checking each keys value against all other values in the map.
Update:
Just some more information that will hopefully clarify what I'm trying to accomplish. Consider a String "azza". Say that I'm iterating over this String and storing each character as a key, and it's corresponding value is some other String. Let's say I eventually get to the last occurrence of 'a' and the value is already be in the map.This would be fine if the key corresponding with the value that is already in the map is also 'a'. My issue occurs when 'a' and 'z' both map to the same value. Only if different keys map to the same value.
Sure, the fastest to both code and execute is:
boolean hasDupeValues = new HashSet<>(map.values()).size() != map.size();
which executes in O(n) time.
Sets don't allow duplicates, so the set will be smaller than the values list if there are dupes.
Very similar to EJP's and Bohemian's answer above but with streams:
boolean hasDupeValues = map.values().stream().distinct().count() != map.size();
You could create a HashMap that maps values to lists of keys. This would take more space and require (slightly) more complex code, but with the benefit of greatly higher efficiency (amortized O(1) vs. O(n) for the method of just looping all values).
For example, say you currently have HashMap<Key, Value> map1, and you want to know which keys have the same value. You create another map, HashMap<Value, List<Key>> map2.
Then you just modify map1 and map2 together.
map1.put(key, value);
if(!map2.containsKey(value)) {
map2.put(value, new ArrayList<Key>);
}
map2.get(value).add(key);
Then to get all keys that map to value, you just do map2.get(value).
If you need to put/remove in many different places, to make sure that you don't forget to use map2 you could create your own data structure (i.e. a separate class) that contains 2 maps and implement put/remove/get/etc. for that.
Edit: I may have misunderstood the question. If you don't need an actual list of keys, just a simple "yes/no" answer to "does the map already contain this value?", and you want something better than O(n), you could keep a separate HashMap<Value, Integer> that simply counts up how many times the value occurs in the map. This would take considerably less space than a map of lists.
You can check whether a map contains a value already by calling map.values().contains(value). This is not as efficient as looking up a key in the map, but still, it's O(n), and you don't need to create a new set just in order to count its elements.
However, what you seem to need is a BiMap. There is no such thing in the Java standard library, but you can build one relatively easily by using two HashMaps: one which maps keys to values and one which maps values to keys. Every time you map a key to a value, you can then check in amortized O(1) whether the value already is mapped to, and if it isn't, map the key to the value in the one map and the value to the key in the other.
If it is an option to create a new dependency for your project, some third-party libraries contain ready-made bimaps, such as Guava (BiMap) and Apache Commons (BidiMap).
You could iterate over the keys and save the current value in the Set.
But, before inserting that value in a Set, check if the Set already contains that value.
If this is true, it means that a previous key already contains the same value.
Map<Integer, String> map = new HashMap<>();
Set<String> values = new HashSet<>();
Set<Integter> keysWithSameValue = new HashSet<>();
for(Integer key : map.keySet()) {
if(values.contains(map.get(key))) {
keysWithSameValue.add(key);
}
values.add(map.get(key));
}

fast static key-value mapping

I have a set of unique key-value pairs, both key and value are strings. The number of pairs is very huge and finding the value of a certain string is extremely time-critical.
The pairs are computed beforehand and are given for a certain program. So i could just write a method containing:
public String getValue(String key)
{
//repeat for every pair
if(key.equals("abc"))
{
return "def";
}
}
but i am talking about more than 250,000 pairs and perhaps sorting them could be faster...
I am having a class that contains the getValue() method and can use its constructor, but has no connection to any database/file system etc. So every pair has to be defined inside the class.
Do you have any ideas that could be faster than a huge series of if-statements? Perhaps using a sorting map that gets the pairs presorted. Perhaps improve constructor-time by deserializing an already created map?
I would like your answers to contain a basic code example of your approach, I will comment answers with their corresponding time it took an a set of pairs!
Time-frame: for one constructor call and 20 calls of getValue() 1000 milliseconds.
Keys have a size of 256 and values have a size < 16
This is exactly what a hash table is made for. It provides O(1) lookup if implemented correctly, which means that as long as you have enough memory and your hash function and collision strategy are smart, it can get values for keys in constant time. Java has multiple hash-backed data structures, from the sounds of things a HashMap<String, String> would do the trick for you.
You can construct it like this:
Map<String, String> myHashTable = new HashMap<String, String>();
add values like this:
myHashTable.put("abcd","value corresponding to abcd");
and get the value for a key like this:
myHashTable.get("abcd");
You were on the right track when you had the intuition that running through all of the keys and checking was not efficient, that would be an O(n) runtime approach, since you'd have to run through all n elements.

Comparing TreeMap contents gives incorrect answer

I use a TreeMap as a 'key' inside another TreeMap
ie
TreeMap<TreeMap<String, String>, Object>
In my code 'object' is a personal construct, but for this intance I have used a string.
I have created a pair of TreeMaps to test the TreeMap.CompareTo() and TreeMap.HashCode() methods. this starts with the following...
public class TreeMapTest
public void testTreeMap()
{
TreeMap<String, String> first = new TreeMap<String, String>();
TreeMap<String, String> second = new TreeMap<String, String>();
first.put("one", "une");
first.put("two", "deux");
first.put("three", "trois");
second.put("une", "one");
second.put("deux", "two");
second.put("trois", "three");
TreeMap<TreeMap<String, String>, String> english = new TreeMap<TreeMap<String, String>, String>();
TreeMap<TreeMap<String, String>, String> french = new TreeMap<TreeMap<String, String>, String>();
english.put(first, "english");
french.put(second, "french");
From here I now call the the english item to see if it contains the key
if (english.containsKey(second))
{
System.out.println("english contains the key");
//throws error of ClassCastException: Java.util.TreeMap cannot be cast to
//Java.Lang.Comparable, reading the docs suggests this is the feature if the key is
//not of a supported type.
//this error does not occur if I use a HashMap structure for all maps, why is
//this key type supported for one map structure but not another?
}
However I should note that both HashMap and TreeMap point to the same HashCode() method in the AbstractMap parent.
My first thought was to convert my TreeMap to a HashMap, but this seemed a bit soppy! So I decided to apply the hashCode() method to the 2 treemap objects.
int hc1 = first.hashCode();
int hc2 = second.hashCode();
if(hc1 == hc2)
{
systom.out.printline("values are equal " + hc1 + " " + hc2);
}
prints the following
values are equal 3877431 & 3877431
For me the hashcode should be different as the key values are different, I can't find details on the implementation difference of the hashCode() method between HashMap and TreeMap.
Please not the following.
changing the Keys only to HashMap doesn't stop the ClassCastException error. Changing all the maps to a HashMap does. so there is something with the containsKey() method in TreeMap that isn't working properly, or I have missunderstood - can anyone explain what?
The section where I get the hashCode of the first and second map objects always produces the same output (no matter if I use a Hash or Tree map here) However the if(english.ContainsKey(second)) doesn't print any message when HashMaps are used, so there is obviously something in the HashMap implementation that is different for the compareTo() method.
My principle questions are.
Where can I find details of the types of keys for use in TreeMap objects (to prevent future 'ClassCastException' errors).
If I can't use a certain type of object as a key, why am I allowed to insert it as a key into the TreeMap in the first place? (surely if I can insert it I should be able to check if the key exists?)
Can anyone suggest another construct that has ordered inster / retrieval to replace my TreeMap key objects?
Or have I potentially found strange behaviour. From my understanding I should be able to do a drop in replacement of TreeMap for HashMap, or have I stumbled upon a fringe scenario?
Thanks in advance for your comments.
David.
ps. the problem isn't a problem in my code as I use a personal utility to create a hash that becomes dependent on the Key and Value pairs (ie I calculate key hash values differently to value hash values... sorry that if is a confusing sentence!) I assume that the hashCode method just sums all the values together without considering if a item is a key or a value.
pps. I'm not sure if this is a good question or not, any pointers on how to improve it?
Edit.
from the responses people seem to think I'm doing some sort of fancy language dictionary stuff, not a surprise from my example, so sorry for that. I used this as an example as it came easily to my brain, was quick to write and demonstrated my question.
The real problem is as follows.
I'm accessing a legacy DB structure, and it doesn't talk nicely to anything (result sets aren't forward and reverse readable etc). So I grab the data and create objects from them.
The smallest object represents a single row in a table (this is the object that in the above example I have used a string value 'english' or 'french' for.
I have a collection of these rowObjects, each row has an obvious key (this is the TreeMap that points to the related rowObject).
i don't know if that makes things any clearer!
Edit 2.
I feel I need to elaborate a little further as to my choice of originaly using
hashMap<HashMap<String,string>, dataObject>
for my data structure, then converting to TreeMap to gain an ordered view.
In edit 1 I said that the legacy DB doesn't play nicely (this is an issue with the JDBC.ODBC I suspect, and I'm not about to acquire a JDBC to communicate with the DB). The truth is I apply some modifications to the data as as I create my java 'dataObject'. This means that although the DB may spit out the results in ascending or descending order, I have no way of knowing what order they are inserted into my dataObject. Using a likedHashMap seems like a nice solution (see duffymo's suggestion) but I later need to extract the data in an ordered fashion, not just consecutively (LinkedHashMap only preserves insertion order), and I'm not inclined to mess around with ordering everything and making copies when I need to insert a new item in between 2 others, TreMap would do this for me... but if I create a specific object for the key it will simply contain a TreeMap as a member, and obviously I will then need to supply a compareTo and hashCode method. So why not just extent TreeMap (allthough Duffymo has a point about throwing that solution out)!
This is not a good idea. Map keys must be immutable to work properly, and yours are not.
What are you really trying to do? When I see people doing things like this with data structures, it makes me think that they really need an object but have forgotten that Java's an object-oriented language.
Looks like you want a crude dictionary to translate between languages. I'd create a LanguageLookup class that embedded those Maps and provide some methods to make it easier for users to interact with it. Better abstraction and encapsulation, more information hiding. Those should be your design objectives. Think about how to add other languages besides English and French so you can use it in other contexts.
public class LanguageLookup {
private Map<String, String> dictionary;
public LanguageLookup(Map<String, String> words) {
this.dictionary = ((words == null) ? new HashMap<String, String>() : new HashMap<String, String>(words));
}
public String lookup(String from) {
return this.dictionary.get(from);
}
public boolean hasWord(String word) {
return this.dictionary.containsKey(word);
}
}
In your case, it looks like you want to translate an English word to French and then see if the French dictionary contains that word:
Map<String, String> englishToFrenchWords = new HashMap<String, String>();
englishToFrenchWords.put("one", "une");
Map<String, String> frenchToEnglishWords = new HashMap<String, String>();
frenchToEnglishWords.put("une", "one");
LanguageLookup englishToFrench = new LanguageLookup(englishToFrenchWords);
LanguageLookup frenchToEnglish = new LanguageLookup(frenchToEnglishWords);
String french = englishToFrench.lookup("one");
boolean hasUne = frenchToEnglish.hasWord(french);
Your TreeMap is not Comparable so you can't add it to a SortedMap and its not immutable so you can't add it to a HashMap. What you could use an IdentityMap but suspect an EnumMap is a better choice.
enum Language { ENGLISH, FRENCH }
Map<Language, Map<Language, Map<String, String>>> dictionaries =
new EnumMap<>(Language.class);
Map<Language, Map<String, String>> fromEnglishMap = new EnumMap<>(Language.class);
dictionaries.put(Language.ENGLISH, fromEnglishMap);
fromEnglishMap.put(Language.FRENCH, first);
Map<Language, Map<String, String>> fromFrenchMap = new EnumMap<>(Language.class);
dictionaries.put(Language.FRENCH, fromFrenchMap);
fromEnglishMap.put(Language.ENGLISH, second);
Map<String, String> fromEnglishToFrench= dictionaries.get(Language.ENGLISH)
.get(Language.FRENCH);
To the problem why Hashmap works and Treemap does not:
A Treemap is a "sorted map", meaning that the entries are sorted according to the key. This means that the key must be comparable, by implementing the Comparable interface. Maps usually do NOT implement this, and I would highly suggest you do not create a custom type to add this feature. As duffymo mentions, using maps as keys is a BAD idea.

java hashtable with collision resolution

I want to get all the values(multiple) of a particular key.But i m getting only one value?I dont know how to print all the values.Great help if someone correct the code..did not get any help from google search..
import java.util.*;
public class hashing
{
public static void main(String args[])
{
String[] ary=new String[4];
String key;
char[] chrary;
ary[0]=new String("abcdef");
ary[1]=new String("defabc");
ary[2]=new String("ghijkl");
ary[3]=new String("jklghi");
Hashtable<String, String> hasht = new Hashtable<String, String>();
for(int i=0;i<4;i++){
chrary=ary[i].toCharArray();
Arrays.sort(chrary);
key=new String(chrary);
hasht.put(key,ary[i]);
}
Enumeration iterator = hasht.elements();
while(iterator.hasMoreElements()) {
String temp = (String)iterator.nextElement();
System.out.println(temp);
}
}
}
PS:output is defabc jklghi.I want abcdef defabc ghijkl jklghi.
Hashtables can only contain one value per key. To store multiple values, you should either
Store a collection (e.g. List<String> or array) per key. Note that you'll have to initialise the collection prior to insertion of the first value corresponding to that key
Use a MultiMap
Note that many MultiMap implementations exist. The Oracle docs provide a simple implementation too (see here, and search for MultiMap)
The way HashMaps work is that there is only one value for a given key. So if you call:
map.put(key, value1);
map.put(key, value2);
the second line will override the value corresponding to the key.
Regarding your comment about collision, it means something different. Internally, a HashMap stores the key/value pairs in buckets that are defined based on the hashcode of the key (hence the name: hashmap). In the (low probability if the hashcode function is good) case where two non-equal keys have the same hashcode, the implementation needs to make sure that querying the hashmap on one of those keys will return the correct value. That is where hash collision need to be handled.
That's not what collision resolution is meant to do. Collision resolution lets you handle the case when two object with different keys would go into the same "bucket" in the hash map. How this resolution happens is an internal detail of the hash map implementation, not something that would be exposed to you.
Actually, in your case, its not collision, its same key with same hashcode. In general Collision occurs only if two different keys generate same hashcode, This can occur due to a bad implementation of hashCode() method.
Yes, java.util.HashMap will handle hash collisions, If you look at the source code of HashMap, it stores each value in a LinkedList. That means, if two different keys with same hashcode comes in.. then both values will go into same bucket but as two different nodes in the linked list.
Found this link online, which explain How hash map works in detail.
if key is the same, the value will be updated. jvm will not put a new key/value for same keys...
Your Hashtable<String, String> maps one string to one string. So put replaces the value that was before linked to a specific key.
If you want multiple values, you can make a Hashtable<String, []String> or a Hashtable<String, List<String>>.
A cleaner solution would be to use Google's Multimap which allows to associate multiple values to one key :
A collection similar to a Map, but which may associate multiple values
with a single key. If you call put(K, V) twice, with the same key but
different values, the multimap contains mappings from the key to both
values.
You are only putting one String for each key:
hasht.put(key,ary[i]);
So if i=1 that means you put defabc, why do you expect to get multiple values for same key?
Hashtable, like all Map keep only one value per key, the last value you set.
If you want to keep all the values, just print the original array.
String[] ary = "abcdef,defabc,ghijkl,jklghi".split(",");
System.out.println(Arrays.toString(ary));
prints
[abcdef, defabc, ghijkl, jklghi]

Categories

Resources