I am looking for an appropriate data structure for my problem. I would like to be able to select node objects as efficiently as possible using two keys. Insertion and deletion also needs to be efficient. Basically every node object has a pair of two keys. The pairs are unique but the individual keys are not. I need to be able to select a group of nodes that have a particular value for one of the two keys.
Example:
Node1 has keys a1 and b1
Node2 has keys a1 and b2
Node3 has keys a2 and b2
I would like to for example be able to select the node with the key a1,b1 but also all nodes that have b2 as key2.
I could of course make two HashMaps (one for each key) but this is kind of an ugly solution because when I add or remove something I would have to do it in both maps. Since there will be a lot of adding and removing going on I would prefer to do this in one go. Does anyone have any ideas about how to do this?
Obviously having a single key that merges the two keys together does not solve the problem because I also need to be able to search for a single key without having to search through the whole map. That wouldn't be very efficient. The problem is an efficiency problem. I could just search every entry in the map for a particular key but instead I would like to use a hash so that I can select multiple node objects using either one of the two keys instantly.
I am not looking for something like the MultiKeyMap because in this data-structure the first key always stays the same, you can only add keys instead of replacing the first key with a different key. I want to be able to switch between the first and the second key.
I do and do not want to store multiple objects with the same key. If you look at the example you can see that the two keys together are always unique. This can be seen as a single key therefore I would not store multiple objects under the same key. But if you look at the individual keys, these are not unique therefore I do want to store multiple objects referenced by the individual keys.
If you can use a library, take a look at the Table interface of Guava. It associates a row and a column with a value. The row and columns may be your first and second keys. Also you can have searches by row or by column.
One of the implementations of this interface is hash based.
You have to create a key class (equality is treated as Point):
public class Key {
int field1;
int field2;
public boolean equals(Object o) {
if (o == null || !(o instanceof Key)) return false;
Key other = (Key) o;
return field1 == other.field1 && field2 == other.field2;
}
public int hashCode() {
return field1*field2; // doesn't matter if some keys have same hash code
}
}
For selecting keys with one specific value in the first field:
public List<Key> getKeysWithField1EqualsTo(int value) {
List<Key> result = new ArrayList<>();
for (Key k: map.keySet()) {
if (k.field1 == value) result.add(k);
}
return result;
}
Since this is rather specific to your problem at hand, you will very likely need to develop your own collection. I would wrap two MultiMaps from Apache Commons into my own collection class that deals with updates of both multi-maps at the same time, and use my class to perform inserts and queries.
Write a simple class that is able to contain two values (the keys) and override equals(..) and hashCode() for equality checks used by the map. Use this simple class as the key for the map.
Here you can find a hashmap compatible pair class (2nd answer):
What is the equivalent of the C++ Pair<L,R> in Java?
Since a HashMap can only sort on one hash for every object, you will never be able to select the distinct lists 'out of the box'. What I would suggest is using a Tuple with two keys, and then iterate over the HashMap and select only those elements that have tuple.key1=X.
HashMaps can have any object as Key so why not create a class with 2 fields and consider this class as your key. you can also Override the Equals method to tell it how the keys are equals
I think we can do it in this way: For each key, we can compute the corresponding hashcode.
key1 -> hashcode1
key2 -> hashcode2
Then we have a 2-d array, with N columns and N rows.
key1 -> rowIndex: hashcode1 MOD N
key2 -> columnIndex: hashcode2 MOD N
Then we store the item in array[rowIndex][columnIndex].
In this implementation, you can get all the entries with a target key1, and any key2. You can also get all the entries with a target key2, and any key1.
This array may expand when there are a lot of collisions, just like what you do with the ordinary hashmap.
Related
I have multiple files which contains key=value string pairs. The keys are the same between the files, but the values differs. Each file can have 1000 plus of such pairs.
I want to store each file in a separate hashmap, ie map<KeyString, ValueString>, so if there are five files, then there will be five hashmaps.
To avoid duplicating the keys across each hashmap, is it possible to have each map reference the same key? Note that once the keys are added to the map, it will not be deleted.
I considered making the first file the 'base' as in the flyweight pattern, this base would be the intrinsic set of keys/values. The other remaining files would be the extrinsic set of values, but I don't know how to relate the values back to the base (intrinsic) keys without duplicating the keys?
I am open to a simpler/better approach.
I can think about a simpler approach. Instead of having Map<String, String> think of Map<String, List<String> or directly MultiMap<String, String> from guava.
If each key is in each file and all have values, you could store values from first file at 0th index, from the second at 1st index etc.
If it wouldn't work, I recommend a Collection<Map<String, String>, so you're able to iterate through your Maps. Then when you want to add value to one of the Maps, go through all keySets and if one of them contains that key, just put with object returned from this keySet.
Other solution would be to have a HashSet of keys that have already been put. This would be more efficient.
After reading in the keys, you can use String.intern().
When called, what it does is either:
add the String to the internal pool if it didn't exist already;
return the equivalent String from the pool if it already existed.
String#intern Javadoc
First of all, I don't see the problem with storing multiple instances of your String keys. 5 HashMaps * 1000 keys is a very small number, and you shouldn't have memory issues.
That said, if you still want to avoid duplicating the Strings, you can create the first HashMap, and then you the exact same keys for the other HashMaps.
For example, suppose map1 is the first HashMap and it is already populated with the contents of the first file.
You can write something like this to populate the 2nd HashMap:
for (String key : map1.keySet()) {
map2.put (key, someValue);
}
Of course you will have to find for each key of the first map the corresponding value of the second map. If the keys are not stored in the same order in the input files, this may require some preliminary sorting step.
Perhaps you could hold a static Map<> to map your keys to unique Integers and use those Integers for the keys to your map?
Something like:
class KeySharedMap<K,V> {
// The next key to use. Using Atomics for the auto-increment.
static final AtomicInteger next = new AtomicInteger(0);
// Static mapping of keys to unique Integers.
static final ConcurrentMap<Object,Integer> keys = new ConcurrentHashMap<>();
// The map indexed by Integer from the `keys`.
Map<Integer, V> map = new HashMap<>();
public V get(Object key) {
return map.get(keys.get(key));
}
public V put(Object key, V value) {
// Associate a unique integer for each unique key.
keys.computeIfAbsent(key,x -> next.getAndIncrement());
// Put it in my map.
return map.put(keys.get(key),value);
}
}
Yes, I realise that K is not used here but I suspect it would be necessary if you wish to implement Map<K,V>.
When defining a composite key for a hash map such as:
public key {
enum a;
enum b;
enum c;
}
Where equals and hashcode are overridden to compare these values (a,b,c)?
Are there techniques to store this list of keys so they can be found by querying the three values rather than creating the new key each time? Such as store list of keys? Or try and reuse a key?
Garbage collection will be large as these keys will be generated on each method calls.
So we can stop:
public void update(obj) {
Key = new Key(obj.a, obj.b, obj.c)
// assume already in there map or add
Val val = hashmap.get(key)
val.update(obj.newvalues) // do some calculation
return val;
// key will then be lost after get? So lots of Garbage collection?
// if so should it explicitly be set to key = null;
}
There is no easy way to do this as you expected. It might be needed to create a Map that takes three keys.
If it is a concern for the program to generate a new Key instance just for query, consider using Integer as key HashMap. But keep in mind Integer is also an instance. It could be faster for HashMap to compare and get the value. It might uses less memory than the self-defined key instance. But not helping to avoid "key" instance to be created (Unless the Integer instance is cached by JVM which is another story.).
About using Integer as key:
If the Key is three enum, try to use Integer as key and do a math translation, making sure different combine of three enums can get a different integer value.
For example, suppose there are 16 values for enum a,enum b and enum c. It is doable to use java bye operator to get a integer to represent the combine. Then use the Integer to get value from map.
Hope this helps.
I basically need to know if my HashMap has different keys that map to the same value. I was wondering if there is a way other than checking each keys value against all other values in the map.
Update:
Just some more information that will hopefully clarify what I'm trying to accomplish. Consider a String "azza". Say that I'm iterating over this String and storing each character as a key, and it's corresponding value is some other String. Let's say I eventually get to the last occurrence of 'a' and the value is already be in the map.This would be fine if the key corresponding with the value that is already in the map is also 'a'. My issue occurs when 'a' and 'z' both map to the same value. Only if different keys map to the same value.
Sure, the fastest to both code and execute is:
boolean hasDupeValues = new HashSet<>(map.values()).size() != map.size();
which executes in O(n) time.
Sets don't allow duplicates, so the set will be smaller than the values list if there are dupes.
Very similar to EJP's and Bohemian's answer above but with streams:
boolean hasDupeValues = map.values().stream().distinct().count() != map.size();
You could create a HashMap that maps values to lists of keys. This would take more space and require (slightly) more complex code, but with the benefit of greatly higher efficiency (amortized O(1) vs. O(n) for the method of just looping all values).
For example, say you currently have HashMap<Key, Value> map1, and you want to know which keys have the same value. You create another map, HashMap<Value, List<Key>> map2.
Then you just modify map1 and map2 together.
map1.put(key, value);
if(!map2.containsKey(value)) {
map2.put(value, new ArrayList<Key>);
}
map2.get(value).add(key);
Then to get all keys that map to value, you just do map2.get(value).
If you need to put/remove in many different places, to make sure that you don't forget to use map2 you could create your own data structure (i.e. a separate class) that contains 2 maps and implement put/remove/get/etc. for that.
Edit: I may have misunderstood the question. If you don't need an actual list of keys, just a simple "yes/no" answer to "does the map already contain this value?", and you want something better than O(n), you could keep a separate HashMap<Value, Integer> that simply counts up how many times the value occurs in the map. This would take considerably less space than a map of lists.
You can check whether a map contains a value already by calling map.values().contains(value). This is not as efficient as looking up a key in the map, but still, it's O(n), and you don't need to create a new set just in order to count its elements.
However, what you seem to need is a BiMap. There is no such thing in the Java standard library, but you can build one relatively easily by using two HashMaps: one which maps keys to values and one which maps values to keys. Every time you map a key to a value, you can then check in amortized O(1) whether the value already is mapped to, and if it isn't, map the key to the value in the one map and the value to the key in the other.
If it is an option to create a new dependency for your project, some third-party libraries contain ready-made bimaps, such as Guava (BiMap) and Apache Commons (BidiMap).
You could iterate over the keys and save the current value in the Set.
But, before inserting that value in a Set, check if the Set already contains that value.
If this is true, it means that a previous key already contains the same value.
Map<Integer, String> map = new HashMap<>();
Set<String> values = new HashSet<>();
Set<Integter> keysWithSameValue = new HashSet<>();
for(Integer key : map.keySet()) {
if(values.contains(map.get(key))) {
keysWithSameValue.add(key);
}
values.add(map.get(key));
}
I want to get all the values(multiple) of a particular key.But i m getting only one value?I dont know how to print all the values.Great help if someone correct the code..did not get any help from google search..
import java.util.*;
public class hashing
{
public static void main(String args[])
{
String[] ary=new String[4];
String key;
char[] chrary;
ary[0]=new String("abcdef");
ary[1]=new String("defabc");
ary[2]=new String("ghijkl");
ary[3]=new String("jklghi");
Hashtable<String, String> hasht = new Hashtable<String, String>();
for(int i=0;i<4;i++){
chrary=ary[i].toCharArray();
Arrays.sort(chrary);
key=new String(chrary);
hasht.put(key,ary[i]);
}
Enumeration iterator = hasht.elements();
while(iterator.hasMoreElements()) {
String temp = (String)iterator.nextElement();
System.out.println(temp);
}
}
}
PS:output is defabc jklghi.I want abcdef defabc ghijkl jklghi.
Hashtables can only contain one value per key. To store multiple values, you should either
Store a collection (e.g. List<String> or array) per key. Note that you'll have to initialise the collection prior to insertion of the first value corresponding to that key
Use a MultiMap
Note that many MultiMap implementations exist. The Oracle docs provide a simple implementation too (see here, and search for MultiMap)
The way HashMaps work is that there is only one value for a given key. So if you call:
map.put(key, value1);
map.put(key, value2);
the second line will override the value corresponding to the key.
Regarding your comment about collision, it means something different. Internally, a HashMap stores the key/value pairs in buckets that are defined based on the hashcode of the key (hence the name: hashmap). In the (low probability if the hashcode function is good) case where two non-equal keys have the same hashcode, the implementation needs to make sure that querying the hashmap on one of those keys will return the correct value. That is where hash collision need to be handled.
That's not what collision resolution is meant to do. Collision resolution lets you handle the case when two object with different keys would go into the same "bucket" in the hash map. How this resolution happens is an internal detail of the hash map implementation, not something that would be exposed to you.
Actually, in your case, its not collision, its same key with same hashcode. In general Collision occurs only if two different keys generate same hashcode, This can occur due to a bad implementation of hashCode() method.
Yes, java.util.HashMap will handle hash collisions, If you look at the source code of HashMap, it stores each value in a LinkedList. That means, if two different keys with same hashcode comes in.. then both values will go into same bucket but as two different nodes in the linked list.
Found this link online, which explain How hash map works in detail.
if key is the same, the value will be updated. jvm will not put a new key/value for same keys...
Your Hashtable<String, String> maps one string to one string. So put replaces the value that was before linked to a specific key.
If you want multiple values, you can make a Hashtable<String, []String> or a Hashtable<String, List<String>>.
A cleaner solution would be to use Google's Multimap which allows to associate multiple values to one key :
A collection similar to a Map, but which may associate multiple values
with a single key. If you call put(K, V) twice, with the same key but
different values, the multimap contains mappings from the key to both
values.
You are only putting one String for each key:
hasht.put(key,ary[i]);
So if i=1 that means you put defabc, why do you expect to get multiple values for same key?
Hashtable, like all Map keep only one value per key, the last value you set.
If you want to keep all the values, just print the original array.
String[] ary = "abcdef,defabc,ghijkl,jklghi".split(",");
System.out.println(Arrays.toString(ary));
prints
[abcdef, defabc, ghijkl, jklghi]
I usually come across scenarios while using HashMap in Java as follows :
I've a list of Objects of class A (List<A>)
A has fields int f1, int f2 and other fields.
I've to construct a map from List to perform O(1) lookup for the Objects of A. The key is combination of f1 and f2 (both being integers).
Now which of the following would be the best practice to use for the map
case 1 : in general
case 2 : f2 can take only 2 to 3 different values, while f1 can take large number of values.
Map<Integer, Map<Integer, List<A>>> // construction of map is cumbersome
Map<String, List<A>> //(key : String f1 + "_" + f2)
Map<Integer, List<A>> //(I tend to use this for case 2)
Missed to clarify one thing here. f1 and f2 don't uniquely identify objects of A. Corrected the map definitions.
If those two fields tend to be immutable (they don't change once set), you can override the equals() and hashCode() methods of A, and simply store a:
Set<A> //(key: fields f1 and f2, via hashCode() method)
If they are not immutable, you cannot use them for the key anyway, since they might change.
I think Map is suitable for case 1, and for case, i recommend List, and this list only have 2-3 elements, then you can map an index to the specific field value.
Why use a map at all? If you don't really need Key-Value pairs, you can just use a HashSet<A>. The lookup is still O(1) and you don't have to bother getting a value from the key.
Of course, the HashSet is probably just a HashMap with null values, but you don't have to invent keys and values.
I don't like using Strings as composite keys. Some blogger out there put it well: Strings are good for things that are text, and not good for things that aren't text.
Why not just create a simple IntPair class with two int fields, and appropriate hashCode() and equals(Object) overrides? It'll take you two seconds in an IDE (not much longer without one), and you'll have a more specific, semantically meaningful key type.
Key is unique in HashMap...because internally in java key is set as
final Key
int static Entry class in java
That's why the key is unique it won't allow duplicates...