I have a hash map which uses linear probing to deal with collisions. I would like to traverse it. Conceptually, this is quite easy, however, the use of generics is spinning me off.
The entries in the internal array of the hash map have their key-value pairs as generics - like this
public entry(K key, V value) {
this.key = key;
this.value = value;
}
These entries are stored in an entry array - like this
private entry[] entries;
I would like to traverse the hash map beginning at a certain key, I will reach the end of the internal array, then go back to the beginning of the array up to the key, in a circular fashion so the whole array is covered.
public V traverse(K k) {
//look from current key
for(int i = (int)k; i < entries.length; i++){
//visit node
}
//go back to start, and look up to key
for(int i = 0; i < (int)k; i++){
//visit node
}
}
I've realized that type casting the key as an integer was sort of stupid, but i'm struggling to find a working way to actually do this traversal.
If I understood you maybe you can create LinkedHashMap(HashMap)() and with it you can do it transversally.
A HashMap does not store the entries in the order you expect. The ordering is based on the hash of the key, and organized as an array of linked-lists.
If you want to have a defined ordering you should use TreeMap, which uses the natural ordering of the key. If you need a custom sorting you can provide a Comparator by constructor. Then just iterate on the map.keySet.
But I think what you need (starting the iteration somewhere in the middle) is not supported. You start iterating at the first node, and do nothing until you reach your desired key, then do your work with the rest of the entries, then start again iterating until you reach your desired key again.
Related
I have to implement a hash table which will use an array, however have to follow a guide and create functions for each procedure.
I would appreciate if anyone could help out in completing this for me as I am having some trouble.
public class HashTable {
// public for testing purposes
public int buckets[];
public HashTable(long _a, long _c, long _m) {
}
public void insert(int key) {
}
}
What I've got so far:
public class HashTable {
// public for testing purposes
public int buckets[];
public HashTable(long _a, long _c, long _m) {
table = new Node[];
}
public void insert(int key) {
Node<T> newNode = new Node(key);
int posPosition = calPosition(key);
}
I have included what I have done so far. Maybe I'm going about it the wrong way. I understand the concept but cannot seem to write code for the hash table so far. Would appreciate any help, Thanks Again
A hash table is simply a list or array of buckets.
Each bucket holds all items the key that hashes to for that particular bucket.
those items are entries that contain the key and the value you are seeking.
When putting something in a hash table, you use the key/value pair. If the buckets are in an array, use the hash code of the key to get the proper index of the array. If you use a linked list you might have to count to the location.
Then use another array or linked list to store then entry at that cell. Linked lists are better, imo, because they can be added without worry of exceeding the size of the array. They can just be added to the front like a regular linked list.
When adding a value, create the entry, hash the key and find the bucket. Then add the entry to the bucket.
When retrieving a value, hash the key, get to the bucket and do a linear search on the bucket to find the entry for the key you are looking for. Then return the value for that entry.
Note: As with most hash tables, you cannot have duplicate keys. And any Object which is used as a key must override equals and hashCode for this to work.
I'm iterating through a huge file reading key and value from every line. I need to obtain specific number (say 100k) of elements with highest values. To store them I figured that I need a collection that allows me to check a minimum value in O(1) or O(log(n)) and if currently read value is higher then remove element with minimum value and put new one. What collection enables me to do that? Values are not unique so BiMap is probably not adequate here.
EDIT:
Ultimate goal is to obtain best [key, value] that will be used later. Say my file looks like below (first column - key, second value):
3 6
5 9
2 7
1 6
4 5
Let's assume I'm looking for best two elements and algorithm to achieve that. I figured that I'll use a key-based collection to store best elements. First two elements (<3, 6>, <5, 9>) will be obviously added to the collection as its capacity is 2. But when I get to the third line I need to check if <2, 7> is eligible to be added to the collection (so I need to be able to check if 7 is higher than minimum value in collection (6)
It sounds like you don't actually need a structure because you are simply looking for the largest N values with their corresponding keys, and the keys are not actually used for sorting or retrieval for the purpose of this problem.
I would use the PriorityQueue, with the minimum value at the root. This allows you to retrieve the smallest element in constant time, and if your next value is larger, removal and insertion in O(log N) time.
class V{
int key;
int value;
}
class ComparatorV implements Comparator<V>{
int compare(V a, V b){
return Integer.compare(a.value, b.value);
}
}
For your specific situation, you can use a TreeSet, and to get around the uniqueness of elements in a set you can store pairs which are comparable but which never appear equal when compared. This will allow you to violate the contract with Set which specifies that the Set not contain equal values.
The documentation for TreeSet contains:
The behavior of a set is well-defined even if its ordering is
inconsistent with equals; it just fails to obey the general contract
of the Set interface
So using the TreeSet with the Comparable inconsistent with equals should be fine in this situation. If you ever need to compare your chess pairs for a different reason (perhaps some other algorithm you are also running in this app) where the comparison should be consistent with equals, then provide a Comparator for the other use. Notice that TreeSet has a constructor which takes a Comparator, so you can use that instead of having ChessPair implement Comparable.
Notice: A TreeSet provides more flexibility than a PriorityQueue in general because of all of its utility methods, but by violating the "comparable consistent with equals" contract of Set some of the functionality of the TreeSet is lost. For example, you can still remove the first element of the set using Set.pollFirst, but you cannot remove an arbitrary element using remove since that will rely on the elements being equivalent.
Per your "n or at worst log(n)" requirement, the documentation also states:
This implementation provides guaranteed log(n) time cost for the basic
operations (add, remove and contains).
Also, I provide an optimization below which reduces the minimum-value query to O(1).
Example
Set s = new TreeSet<ChessPair>();
and
public class ChessPair implements Comparable<ChessPair>
{
final int location;
final int value;
public ChessPair(final int location, final int value)
{
this.location = location;
this.value = value;
}
#Override
public int compareTo(ChessPair o)
{
if(value < o.value) return -1;
return 1;
}
}
Now you have an ordered set containing your pairs of numbers, they are ordered by your value, you can have duplicate values, and you can get the associated locations. You can also easily grab the first element (set.first), last (set.last), or get a sub-set (set.subSet(a,b)), or iterate over the first (or last, by using descendingSet) n elements. This provides everything you asked for.
Example Use
You specified wanting to keep the 100 000 best elements. So I would use one algorithm for the first 100 000 possibilities which simply adds every time.
for(int i = 0; i < 100000 && dataSource.hasNext(); i += 1)
{
ChessPair p = dataSource.next(); // or whatever you do to get the next line
set.add(p);
}
and then a different one after that
while(dataSource.hasNext())
{
ChessPair p = dataSource.next();
if(p.value > set.first().value)
{
set.remove(set.pollFirst());
set.add(p);
}
}
Optimization
In your case, you can insert an optimization into the algorithm where you compare against the lowest value. The above, simple version performs an O(log(n)) operation every time it compares against minimum-value since set.first() is O(log(n)). Instead, you can store the minimum value in a local variable.
This optimization works well for scaling this algorithm because the impact is negligible - no gain, no loss - when n is close to the total data set size (ie: you want best 100 values out of 110), but when the total data set is vastly larger than n (ie: best 100 000 out of 100 000 000 000) the query for the minimum value is going to be your most common operation and will now be constant.
So now we have (after loading the initial n values)...
int minimum = set.first().value;
while(dataSource.hasNext())
{
ChessPair p = dataSource.next();
if(p.value > minimum)
{
set.remove(set.pollFirst());
set.add(p);
minimum = set.first().value;
}
}
Now your most common operation - query minimum value - is constant time (O(1)), your second most common operation - add - is worst case log(n) time, and your least most common operation - remove - is worst case log(n) time.
For arbitrarily large data sets, each input is now processed in constant O(1) time.
See java.util.TreeSet
Previous answer (now obsolete)
Based on question edits and discussion in the question's comments, I no longer believe my original answer to be correct. I am leaving it below for reference.
If you want a Map collection which allows fast access to elements based on order, then you want an ordered Map, for which there is a sub-interface SortedMap. Fortunately for you, Java has a great implementation of SortedMap: it's TreeMap, a Map which is backed by a "red-black" tree structure which is an ordered tree.
Red-black-trees are nice since they rotate branches in order to keep the tree balanced. That is, you will not end up with a tree that branches n times in one direction, yielding n layers, just because your data may already have been sorted. You are guaranteed to have approximately log(n) layers in the tree, so it is always fast and guarantees log(n) query even for worst-case.
For your situation, try out the java.util.TreeMap. On the page linked in the previous sentence, there are links also to Map and SortedMap. You should check out the one for SortedMap too, so you can see where TreeMap gets some of the specific functionality that you are looking for. It allows you to get the first key, the last key, and a sub-map that fetches a range from within this map.
For your situation though, it is probably sufficient to just grab an iterator from the TreeMap and iterate over the first n pairs, where n is the number of lowest (or highest) values that you want.
Use a TreeSet, which offers O(log n) insertion and O(1) retrieval of either the highest or lowest scored item.
Your item class must:
Implement Comparable
Not implement equals()
To keep the top 100K items only, use this code:
Item item; // to add
if (treeSet.size() == 100_000) {
if (treeSet.first().compareTo(item) < 0) {
treeSet.remove(treeSet.first());
treeSet.add(item);
}
} else {
treeSet.add(item);
}
If you want a collection ordered by values, you can use a TreeSet which stores tuples of your keys and values. A TreeSet has O(log(n)) access times.
class KeyValuePair<Key, Value: Comparable<Value>> implements Comparable<KeyValuePair<Key, Value>> {
Key key;
Value value;
KeyValuePair(Key key, Value value) {
this.key = key;
this.value = value;
}
public int compare(KeyValuePair<Key, Value> other) {
return this.value.compare(other.value);
}
}
or instead of implementing Comparable, you can pass a Comparator to the set at creation time.
You can then retrieve the first value using treeSet.first().value.
Something like this?
entry for your data structure, that can be sorted based on the value
class Entry implements Comparable<Entry> {
public final String key;
public final long value;
public Entry(String key, long value) {
this.key = key;
this.value = value;
}
public int compareTo(Entry other) {
return this.value - other.value;
}
public int hashCode() {
//hashcode based on the same values on which equals works
}
}
actual code that works with a PriorityQueue. The sorting is based on the value, not on the key as with a TreeMap. This is because of the compareMethod defined in Entry. If the sets grows above 100000, the lowest entry (with the lowest value) is removed.
public class ProcessData {
private int maxSize;
private PriorityQueue<Entry> largestEntries = new PriorityQueue<>(maxSize);
public ProcessData(int maxSize) {
this.maxSize = maxSize;
}
public void addKeyValue(String key, long value) {
largestEntries.add(new Entry(key, value));
if (largestEntries.size() > maxSize) {
largestEntries.poll();
}
}
}
This question was asked to me in a job interview and I still don't know answer so ask here. Lets say hashCode() of key object returns a fixed integer so HashMap would look like a LinkedList.
How would a duplicate element be found and replaced by new value in map?
e.g. if following 1001 puts are performed in order listed below,
put(1000,1000), put(1,1), put( 2, 2), put ( 3,3 ) ....put(999,999), put(1000,1000 )
Would map be traversed all the way to end and then new one be inserted at head when last put(1000,1000) is performed?
OR
Map has some other way to locate and replace duplicate keys?
First case is correct.
In your case when hashCode() is returning same hash value for all the keys. In the java HashMap, Key and Value both are stored in the bucket as Map.Entry object. When perform the second or further put() operations into the map, it will traverse all the element to check whether Key is already present in the Map. If Key is not found then new Key and Value pair will be added into the linked list. If Key is found in the list then it update the Value for the pair.
Details explanation about java HashMap working: How HashMap works in Java
Take this sample code and run in the debug mode and observe how the new Key and Value pair are inserted into the Map.
In the class you will need to hashCode() (we want to control how the hash codes are generated for Node), toString() (just to output the Node value in SOUT) and equals() (defines the equality of the keys based on the value of Node member variable Integer, for updating the values.) methods for getting it working.
public class HashMapTest {
static class Node {
Integer n;
public Node(int n) {
this.n = n;
}
#Override
public int hashCode() {
return n%3;
}
#Override
public boolean equals(Object object) {
Node node = (Node)object;
return this.n.equals(node.n);
}
#Override
public String toString() {
return n.toString();
}
}
public static void main(String[] args) {
Map<Node, String> map = new HashMap<>();
for (int i = 0; i<6; i++) {
map.put(new Node(i), ""+i); // <-- Debug Point
}
map.put(new Node(0), "xxx");
} // <-- Debug Point
}
First 3 entries in the map: (hash code is n%3)
Three more values: (hash code is n%3)
Now don't confused about the ordering of the node, I have executed them on java 1.8 and HashMap uses TreeNode, an implementation of Red-Black tree as per the code documentation. This can be different in different versions of the java.
Now lets update the Value of Key 0:
When the hash code is the same, the hash map compares objects using the equals method.
For example, let's say you put a bunch of elements in the hash map:
put(1000,1000), put(1,1), put( 2, 2), put ( 3,3 ) ....put(999,999)
And then you do this:
put(1000,1000 )
1000 is already in the map, the hash code is the same, it is also the same in terms of the equals method, so the element will be replaced, no need to iterate further.
Now if you do:
put(1234, 1234)
1234 is not yet in the map. All the elements are in a linked list, due to the fixed hash code. The hash map will iterate over the elements, comparing them using equals. It will be false for all elements, the end of the list will be reached, and the entry will be appended.
JDK implementations changes over time !
In JDK8, if the hashCode() is a constant value, the implementation creates a tree not a linked list in order to protect against DDOS attack 1.
I'm trying to come up with an efficient way to return a key in my HashMap that has the lowest value in datastructure. Is there a quick and efficient way to do this besides looping through the entire HashMap?
For example, if I have a hashmap that looks like this:
1: 200
3: 400
5: 1
I want to return the key, 5.
No, you have to loop over all the keys in a HashMap to find the smallest. If this is an important operation, you're better off using a SortedMap, for instance TreeMap, which keeps its elements in sorted order, and then you can simply call firstKey() to find the lowest key.
As others have mentioned HashMap itself does not provide this.
So your options are to either compute it on-demand or pre-compute.
To compute it on-demand, you would iterate the HashMap.entrySet()
Depending on the size of the map, frequency of its change and frequency of requiring the key-with-lowest-value, pre-computing (caching) may be more efficient. Something as follows:
class HashMapWithLowestValueCached<K, V extends Comparable> extends HashMap<K, V> {
V lowestValue;
K lowestValueKey;
void put(K k, V v) {
if (v.compareTo(lowestValue) < 0) {
lowestValue = v;
lowestValueKey = k;
}
super.put(k, v);
}
K lowestValueKey () { return lowestValueKey; }
}
No, there is no way of doing this. You need to iterate over all the elements in the HashMap to find the one with the lowest value.
The reason why we have different kinds of storage is that they support different kinds of operations with different efficiency. HashMap is not designed to retrieve elements efficienctly based on their value. The kind of storage class you need for this will depend on what other operations you need to be able to do quickly. Assuming that you probably also want to be able to retrieve items quickly based on their key, the following might work:
Write a wrapper around your HashMap that keeps track of all the elements being added to it, and remembers which oneis the smallest. This is really only useful if retriving the smalls is the only way you need to access by value.
Store all your data twice - once in a HashMap and once in a data structure that sorts by value - for example, a SortedMap with key and value reversed.
If you find you don't need to retrieve by key, just reverse key and value.
No, there is no quick and efficient way of doing that - you need to loop through the entire hash map. The reason for it is that the keys and values in hash maps do not observe any particular order.
No, because otherwise there would exist a sorting algorithm in O(n log n) (probabilistic, though): add all elements to the hash map, than extract the lowest one by one.
//create hashmap
HashMap<Integer, String> yourHashmap = new HashMap<>();
//add your values here
yourHashmap.put(1,"200");
yourHashmap.put(3,"400");
yourHashmap.put(5,"1");
//then create empty arraylist
ArrayList<Integer> listDuplicates = new ArrayList<Integer>();
//filing the empty arraylist with all id's from duplicateHashmap
for (Map.Entry<Integer, String> entry : yourHashmap.entrySet()) {
listDuplicates.add(entry.getKey());
}
//Ordering the numbers
Collections.sort(listDuplicates);
for (Integer num : listDuplicates) {
int id = num; //entry
String number2 = duplicateHashmap.get(num);//value
System.out.println("lowest value = "+id+" : "+number2);
//breaking here because we've found the lowest value...
break;
}
I require to get the list of keys where the values are equal for those keys from a HashMap.
For example , my hashmap contains the below elements.
Key Value
1 a,b
2 e,c
3 a,b
4 f
5 e,c
6 c
We need to evaluate as
1,3 contains value (a,b)
2,5 contains value (e,c)
4 contains value (f)
6 contains value (c)
Thx
You could invert your hash: build a new hash with the key type being the type of your current map's value, and the value type being a list of your current map's key type.
Iterate over your current map's keys, and push them to the right slot in your new map. You'll then have exactly the mapping you are asking for.
If the values in your current map aren't directly comparable right now, you'll need to find a representation that is. This depends completely on the nature of the data.
One simple approach is to sort the list and use it's toString representation as your new key. This only works if the toString representation of the underlying objects is sane for this purpose.
You can create other map where your keys a used as values and values as keys. If for example your source map is defined as Map<Integer, String> create map Map<String, List<Integer>>. The list of integers will contain keys (from your source map) that have certain values.
Building on Mat's answer, if you need to do this operation frequently, use one of the bidirectional map classes from Guava or Apache Commons Collections; e.g. HashBiMap<K,V> or DualHashBidiMap or DualTreeBidiMap. These data structures maintain a pair of maps that represent the forward and inverse mappings.
Alternatively, for a once off computation:
Extract the Map.entries() collection into an array.
Sort the array in order of the values.
Iterate the array, and extract the entry keys for which subsequent entry values are equal.
(This should be O(NlogN) in time and require O(N) extra space ... depending on the sort algorithm used.)
The most basic method will be to:
Get the first key of the HashMap and iterate over the map checking for keys with the same value.
If found, remove that key from the map and store the key in another collection (maybe a Vector).
Then after all other keys are checked, add the current key to that collection.
If no other keys are found, add the current key to that collection.
Then add the keys in that collection to another map with the relevant value. Clear the collection.
Proceed to the next key and do the same.
After doing this, you will end up with what you want.
EDIT: The Code:
HashMap comp = new HashMap(); // Calculations Done
Vector v = new Vector(); // Temporary List To Store Keys
// Get The List Of Keys
Vector<Integer> keys = new Vector<Integer>();
Iterator<Integer> it = hm.keySet().iterator();
while(it.hasNext()) keys.add(it.next());
// For Every Key In Map...
for(int i = 0; i < hm.size(); i++) {
int key = keys.get(i);
v.add(key); // Add the Current Key To Temporary List
// Check If Others Exist
for(int j = i+1; j < hm.size(); j++) {
int nkey = keys.get(j);
if(hm.get(key).equals(hm.get(nkey))) {
v.add(nkey);
}
}
// Store The Value Of Current Key And The Keys In Temporary List In The Comp HashMap
String val = hm.get(key);
String cKey = "";
for(int x = 0; x < v.size(); x++)
cKey += v.get(x) + ",";
// Remove The Comma From Last Key, Put The Keys As Value And Value As Key
cKey = cKey.substring(0, cKey.length()-1);
comp.put(cKey, val);
// Clear The Temporary List
v.clear();
}
There is a little problem in this code: Duplicates occur and also the last duplicate seems to be correct.
The output using your example give. (You need to do a little formatting).
{3=a,b, 6=c, 5=e,c, 2,5=e,c, 4=f, 1,3=a,b}