Compare 2 keys in Java HashMap - java

I've made a BinaryTree< HashMap<String, String> >.
How can I compare the two keys so I can correctly insert the two elements (HashMaps) into the ordered BinaryTree?
Here's what I've got so far:
public class MyMap<K extends Comparable<K>, V> extends HashMap<K, V> implements Comparable< MyMap<K, V> >
{
#override
public int compareTo(MyMap<K, V> mapTwo)
{
if ( (this.keySet().equals(mapTwo.keySet())) ) return 0;
//How can I check for greater than/less than and keep my generics?
}
EDIT: There is only one key in each HashMap (it's a very simple language translation system), so sorting the keys shouldn't be necessary. I would have liked to use the String.compareTo() method, but because of my generics, the compiler doesn't know that K is a String

I think you've picked a bad data structure.
HashMaps are not naturally ordered. The keys in the set for a HashMap have an unpredictable order that is sensitive to the sequence of operations that populated the map. This makes it unsuitable for comparing two HashMaps.
In order to compare a pair of HashMaps, you need to extract the respective key sets, sort them and then compare the sorted sets. In other words, a compareTo method for HashSet derived classes is going to be O(NlogN) on average.
FWIW, a compareTo implementation would look something like this, assuming that the method is to order the HashMaps based on the sorted lists keys in their respective key sets. Obviously, there are other orderings based on the key sets.
public int compareTo(MyMap<K, V> other) {
List<K> myKeys = new ArrayList<K>(this.keySet());
List<K> otherKeys = new ArrayList<K>(other.keySet());
Collections.sort(myKeys);
Collections.sort(otherKeys);
final int minSize = Math.min(myKeys.size(), otherKeys.size());
for (int i = 0; i < minSize; i++) {
int cmp = myKeys.get(i).compareTo(otherKeys.get(i));
if (cmp != 0) {
return cmp;
}
}
return (myKeys.size() - otherKeys.size());
}
If there is only ever one key / value pair in the map, then you should replace it with a simple Pair<K,V> class. Using a HashMap to represent a single pair is ... crazy.

Related

TreeMap<int[],Double> initialisation and sorting by value

I have to make sorted map which key is int[] and value is double. It cant be swapped because double will be duplicated. Moreover, map will be sort by value and last x values will be deleted.
I tried to make
Map<int[],Double> map = new TreeMap<>();;
int[] i = {0,1,1,0};
map.put(i,8.5); // ERROR HERE Organisms.java:46
i = new int[]{0,0,0,0};
map.put(i,30.0);
System.out.println("sorted" + sortByValue(map));
Exception in thread "AWT-EventQueue-0" java.lang.ClassCastException:
[I cannot be cast to java.lang.Comparable at
java.util.TreeMap.compare(TreeMap.java:1294) at
java.util.TreeMap.put(TreeMap.java:538) at
com.pszt_organism.Organisms.test(Organisms.java:46) <-- MARKED ERROR
I found method sortByValue(Map<K, V> map) in this topic: java8 example by Carter Page
I suppose that TreeMap has problem with sorting table of int. How to solve that?
EDIT:
private <K, V extends Comparable<? super V>> Map<K, V> sortByValue(Map<K, V> map) {
return map.entrySet()
.stream()
.sorted(Map.Entry.comparingByValue(/*Collections.reverseOrder()*/))
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue,
(e1, e2) -> e1,
LinkedHashMap::new
));
}
The problem is that Java array types do not implement Comparable.
The solution: implement a Comparator that will compare your int[] keys, and pass an instance as a parameter to the TreeMap constructor.
However, this will only work if the int[] objects are not mutated while they are in use as keys. If you mutate them, then you will "break" the TreeMap and Map operations will behave incorrectly.
You could also wrap the int[] objects in a class that implements Comparable, implement compareTo, equals and hashCode. The same caveat about mutation applies with this approach as well.
int[] arrays as poorly suited for use as keys in a Map (see this Q&A for a brief explanation; it talks about array lists, but the same logic applies to arrays as well).
If you are set on using arrays as keys anyway, proceed with caution:
Since int[] is not Comparable, you cannot use it as a key without supplying a piece of logic for comparing arrays.
Here is how you can do it:
Map<int[],Double> map = new TreeMap<>(
new Comparator<int[]>() {
#Override public int compare(int[] lhs, int[] rhs) {
int len = Math.min(lhs.length, rhs.length);
for (int i = 0 ; i != len ; i++) {
if (lhs[i] != rhs[i]) {
return Integer.compare(lhs[i], rhs[i]);
}
}
// If we're here, common elements match up;
// hence, the longer of the two arrays wins.
return Integer.compare(lhs.length, rhs.length)
}
}
);
You don't need to use TreeMap to use that sorting method. Just use a different Map. That sorting method creates a new LinkedHashMap for the result, so the Map which is passed as an argument is just a temporary container.

why different key with duplicated value disappear when transfer hashmap to treemap

I use the code below to sort my hashmap by its value. But the result seems confused because it only keep one entry for one value and remove another entry with duplicate value.
Here is the Comparator code:
class ValueComparator implements Comparator {
Map map;
public ValueComparator(Map map) {
this.map = map;
}
public int compare(Object keyA, Object keyB) {
Comparable valueA = (Comparable) map.get(keyA);
Comparable valueB = (Comparable) map.get(keyB);
return valueB.compareTo(valueA);
}
And here is how I use it:
TreeMap sortedMap=new TreeMap(new ValueComparator(allCandidateMap));
sortedMap.putAll(allCandidateMap);
That makes perfect sense. You've declared that if two keys map to equal values in allCandidateMap, they should be considered equal, since your compare will be returning 0.
What this comes down to is that your comparator has almost reversed the roles of key and value. If you try doing other operations you will probably find that the values of the map often behave like keys. Methods like get and containsKey will act as if they're looking up the values, not the keys (but then get will return the value you passed in, so values are still values as well). The comparator defines the behaviour of the TreeMap, and you've asked for very weird behaviour.

Is there any data structure that offers fast key/value access, but that is also ordered and can be accessed by a position index?

Basically, I need something like a TreeMap but that would allow me to get the element at the position X efficiently.
You can use a ListOrderedMap from Apache Commons Collections.
It gives you a get(int index) method to retrieve the key at position index on top of the usual Map methods.
A balanced tree can be used for both lookups by key and by index, both in O(log N) time, if you store a "size" field in each node which tracks how many key/value pairs are contained in the node and all its descendants.
The code for looking up a value by index would look something like this (in pseudocode):
def at(index)
if index == this.left.size
return this.value
else if index < this.left.size
return this.left.at(index)
else
return this.right.at(index - this.left.size - 1)
TreeMap get complexity is O(log n)
I guess it is not a concern in performance.
Also as i know array has O(1) but the your demands match with treemap.
The SortedMap Interface with TreeMap is suitable for you :).
It wouldn't be difficult to combine some data structures to provide this. Assuming there are no duplicates then the sketch below could work. If you do need to support duplicates then wrap your objects with something that provides unique hashes like the default Object class does.
I have no idea how you want use the positional data so I didn't add any methods that relate to reordering or iterating but it wouldn't be difficult. Sounds like the Apaches ListOrderedMap is a good choice too.
public class OrderedMap<K, V>{
private ArrayList<V> values;
private HashMap<K, V> map;
private HashMap<K, Integer> keysToIndices;
public OrderedMap(){
values = new ArrayList<>();
map = new HashMap<>();
keysToIndices = new HashMap<>();
}
public void put(K key, V value){
values.add(value);
map.put(key, value);
keysToIndices.put(key, values.size()-1);
}
public T get(K key){
return map.get(key);
}
public V remove(K key){
map.remove(key);
return values.remove(keysToIndices.remove(key))
}
}

TreeMap sort is not working on equal values

I have written a method that sorts the TreeMap by its values
public TreeMap<String, Integer> sortByValues(final TreeMap<String, Integer> map) {
Comparator<String> valueComparator = new Comparator<String>() {
public int compare(String k1, String k2) {
int compare = map.get(k1).compareTo(map.get(k2));
return compare;
}
};
Map<String, Integer> sortedByValues = new TreeMap<String, Integer>(valueComparator);
sortedByValues.putAll(map);
return sortedByValues;
}
The above method works fine in normal case but fails when there is a duplicate value present in the TreeMap. Any duplicate value entry is removed from the Map.After googling it I found the solution as
public Map<String, Integer> sortByValuesTree(final Map<String, Integer> map) {
Comparator<String> valueComparator = new Comparator<String>() {
public int compare(String k1, String k2) {
int compare = map.get(k1).compareTo(map.get(k2));
if (compare == 0) return 1;
else return compare;
}
};
Map<String, Integer> sortedByValues = new TreeMap<String, Integer>(valueComparator);
sortedByValues.putAll(map);
return sortedByValues;
}
The above works fine but I am not able to understand why first method didn't work. Why did it remove duplicate value entry? Can someone please let me know
Why did it remove duplicate value entry?
Because that's the very definition of a Map: for a given key, it stores one value. If you put another value for a key that is already in the map, the new value replaces the old one for this key. And since you told the map that a key is equal to another one when their associated value are equal, the map considers two keys to be equal when their value are equal.
Note that your solution is a bad one which probably works by accident. Indeed, your comparator doesn't respect the contract of the Comparator interface. Indeed, when two values are equal, you arbitrarily decide to make the first one bigger than the second one. This means that your comparator makes A > B and B > A true at the same time, which is not correct.
Sorting a TreeMap by value just looks like an absurdity to me. You won't be able to add any new value to the map anyway, since it would require the entry to already exist in the old map. Shouldn't you simply have a sorted list of map entries?
Actually, especially in Java 7 onwards, even your second method is not going to work. (See below for why.) Anyway, the reason is that maps must have distinct keys, and when you are using your value as key, two equal values would be treated as equal keys.
The proper fix, by the way, is to sort by value, then by key:
int compare = map.get(k1).compareTo(map.get(k2));
if (compare == 0) {
compare = k1.compareTo(k2);
}
return compare;
Proper comparators must follow three rules:
compare(a, a) == 0 for all values of a.
signum(compare(a, b)) == -signum(compare(b, a)) for all values of a and b.
if signum(compare(a, b)) == signum(compare(b, c)), then signum(compare(a, c)) must also have the same value, for all values of a, b, and c.

Automatically sorted by values map in Java

I need to have an automatically sorted-by-values map in Java - so that It keeps being sorted at any time while I'm adding new key-value pairs or update the value of an existing key-value pair, or even delete some entry.
Please also have in mind that this map is going to be really big (100's of thousands, or even 10's of millions of entries in size).
So basically I'm looking for the following functionality:
Supposed that we had a class 'SortedByValuesMap' that implements the aforementioned functionality
and we have the following code:
SortedByValuesMap<String,Long> sorted_map = new SortedByValuesMap<String, Long>();
sorted_map.put("apples", 4);
sorted_map.put("oranges", 2);
sorted_map.put("bananas", 1);
sorted_map.put("lemons", 3);
sorted_map.put("bananas", 6);
for (String key : sorted_map.keySet()) {
System.out.println(key + ":" + sorted_map.get(key));
}
the output should be:
bananas:6
apples:4
lemons:3
oranges:2
In particular, what's really important for me, is to be able to get the entry with the
lowest value at any time - using a command like:
smallestItem = sorted_map.lastEntry();
which should give me the 'oranges' entry
EDIT: I am a Java newbie so please elaborate a bit in your answers - thanks
EDIT2: This might help: I am using this for counting words (for those who are familiar: n-grams in particular) in huge text files. So I need to build a map where keys are words and values are the frequencies of those words. However, due to limitations (like RAM), I want to keep only the X most frequent words - but you can't know beforehand which are going to be the most frequent words of course. So, the way I thought it might work (as an approximation) is to start counting words and when the map reaches a top-limit (like 1 mil entries) , the least frequent entry will be deleted so as to keep the map's size to 1 mil always.
Keep 2 data structures:
A dictionary of words -> count. Just use an ordinary HashMap<String, Long>.
An "array" to keep track of order, such that list[count] holds a Set<String> of words with that count.
I'm writing this as though it were an array as a notational convenience. In fact, you probably don't know an upper bound on the number of occurrences, so you need a resizable data structure. Implement using a Map<Long, Set<String>>. Or, if that uses too much memory, use an ArrayList<Set<String>> (you'll have to test for count == size() - 1, and if so, use add() instead of set(count + 1)).
To increment the number of occurrences for a word (pseudocode):
// assumes data structures are in instance variables dict and arr
public void tally(final String word)
{
final long count = this.dict.get(word) or 0 if absent;
this.dict.put(word, count + 1);
// move word up one place in arr
this.arr[count].remove(word); // This is why we use a Set: for fast deletion here.
this.arr[count + 1].add(word);
}
To iterate over words in order (pseudocode):
for(int count = 0; count < arr.size; count++)
for(final String word : this.arr[count])
process(word, count);
How about using additional index or only TreeMap<Long, TreeSet<String>> or TreeMap<Long, String> if Long values are distinct?
You can also write a Heap.
Guava BiMap Solution:
//Prepare original data
BiMap<String, Integer> biMap = HashBiMap.create();
biMap.put("apples" , 4);
biMap.put("oranges", 2);
biMap.put("bananas", 1);
biMap.put("lemons" , 3);
biMap.put("bananas", 6);
//Create a desc order SortedMap
SortedMap<Integer, String> sortedMap = new TreeMap<Integer, String>(new Comparator<Integer>(){
#Override public int compare(Integer o1, Integer o2) {
return o2-o1;
}});
//Put inversed map
sortedMap.putAll(biMap.inverse());
for (Map.Entry<Integer, String> e: sortedMap.entrySet()) {
System.out.println(e);
}
System.out.println(sortedMap.lastKey());
Try the solution posted on http://paaloliver.wordpress.com/2006/01/24/sorting-maps-in-java/ . You have the flexibility of doing sorting ascending or descending too.
Here is what they say
import java.util.Comparator;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.SortedMap;
import java.util.TreeMap;
public class MapValueSort {
/** inner class to do soring of the map **/
private static class ValueComparer implements Comparator<String> {
private Map<String, String> _data = null;
public ValueComparer (Map<String, String> data){
super();
_data = data;
}
public int compare(String o1, String o2) {
String e1 = (String) _data.get(o1);
String e2 = (String) _data.get(o2);
return e1.compareTo(e2);
}
}
public static void main(String[] args){
Map<String, String> unsortedData = new HashMap<String, String>();
unsortedData.put("2", "DEF");
unsortedData.put("1", "ABC");
unsortedData.put("4", "ZXY");
unsortedData.put("3", "BCD");
SortedMap<String, String> sortedData = new TreeMap<String, String>(new MapValueSort.ValueComparer(unsortedData));
printMap(unsortedData);
sortedData.putAll(unsortedData);
System.out.println();
printMap(sortedData);
}
private static void printMap(Map<String, String> data) {
for (Iterator<String> iter = data.keySet().iterator(); iter.hasNext();) {
String key = (String) iter.next();
System.out.println("Value/key:"+data.get(key)+"/"+key);
}
}
}
Outputs
Value/key:BCD/3
Value/key:DEF/2
Value/key:ABC/1
Value/key:ZXY/4
Value/key:ABC/1
Value/key:BCD/3
Value/key:DEF/2
Value/key:ZXY/4
I found the need of a similar structure to keep a list of objects ordered by associated values. Based on the suggestion from Mechanical snail in this thread, I coded up a basic implementation of such a map. Feel free to use.
import java.util.*;
/**
* A map where {#link #keySet()} and {#link #entrySet()} return sets ordered
* with ascending associated values with respect to the the comparator provided
* at constuction. The order of two or more keys with identical values is not
* defined.
* <p>
* Several contracts of the Map interface are not satisfied by this minimal
* implementation.
*/
public class ValueSortedMap<K, V> extends HashMap<K, V> {
protected Map<V, Collection<K>> valueToKeysMap;
public ValueSortedMap() {
this((Comparator<? super V>) null);
}
public ValueSortedMap(Comparator<? super V> valueComparator) {
this.valueToKeysMap = new TreeMap<V, Collection<K>>(valueComparator);
}
public boolean containsValue(Object o) {
return valueToKeysMap.containsKey(o);
}
public V put(K k, V v) {
V oldV = null;
if (containsKey(k)) {
oldV = get(k);
valueToKeysMap.get(oldV).remove(k);
}
super.put(k, v);
if (!valueToKeysMap.containsKey(v)) {
Collection<K> keys = new ArrayList<K>();
keys.add(k);
valueToKeysMap.put(v, keys);
} else {
valueToKeysMap.get(v).add(k);
}
return oldV;
}
public void putAll(Map<? extends K, ? extends V> m) {
for (Map.Entry<? extends K, ? extends V> e : m.entrySet())
put(e.getKey(), e.getValue());
}
public V remove(Object k) {
V oldV = null;
if (containsKey(k)) {
oldV = get(k);
super.remove(k);
valueToKeysMap.get(oldV).remove(k);
}
return oldV;
}
public void clear() {
super.clear();
valueToKeysMap.clear();
}
public Set<K> keySet() {
LinkedHashSet<K> ret = new LinkedHashSet<K>(size());
for (V v : valueToKeysMap.keySet()) {
Collection<K> keys = valueToKeysMap.get(v);
ret.addAll(keys);
}
return ret;
}
public Set<Map.Entry<K, V>> entrySet() {
LinkedHashSet<Map.Entry<K, V>> ret = new LinkedHashSet<Map.Entry<K, V>>(size());
for (Collection<K> keys : valueToKeysMap.values()) {
for (final K k : keys) {
final V v = get(k);
ret.add(new Map.Entry<K,V>() {
public K getKey() {
return k;
}
public V getValue() {
return v;
}
public V setValue(V v) {
throw new UnsupportedOperationException();
}
});
}
}
return ret;
}
}
This implementation does not honor all the contracts of the Map interface such as reflecting value changes and removals in the returned key set and entry sets in the actual map, but such a solution would be a bit large to include in a forum like this. Perhaps I will work on one and make it available via github or something similar.
Update: You cannot sort maps by values, sorry.
You can use SortedMap implementation like TreeMap with Comparator defining order by values (instead of default - by keys).
Or, even better, you can put elements into a PriorityQueue with predefined comparator by values. It should be faster and take less memory compared to TreeMap.
You may refer to the implementation of java.util.LinkedHashMap.
The basic idea is, using a inner linked list to store orders. Here is some details:
Extends from HashMap. In HashMap, each entry has a key and value, that is basic. You can Add a next and a prev pointer to store entries in order by value. And a header and tail pointer to get the first and last entry. For every modification (add, remove, update), you can add your own code to change the list order. It is no more than a linear search and pointer switch.
Sure it will be slow for add/update if there are too many entries because it is a linked list not array. But as long as the list is sorted, I believe there are lots of ways to speedup the search.
So here is what you got: A map that has the same speed with HashMap when retrieving an entry by a key. A linked list which stores entries in order.
We can discuss this further if this solution meets your requirement.
to jtahlborn:
As I said, it surely is slow without any optimization. Since we are talking about performance not impl now, lots of things can be done.
One solution is using a tree instead of Linked List, like Red-Black Tree. Then iterate the tree instead of iterator the map.
About the smallest value, it is easier. Just using a member variable to store the smallest, when add or update an element, update the smallest value. When delete, search the tree for the smallest (this is very fast)
if tree is too complex, it is also possible to using another list/array to mark the some positions in the list. for example, maybe 100 element each. Then when search, just search the position list first and then the real list. This list also needs to be maintained, it would be reasonable to recount the position list for certain times of modification, maybe 100.
if all you need is the "min" value, then just use a normal map and keep track of the "min" value anytime it is modified.
EDIT:
so, if you really need value ordering and you want to use out-of-the-box solutions, you basically need 2 collections. One normal map (e.g. HashMap), and one SortedSet (e.g. TreeSet>). you can traverse ordered elements via the TreeSet, and find frequencies by key using the HashMap.
obviously, you could always code up something yourself sort of like a LinkedHashMap, where the elements are locatable by key and traversable by order, but that's pretty much going to be entirely custom code (i doubt anything that specific already exists, but i could be wrong).

Categories

Resources