ArrayList sorting, application stuck - java

I have an ArrayList filled with words from a text file, that I need to sort by occurrence of words, from the most occurring to the less occurring. I copy the original ArrayList with the words to another Arraylist as well as adding on top the number of occurrences. So the word in the new ArrayList would look, for example:
"password:125" where "password" is the word and "125" is the number of occurrences in the ArrayList.
for (int i=0;i<sorter.size();i++) {
sorter2.add(sorter.get(i)+":"+Collections.frequency(sorter, sorter.get(i)));
}
Afterwards I sort the ArrayList with this class:
public class RepeatFormulaCounter implements Comparator<String> {
#Override
public int compare(String o1, String o2) {
if (findValue(o2) != findValue(o1)) {
return findValue(o2) - findValue(o1);
}
return o2.compareTo(o1);
}
public int findValue(String find){
int result=0;
String spliter[]=find.split(":");
result=Integer.parseInt(spliter[1]);
return result;
}
}
However as I have 5 text files filled with words which 3 of the files are around 45 000 words and 2 with more than 1 000 000, The files with around 45000 words are sorted and displayed without any problems, but when I start to sort the ones with more than 1 000 000 words the application gets stuck. Why does it happen? and how can I fix it?
Please notice I am using a GUI application to displaying it. And I am using 2 similar sort classes for other ways of sorting by different criteria's which display and perform without any problems.

Why do you store words as "password:125"? You are working in very inefficient way. You must use efficient data structure to store your word's statistics. Use Map interface and choose right implementation to store words with its occurrence.
Map<String, Integer> wordsMap = new HashMap<String,Double>();
/* Fill the wordsMap with data, then use this function to sort.
Fill and update value by key is simple:
wordsMap .put(key, 50); <-- put value
wordsMap .put(key, map.get(key) + 1); <--- update value
For example:
wordsMap .put("google", 0); <-- put value
wordsMap .put("google", map.get("google") + 1); <--- increment value by 1
*/
public static <K, V extends Comparable<? super V>> Map<K, V>
sortByValue( Map<K, V> map )
{
List<Map.Entry<K, V>> list =
new LinkedList<>( map.entrySet() );
Collections.sort( list, new Comparator<Map.Entry<K, V>>()
{
#Override
public int compare( Map.Entry<K, V> o1, Map.Entry<K, V> o2 )
{
return (o1.getValue()).compareTo( o2.getValue() );
}
} );
Map<K, V> result = new LinkedHashMap<>();
for (Map.Entry<K, V> entry : list)
{
result.put( entry.getKey(), entry.getValue() );
}
return result;
}
// sortByValue(wordsMap);
Additionally, you can read about classes Hashtable, LinkedHashMap, TreeMap and then choose the one with better performance. They implements the same Map interface, but has different asymptotics for internal implementation of put(), get() and other methods.
The Javadocs from Sun for each collection class will generally tell you exactly what you want.
HashMap, for example:
This implementation provides constant-time performance for the basic
operations (get and put), assuming the hash function disperses the
elements properly among the buckets. Iteration over collection views
requires time proportional to the "capacity" of the HashMap instance
(the number of buckets) plus its size (the number of key-value
mappings).
TreeMap:
This implementation provides guaranteed log(n) time cost for the
containsKey, get, put and remove operations.
TreeSet:
This implementation provides guaranteed log(n) time cost for the basic
operations (add, remove and contains).
Read more about this.
If it will be still slower then yours expectations, you can use multithreading. If you have processor with 8 cores, you can split your file to 8 peaces, count words in 8 threads, merge results, and then run sort.

Most likely a memory issue. Try increasing your jvm heap size. You make lots of temporary strings and your garbage collector will go crazy at large data sizes.

I think the problem might be outside of the code shown, but you could try to reduce object trashing by reducing the amount of find calls and then number of objects created (currently, each find call creates 3 new objects and you call find 4 times in compare):
#Override
public int compare(String o1, String o2) {
int f2 = findValue(o2);
int f1 = findValue(o1);
if (f2 != f1) {
return f2 - f1;
}
return o2.compareTo(o1);
}
public int findValue(String find){
int result = 0;
int cut = find.lastIndexOf(':');
result = Integer.parseInt(find.substring(cut + 1));
return result;
}
This can probably be improved some more by getting rid of substring...
Probably a better option would be to hand in the map that you use for counting to the comparator constructor, and then use it in the comparator:
public class CountComparator implements Comparator<String> {
Map<String, Integer> counts;
public CountComparator(Map<String, Integer> counts) {
this.counts = counts;
}
public int compare(String o1, String o2) {
int f2 = counts.get(o2);
int f1 = counts.get(o1);
if (f1 != f2) {
return f2 - f1;
}
return o2.compareTo(o1);
}
}

Make use of streams that were introduced in Java 8. They are great for processing data.
HashMap<String, Integer> occurences = new HashMap<>();
...
Stream<String> stream = occurences.entrySet().stream()
.sorted((a, b) -> b.getValue() - a.getValue())
.map(kv -> kv.getKey());
String[] sortedWords = stream.toArray(size -> new String[size]);

Related

How to sort a HashMap by Key and then Value using Comparator (pasting all code, post will be long)

Starting from scratch so that peoples eyes don't bleed, here is my most recent progress:
Set<Entry<GregorianCalendar, Event>> map = (MyCalendarTester.myCal.getMyCalHash().entrySet());
LinkedList<Map.Entry<GregorianCalendar, Event>> list = new LinkedList<Map.Entry<GregorianCalendar,Event>>(map);
Collections.sort(list, new Comparator<Map.Entry<GregorianCalendar, Event>>() {
public int compare(Map.Entry<GregorianCalendar, Event> e1, Map.Entry<GregorianCalendar,Event> e2) {
int r;
r = e1.getKey().compareTo(e2.getKey());
if (r!=0) return r;
r = e1.getValue().compareTo(e2.getValue());
return r;
}
});
Iterator<Map.Entry<GregorianCalendar,Event>> i2 = list.iterator();
while (i2.hasNext()) {
System.out.println(i2.next() + " , ");
}
And here is how my compareTo works:
#Override
public int compareTo(Object e) {
// TODO Auto-generated method stub
int hour = ((Event) e).getStartTime().get(Calendar.HOUR_OF_DAY);
int minute = ((Event) e).getStartTime().get(Calendar.MINUTE);
int anotherHour = this.startTime.get(Calendar.HOUR_OF_DAY);
int anotherMinute = this.startTime.get(Calendar.MINUTE);
if(anotherHour - hour == 0 ){
return anotherMinute - minute;
}else{
return anotherHour - hour;
}
}
By creating the following days and events:
Test4
04/16/2016
2:23,5:56
Test5
03/15/2015
1:11
Test6
08/29/2017
7:51,23:59
Test7
04/16/2016
1:23,5:56
My program produces:
java.util.GregorianCalendar[time=?,areFieldsSet=false,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2015,MONTH=3,WEEK_OF_YEAR=12,WEEK_OF_MONTH=3,DAY_OF_MONTH=15,DAY_OF_YEAR=75,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=3,AM_PM=0,HOUR=9,HOUR_OF_DAY=9,MINUTE=30,SECOND=38,MILLISECOND=260,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]=Event#791d9ad ,
java.util.GregorianCalendar[time=?,areFieldsSet=false,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2016,MONTH=4,WEEK_OF_YEAR=12,WEEK_OF_MONTH=3,DAY_OF_MONTH=16,DAY_OF_YEAR=75,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=3,AM_PM=0,HOUR=9,HOUR_OF_DAY=9,MINUTE=30,SECOND=38,MILLISECOND=260,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]=Event#7869f0bc ,
java.util.GregorianCalendar[time=?,areFieldsSet=false,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2017,MONTH=8,WEEK_OF_YEAR=12,WEEK_OF_MONTH=3,DAY_OF_MONTH=29,DAY_OF_YEAR=75,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=3,AM_PM=0,HOUR=9,HOUR_OF_DAY=9,MINUTE=30,SECOND=38,MILLISECOND=462,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]=Event#733c0466 ,
(sorry, not sure how to nicely format that one)
The issue is that (1) they are all identical except for the DAY_OF_MONTH, (2) it eats up the duplicate day, and then (3) I'm not sure how to get pull out the relevant information that I need out of the Calendar using the LinkedList (basically how to format it).
Again, sorry this is taking so long, but this is just beyond my programming capabilities to figure out on my own.
Use a TreeMap instead, it is sorted by key. You can't have the best of both worlds at the same time (O(1) access time and sorted keys). There are obviously ways to iterate in a sorted manner but those rely on copying all keys to a list, sorting that list, and then getting all values in that order. But really, if you want a map that is sorted by key, you should use a TreeMap. It's made specifically for that purpose.
You can try using a Guava TreeMultimap which will allow you to do two things: (1) have multiple values for each key and (2) keep the keys and the values sorted (according to natural order or provided comparator):
// For natural ordering of both K and V:
TreeMultimap<K, V> mMap = TreeMultimap.create();
// For custom ordering of K and V:
TreeMultimap<K, V> mMap = TreeMultimap.create(
new Comparator<K>() {
int compare(K o1, K o2) {
// your implementation here
}
},
new Comparator<V>() {
int compare(V o1, V o2) {
// your implementation here
}
}
);
Next, you can iterate the multimap K,V pairs like this:
for (Map.Entry<K, V> e : mMap.entries()) {
// entries are sorted by K first and then by V
K k = e.getKey();
V v = e.getValue();
// do something with k, v
}
More information about MultiMap is available here.

Sorting HashMap by value using a TreeMap and Comparator

Im using the following code to create a hashmap and then sort the values in the hashmap by using a treemap and a comparator. However, the output is rather unexpected.
So any thoughts as to what Im doing wrong would be helpful
Code
public static void main(String[] args) {
System.out.println("Most freq"+mostFreq(" i me hello hello hello me"));
}
public static String[] mostFreq(String str){
if ((str==null)||( str.trim().equalsIgnoreCase("")))
return null;
String[] arr = new String[10];
String[] words= str.split(" ");
Map <String,Integer> map = new HashMap<String,Integer>();
for (String word :words)
{
int count =0;
if (map.containsKey(word))
{
count= map.get(word);
map.put(word, count+1);
}
else
map.put(word, 1);
}
MyComparator comp= new MyComparator(map);
Map<String,Integer> newMap= new TreeMap(comp);
newMap.putAll(map);
Iterator it= newMap.entrySet().iterator();
while (it.hasNext())
{
Map.Entry pairs = (Map.Entry) it.next();
System.out.println("Key "+pairs.getKey()+"-- value"+pairs.getValue());
}
return arr;
}
Here's the comparator
package samplecodes;
import java.util.Comparator;
import java.util.Map;
public class MyComparator implements Comparator {
Map map;
public MyComparator(Map map){
this.map=map;
}
#Override
public int compare(Object o1, Object o2) {
return ((Integer)map.get(o1) >(Integer)map.get(o2)? (Integer)map.get(o1):(Integer)map.get(o2));
}
}
And the output is of the form
me-2
hello-3
i-3
Please check the JavaDoc of compare: You do not return the bigger value, but -1 for o1 < o2, 0 for o1 = o2 and 1 for o1 > o2. So you could write:
#Override
public int compare(Object o1, Object o2) {
return ((Integer) map.get(o1)).compareTo((Integer) map.get(o2);
}
The Java Doc of TreeMap clearly states that:
A Red-Black tree based NavigableMap implementation. The map is sorted
according to the natural ordering of its keys
we should not violate this rule by using TreeMap to sort by values.
However to sort by values, we can do the following:
Create a LinkedList of entries of the map
using Collection.sort to sort the entries
Inserting the sorted entries to a LinkedHashMap: keeps the keys in the order they are inserted, which is currently sorted on natural ordering.
Return the LinkedHashMap as the sorted map.
public static <K extends Comparable,V extends Comparable> Map<K,V> sortByValues(Map<K,V> map){
List<Map.Entry<K,V>> entries = new LinkedList<Map.Entry<K,V>>(map.entrySet());
Collections.sort(entries, new Comparator<Map.Entry<K,V>>() {
#Override
public int compare(Entry<K, V> o1, Entry<K, V> o2) {
return o1.getValue().compareTo(o2.getValue());
}
});
Map<K,V> sortedMap = new LinkedHashMap<K,V>();
for(Map.Entry<K,V> entry: entries){
sortedMap.put(entry.getKey(), entry.getValue());
}
return sortedMap;
}
}
Reference: Sorting Map by value
What you are doing is really a misuse of tools.
I believe what you need to do is:
Have a list/array of input words (still fine that you get it by splitting the input string)
Create a Map to store the word as key, and frequency as value
Have a collection of unique words, then sort the collection base on the the frequency
When you are doing the output, traverse the sorted unique word list, for each element, get the frequency from the frequencyMap, and output the word + frequency.
Of course you can still make use of something like a TreeSet and use frequency as key, but you should have list of words as the value of this map (aka Multi-Map), instead of writing a problematic comparator which do not follow the contract of Comparator: http://docs.oracle.com/javase/6/docs/api/java/util/Comparator.html#compare%28T,%20T%29 Both your original implementation and the one in comment of one of the answers does not comply with the rule of sgn(compare(x, y)) == -sgn(compare(y, x)) for all x and y (The original one is even worse).
some code snippet just for giving you hints:
List<String> words = ....;
Map<String, Integer> wordFrequencyMap = new HashMap<String, Integer>();
// iterate words and update wordFrequencyMap accordingly
List<String> uniqueWords = new ArrayList<String>(new HashSet<String>(words));
Collections.sort(uniqueWords, new WordFrequencyComparator<String>(wordFrequencyMap));
for (String w : uniqueWords) {
System.out.println("word : " + w + " frequency : " + wordFrequencyMap.get(w));
}
The missing part shouldn't be anything difficult.

Sort arraylist by number of times in arraylist and then remove duplicates

Right now I have something in php that looks like this:
$count = array_count_values($result);
arsort($count);
foreach($count as $key => $val){
$result[] = $key;
}
It will count all the items in the array and put it into a key/value pair. Which will remove the duplicates and then I tell it to sort. Then I take the key of it and store it. Is there a way to do this in Java?
I don't believe Java has an equivalent to the array_count_values function, so you will need to implement that yourself. Something like this:
public static <T> Map<T, Integer> countValues(List<T> values) {
Map<T, Integer> result = new HashMap<T, Integer>();
// iterate through values, and increment its corresponding value in result
return result;
}
Then use the java.util.Collections.sort(List list, Comparator c) function to sort the array by the counts. You'll need to implement Comparator to sort by the counts.
public class CountComparator<T> implements Comparator<T> {
private Map<T, Integer> counts;
public CountComparator(Map<T, Integer> counts) {
this.counts = counts;
}
public int compare(T o1, T o2) {
// assumes that the map contains all keys
return counts.get(o1).compareTo(counts.get(o2));
}
}
How about using a Multiset from the Google Guava library to get counts. It works in a similar way to PHP's array_count_values.
If you want it sorted by the key, then use the TreeMultiset implementation.
If you want it sorted on count, then use Multisets.copyHighestCountFirst

Sorting Java TreeMap by value not working when more than two values have the same sort-property

I want to sort a Java TreeMap based on some attribute of value. To be specific, I want to sort a TreeMap<Integer, Hashset<Integer>> based on the size of Hashset<Integer>. To achieve this, I have done the following:
A Comparator class:
private static class ValueComparer implements Comparator<Integer> {
private Map<Integer, HashSet<Integer>> map = null;
public ValueComparer (Map<Integer, HashSet<Integer>> map){
super();
this.map = map;
}
#Override
public int compare(Integer o1, Integer o2) {
HashSet<Integer> h1 = map.get(o1);
HashSet<Integer> h2 = map.get(o2);
int compare = h2.size().compareTo(h1.size());
if (compare == 0 && o1!=o2){
return -1;
}
else {
return compare;
}
}
}
A usage example:
TreeMap<Integer, HashSet<Integer>> originalMap = new TreeMap<Integer, HashSet<Integer>>();
//load keys and values into map
ValueComparer comp = new ValueComparer(originalMap);
TreeMap<Integer, HashSet<Integer>> sortedMap = new TreeMap<Integer, HashSet<Integer>>(comp);
sortedMap.putAll(originalMap);
The problem:
This doesn't work when originalMap contains more than 2 values of the same size. For other cases, it works alright. When more than two values in the map are of same size, the third value in the new sorted-map is null and throws NullPointerException when I try to access it.
I can't figure out what the problem is. Woule be nice if someone could point out.
Update:
Here's an example that works when two values have the same size: http://ideone.com/iFD9c
In the above example, if you uncomment lines 52-54, this code will fail- that's what my problem is.
Update: You cannot return -1 from ValueComparator just because you want to avoid duplicate keys to not be removed. Check the contract of Comparator.compare.
When you pass a Comparator to TreeMap you compute a ("new") place to put the entry. No (computed) key can exist more than once in a TreeMap.
If you want to sort the orginalMap by size of the value you can do as follows:
public static void main(String[] args) throws Exception {
TreeMap<Integer, HashSet<Integer>> originalMap =
new TreeMap<Integer, HashSet<Integer>>();
originalMap.put(0, new HashSet<Integer>() {{ add(6); add(7); }});
originalMap.put(1, new HashSet<Integer>() {{ add(6); }});
originalMap.put(2, new HashSet<Integer>() {{ add(9); add(8); }});
ArrayList<Map.Entry<Integer, HashSet<Integer>>> list =
new ArrayList<Map.Entry<Integer, HashSet<Integer>>>();
list.addAll(originalMap.entrySet());
Collections.sort(list, new Comparator<Map.Entry<Integer,HashSet<Integer>>>(){
public int compare(Map.Entry<Integer, HashSet<Integer>> o1,
Map.Entry<Integer, HashSet<Integer>> o2) {
Integer size1 = (Integer) o1.getValue().size();
Integer size2 = (Integer) o2.getValue().size();
return size2.compareTo(size1);
}
});
System.out.println(list);
}
Your comparator logic (which I'm not sure I follow why you'd return -1 if the set sizes are equal but they keys are different) shouldn't affect what the Map itself returns when you call get(key).
Are you positive you aren't inserting null values into the initial map? What does this code look like?
Your comparator doesn't respect the Comparator contract: if compare(o1, o2) < 0, then compare(o2, o1) should be > 0. You must find a deterministic way of comparing your elements when both sizes are the same and the integers are not identical. You could perhaps use the System.identityHashCode() of the integers to compare them in this case.
That said, I really wonder what you could do with such a map: you can't create new Integers and use them to get a value out of the map, and you can't modify the sets that it holds.
Side note: your comparator code sample uses map and data to refer to the same map.
You can have TreeMap ordered only by keys. There is no way of creating TreeMap ordered by values, because you will get StackOverflowException.
Think about it. To get an element from a tree, you need to perform comparisions, but to perform comparisions, you need to get elements.
You will have to sort it in other collection or to use Tree, you will have to encapsulate the integer (from entry value) also into the entry key and define comparator using that integer taken from a key.
Assuming you cannot use a comparator that returns 0 with a Set, this might work: Add all the elements in originalMap.entrySet() to an ArrayList and then sort the ArrayList using your ValueComparer, changing it to return 0 as necessary.
Then add all the entries in the sorted ArrayList to a LinkedHashMap.
I had a similar problem as the original poster. I had a TreeMap i wanted to sort on a value. But when I made a comparator that looked at the value, i had issues because of the breaking of the comparator that JB talked about. I was able to use my custom comparator and still observe the contract. When the valuse I was looking at were equal, i fell back to comparing the keys. I didn't care about the order if values were equal.
public int compare(String a, String b) {
if(base.get(a)[0] == base.get(b)[0]){ //need to handle when they are equal
return a.compareTo(b);
}else if (base.get(a)[0] < base.get(b)[0]) {
return -1;
} else {
return 1;
} // returning 0 would merge keys

Automatically sorted by values map in Java

I need to have an automatically sorted-by-values map in Java - so that It keeps being sorted at any time while I'm adding new key-value pairs or update the value of an existing key-value pair, or even delete some entry.
Please also have in mind that this map is going to be really big (100's of thousands, or even 10's of millions of entries in size).
So basically I'm looking for the following functionality:
Supposed that we had a class 'SortedByValuesMap' that implements the aforementioned functionality
and we have the following code:
SortedByValuesMap<String,Long> sorted_map = new SortedByValuesMap<String, Long>();
sorted_map.put("apples", 4);
sorted_map.put("oranges", 2);
sorted_map.put("bananas", 1);
sorted_map.put("lemons", 3);
sorted_map.put("bananas", 6);
for (String key : sorted_map.keySet()) {
System.out.println(key + ":" + sorted_map.get(key));
}
the output should be:
bananas:6
apples:4
lemons:3
oranges:2
In particular, what's really important for me, is to be able to get the entry with the
lowest value at any time - using a command like:
smallestItem = sorted_map.lastEntry();
which should give me the 'oranges' entry
EDIT: I am a Java newbie so please elaborate a bit in your answers - thanks
EDIT2: This might help: I am using this for counting words (for those who are familiar: n-grams in particular) in huge text files. So I need to build a map where keys are words and values are the frequencies of those words. However, due to limitations (like RAM), I want to keep only the X most frequent words - but you can't know beforehand which are going to be the most frequent words of course. So, the way I thought it might work (as an approximation) is to start counting words and when the map reaches a top-limit (like 1 mil entries) , the least frequent entry will be deleted so as to keep the map's size to 1 mil always.
Keep 2 data structures:
A dictionary of words -> count. Just use an ordinary HashMap<String, Long>.
An "array" to keep track of order, such that list[count] holds a Set<String> of words with that count.
I'm writing this as though it were an array as a notational convenience. In fact, you probably don't know an upper bound on the number of occurrences, so you need a resizable data structure. Implement using a Map<Long, Set<String>>. Or, if that uses too much memory, use an ArrayList<Set<String>> (you'll have to test for count == size() - 1, and if so, use add() instead of set(count + 1)).
To increment the number of occurrences for a word (pseudocode):
// assumes data structures are in instance variables dict and arr
public void tally(final String word)
{
final long count = this.dict.get(word) or 0 if absent;
this.dict.put(word, count + 1);
// move word up one place in arr
this.arr[count].remove(word); // This is why we use a Set: for fast deletion here.
this.arr[count + 1].add(word);
}
To iterate over words in order (pseudocode):
for(int count = 0; count < arr.size; count++)
for(final String word : this.arr[count])
process(word, count);
How about using additional index or only TreeMap<Long, TreeSet<String>> or TreeMap<Long, String> if Long values are distinct?
You can also write a Heap.
Guava BiMap Solution:
//Prepare original data
BiMap<String, Integer> biMap = HashBiMap.create();
biMap.put("apples" , 4);
biMap.put("oranges", 2);
biMap.put("bananas", 1);
biMap.put("lemons" , 3);
biMap.put("bananas", 6);
//Create a desc order SortedMap
SortedMap<Integer, String> sortedMap = new TreeMap<Integer, String>(new Comparator<Integer>(){
#Override public int compare(Integer o1, Integer o2) {
return o2-o1;
}});
//Put inversed map
sortedMap.putAll(biMap.inverse());
for (Map.Entry<Integer, String> e: sortedMap.entrySet()) {
System.out.println(e);
}
System.out.println(sortedMap.lastKey());
Try the solution posted on http://paaloliver.wordpress.com/2006/01/24/sorting-maps-in-java/ . You have the flexibility of doing sorting ascending or descending too.
Here is what they say
import java.util.Comparator;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.SortedMap;
import java.util.TreeMap;
public class MapValueSort {
/** inner class to do soring of the map **/
private static class ValueComparer implements Comparator<String> {
private Map<String, String> _data = null;
public ValueComparer (Map<String, String> data){
super();
_data = data;
}
public int compare(String o1, String o2) {
String e1 = (String) _data.get(o1);
String e2 = (String) _data.get(o2);
return e1.compareTo(e2);
}
}
public static void main(String[] args){
Map<String, String> unsortedData = new HashMap<String, String>();
unsortedData.put("2", "DEF");
unsortedData.put("1", "ABC");
unsortedData.put("4", "ZXY");
unsortedData.put("3", "BCD");
SortedMap<String, String> sortedData = new TreeMap<String, String>(new MapValueSort.ValueComparer(unsortedData));
printMap(unsortedData);
sortedData.putAll(unsortedData);
System.out.println();
printMap(sortedData);
}
private static void printMap(Map<String, String> data) {
for (Iterator<String> iter = data.keySet().iterator(); iter.hasNext();) {
String key = (String) iter.next();
System.out.println("Value/key:"+data.get(key)+"/"+key);
}
}
}
Outputs
Value/key:BCD/3
Value/key:DEF/2
Value/key:ABC/1
Value/key:ZXY/4
Value/key:ABC/1
Value/key:BCD/3
Value/key:DEF/2
Value/key:ZXY/4
I found the need of a similar structure to keep a list of objects ordered by associated values. Based on the suggestion from Mechanical snail in this thread, I coded up a basic implementation of such a map. Feel free to use.
import java.util.*;
/**
* A map where {#link #keySet()} and {#link #entrySet()} return sets ordered
* with ascending associated values with respect to the the comparator provided
* at constuction. The order of two or more keys with identical values is not
* defined.
* <p>
* Several contracts of the Map interface are not satisfied by this minimal
* implementation.
*/
public class ValueSortedMap<K, V> extends HashMap<K, V> {
protected Map<V, Collection<K>> valueToKeysMap;
public ValueSortedMap() {
this((Comparator<? super V>) null);
}
public ValueSortedMap(Comparator<? super V> valueComparator) {
this.valueToKeysMap = new TreeMap<V, Collection<K>>(valueComparator);
}
public boolean containsValue(Object o) {
return valueToKeysMap.containsKey(o);
}
public V put(K k, V v) {
V oldV = null;
if (containsKey(k)) {
oldV = get(k);
valueToKeysMap.get(oldV).remove(k);
}
super.put(k, v);
if (!valueToKeysMap.containsKey(v)) {
Collection<K> keys = new ArrayList<K>();
keys.add(k);
valueToKeysMap.put(v, keys);
} else {
valueToKeysMap.get(v).add(k);
}
return oldV;
}
public void putAll(Map<? extends K, ? extends V> m) {
for (Map.Entry<? extends K, ? extends V> e : m.entrySet())
put(e.getKey(), e.getValue());
}
public V remove(Object k) {
V oldV = null;
if (containsKey(k)) {
oldV = get(k);
super.remove(k);
valueToKeysMap.get(oldV).remove(k);
}
return oldV;
}
public void clear() {
super.clear();
valueToKeysMap.clear();
}
public Set<K> keySet() {
LinkedHashSet<K> ret = new LinkedHashSet<K>(size());
for (V v : valueToKeysMap.keySet()) {
Collection<K> keys = valueToKeysMap.get(v);
ret.addAll(keys);
}
return ret;
}
public Set<Map.Entry<K, V>> entrySet() {
LinkedHashSet<Map.Entry<K, V>> ret = new LinkedHashSet<Map.Entry<K, V>>(size());
for (Collection<K> keys : valueToKeysMap.values()) {
for (final K k : keys) {
final V v = get(k);
ret.add(new Map.Entry<K,V>() {
public K getKey() {
return k;
}
public V getValue() {
return v;
}
public V setValue(V v) {
throw new UnsupportedOperationException();
}
});
}
}
return ret;
}
}
This implementation does not honor all the contracts of the Map interface such as reflecting value changes and removals in the returned key set and entry sets in the actual map, but such a solution would be a bit large to include in a forum like this. Perhaps I will work on one and make it available via github or something similar.
Update: You cannot sort maps by values, sorry.
You can use SortedMap implementation like TreeMap with Comparator defining order by values (instead of default - by keys).
Or, even better, you can put elements into a PriorityQueue with predefined comparator by values. It should be faster and take less memory compared to TreeMap.
You may refer to the implementation of java.util.LinkedHashMap.
The basic idea is, using a inner linked list to store orders. Here is some details:
Extends from HashMap. In HashMap, each entry has a key and value, that is basic. You can Add a next and a prev pointer to store entries in order by value. And a header and tail pointer to get the first and last entry. For every modification (add, remove, update), you can add your own code to change the list order. It is no more than a linear search and pointer switch.
Sure it will be slow for add/update if there are too many entries because it is a linked list not array. But as long as the list is sorted, I believe there are lots of ways to speedup the search.
So here is what you got: A map that has the same speed with HashMap when retrieving an entry by a key. A linked list which stores entries in order.
We can discuss this further if this solution meets your requirement.
to jtahlborn:
As I said, it surely is slow without any optimization. Since we are talking about performance not impl now, lots of things can be done.
One solution is using a tree instead of Linked List, like Red-Black Tree. Then iterate the tree instead of iterator the map.
About the smallest value, it is easier. Just using a member variable to store the smallest, when add or update an element, update the smallest value. When delete, search the tree for the smallest (this is very fast)
if tree is too complex, it is also possible to using another list/array to mark the some positions in the list. for example, maybe 100 element each. Then when search, just search the position list first and then the real list. This list also needs to be maintained, it would be reasonable to recount the position list for certain times of modification, maybe 100.
if all you need is the "min" value, then just use a normal map and keep track of the "min" value anytime it is modified.
EDIT:
so, if you really need value ordering and you want to use out-of-the-box solutions, you basically need 2 collections. One normal map (e.g. HashMap), and one SortedSet (e.g. TreeSet>). you can traverse ordered elements via the TreeSet, and find frequencies by key using the HashMap.
obviously, you could always code up something yourself sort of like a LinkedHashMap, where the elements are locatable by key and traversable by order, but that's pretty much going to be entirely custom code (i doubt anything that specific already exists, but i could be wrong).

Categories

Resources