Consider the following Java HashMap.
Map<String, String> unsortMap = new HashMap<String, String>();
unsortMap.put("Z", "z");
unsortMap.put("B", "b");
unsortMap.put("A", "a");
unsortMap.put("C", "c");
Now I wish to sort this Map by Key. One option is for me to use a TreeMap for this purpose.
Map<String, String> treeMap = new TreeMap<String, String>(unsortMap);
Another option is for me to Use Java Streams with Sorted(), as follows.
Map<String, Integer> sortedMap = new HashMap<>();
unsortMap.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.forEachOrdered(x -> sortedMap.put(x.getKey(), x.getValue()));
Out of these two, which option is preferred and why (may be in terms of performance)?
Thank you
As pointed out by others dumping the sorted stream of entries into a regular HashMap would do nothing... LinkedHashMap is the logical choice.
However, an alternative to the approaches above is to make full use of the Stream Collectors API.
Collectors has a toMap method that allows you to provide an alternative implementation for the Map. So instead of a HashMap you can ask for a LinkedHashMap like so:
unsortedMap.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue,
(v1, v2) -> v1, // you will never merge though ask keys are unique.
LinkedHashMap::new
));
Between using a TreeMap vs LinkedHashMap ... The complexity of construction is likely to be the same something like O(n log n)... Obviously the TreeMap solution is a better approach if you plan to keep adding more elements to it... I guess you should had started with a TreeMap in that case. The LinkedHashMap option has the advantage that lookup is going to be O(1) on the Linked or the original unsorted map whereas as TreeMap's is something like O(log n) so if you would need to keep the unsorted map around for efficient lookup whereas in if you build the LinkedHashMap you could toss the original unsorted map (thus saving some memory).
To make things a bit more efficient with LinkedHashMap you should provide an good estimator of the required size at construction so that there is not need for dynamic resizing, so instead of LinkedHashMap::new you say () -> new LinkedHashMap<>(unsortedMap.size()).
I'm my opinion the use of a TreeMap is more neat... as keeps the code smaller so unless there is actual performance issue that could be addressed using the unsorted and sorted linked map approach I would use the Tree.
Your stream code won't even sort the map, because it is performing the operation against a HashMap, which is inherently unsorted. To make your second stream example work, you may use LinkedHashMap, which maintains insertion order:
Map<String, Integer> sortedMap = new LinkedHashMap<>();
unsortMap.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.forEachOrdered(x -> sortedMap.put(x.getKey(), x.getValue()));
But now your two examples are not even the same underlying data structure. A TreeMap is backed by a tree (red black if I recall correctly). You would use a TreeMap if you wanted to be able to iterate in a sorted way, or search quickly for a key. A LinkedHashMap is hashmap with a linked list running through it. You would use this if you needed to maintain insertion order, for example when implementing a queue.
The second way does not work, when you call HashMap#put, it does not hold the put order. You might need LinkedHashMap.
TreeMap v.s. Stream(LinkedHashMap):
code style. Using TreeMap is more cleaner since you can achieve it in one line.
space complexity. If the original map is HashMap, with both method you need to create a new Map. If If the original map is LinkedHashMap, then you only need create a new Map with the first approach. You can re-use the LinkedHashMap with the second approach.
time complexity. They should both have O(nln(n)).
Related
I have a stream that processes some strings and collects them in a map.
But getting the following exception:
java.lang.IllegalStateException:
Duplicate key test#yahoo.com
(attempted merging values [test#yahoo.com] and [test#yahoo.com])
at java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:133)
I'm using the following code:
Map<String, List<String>> map = emails.stream()
.collect(Collectors.toMap(
Function.identity(),
email -> processEmails(email)
));
The flavor of toMap() you're using in your code (which expects only keyMapper and valueMapper) disallow duplicates merely because it's not capable to handle them. And exception message explicitly tells you that.
Judging by the resulting type Map<String, List<String>> and by the exception message which shows strings enclosed in square brackets, it is possible to make the conclusion that processEmails(email) produces a List<String> (although it's not obvious from your description and IMO worth specifying).
There are multiple ways to solve this problem, you can either:
Use this another version of toMap(keyMapper,valueMapper,mergeFunction) which requires the third argument, mergeFunction - a function responsible for resolving duplicates.
Map<String, List<String>> map = emails.stream()
.collect(Collectors.toMap(
Function.identity(),
email -> processEmails(email),
(list1, list2) -> list1 // or { list1.addAll(list2); return list1} depending on the your logic of resolving duplicates you need
));
Make use of the collector groupingBy(classifier,downstream) to preserve all the emails retrieved by processEmails() that are associated with the same key by storing them into a List. As a downstream collector we could utilize a combination of collectors flatMapping() and toList().
Map<String, List<String>> map = emails.stream()
.collect(Collectors.groupingBy(
Function.identity(),
Collectors.flatMapping(email -> processEmails(email).stream(),
Collectors.toList())
));
Note that the later option would make sense only if processEmails() somehow generates different results for the same key, otherwise you would end up with a list of repeated values which doesn't seem to be useful.
But what you definitely shouldn't do in this case is to use distinct(). It'll unnecessarily increase the memory consumption because it eliminates the duplicates by maintaining a LinkedHashSet under the hood. It would be wasteful because you're already using Map which is capable to deal with duplicated keys.
You have duplicate emails. The toMap version you're using explicitly doesn't allow duplicate keys. Use the toMap that takes a merge function. How to merge those processEmails results depends on your business logic.
Alternatively, use distinct() before collecting, because otherwise you'll probably end up sending some people multiple emails.
try using
Collectors.toMap(Function keyFuntion, Function valueFunction, BinaryOperator mergeFunction)
You obviously have to write your own merge logic, a simple mergeFunction could be
(x1, x2) -> x1
I have a HashMap with Document as Key and a Double Value as Value.
My aim is to sort the Hashmap by descending value. The .reverse() should be after comparingbyValue() but conflicts with sorted. How do I solve this?
HashMap<Document, Double> sortedMap = Map.entrySet().stream().sorted(Entry.comparingByValue())
.collect(Collectors.toMap(Entry::getKey, Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));
HashMap is fundamentally unsorted. TreeMap is fundamentally sorted on key. In other words, you can't do this. Just now how it works. You have two options:
Figure out a way to sort on keys
And then just use a TreeMap with a custom comparator.
Copy the whole thing
Make a tuple class, then copy the entire thing into an ArrayList by turning each k/v pair into a tuple, then sort that list on value as you want, then move the entire thing into a LinkedHashMap which preserves order. You cannot now modify the thing without going through this entire routine again.
Rethink your architecture
If neither is acceptable you'll have to go back to the drawing board.
I have a Map and want to sort the values by a Comparator putting the result into a LinkedHashMap.
Map<String, User> sorted = test.getUsers()
.entrySet()
.stream()
.sorted(Map.Entry.comparingByValue(SORT_BY_NAME))
.collect(LinkedHashMap::new, (map, entry) -> map.put(entry.getKey(), entry.getValue()),
LinkedHashMap::putAll);
test.setUsers(sorted);
All works, however, I wonder if this can be simplified.
Actually I create a new Map and put that new Map into the setUsers(). Can I change the stream directly without creating a new LinkedHashMap?
With Collections.sort(Comparator), the list is directly sorted. However, Collections.sort does not work with Maps.
Why not use Collectors.toMap()?
.collect(Collectors.toMap(
Entry::getKey,
Entry::getValue,
(a, b) -> { throw new AssertionError("impossible"); },
LinkedHashMap::new));
You cannot perform in-place sorting on a collection that does not support array-style indexing, so for maps it is out of the question (except you use something like selection sort on a LinkedHashMap, but that would be a bad idea). Creating a new map is unavoidable. You can still simplify it though, by following shmosel's answer.
I am new Java 8 and want to sort a Map based on Key and then sort each list within values.
I tried to look for a Java 8 way to sort Keys and also value.
HashMap> map
map.entrySet().stream().sorted(Map.Entry.comparingByKey())
.collect(Collectors.toMap(Map.Entry::getKey,
Map.Entry::getValue, (e1, e2) -> e2, LinkedHashMap::new));
I am able to sort the Map and I can collect each values within map to sort but is their a way we can do in java 8 and can both be combined.
To sort by key, you could use a TreeMap. To sort each list in the values, you could iterate over the values of the map by using the Map.values().forEach() methods and then sort each list by using List.sort. Putting it all together:
Map<Integer, List<Integer>> sortedByKey = new TreeMap<>(yourMap);
sortedByKey.values().forEach(list -> list.sort(null)); // null: natural order
This sorts each list in-place, meaning that the original lists are mutated.
If instead you want to create not only a new map, but also new lists for each value, you could do it as follows:
Map<Integer, List<Integer>> sortedByKey = new TreeMap<>(yourMap);
sortedByKey.replaceAll((k, originalList) -> {
List<Integer> newList = new ArrayList<>(originalList);
newList.sort(null); // null: natural order
return newList;
});
EDIT:
As suggested in the comments, you might want to change:
sortedByKey.values().forEach(list -> list.sort(null));
By either:
sortedByKey.values().forEach(Collections::sort);
Or:
sortedByKey.values().forEach(list -> list.sort(Comparator.naturalOrder()));
Either one of the two options above is much more expressive and shows the developer's intention in a better way than using null as the comparator argument to the List.sort method.
Same considerations apply for the approach in which the lists are not modified in-place.
I'm creating a new Map and pushing strings into it (no big deal) -but I've noticed that the strings are being re-ordered as the map grows. Is it possible to stop this re-ordering that occurs so the items in the map retain the order the were put in with?
Map<String,String> x = new HashMap<String, String>();
x.put("a","b");
x.put("a","c");
x.put("a","d");
x.put("1","2");
x.put("1","3");
x.put("1","4");
//this shows them out of order sadly...
for (Map.Entry<String, String> entry : x.entrySet()) {
System.out.println("IN THIS ORDER ... " + entry.getValue());
}
If you care about order, you can use a SortedMap. The actual class which implements the interface (at least for most scenarios) is a TreeMap. Alternatively, LinkedHashMap also maintains its order, while still utilizing a hashtable-based container.
You can keep it with LinkedHashMap.
A HashMap in java is not sorted http://download.oracle.com/javase/1,5.0/docs/api/java/util/HashMap.html. If you want predictable iteration order use a LinkedHashMap instead: http://download.oracle.com/javase/1.4.2/docs/api/java/util/LinkedHashMap.html
Heres a good discussion on the difference: How is the implementation of LinkedHashMap different from HashMap?
The previous answers are correct in that you should use an implementation of Map that maintains ordering. LinkedHashMap and SortedMap each do these things.
However, the takeaway point is that not all collections maintain order and if order is important to you, you should choose the appropriate implementation. Generic HashMaps do not maintain order, do not claim to do so and cannot be set to do so.