How to get the key in Collectors.toMap merge function?

How to get the key in Collectors.toMap merge function? - java

When a duplicate key entry is found during Collectors.toMap(), the merge function (o1, o2) is called.
Question: how can I get the key that caused the duplication?
String keyvalp = "test=one\ntest2=two\ntest2=three";
Pattern.compile("\n")
.splitAsStream(keyval)
.map(entry -> entry.split("="))
.collect(Collectors.toMap(
split -> split[0],
split -> split[1],
(o1, o2) -> {
//TODO how to access the key that caused the duplicate? o1 and o2 are the values only
//split[0]; //which is the key, cannot be accessed here
},
HashMap::new));
Inside the merge function I want to decide based on the key which if I cancel the mapping, or continue and take on of those values.

You need to use a custom collector or use a different approach.
Map<String, String> map = new Hashmap<>();
Pattern.compile("\n")
.splitAsStream(keyval)
.map(entry -> entry.split("="))
.forEach(arr -> map.merge(arr[0], arr[1], (o1, o2) -> /* use arr[0]));
Writing a custom collector is rather more complicated. You need a TriConsumer (key and two values) is similar which is not in the JDK which is why I am pretty sure there is no built in function which uses. ;)

The merge function has no chance to get the key, which is the same issue, the builtin function has, when you omit the merge function.
The solution is to use a different toMap implementation, which does not rely on Map.merge:
public static <T, K, V> Collector<T, ?, Map<K,V>>
toMap(Function<? super T, ? extends K> keyMapper,
Function<? super T, ? extends V> valueMapper) {
return Collector.of(HashMap::new,
(m, t) -> {
K k = keyMapper.apply(t);
V v = Objects.requireNonNull(valueMapper.apply(t));
if(m.putIfAbsent(k, v) != null) throw duplicateKey(k, m.get(k), v);
},
(m1, m2) -> {
m2.forEach((k,v) -> {
if(m1.putIfAbsent(k, v)!=null) throw duplicateKey(k, m1.get(k), v);
});
return m1;
});
}
private static IllegalStateException duplicateKey(Object k, Object v1, Object v2) {
return new IllegalStateException("Duplicate key "+k+" (values "+v1+" and "+v2+')');
}
(This is basically what Java 9’s implementation of toMap without a merge function will do)
So all you need to do in your code, is to redirect the toMap call and omit the merge function:
String keyvalp = "test=one\ntest2=two\ntest2=three";
Map<String, String> map = Pattern.compile("\n")
.splitAsStream(keyvalp)
.map(entry -> entry.split("="))
.collect(toMap(split -> split[0], split -> split[1]));
(or ContainingClass.toMap if its neither in the same class nor static imports)<\sup>
The collector supports parallel processing like the original toMap collector, though it’s not very likely to get a benefit from parallel processing here, even with more elements to process.
If, if I get you correctly, you only want to pick either, the older or newer value, in the merge function based on the actual key, you could do it with a key Predicate like this
public static <T, K, V> Collector<T, ?, Map<K,V>>
toMap(Function<? super T, ? extends K> keyMapper,
Function<? super T, ? extends V> valueMapper,
Predicate<? super K> useOlder) {
return Collector.of(HashMap::new,
(m, t) -> {
K k = keyMapper.apply(t);
m.merge(k, valueMapper.apply(t), (a,b) -> useOlder.test(k)? a: b);
},
(m1, m2) -> {
m2.forEach((k,v) -> m1.merge(k, v, (a,b) -> useOlder.test(k)? a: b));
return m1;
});
}
Map<String, String> map = Pattern.compile("\n")
.splitAsStream(keyvalp)
.map(entry -> entry.split("="))
.collect(toMap(split -> split[0], split -> split[1], key -> condition));
There are several ways to customize this collector…

There is, of course, simple and trivial trick - saving the key in the 'key mapper' function and getting the key in the 'merge' function. So, the code may look like the following (assuming the key is Integer):
final AtomicInteger key = new AtomicInteger();
...collect( Collectors.toMap(
item -> { key.set(item.getKey()); return item.getKey(); }, // key mapper
item -> ..., // value mapper
(v1, v2) -> { log(key.get(), v1, v2); return v1; } // merge function
);
Note: this is not good for parallel processing.

Related

Rewrite SortedMap with streams Java

I have this code:
SortedMap<String, Double> starsPerActivity = new TreeMap<>();
for(Product p : products.values()) {
for(Rating r : ratings) {
if(r.getProductName() == p.getName()) {
starsPerActivity.put(p.getActivityName(), this.getStarsOfProduct(p.getName()));
}
}
}
return starsPerActivity;
And I want to rewrite this piece of code with streams.
I tried, but I don't know how.
The method starsPerActivity() returns a map that associates a name of the activity to the average number of stars for the products belonging to that activity, with the activity names sorted alphabetically. Activities whose products have not been rated should not appear in the result.

You can use the third Collectors.toMap overload:
public static <T, K, U, M extends Map<K, U>>
Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper,
Function<? super T, ? extends U> valueMapper,
BinaryOperator<U> mergeFunction,
Supplier<M> mapFactory)
Which allows you to define what map implementation you'd like to use:
SortedMap<String, Double> starsPerActivity = products.values().stream()
.filter(p -> ratings.stream()
.anyMatch(r -> r.getProductName().equals(p.getName())))
.collect(Collectors.toMap(
Product::getActivityName,
p -> getStarsOfProduct(p.getName()),
Double::max, TreeMap::new
));
Also I noticed that you used r.getProductName() == p.getName() in your if statement, which is probably applied to Strings. This is discouraged, see: How do I compare strings in Java
Note that this implementation picks the rating with the highest value. You can change this behaviour by replacing Double::max with the logic you like. If you want the same behaviour you currently have then you could use this, which is close enough: (a, b) -> b which simply picks the latest put value.

SortedMap<String, Double> starsPerActivity = new TreeMap<>();
products.values().forEach(p -> ratings.stream().filter(r -> r.getProductName().equals(p.getName()))
.forEachOrdered(r -> starsPerActivity.put(p.getActivityName(), this.getStarsOfProduct(p.getName()))));

In Java8, when will the merge function be triggered in Collectors.toMap?

I want to transform a Hashmap<String,Long> to a Treemap, in order to sort its key by string.length (I can't simply use treemap.addAll because I have may other logic when insert and I want to use java8)
The code is below. But when keys with same length exist in the initial Hashmap, it will trigger the merge function which throws Exception(I intent to do it because there won't be same string in my case). I wonder why the merge function be triggered since the JavaDoc of toMap() says "If the mapped keys contains duplicates (according to Object#equals(Object)), the value mapping function is applied to each equal element, and results are merged using the provided merging function." I think that in my code the "mapped keys" should be the entry in hashMap mapped by Entry::getKey but not string.length() in the TreeMap comparator. i.e. "abc" != "def". So it shouldn't trigger the merge. But?? What the hell?
public class TestToMap {
public static Map<String, Long> map1 = new HashMap<String, Long>() {
{
put("abc", 123L);
put("def", 456L);
}
};
public static void main(String[] args) {
Map<String, Long> priceThresholdMap = map1.entrySet().stream()
.collect(Collectors.toMap(Entry::getKey,
Entry::getValue,
throwingMerger(),
() -> new TreeMap<String, Long>(
(a, b) -> {
return a.length() - b.length();
}))); // this will trigger merge function, why?
//() -> new TreeMap<String, Long>(Comparator.comparingInt(String::length).thenComparing(String::compareTo)))); // but this won't trigger merge function
}
private static <T> BinaryOperator<T> throwingMerger() {
return (u, v) -> {
throw new IllegalStateException(String.format("priceThresholdMap has duplicate v1 %s,v2 %s", u, v));
};
}
}

Of course it should trigger a merge. The merge function is used to merge values having identical keys in the output Map, which in your case is a TreeMap.
In a TreeMap, keys are identical if the Comparator's compare method returns 0, so two keys having the same length are deemed identical, and their corresponding values should be merged.
Note that your Comparator causes the output TreeMap not to implement the Map interface correctly, since the ordering it defines is not consistent with equals():
Note that the ordering maintained by a tree map, like any sorted map, and
whether or not an explicit comparator is provided, must be consistent
with equals if this sorted map is to correctly implement the
Map interface
(From TreeMap Javadoc)
If you want to sort the String's by length, you can still be consistent with equals:
Instead of
return a.length() - b.length()
use
return a.length() == b.length() ? a.compareTo(b) : Integer.compare(a.length(),b.length())
Now unequal Strings having the same length will be ordered lexicographically, while String having different length will be ordered by length.

According to toMap() source code, it create a accumulator which will fold each element from source stream into map.
Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper,
Function<? super T, ? extends U> valueMapper,
BinaryOperator<U> mergeFunction,
Supplier<M> mapSupplier) {
BiConsumer<M, T> accumulator
= (map, element) -> map.merge(keyMapper.apply(element),
valueMapper.apply(element), mergeFunction);
return new CollectorImpl<>(mapSupplier, accumulator, mapMerger(mergeFunction), CH_ID);
}
And in Map.merge() , when get("def") will return the exist oldValue=123, which key is "abc", because by the comparator I give to TreeMap "def" is equal to "abc". And then oldValue!=null calls merge function.
default V merge(K key, V value,
BiFunction<? super V, ? super V, ? extends V> remappingFunction) {
Objects.requireNonNull(remappingFunction);
Objects.requireNonNull(value);
V oldValue = get(key);
V newValue = (oldValue == null) ? value :
remappingFunction.apply(oldValue, value); // call the merge function
if(newValue == null) {
remove(key);
} else {
put(key, newValue);
}
return newValue;
}
ref:Collectors toMap duplicate key

How to apply in Function<T, R> argument from List (not List)

I have a method which returns company as key and list of employeer as values
<T> Map<String, List<T>> getUserPerCompany(final Function<User, T> converter).
The method accepts the converter parameter which in the tests returns the String (name + lastname of the employee). It should returns: Map<String, List<String>>. I created this implementation:
return getUserStream().collect(toMap(Company::getName, c -> converter.apply(c.getUsers())));
Error is:
apply (domain.User) in Function cannot be applied to (java.util.List<domain.User>)
My problem is that I do not know how to pass the employee to the 'apply' list instead of the list in full.
My other attempts:
return getUserStream().collect(toMap(Company::getName, c -> converter.apply((User) c.getUsers().listIterator())));
return getUserStream().collect(toMap(Company::getName, c -> converter.apply((User) c.getUsers().subList(0, c.getUsers().size()))));
return getUserStream().collect(toMap(Company::getName, c -> converter.apply((User) c.getUsers().iterator())));

I suppose this is what you're looking for
<T> Map<String, List<T>> getUserPerCompany(final Function<User, T> converter) {
return getUserStream().collect(Collectors.toMap(
c -> c.getName(),
c -> c.getUsers()
.stream()
.map(converter)
.collect(Collectors.toList())
));
}
Usage example is
final Map<String, List<String>> users = getUserPerCompany(user -> user.getName() + " " + user.getSurname());
Basically you need to map each User, applying the input Function.

You can use Collectors.groupingBy() and write custom Collector:
<T> Map<String, List<T>> getUserPerCompany(final Function<User, T> converter) {
return getUserStream().collect(
Collectors.groupingBy(
c -> c.getName(),
Collector.of(
ArrayList::new, //init accumulator
(list, c)-> c.getUsers() //processing each element
.stream()
.map(converter)
.forEach(list::add),
(result1, result2) -> { //confluence 2 accumulators
result1.addAll(result2); //in parallel execution
return result1;
}
)
)
);
}

How to preserve order in a Map created from Month.values() with java 8 streams [duplicate]

I am creating a Map from a List as follows:
List<String> strings = Arrays.asList("a", "bb", "ccc");
Map<String, Integer> map = strings.stream()
.collect(Collectors.toMap(Function.identity(), String::length));
I want to keep the same iteration order as was in the List. How can I create a LinkedHashMap using the Collectors.toMap() methods?

The 2-parameter version of Collectors.toMap() uses a HashMap:
public static <T, K, U> Collector<T, ?, Map<K,U>> toMap(
Function<? super T, ? extends K> keyMapper,
Function<? super T, ? extends U> valueMapper)
{
return toMap(keyMapper, valueMapper, throwingMerger(), HashMap::new);
}
To use the 4-parameter version, you can replace:
Collectors.toMap(Function.identity(), String::length)
with:
Collectors.toMap(
Function.identity(),
String::length,
(u, v) -> {
throw new IllegalStateException(String.format("Duplicate key %s", u));
},
LinkedHashMap::new
)
Or to make it a bit cleaner, write a new toLinkedMap() method and use that:
public class MoreCollectors
{
public static <T, K, U> Collector<T, ?, Map<K,U>> toLinkedMap(
Function<? super T, ? extends K> keyMapper,
Function<? super T, ? extends U> valueMapper)
{
return Collectors.toMap(
keyMapper,
valueMapper,
(u, v) -> {
throw new IllegalStateException(String.format("Duplicate key %s", u));
},
LinkedHashMap::new
);
}
}

Make your own Supplier, Accumulator and Combiner:
List<String> myList = Arrays.asList("a", "bb", "ccc");
// or since java 9 List.of("a", "bb", "ccc");
LinkedHashMap<String, Integer> mapInOrder = myList
.stream()
.collect(
LinkedHashMap::new, // Supplier LinkedHashMap to keep the order
(map, item) -> map.put(item, item.length()), // Accumulator
Map::putAll); // Combiner
System.out.println(mapInOrder); // prints {a=1, bb=2, ccc=3}

The right solution for this problem is
Current ----> 2 parameter version
Map<Integer, String> mapping = list.stream().collect(Collectors.toMap(Entity::getId, Entity::getName));
Right ----> Use 4-parameter version of the Collectors.toMap to tell supplier to supply a new LinkedHashMap:
Map<Integer, String> mapping = list.stream().collect(Collectors.toMap(Entity::getId, Entity::getName, (u, v) -> u, LinkedHashMap::new));
This will help.

In Kotlin, toMap() is order-preserving.
fun <K, V> Iterable<Pair<K, V>>.toMap(): Map<K, V>
Returns a new map containing all key-value pairs from the given collection of pairs.
The returned map preserves the entry iteration order of the original collection. If any of two pairs would have the same key the last one gets added to the map.
Here's its implementation:
public fun <K, V> Iterable<Pair<K, V>>.toMap(): Map<K, V> {
if (this is Collection) {
return when (size) {
0 -> emptyMap()
1 -> mapOf(if (this is List) this[0] else iterator().next())
else -> toMap(LinkedHashMap<K, V>(mapCapacity(size)))
}
}
return toMap(LinkedHashMap<K, V>()).optimizeReadOnlyMap()
}
The usage is simply:
val strings = listOf("a", "bb", "ccc")
val map = strings.map { it to it.length }.toMap()
The underlying collection for map is a LinkedHashMap (which is insertion-ordered).

Simple function to map array of objects by some field:
public static <T, E> Map<E, T> toLinkedHashMap(List<T> list, Function<T, E> someFunction) {
return list.stream()
.collect(Collectors.toMap(
someFunction,
myObject -> myObject,
(key1, key2) -> key1,
LinkedHashMap::new)
);
}
Map<String, MyObject> myObjectsByIdMap1 = toLinkedHashMap(
listOfMyObjects,
MyObject::getSomeStringField()
);
Map<Integer, MyObject> myObjectsByIdMap2 = toLinkedHashMap(
listOfMyObjects,
MyObject::getSomeIntegerField()
);

Since Java 9 you can collect a list of map entries with the same order as in the original list:
List<String> strings = Arrays.asList("a", "bb", "ccc");
List<Map.Entry<String, Integer>> entries = strings.stream()
.map(e -> Map.entry(e, e.length()))
.collect(Collectors.toList());
System.out.println(entries); // [a=1, bb=2, ccc=3]
Or you can collect a list of maps with a single entry in the same way:
List<String> strings = Arrays.asList("a", "bb", "ccc");
List<Map<String, Integer>> maps = strings.stream()
.map(e -> Map.of(e, e.length()))
.collect(Collectors.toList());
System.out.println(maps); // [{a=1}, {bb=2}, {ccc=3}]

How to merge lists of Map with Lists values using Java Streams API?

How can I reduce the Map<X, List<String>> grouping by the X.p and join all the list values at the same time, so that I have Map<Integer, List<String>> at the end?
This is what I've tried so far:
class X {
int p;
int q;
public X(int p, int q) { this.p = p; this.q = q; }
}
Map<X, List<String>> x = new HashMap<>();
x.put(new X(123,5), Arrays.asList("A","B"));
x.put(new X(123,6), Arrays.asList("C","D"));
x.put(new X(124,7), Arrays.asList("E","F"));
Map<Integer, List<String>> z = x.entrySet().stream().collect(Collectors.groupingBy(
entry -> entry.getKey().p,
mapping(Map.Entry::getValue,
reducing(new ArrayList<>(), (a, b) -> { a.addAll(b); return a; }))));
System.out.println("z="+z);
But the result is: z={123=[E, F, A, B, C, D], 124=[E, F, A, B, C, D]}.
I want to have z={123=[A, B, C, D], 124=[E, F]}

Here's one way to do it using two Stream pipelines :
Map<Integer, List<String>> z =
// first process the entries of the original Map and produce a
// Map<Integer,List<List<String>>>
x.entrySet()
.stream()
.collect(Collectors.groupingBy(entry -> entry.getKey().p,
mapping(Map.Entry::getValue,
toList())))
// then process the entries of the intermediate Map and produce a
// Map<Integer,List<String>>
.entrySet()
.stream()
.collect (toMap (Map.Entry::getKey,
e -> e.getValue()
.stream()
.flatMap(List::stream)
.collect(toList())));
Java 9 is supposed to add a flatMapping Collector, that would make your life easier (I learned about this new feature thanks to Holger).
Output :
z={123=[A, B, C, D], 124=[E, F]}

You are using the reducing collector incorrectly. The first argument must be an identity value to the reduction operation. But you are modifying it by adding values to it, which perfectly explains the result: all values are added to the same ArrayList which is expected to be the invariant identity value.
What you want to do is a Mutable reduction and Collectors.reducing is not appropriate for that. You may create an appropriate collector using the method Collector.of(…):
Map<Integer, List<String>> z = x.entrySet().stream().collect(groupingBy(
entry -> entry.getKey().p, Collector.of(
ArrayList::new, (l,e)->l.addAll(e.getValue()), (a,b)->{a.addAll(b);return a;})));

There is a way to achieve that in one run by writing your own Collector:
Map<Integer, List<String>> z = x.entrySet().stream().collect(
Collectors.groupingBy(entry -> entry.getKey().p,
Collectors.mapping(Entry::getValue,
Collector.of(ArrayList::new, (a, b) -> a.addAll(b), (a, b) -> {
a.addAll(b);
return a;
})
)
)
);

Using the EntryStream class of my StreamEx library such tasks can be solved quite easily:
Map<Integer, List<String>> z = EntryStream.of(x)
.mapKeys(k -> k.p)
.flatMapValues(List::stream)
.grouping();
Internally it's transformed to something like this:
Map<Integer, List<String>> z = x.entrySet().stream()
.map(e -> new AbstractMap.SimpleImmutableEntry<>(e.getKey().p, e.getValue()))
.<Entry<Integer, String>>flatMap(e -> e.getValue().stream()
.map(s -> new AbstractMap.SimpleImmutableEntry<>(e.getKey(), s)))
.collect(Collectors.groupingBy(e -> e.getKey(),
Collectors.mapping(e -> e.getValue(), Collectors.toList())));
So it's actually a single stream pipeline.
If you don't want to use the third-party code, you can simplify the above version a little:
Map<Integer, List<String>> z = x.entrySet().stream()
.<Entry<Integer, String>>flatMap(e -> e.getValue().stream()
.map(s -> new AbstractMap.SimpleEntry<>(e.getKey().p, s)))
.collect(Collectors.groupingBy(e -> e.getKey(),
Collectors.mapping(e -> e.getValue(), Collectors.toList())));
Though it still looks ugly.
Finally please note that in JDK9 there's new standard collector called flatMapping which can be implemented in the following way:
public static <T, U, A, R>
Collector<T, ?, R> flatMapping(Function<? super T, ? extends Stream<? extends U>> mapper,
Collector<? super U, A, R> downstream) {
BiConsumer<A, ? super U> downstreamAccumulator = downstream.accumulator();
return Collector.of(downstream.supplier(),
(r, t) -> {
try (Stream<? extends U> result = mapper.apply(t)) {
if (result != null)
result.sequential().forEach(u -> downstreamAccumulator.accept(r, u));
}
},
downstream.combiner(), downstream.finisher(),
downstream.characteristics().toArray(new Collector.Characteristics[0]));
}
Using this collector, your task can be solved simpler without additional libraries:
Map<Integer, List<String>> z = x.entrySet().stream()
.map(e -> new AbstractMap.SimpleImmutableEntry<>(e.getKey().p, e.getValue()))
.collect(Collectors.groupingBy(e -> e.getKey(),
flatMapping(e -> e.getValue().stream(), Collectors.toList())));

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to get the key in Collectors.toMap merge function? - java

Related

Rewrite SortedMap with streams Java

In Java8, when will the merge function be triggered in Collectors.toMap?

How to apply in Function<T, R> argument from List (not List)

How to preserve order in a Map created from Month.values() with java 8 streams [duplicate]

How to merge lists of Map with Lists values using Java Streams API?

Categories

Resources