groupingBy and filter in one step - java

I have a Stream<String>, and I want a Map<Integer, String>. Let's call my classifier function getKey(String) - it can be expensive. Sometimes it returns zero, which means that the String should be discarded and not included in the resulting map.
So, I can use this code:
Stream<String> stringStream;
Map<Integer, String> result =
stringStream.collect(Collectors.groupingBy(this::getKey, Collectors.joining());
result.remove(0);
This first adds the unwanted Strings to the Map keyed by zero, and then removes them. There may be a lot of them. Is there an elegant way to avoid adding them to the map in the first place?
I don't want to add a filter step before grouping, because that would mean executing the decision/classification code twice.

You said that calling getKey is expensive, but you could still map the elements of the stream up-front before filtering them. The call to getKey will be only done once in this case.
Map<Integer, String> result =
stringStream.map(s -> new SimpleEntry<>(this.getKey(s), s))
.filter(e -> e.getKey() != 0)
.collect(groupingBy(Map.Entry::getKey, mapping(Map.Entry::getValue, joining())));
Note that there is no tuple classes in the standard API. You may roll your own one or use AbstractMap.SimpleEntry as a substitute.
Alternatively, if you think the first version creates a lot of entries, you can use the collect method where you provide yourself the supplier, accumulator and combiner.
Map<Integer, String> result = stringStream
.collect(HashMap::new,
(m, e) -> {
Integer key = this.getKey(e);
if(key != 0) {
m.merge(key, e, String::concat);
}
},
Map::putAll);

You may use a stream of pairs like this:
stringStream.map(x -> new Pair(getKey(x), x))
.filter(pair -> pair.left != 0) // or whatever predicate
.collect(Collectors.groupingBy(pair -> pair.left,
Collectors.mapping(pair -> pair.right, Collectors.joining())));
This code assumes simple Pair class with two fields left and right.
Some third-party libraries like my StreamEx provide additional methods to remove the boilerplate:
StreamEx.of(stringStream)
.mapToEntry(this::getKey, x -> x)
.filterKeys(key -> key != 0) // or whatever
.grouping(Collectors.joining());

Related

Create an ImmutableMap using Java 8

How can the following method be written using Java Stream API?
public Map<String, String> getMap() {
Map<String, String> mapB = new HashMap<>();
for (String parameterKey : listOfKeys) {
String parameterValue = mapA.get(parameterKey);
mapB.put(parameterKey, Objects.requireNonNullElse(parameterValue, ""));
}
return ImmutableMap.copyOf(mapB);
}
I tried something like this:
return listOfKeys.stream()
.map(firstMap::get)
.collect( ? )
But I don't know how to continue from here.
Guava
If you need Guava's ImmutableMap, you can make use of the Collector returned by ImmutableMap.toImmutableMap() available since Guava's version 21.0.
Note: the minimum JDK version required by the Guava 21.0 release is Java 8 (see), hence below is a valid Java 8 compliant solution.
public Map<String, String> getMap() {
return listOfKeys.stream()
.collect(ImmutableMap.toImmutableMap(
Function.identity(), // keyMapper - generating keys
str -> mapA.getOrDefault(str, ""), // valueMapper - generating values
(left, right) -> left // mergeFunction - resolving duplicates
));
}
In case if you're using Guava's version earlier than 21.0, then you can generate the resulting map using standard JDK collector toMap() and wrap it with an ImmutableMap by the means of collectingAndThen().
This approach would be cleaner, cramming the stream as an argument of the copyOf() method as suggested in another answer:
public Map<String, String> getMap() {
return istOfKeys.stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(
Function.identity(), // keyMapper - generating keys
str -> mapA.getOrDefault(str, ""), // valueMapper - generating values
(left, right) -> right), // mergeFunction - resolving duplicates
ImmutableMap::copyOf
));
}
Note:
If all elements contained in listOfKeys are guaranteed to be unique, then you can remove mergeFunction (the third argument of the collector).
I've replaced Objects.requireNonNullElse() with Map.getOrDefault() which would guard against the cases when a key is not present in the mapA. If you also want you protect against situations when the value associated with the key is null, then replace it with requireNonNullElse(). But you need to be aware that it's a sign of a faulty design if null values are being stored in the map because they have some special meaning in your business logic (and therefore you can't get rid of them).
Standard JDK
Otherwise, you can use collector Collectors.toUnmodifiableMap()
It internally uses a combination of collectors toMap(), accumulates stream data into a map, and collectingAndThen()) which performs the final transformation into a so-called unmodifiable map created via Map.ofEntries().
public Map<String, String> getMap() {
return listOfKeys.stream()
.collect(Collectors.toUnmodifiableMap(
Function.identity(), // keyMapper - generating keys
str -> mapA.getOrDefault(str, ""), // valueMapper - generating values
(left, right) -> left // mergeFunction - resolving duplicates
));
}

Java Map with List value to list using streams?

I am trying to rewrite the method below using streams but I am not sure what the best approach is? If I use flatMap on the values of the entrySet(), I lose the reference to the current key.
private List<String> asList(final Map<String, List<String>> map) {
final List<String> result = new ArrayList<>();
for (final Entry<String, List<String>> entry : map.entrySet()) {
final List<String> values = entry.getValue();
values.forEach(value -> result.add(String.format("%s-%s", entry.getKey(), value)));
}
return result;
}
The best I managed to do is the following:
return map.keySet().stream()
.flatMap(key -> map.get(key).stream()
.map(value -> new AbstractMap.SimpleEntry<>(key, value)))
.map(e -> String.format("%s-%s", e.getKey(), e.getValue()))
.collect(Collectors.toList());
Is there a simpler way without resorting to creating new Entry objects?
A stream is a sequence of values (possibly unordered / parallel). map() is what you use when you want to map a single value in the sequence to some single other value. Say, map "alturkovic" to "ALTURKOVIC". flatMap() is what you use when you want to map a single value in the sequence to 0, 1, or many other values. Hence why a flatMap lambda needs to turn a value into a stream of values. flatMap can thus be used to take, say, a list of lists of string, and turn that into a stream of just strings.
Here, you want to map a single entry from your map (a single key/value pair) into a single element (a string describing it). 1 value to 1 value. That means flatMap is not appropriate. You're looking for just map.
Furthermore, you need both key and value to perform your mapping op, so, keySet() is also not appropriate. You're looking for entrySet(), which gives you a set of all k/v pairs, juts what we need.
That gets us to:
map.entrySet().stream()
.map(e -> String.format("%s-%s", e.getKey(), e.getValue()))
.collect(Collectors.toList());
Your original code makes no effort to treat a single value from a map (which is a List<String>) as separate values; you just call .toString() on the entire ordeal, and be done with it. This means the produced string looks like, say, [Hello, World] given a map value of List.of("Hello", "World"). If you don't want this, you still don't want flatmap, because streams are also homogenous - the values in a stream are all of the same kind, and thus a stream of 'key1 value1 value2 key2 valueA valueB' is not what you'd want:
map.entrySet().stream()
.map(e -> String.format("%s-%s", e.getKey(), myPrint(e.getValue())))
.collect(Collectors.toList());
public static String myPrint(List<String> in) {
// write your own algorithm here
}
Stream API just isn't the right tool to replace that myPrint method.
A third alternative is that you want to smear out the map; you want each string in a mapvalue's List<String> to first be matched with the key (so that's re-stating that key rather a lot), and then do something to that. NOW flatMap IS appropriate - you want a stream of k/v pairs first, and then do something to that, and each element is now of the same kind. You want to turn the map:
key1 = [value1, value2]
key2 = [value3, value4]
first into a stream:
key1:value1
key1:value2
key2:value3
key2:value4
and take it from there. This explodes a single k/v entry in your map into more than one, thus, flatmapping needed:
return map.entrySet().stream()
.flatMap(e -> e.getValue().stream()
.map(v -> String.format("%s-%s", e.getKey(), v))
.collect(Collectors.toList());
Going inside-out, it maps a single entry within a list that belongs to a single k/v pair into the string Key-SingleItemFromItsList.
Adding my two cents to excellent answer by #rzwitserloot. Already flatmap and map is explained in his answer.
List<String> resultLists = myMap.entrySet().stream()
.flatMap(mapEntry -> printEntries(mapEntry.getKey(),mapEntry.getValue())).collect(Collectors.toList());
System.out.println(resultLists);
Splitting this to a separate method gives good readability IMO,
private static Stream<String> printEntries(String key, List<String> values) {
return values.stream().map(val -> String.format("%s-%s",key,val));
}

How can I concatenate two Java streams while specifying specific logic for items that don't contain duplicates?

I have two maps that use the same object as keys. I want to merge these two streams by key. When a key exists in both maps, I want the resulting map to run a formula. When a key exists in a single map I want the value to be 0.
Map<MyKey, Integer> map1;
Map<MyKey, Integer> map2;
<Map<MyKey, Double> result =
Stream.concat(map1.entrySet().stream(), map2.entrySet().stream())
.collect(Collectors.toMap(
Map.Entry::getKey, Map.Entry::getValue,
(val1, val2) -> (val1 / (double)val2) * 12D));
This will use the formula if the key exists in both maps, but I need an easy way to set the values for keys that only existed in one of the two maps to 0D.
I can do this by doing set math and trying to calculate the inner-join of the two keySets, and then subtracting the inner-join result from the full outer join of them... but this is a lot of work that feels unnecessary.
Is there a better approach to this, or something I can easily do using the Streaming API?
Here is a simple way, only stream the keys, and then looking up the values, and leaving the original maps unchanged.
Map<String, Double> result =
Stream.concat(map1.keySet().stream(), map2.keySet().stream())
.distinct()
.collect(Collectors.toMap(k -> k, k -> map1.containsKey(k) && map2.containsKey(k)
? map1.get(k) * 12d / map2.get(k) : 0d));
Test
Map<String, Integer> map1 = new HashMap<>();
Map<String, Integer> map2 = new HashMap<>();
map1.put("A", 1);
map1.put("B", 2);
map2.put("A", 3);
map2.put("C", 4);
// code above here
result.entrySet().forEach(System.out::println);
Output
A=4.0
B=0.0
C=0.0
For this solution to work, your initial maps should be Map<MyKey, Double>. I'll try to find another solution that will work if the values are initially Integer.
You don't even need streams for this! You should simply be able to use Map#replaceAll to modify one of the Maps:
map1.replaceAll((k, v) -> map2.containsKey(k) ? 12D * v / map2.get(k) : 0D);
Now, you just need to add every key to map1 that is in map2, but not map1:
map2.forEach((k, v) -> map1.putIfAbsent(k, 0D));
If you don't want to modify either of the Maps, then you should create a deep copy of map1 first.
Stream.concat is not the right approach here, as you are throwing the elements of the two map together, creating the need to separate them afterward.
You can simplify this by directly doing the intended task of processing the intersection of the keys by applying your function and processing the other keys differently. E.g. when you stream over one map instead of the concatenation of two maps, you only have to check for the presence in the other map to either, apply the function or use zero. Then, the keys only present in the second map need to be put with zero in a second step:
Map<MyKey, Double> result = map1.entrySet().stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(Map.Entry::getKey, e -> {
Integer val2 = map2.get(e.getKey());
return val2==null? 0.0: e.getValue()*12.0/val2;
}),
m -> {
Map<MyKey, Double> rMap = m.getClass()==HashMap.class? m: new HashMap<>(m);
map2.keySet().forEach(key -> rMap.putIfAbsent(key, 0.0));
return rMap;
}));
This clearly suffers from the fact that Streams don’t offer convenience methods for processing map entries. Also, we have to deal with the unspecified map type for the second processing step. If we provided a map supplier, we also had to provide a merge function, making the code even more verbose.
The simpler solution is to use the Collection API rather than the Stream API:
Map<MyKey, Double> result = new HashMap<>(Math.max(map1.size(),map2.size()));
map2.forEach((key, value) -> result.put(key, map1.getOrDefault(key, 0)*12D/value));
map1.keySet().forEach(key -> result.putIfAbsent(key, 0.0));
This is clearly less verbose and potentially more efficient as it omits some of the Stream solution’s processing steps and provides the right initial capacity to the map. It utilizes the fact that the formula evaluates to the desired zero result if we use zero as default for the first map’s value for absent keys. If you want to use a different formula which doesn’t have this property or want to avoid the calculation for absent mappings, you’d have to use
Map<MyKey, Double> result = new HashMap<>(Math.max(map1.size(),map2.size()));
map2.forEach((key, value2) -> {
Integer value1 = map1.get(key);
result.put(key, value1 != null? value1*12D/value2: 0.0);
});
map1.keySet().forEach(key -> result.putIfAbsent(key, 0.0));

Java Stream collect - how to deduce type?

I have been given a stream of words, Stream<String> words, and a class Pair<String,Integer> which realizes a simple tuple for (someString, someInt) with getter and setter methods for both elements called getFirst,setFirst,getSecond,setSecond.
I am now supposed to box each word of the stream into a Pair (word, 1), and then use a Collector to somehow make the whole thing tell me how often each word is in the text. Now I've looked up a Collector that should let me do what I want to, and passed it as .collect(...) to the stream.
But the whole thing is looking so complex, and the type inference and deduction and wildcards that are floating around in that topic aren't making it any easier, so that I got now no clue, just what it is I've created.
I've tried deducing it from the API, and tried all the things I could come up with, but none of it seems to match:
words
.map(x -> new Pair<String,Integer>(x,1))
.collect(Collectors.groupingBy(
x -> x.getFirst(),
Collectors.reducing(
(a,b) -> new Pair<String,Integer>(a.getFirst(), a.getSecond() + b.getSecond())
)
));
Try using Collectors.toMap:
Collection<Pair<String, Integer>> values = words.collect(Collectors.toMap(
Function.identity(),
s -> new Pair<>(s, 1),
(a, b) -> {a.setSecond(a.getSecond() + b.getSecond()); return a;}
)).values();
It creates a map from your stream, using provided:
keyMapper - a mapping function to produce keys
valueMapper - a mapping function to produce values
mergeFunction - a merge function, used to resolve collisions between values associated with the same key
So it groups your Pairs by string value to a map, and then you just call .values() to get a collection of Pairs
The easiest (though not necessarily most efficient) solution would be to group to a map and then convert the entries to pairs:
List<Pair<String, Integer>> pairs = words
.collect(Collectors.groupingBy(x -> x, Collectors.summingInt(x -> 1)))
.entrySet()
.stream()
.map(e -> new Pair(e.getKey(), e.getValue()))
.collect(Collectors.toList());
I agree that entering the world of collectors can be a bit frightening at the beginning, particularly if you need to deal with generic type parameters.
There are many ways to solve your problem, both with and without streams.
With streams:
Map<String, Pair<String, Integer>> map = words.stream()
.collect(Collectors.toMap(
word -> word,
word -> new Pair<>(word, 1),
(o, n) -> {
o.setSecond(o.getSecond() + n.getSecond());
return o;
}));
Collection<Pair<String, Integer>> result = map.values();
Collectors.toMap works by transforming each element of the stream into the keys (this is the 1st argument word -> word, which means we leave the word as is, so that it will be the key of the map), and by transforming each element of the stream into the values (this is the 2nd argument word -> new Pair<>(word, 1), which means that we've found the word for the first time, so we're creating a new Pair instance for that word with a count of 1).
The 3rd argument is a merge function that is to be used to merge values when the 1st argument returns a key that already belongs to the map. As maps can't have more than one entry for the same key, we need a way to merge the value that is already in the map for that key, with the new value produced by the 2nd argument. In this case, o stands for the old value and n for the new value. The way I merge values is by summing the counts for the word and setting the new count in the Pair instance that corresponds to the old value. There's no need to create a new instance of Pair with the word and the new count, as it's safe to accumulate the count by mutating the old instance of Pair.
Without streams:
Map<String, Pair<String, Integer>> map = new HashMap<>();
words.forEach(word -> map.merge(
word,
new Pair<>(word, 1),
(o, n) -> {
o.setSecond(o.getSecond() + n.getSecond());
return o;
}));
Collection<Pair<String, Integer>> result = map.values();
This uses Map.merge and has similar semantics as the previous code.

How to use if-else logic in Java 8 stream forEach

What I want to do is shown below in 2 stream calls. I want to split a collection into 2 new collections based on some condition. Ideally I want to do it in 1. I've seen conditions used for the .map function of streams, but couldn't find anything for the forEach. What is the best way to achieve what I want?
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() != null)
.forEach(pair-> myMap.put(pair.getKey(), pair.getValue()));
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() == null)
.forEach(pair-> myList.add(pair.getKey()));
Just put the condition into the lambda itself, e.g.
animalMap.entrySet().stream()
.forEach(
pair -> {
if (pair.getValue() != null) {
myMap.put(pair.getKey(), pair.getValue());
} else {
myList.add(pair.getKey());
}
}
);
Of course, this assumes that both collections (myMap and myList) are declared and initialized prior to the above piece of code.
Update: using Map.forEach makes the code shorter, plus more efficient and readable, as Jorn Vernee kindly suggested:
animalMap.forEach(
(key, value) -> {
if (value != null) {
myMap.put(key, value);
} else {
myList.add(key);
}
}
);
In most cases, when you find yourself using forEach on a Stream, you should rethink whether you are using the right tool for your job or whether you are using it the right way.
Generally, you should look for an appropriate terminal operation doing what you want to achieve or for an appropriate Collector. Now, there are Collectors for producing Maps and Lists, but no out of-the-box collector for combining two different collectors, based on a predicate.
Now, this answer contains a collector for combining two collectors. Using this collector, you can achieve the task as
Pair<Map<KeyType, Animal>, List<KeyType>> pair = animalMap.entrySet().stream()
.collect(conditional(entry -> entry.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue),
Collectors.mapping(Map.Entry::getKey, Collectors.toList()) ));
Map<KeyType,Animal> myMap = pair.a;
List<KeyType> myList = pair.b;
But maybe, you can solve this specific task in a simpler way. One of you results matches the input type; it’s the same map just stripped off the entries which map to null. If your original map is mutable and you don’t need it afterwards, you can just collect the list and remove these keys from the original map as they are mutually exclusive:
List<KeyType> myList=animalMap.entrySet().stream()
.filter(pair -> pair.getValue() == null)
.map(Map.Entry::getKey)
.collect(Collectors.toList());
animalMap.keySet().removeAll(myList);
Note that you can remove mappings to null even without having the list of the other keys:
animalMap.values().removeIf(Objects::isNull);
or
animalMap.values().removeAll(Collections.singleton(null));
If you can’t (or don’t want to) modify the original map, there is still a solution without a custom collector. As hinted in Alexis C.’s answer, partitioningBy is going into the right direction, but you may simplify it:
Map<Boolean,Map<KeyType,Animal>> tmp = animalMap.entrySet().stream()
.collect(Collectors.partitioningBy(pair -> pair.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)));
Map<KeyType,Animal> myMap = tmp.get(true);
List<KeyType> myList = new ArrayList<>(tmp.get(false).keySet());
The bottom line is, don’t forget about ordinary Collection operations, you don’t have to do everything with the new Stream API.
The problem by using stream().forEach(..) with a call to add or put inside the forEach (so you mutate the external myMap or myList instance) is that you can run easily into concurrency issues if someone turns the stream in parallel and the collection you are modifying is not thread safe.
One approach you can take is to first partition the entries in the original map. Once you have that, grab the corresponding list of entries and collect them in the appropriate map and list.
Map<Boolean, List<Map.Entry<K, V>>> partitions =
animalMap.entrySet()
.stream()
.collect(partitioningBy(e -> e.getValue() == null));
Map<K, V> myMap =
partitions.get(false)
.stream()
.collect(toMap(Map.Entry::getKey, Map.Entry::getValue));
List<K> myList =
partitions.get(true)
.stream()
.map(Map.Entry::getKey)
.collect(toList());
... or if you want to do it in one pass, implement a custom collector (assuming a Tuple2<E1, E2> class exists, you can create your own), e.g:
public static <K,V> Collector<Map.Entry<K, V>, ?, Tuple2<Map<K, V>, List<K>>> customCollector() {
return Collector.of(
() -> new Tuple2<>(new HashMap<>(), new ArrayList<>()),
(pair, entry) -> {
if(entry.getValue() == null) {
pair._2.add(entry.getKey());
} else {
pair._1.put(entry.getKey(), entry.getValue());
}
},
(p1, p2) -> {
p1._1.putAll(p2._1);
p1._2.addAll(p2._2);
return p1;
});
}
with its usage:
Tuple2<Map<K, V>, List<K>> pair =
animalMap.entrySet().parallelStream().collect(customCollector());
You can tune it more if you want, for example by providing a predicate as parameter.
I think it's possible in Java 9:
animalMap.entrySet().stream()
.forEach(
pair -> Optional.ofNullable(pair.getValue())
.ifPresentOrElse(v -> myMap.put(pair.getKey(), v), v -> myList.add(pair.getKey())))
);
Need the ifPresentOrElse for it to work though. (I think a for loop looks better.)

Categories

Resources