What I want to do is shown below in 2 stream calls. I want to split a collection into 2 new collections based on some condition. Ideally I want to do it in 1. I've seen conditions used for the .map function of streams, but couldn't find anything for the forEach. What is the best way to achieve what I want?
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() != null)
.forEach(pair-> myMap.put(pair.getKey(), pair.getValue()));
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() == null)
.forEach(pair-> myList.add(pair.getKey()));
Just put the condition into the lambda itself, e.g.
animalMap.entrySet().stream()
.forEach(
pair -> {
if (pair.getValue() != null) {
myMap.put(pair.getKey(), pair.getValue());
} else {
myList.add(pair.getKey());
}
}
);
Of course, this assumes that both collections (myMap and myList) are declared and initialized prior to the above piece of code.
Update: using Map.forEach makes the code shorter, plus more efficient and readable, as Jorn Vernee kindly suggested:
animalMap.forEach(
(key, value) -> {
if (value != null) {
myMap.put(key, value);
} else {
myList.add(key);
}
}
);
In most cases, when you find yourself using forEach on a Stream, you should rethink whether you are using the right tool for your job or whether you are using it the right way.
Generally, you should look for an appropriate terminal operation doing what you want to achieve or for an appropriate Collector. Now, there are Collectors for producing Maps and Lists, but no out of-the-box collector for combining two different collectors, based on a predicate.
Now, this answer contains a collector for combining two collectors. Using this collector, you can achieve the task as
Pair<Map<KeyType, Animal>, List<KeyType>> pair = animalMap.entrySet().stream()
.collect(conditional(entry -> entry.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue),
Collectors.mapping(Map.Entry::getKey, Collectors.toList()) ));
Map<KeyType,Animal> myMap = pair.a;
List<KeyType> myList = pair.b;
But maybe, you can solve this specific task in a simpler way. One of you results matches the input type; it’s the same map just stripped off the entries which map to null. If your original map is mutable and you don’t need it afterwards, you can just collect the list and remove these keys from the original map as they are mutually exclusive:
List<KeyType> myList=animalMap.entrySet().stream()
.filter(pair -> pair.getValue() == null)
.map(Map.Entry::getKey)
.collect(Collectors.toList());
animalMap.keySet().removeAll(myList);
Note that you can remove mappings to null even without having the list of the other keys:
animalMap.values().removeIf(Objects::isNull);
or
animalMap.values().removeAll(Collections.singleton(null));
If you can’t (or don’t want to) modify the original map, there is still a solution without a custom collector. As hinted in Alexis C.’s answer, partitioningBy is going into the right direction, but you may simplify it:
Map<Boolean,Map<KeyType,Animal>> tmp = animalMap.entrySet().stream()
.collect(Collectors.partitioningBy(pair -> pair.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)));
Map<KeyType,Animal> myMap = tmp.get(true);
List<KeyType> myList = new ArrayList<>(tmp.get(false).keySet());
The bottom line is, don’t forget about ordinary Collection operations, you don’t have to do everything with the new Stream API.
The problem by using stream().forEach(..) with a call to add or put inside the forEach (so you mutate the external myMap or myList instance) is that you can run easily into concurrency issues if someone turns the stream in parallel and the collection you are modifying is not thread safe.
One approach you can take is to first partition the entries in the original map. Once you have that, grab the corresponding list of entries and collect them in the appropriate map and list.
Map<Boolean, List<Map.Entry<K, V>>> partitions =
animalMap.entrySet()
.stream()
.collect(partitioningBy(e -> e.getValue() == null));
Map<K, V> myMap =
partitions.get(false)
.stream()
.collect(toMap(Map.Entry::getKey, Map.Entry::getValue));
List<K> myList =
partitions.get(true)
.stream()
.map(Map.Entry::getKey)
.collect(toList());
... or if you want to do it in one pass, implement a custom collector (assuming a Tuple2<E1, E2> class exists, you can create your own), e.g:
public static <K,V> Collector<Map.Entry<K, V>, ?, Tuple2<Map<K, V>, List<K>>> customCollector() {
return Collector.of(
() -> new Tuple2<>(new HashMap<>(), new ArrayList<>()),
(pair, entry) -> {
if(entry.getValue() == null) {
pair._2.add(entry.getKey());
} else {
pair._1.put(entry.getKey(), entry.getValue());
}
},
(p1, p2) -> {
p1._1.putAll(p2._1);
p1._2.addAll(p2._2);
return p1;
});
}
with its usage:
Tuple2<Map<K, V>, List<K>> pair =
animalMap.entrySet().parallelStream().collect(customCollector());
You can tune it more if you want, for example by providing a predicate as parameter.
I think it's possible in Java 9:
animalMap.entrySet().stream()
.forEach(
pair -> Optional.ofNullable(pair.getValue())
.ifPresentOrElse(v -> myMap.put(pair.getKey(), v), v -> myList.add(pair.getKey())))
);
Need the ifPresentOrElse for it to work though. (I think a for loop looks better.)
Related
I am trying to rewrite the method below using streams but I am not sure what the best approach is? If I use flatMap on the values of the entrySet(), I lose the reference to the current key.
private List<String> asList(final Map<String, List<String>> map) {
final List<String> result = new ArrayList<>();
for (final Entry<String, List<String>> entry : map.entrySet()) {
final List<String> values = entry.getValue();
values.forEach(value -> result.add(String.format("%s-%s", entry.getKey(), value)));
}
return result;
}
The best I managed to do is the following:
return map.keySet().stream()
.flatMap(key -> map.get(key).stream()
.map(value -> new AbstractMap.SimpleEntry<>(key, value)))
.map(e -> String.format("%s-%s", e.getKey(), e.getValue()))
.collect(Collectors.toList());
Is there a simpler way without resorting to creating new Entry objects?
A stream is a sequence of values (possibly unordered / parallel). map() is what you use when you want to map a single value in the sequence to some single other value. Say, map "alturkovic" to "ALTURKOVIC". flatMap() is what you use when you want to map a single value in the sequence to 0, 1, or many other values. Hence why a flatMap lambda needs to turn a value into a stream of values. flatMap can thus be used to take, say, a list of lists of string, and turn that into a stream of just strings.
Here, you want to map a single entry from your map (a single key/value pair) into a single element (a string describing it). 1 value to 1 value. That means flatMap is not appropriate. You're looking for just map.
Furthermore, you need both key and value to perform your mapping op, so, keySet() is also not appropriate. You're looking for entrySet(), which gives you a set of all k/v pairs, juts what we need.
That gets us to:
map.entrySet().stream()
.map(e -> String.format("%s-%s", e.getKey(), e.getValue()))
.collect(Collectors.toList());
Your original code makes no effort to treat a single value from a map (which is a List<String>) as separate values; you just call .toString() on the entire ordeal, and be done with it. This means the produced string looks like, say, [Hello, World] given a map value of List.of("Hello", "World"). If you don't want this, you still don't want flatmap, because streams are also homogenous - the values in a stream are all of the same kind, and thus a stream of 'key1 value1 value2 key2 valueA valueB' is not what you'd want:
map.entrySet().stream()
.map(e -> String.format("%s-%s", e.getKey(), myPrint(e.getValue())))
.collect(Collectors.toList());
public static String myPrint(List<String> in) {
// write your own algorithm here
}
Stream API just isn't the right tool to replace that myPrint method.
A third alternative is that you want to smear out the map; you want each string in a mapvalue's List<String> to first be matched with the key (so that's re-stating that key rather a lot), and then do something to that. NOW flatMap IS appropriate - you want a stream of k/v pairs first, and then do something to that, and each element is now of the same kind. You want to turn the map:
key1 = [value1, value2]
key2 = [value3, value4]
first into a stream:
key1:value1
key1:value2
key2:value3
key2:value4
and take it from there. This explodes a single k/v entry in your map into more than one, thus, flatmapping needed:
return map.entrySet().stream()
.flatMap(e -> e.getValue().stream()
.map(v -> String.format("%s-%s", e.getKey(), v))
.collect(Collectors.toList());
Going inside-out, it maps a single entry within a list that belongs to a single k/v pair into the string Key-SingleItemFromItsList.
Adding my two cents to excellent answer by #rzwitserloot. Already flatmap and map is explained in his answer.
List<String> resultLists = myMap.entrySet().stream()
.flatMap(mapEntry -> printEntries(mapEntry.getKey(),mapEntry.getValue())).collect(Collectors.toList());
System.out.println(resultLists);
Splitting this to a separate method gives good readability IMO,
private static Stream<String> printEntries(String key, List<String> values) {
return values.stream().map(val -> String.format("%s-%s",key,val));
}
I have a class with a collection of Seed elements. One of the method's return type of Seed is Optional<Pair<Boolean, Boolean>>.
I'm trying to loop over all seeds, keeping the return type (Optional<Pair<Boolean, Boolean>>), but I would like to be able to say if there was at least true value (in any of the Pairs) and override the result with it. Basically, if the collection is (skipping the Optional wrapper to make things simpler): [Pair<false, false>, Pair<false, true>, Pair<false, false>] I would like to return and Optional of Pair<false, true> because the second element had true. In the end, I'm interested if there was a true value and that's about it.
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
return seeds.stream()
.map(Seed::hadExposure)
...
}
I was playing with reduce but couldn't come up with anything useful.
My question is related with Java streams directly. I can easily do this with a for loop, but I aimed initially for streams.
Straighforward
Since you're Java 11, you can use Optional::stream (introduced in Java 9) to get rid of the Optional wrapper. As a terminal operation, reduce is your friend:
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
// wherever the seeds come from
Stream<Optional<Pair<Boolean, Boolean>>> seeds = seeds();
return seeds
.flatMap(Optional::stream)
.reduce((pair1, pair2) -> new Pair<>(
pair1.left() || pair2.left(),
pair1.right() || pair2.right())
);
}
Extended
If you want to go a step further and give your Pair a general way to be folded with another Pair into a new instance, you can make the code a bit more expressive:
public class Pair<LEFT, RIGHT> {
private final LEFT left;
private final RIGHT right;
// constructor, equals, hashCode, toString, ...
public Pair<LEFT, RIGHT> fold(
Pair<LEFT, RIGHT> other,
BinaryOperator<LEFT> combineLeft,
BinaryOperator<RIGHT> combineRight) {
return new Pair<>(
combineLeft.apply(left, other.left),
combineRight.apply(right, other.right));
}
}
// now you can use fold and Boolean::logicalOr
// https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Boolean.html#logicalOr(boolean,boolean)
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
Stream<Optional<Pair<Boolean, Boolean>>> seeds = seeds();
return seeds
.flatMap(Optional::stream)
.reduce((pair1, pair2) -> pair1
.fold(pair2, Boolean::logicalOr, Boolean::logicalOr))
);
}
I probably wouldn't create Pair::fold just for this use case, but I would be tempted. ;)
Your thoughts on reduce look like the right way to go, using || to reduce both sides of each Pair together. (Not exactly sure what your Optional semantics are, so going to filter out empty ones here and that might get what you want, but you may need to adjust):
Optional<Pair<Boolean, Boolean>> result = seeds.stream().map(Seed::hadExposure)
.filter(Optional::isPresent)
.map(Optional::get)
.reduce((a, b) -> new Pair<>(a.first || b.first, a.second || b.second));
As you've tagged this question with java-11, you can make use of the Optional.stream method:
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
return Optional.of(
seeds.stream()
.flatMap(seed -> seed.hadExposure().stream())
.collect(
() -> new Pair<Boolean, Boolean>(false, false),
(p, seed) -> {
p.setLeft(p.getLeft() || seed.getLeft());
p.setRight(p.getRight() || seed.getRight());
},
(p1, p2) -> {
p1.setLeft(p1.getLeft() || p2.getLeft());
p1.setRight(p1.getRight() || p2.getRight());
}));
}
This first gets rid of the Optional by means of the Optional.stream method (keeping just the pairs) and then uses Stream.collect to mutably reduce the pairs by means of the OR associative operation.
Note: using Stream.reduce would also work, but it would create a lot of unnecessary intermediate pairs. That's why I've used Stream.collect instead.
using Collectors.partitioningBy you can get a Map with boolean keys after that you can easily retrieve values indexed with the key true
Optional<Pair<Boolean, Boolean>> collect = Arrays.asList(pair1, pair2, par3).stream()
.filter(Optional::isPresent)
.map(Optional::get)
.collect(Collectors.collectingAndThen(Collectors.partitioningBy(p -> p.getFirst() == true || p.getSecond() == true),
m -> m.get(true).stream().findAny()));
Let's say I have some stream and want to collect to map like this
stream.collect(Collectors.toMap(this::func1, this::func2));
But I want to skip null keys/values. Of course, I can do like this
stream.filter(t -> func1(t) != null)
.filter(t -> func2(t) != null)
.collect(Collectors.toMap(this::func1, this::func2));
But is there more beautiful/effective solution?
If you want to avoid evaluating the functions func1 and func2 twice, you have to store the results. E.g.
stream.map(t -> new AbstractMap.SimpleImmutableEntry<>(func1(t), func2(t))
.filter(e -> e.getKey()!=null && e.getValue()!=null)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
This doesn’t make the code shorter and even the efficiency depends on the circumstances. This change pays off, if the costs of evaluating the func1 and func2 are high enough to compensate the creation of temporary objects. In principle, the temporary object could get optimized away, but this isn’t guaranteed.
Starting with Java 9, you can replace new AbstractMap.SimpleImmutableEntry<>(…) with Map.entry(…). Since this entry type disallows null right from the start, it would need filtering before constructing the entry:
stream.flatMap(t -> {
Type1 value1 = func1(t);
Type2 value2 = func2(t);
return value1!=null && value2!=null? Stream.of(Map.entry(value1, value2)): null;
})
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
Alternatively, you may use a pair type of one of the libraries you’re already using (the Java API itself doesn’t offer such a type).
Another way to avoid evaluating the functions twice. Use a pair class of your choice. Not as concise as Holger's but it's a little less dense which can be easier to read.
stream.map(A::doFuncs)
.flatMap(Optional::stream)
.collect(Collectors.toMap(Pair::getKey, Pair::getValue));
private static Optional<Pair<Bar, Baz>> doFuncs(Foo foo)
{
final Bar bar = func1(foo);
final Baz baz = func2(foo);
if (bar == null || baz == null) return Optional.empty();
return Optional.of(new Pair<>(bar, baz));
}
(Choose proper names - I didn't know what types you were using)
One option is to do as in the other answers, i.e. use a Pair type, or an implementation of Map.Entry. Another approach used in functional programming would be to memoize the functions. According to Wikipedia:
memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
So you could do it by caching the results of the functions in maps:
public static <K, V> Function<K, V> memoize(Function<K, V> f) {
Map<K, V> map = new HashMap<>();
return k -> map.computeIfAbsent(k, f);
}
Then, use the memoized functions in the stream:
Function<E, K> memoizedFunc1 = memoize(this::func1);
Function<E, V> memoizedFunc2 = memoize(this::func2);
stream.filter(t -> memoizedFunc1.apply(t) != null)
.filter(t -> memoizedFunc2.apply(t) != null)
.collect(Collectors.toMap(memoizedFunc1, memoizedFunc2));
Here E stands for the type of the elements of the stream, K stands for the type returned by func1 (which is the type of the keys of the map) and V stands for the type returned by func2 (which is the type of the values of the map).
This is a naive solution, but does not call functions twice and does not create extra objects:
List<Integer> ints = Arrays.asList(1, null, 2, null, 3);
Map<Integer, Integer> res = ints.stream().collect(LinkedHashMap::new, (lhm, i) -> {
final Integer integer1 = func1(i);
final Integer integer2 = func2(i);
if(integer1 != null && integer2 != null) {
lhm.put(integer1, integer2);
}
}, (lhm1, lhm2) -> {});
You could create a isFunc1AndFunc2NotNull() method in the current class :
boolean isFunc1AndFunc2NotNull(Foo foo){
return func1(foo) != null && func2(foo) != null;
}
And change your stream as :
stream.filter(this::isFunc1AndFunc2NotNull)
.collect(Collectors.toMap(this::func1, this::func2));
Imagine that I have the following working lambda expression:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.filter(f -> f.isAnnotationPresent(Column.class))
.collect(toMap(f -> {
f.setAccessible(true);
return f;
}, f -> f.getAnnotation(Column.class).name()));
I would like to create a stream with 2 values before the filter statement. So I want to do a mapping but still keep the original value aside from it. I want to achieve something like this:
this.fields = Arrays.stream(resultClass.getDeclaredFields())
//map to <Field, Annotation> stream
.filter((f, a) -> a != null)
.collect(toMap(f -> {
f.setAccessible(true);
return f;
}, f -> a.name()));
Is this possible with Java 8 streams? I have looked at collect(groupingBy()) but still without succes.
You need something like a Pair that holds two values. You can write your own, but here is some code that repurposes AbstractMap.SimpleEntry:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.map(f -> new AbstractMap.SimpleEntry<>(f, f.getAnnotation(Column.class)))
.filter(entry -> entry.getValue() != null)
.peek(entry -> entry.getKey().setAccessible(true))
.collect(toMap(Map.Entry::getKey, entry -> entry.getValue().name()));
You can do the entire operation in one go during the collect operation without the need of a pair type:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.collect(HashMap::new, (m,f) -> {
Column c=f.getAnnotation(Column.class);
if(c!=null) {
f.setAccessible(true);
m.put(f, c.name());
}
}, Map::putAll);
Still, to me it looks cleaner to separate the two operations which do not becong together:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.collect(HashMap::new, (m,f) -> {
Column c=f.getAnnotation(Column.class);
if(c!=null) m.put(f,c.name());
}, Map::putAll);
AccessibleObject.setAccessible(
fields.keySet().stream().toArray(AccessibleObject[]::new), true);
This solution does iterate twice over the fields having the annotation, but since this performs only one security check rather than one check per field, it might still outperform all other solutions.
Generally, you shouldn’t try to optimize unless there really is a performance problem and if you do it, you should measure, not guess about the costs of the operations. The results might be surprising and iterating multiple times over a data set is not necessarily bad.
#Peter Lawrey: I tried your suggestion with an intermediary map. It works now but it is not really pretty.
this.fields = Arrays.stream(resultClass.getDeclaredFields())
.collect(HashMap<Field, Column>::new, (map, f) -> map.put(f, f.getAnnotation(Column.class)), HashMap::putAll)
.entrySet().stream()
.filter(entry -> entry.getValue() != null)
.peek(entry -> entry.getKey().setAccessible(true))
.collect(toMap(Map.Entry::getKey, entry -> entry.getValue().name()));
I have a Stream<String>, and I want a Map<Integer, String>. Let's call my classifier function getKey(String) - it can be expensive. Sometimes it returns zero, which means that the String should be discarded and not included in the resulting map.
So, I can use this code:
Stream<String> stringStream;
Map<Integer, String> result =
stringStream.collect(Collectors.groupingBy(this::getKey, Collectors.joining());
result.remove(0);
This first adds the unwanted Strings to the Map keyed by zero, and then removes them. There may be a lot of them. Is there an elegant way to avoid adding them to the map in the first place?
I don't want to add a filter step before grouping, because that would mean executing the decision/classification code twice.
You said that calling getKey is expensive, but you could still map the elements of the stream up-front before filtering them. The call to getKey will be only done once in this case.
Map<Integer, String> result =
stringStream.map(s -> new SimpleEntry<>(this.getKey(s), s))
.filter(e -> e.getKey() != 0)
.collect(groupingBy(Map.Entry::getKey, mapping(Map.Entry::getValue, joining())));
Note that there is no tuple classes in the standard API. You may roll your own one or use AbstractMap.SimpleEntry as a substitute.
Alternatively, if you think the first version creates a lot of entries, you can use the collect method where you provide yourself the supplier, accumulator and combiner.
Map<Integer, String> result = stringStream
.collect(HashMap::new,
(m, e) -> {
Integer key = this.getKey(e);
if(key != 0) {
m.merge(key, e, String::concat);
}
},
Map::putAll);
You may use a stream of pairs like this:
stringStream.map(x -> new Pair(getKey(x), x))
.filter(pair -> pair.left != 0) // or whatever predicate
.collect(Collectors.groupingBy(pair -> pair.left,
Collectors.mapping(pair -> pair.right, Collectors.joining())));
This code assumes simple Pair class with two fields left and right.
Some third-party libraries like my StreamEx provide additional methods to remove the boilerplate:
StreamEx.of(stringStream)
.mapToEntry(this::getKey, x -> x)
.filterKeys(key -> key != 0) // or whatever
.grouping(Collectors.joining());