Collect to map skipping null key/values

Collect to map skipping null key/values - java

Let's say I have some stream and want to collect to map like this
stream.collect(Collectors.toMap(this::func1, this::func2));
But I want to skip null keys/values. Of course, I can do like this
stream.filter(t -> func1(t) != null)
.filter(t -> func2(t) != null)
.collect(Collectors.toMap(this::func1, this::func2));
But is there more beautiful/effective solution?

If you want to avoid evaluating the functions func1 and func2 twice, you have to store the results. E.g.
stream.map(t -> new AbstractMap.SimpleImmutableEntry<>(func1(t), func2(t))
.filter(e -> e.getKey()!=null && e.getValue()!=null)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
This doesn’t make the code shorter and even the efficiency depends on the circumstances. This change pays off, if the costs of evaluating the func1 and func2 are high enough to compensate the creation of temporary objects. In principle, the temporary object could get optimized away, but this isn’t guaranteed.
Starting with Java 9, you can replace new AbstractMap.SimpleImmutableEntry<>(…) with Map.entry(…). Since this entry type disallows null right from the start, it would need filtering before constructing the entry:
stream.flatMap(t -> {
Type1 value1 = func1(t);
Type2 value2 = func2(t);
return value1!=null && value2!=null? Stream.of(Map.entry(value1, value2)): null;
})
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
Alternatively, you may use a pair type of one of the libraries you’re already using (the Java API itself doesn’t offer such a type).

Another way to avoid evaluating the functions twice. Use a pair class of your choice. Not as concise as Holger's but it's a little less dense which can be easier to read.
stream.map(A::doFuncs)
.flatMap(Optional::stream)
.collect(Collectors.toMap(Pair::getKey, Pair::getValue));
private static Optional<Pair<Bar, Baz>> doFuncs(Foo foo)
{
final Bar bar = func1(foo);
final Baz baz = func2(foo);
if (bar == null || baz == null) return Optional.empty();
return Optional.of(new Pair<>(bar, baz));
}
(Choose proper names - I didn't know what types you were using)

One option is to do as in the other answers, i.e. use a Pair type, or an implementation of Map.Entry. Another approach used in functional programming would be to memoize the functions. According to Wikipedia:
memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
So you could do it by caching the results of the functions in maps:
public static <K, V> Function<K, V> memoize(Function<K, V> f) {
Map<K, V> map = new HashMap<>();
return k -> map.computeIfAbsent(k, f);
}
Then, use the memoized functions in the stream:
Function<E, K> memoizedFunc1 = memoize(this::func1);
Function<E, V> memoizedFunc2 = memoize(this::func2);
stream.filter(t -> memoizedFunc1.apply(t) != null)
.filter(t -> memoizedFunc2.apply(t) != null)
.collect(Collectors.toMap(memoizedFunc1, memoizedFunc2));
Here E stands for the type of the elements of the stream, K stands for the type returned by func1 (which is the type of the keys of the map) and V stands for the type returned by func2 (which is the type of the values of the map).

This is a naive solution, but does not call functions twice and does not create extra objects:
List<Integer> ints = Arrays.asList(1, null, 2, null, 3);
Map<Integer, Integer> res = ints.stream().collect(LinkedHashMap::new, (lhm, i) -> {
final Integer integer1 = func1(i);
final Integer integer2 = func2(i);
if(integer1 != null && integer2 != null) {
lhm.put(integer1, integer2);
}
}, (lhm1, lhm2) -> {});

You could create a isFunc1AndFunc2NotNull() method in the current class :
boolean isFunc1AndFunc2NotNull(Foo foo){
return func1(foo) != null && func2(foo) != null;
}
And change your stream as :
stream.filter(this::isFunc1AndFunc2NotNull)
.collect(Collectors.toMap(this::func1, this::func2));

Related

Accumulate values and return result in Java stream

I have a class with a collection of Seed elements. One of the method's return type of Seed is Optional<Pair<Boolean, Boolean>>.
I'm trying to loop over all seeds, keeping the return type (Optional<Pair<Boolean, Boolean>>), but I would like to be able to say if there was at least true value (in any of the Pairs) and override the result with it. Basically, if the collection is (skipping the Optional wrapper to make things simpler): [Pair<false, false>, Pair<false, true>, Pair<false, false>] I would like to return and Optional of Pair<false, true> because the second element had true. In the end, I'm interested if there was a true value and that's about it.
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
return seeds.stream()
.map(Seed::hadExposure)
...
}
I was playing with reduce but couldn't come up with anything useful.
My question is related with Java streams directly. I can easily do this with a for loop, but I aimed initially for streams.

Straighforward
Since you're Java 11, you can use Optional::stream (introduced in Java 9) to get rid of the Optional wrapper. As a terminal operation, reduce is your friend:
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
// wherever the seeds come from
Stream<Optional<Pair<Boolean, Boolean>>> seeds = seeds();
return seeds
.flatMap(Optional::stream)
.reduce((pair1, pair2) -> new Pair<>(
pair1.left() || pair2.left(),
pair1.right() || pair2.right())
);
}
Extended
If you want to go a step further and give your Pair a general way to be folded with another Pair into a new instance, you can make the code a bit more expressive:
public class Pair<LEFT, RIGHT> {
private final LEFT left;
private final RIGHT right;
// constructor, equals, hashCode, toString, ...
public Pair<LEFT, RIGHT> fold(
Pair<LEFT, RIGHT> other,
BinaryOperator<LEFT> combineLeft,
BinaryOperator<RIGHT> combineRight) {
return new Pair<>(
combineLeft.apply(left, other.left),
combineRight.apply(right, other.right));
}
}
// now you can use fold and Boolean::logicalOr
// https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Boolean.html#logicalOr(boolean,boolean)
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
Stream<Optional<Pair<Boolean, Boolean>>> seeds = seeds();
return seeds
.flatMap(Optional::stream)
.reduce((pair1, pair2) -> pair1
.fold(pair2, Boolean::logicalOr, Boolean::logicalOr))
);
}
I probably wouldn't create Pair::fold just for this use case, but I would be tempted. ;)

Your thoughts on reduce look like the right way to go, using || to reduce both sides of each Pair together. (Not exactly sure what your Optional semantics are, so going to filter out empty ones here and that might get what you want, but you may need to adjust):
Optional<Pair<Boolean, Boolean>> result = seeds.stream().map(Seed::hadExposure)
.filter(Optional::isPresent)
.map(Optional::get)
.reduce((a, b) -> new Pair<>(a.first || b.first, a.second || b.second));

As you've tagged this question with java-11, you can make use of the Optional.stream method:
public Optional<Pair<Boolean, Boolean>> hadAnyExposure() {
return Optional.of(
seeds.stream()
.flatMap(seed -> seed.hadExposure().stream())
.collect(
() -> new Pair<Boolean, Boolean>(false, false),
(p, seed) -> {
p.setLeft(p.getLeft() || seed.getLeft());
p.setRight(p.getRight() || seed.getRight());
},
(p1, p2) -> {
p1.setLeft(p1.getLeft() || p2.getLeft());
p1.setRight(p1.getRight() || p2.getRight());
}));
}
This first gets rid of the Optional by means of the Optional.stream method (keeping just the pairs) and then uses Stream.collect to mutably reduce the pairs by means of the OR associative operation.
Note: using Stream.reduce would also work, but it would create a lot of unnecessary intermediate pairs. That's why I've used Stream.collect instead.

using Collectors.partitioningBy you can get a Map with boolean keys after that you can easily retrieve values indexed with the key true
Optional<Pair<Boolean, Boolean>> collect = Arrays.asList(pair1, pair2, par3).stream()
.filter(Optional::isPresent)
.map(Optional::get)
.collect(Collectors.collectingAndThen(Collectors.partitioningBy(p -> p.getFirst() == true || p.getSecond() == true),
m -> m.get(true).stream().findAny()));

Using Java 8 stream methods to get the last max value

Given a list of items with properties, I am trying to get the last item to appear with a maximum value of said property.
For example, for the following list of objects:
t i
A: 3
D: 7 *
F: 4
C: 5
X: 7 *
M: 6
I can get one of the Things with the highest i:
Thing t = items.stream()
.max(Comparator.comparingLong(Thing::getI))
.orElse(null);
However, this will get me Thing t = D. Is there a clean and elegant way of getting the last item, i.e. X in this case?
One possible solution is using the reduce function. However, the property is calculated on the fly and it would look more like:
Thing t = items.stream()
.reduce((left, right) -> {
long leftValue = valueFunction.apply(left);
long rightValue = valueFunction.apply(right);
return leftValue > rightValue ? left : right;
})
.orElse(null);
The valueFunction now needs to be called nearly twice as often.
Other obvious roundabout solutions are:
Store the object in a Tuple with its index
Store the object in a Tuple with its computed value
Reverse the list beforehand
Don't use Streams

Remove the equals option (don't return 0 if the compared numbers are equal, return -1 instead) from the comparator (ie. write your own comparator that doesn't include an equals option):
Thing t = items.stream()
.max((a, b) -> a.getI() > b.getI() ? 1 : -1)
.orElse(null);

Conceptually, you seem to be possibly looking for something like thenComparing using the index of the elements in the list:
Thing t = items.stream()
.max(Comparator.comparingLong(Thing::getI).thenComparing(items::indexOf))
.orElse(null);

To avoid the multiple applications of valueFunction in your reduce solution, simply explicitly calculate the result and put it in a tuple:
Item lastMax = items.stream()
.map(item -> new AbstractMap.SimpleEntry<Item, Long>(item, valueFunction.apply(item)))
.reduce((l, r) -> l.getValue() > r.getValue() ? l : r )
.map(Map.Entry::getKey)
.orElse(null);

Stream is not necessary bad if you do things in two steps :
1) Find the i value that has more occurrences in the Iterable (as you did)
2) Search the last element for this i value by starting from the end of items:
Thing t =
items.stream()
.max(Comparator.comparingLong(Thing::getI))
.mapping(firstMaxThing ->
return
IntStream.rangeClosed(1, items.size())
.mapToObj(i -> items.get(items.size()-i))
.filter(item -> item.getI() == firstMaxThing.getI())
.findFirst().get();
// here get() cannot fail as *max()* returned something.
)
.orElse(null)

The valueFunction now needs to be called nearly twice as often.
Note that even when using max, the getI method will be called again and again for every comparison, not just once per element. In your example, it's called 11 times, including 6 times for D, and for longer lists, too, it seems to be called on average twice per element.
How about you just cache the calculated value directly in the Thing instance? If this is not possible, you could use an external Map and use calculateIfAbsent to calculate the value only once for each Thing and then use your approach using reduce.
Map<Thing, Long> cache = new HashMap<>();
Thing x = items.stream()
.reduce((left, right) -> {
long leftValue = cache.computeIfAbsent(left, Thing::getI);
long rightValue = cache.computeIfAbsent(right, Thing::getI);
return leftValue > rightValue ? left : right;
})
.orElse(null);
Or a bit cleaner, calculating all the values beforehand:
Map<Thing, Long> cache = items.stream()
.collect(Collectors.toMap(x -> x, Thing::getI));
Thing x = items.stream()
.reduce((left, right) -> cache.get(left) > cache.get(right) ? left : right)
.orElse(null);

You can still use the reduction to get this thing done. If t1 is larger, then only it will keep t1. In all the other cases it will keep t2. If either t2 is larger or t1 and t2 are the same, then it will eventually return t2 adhering to your requirement.
Thing t = items.stream().
reduce((t1, t2) -> t1.getI() > t2.getI() ? t1 : t2)
.orElse(null);

Your current implementation using reduce looks good, unless your value-extractor function is costly.
Considering later you may want to reuse the logic for different object types & fields, I would extract the logic itself in separate generic method/methods:
public static <T, K, V> Function<T, Map.Entry<K, V>> toEntry(Function<T, K> keyFunc, Function<T, V> valueFunc){
return t -> new AbstractMap.SimpleEntry<>(keyFunc.apply(t), valueFunc.apply(t));
}
public static <ITEM, FIELD extends Comparable<FIELD>> Optional<ITEM> maxBy(Function<ITEM, FIELD> extractor, Collection<ITEM> items) {
return items.stream()
.map(toEntry(identity(), extractor))
.max(comparing(Map.Entry::getValue))
.map(Map.Entry::getKey);
}
The code snippets above can be used like this:
Thing maxThing = maxBy(Thing::getField, things).orElse(null);
AnotherThing maxAnotherThing = maxBy(AnotherThing::getAnotherField, anotherThings).orElse(null);

Transform list to mapping using java streams

I have the following pattern repeated throughout my code:
class X<T, V>
{
V doTransform(T t) {
return null; // dummy implementation
}
Map<T, V> transform(List<T> item) {
return item.stream().map(x->new AbstractMap.SimpleEntry<>(x, doTransform(x))).collect(toMap(x->x.getKey(), x->x.getValue()));
}
}
Requiring the use of AbstractMap.SimpleEntry is messy and clunky. Linqs use of anonymous types is more elegant.
Is there a simpler way to achieve this using streams?
Thx in advance.

You can call doTransform in the value mapper:
Map<T, V> transform(List<T> item) {
return item.stream().collect(toMap(x -> x, x -> doTransform(x)));
}

Unfortunately, Java doesn't have an exact equivalent of C#'s anonymous types.
In this specific case, you don't need the intermediate map operation as #Jorn Vernee has suggested. instead, you can perform the key and value extraction in the toMap collector.
However, when it gets to cases where you think you need something as such of C#'s anonymous types you may consider:
anonymous objects (may not always be what you want depending on your use case)
Arrays.asList(...), List.of(...) (may not always be what you want depending on your use case)
an array (may not always be what you want depending on your use case)
Ultimately, If you really need to map to something that can contain two different types of elements then I'd stick with the AbstractMap.SimpleEntry.
That, said your current example can be simplified to:
Map<T, V> transform(List<T> items) {
return items.stream().collect(toMap(Function.identity(),this::doTransform));
}

In this specific example, there is no need to do the intermediate storage at all:
Map<T, V> transform(List<T> item) {
return item.stream().collect(toMap(x -> x, x -> doTransform(x)));
}
But if you need it, Java 9 offers a simpler factory method,
Map<T, V> transform(List<T> item) {
return item.stream()
.map(x -> Map.entry(x, doTransform(x)))
.collect(toMap(x -> x.getKey(), x -> x.getValue()));
}
as long as you don’t have to deal with null.
You can use an anonymous inner class here,
Map<T, V> transform(List<T> item) {
return item.stream()
.map(x -> new Object(){ T t = x; V v = doTransform(x); })
.collect(toMap(x -> x.t, x -> x.v));
}
but it’s less efficient. It’s an inner class which captures a reference to the surrounding this, also it captures x, so you have two fields, t and the synthetic one for capturing x, for the same thing.
The latter could be circumvented by using a method, e.g.
Map<T, V> transform(List<T> item) {
return item.stream()
.map(x -> new Object(){ T getKey() { return x; } V v = doTransform(x); })
.collect(toMap(x -> x.getKey(), x -> x.v));
}
But it doesn’t add to readability.
The only true anonymous types are the types generated for lambda expressions, which could be used to store information via higher order functions:
Map<T, V> transform(List<T> item) {
return item.stream()
.map(x -> capture(x, doTransform(x)))
.collect(HashMap::new, (m,f) -> f.accept(m::put), HashMap::putAll);
}
public static <A,B> Consumer<BiConsumer<A,B>> capture(A a, B b) {
return f -> f.accept(a, b);
}
but you’d soon hit the limitations of Java’s type system (it still isn’t a functional programming language) if you try this with more complex scenarios.

How to use if-else logic in Java 8 stream forEach

What I want to do is shown below in 2 stream calls. I want to split a collection into 2 new collections based on some condition. Ideally I want to do it in 1. I've seen conditions used for the .map function of streams, but couldn't find anything for the forEach. What is the best way to achieve what I want?
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() != null)
.forEach(pair-> myMap.put(pair.getKey(), pair.getValue()));
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() == null)
.forEach(pair-> myList.add(pair.getKey()));

Just put the condition into the lambda itself, e.g.
animalMap.entrySet().stream()
.forEach(
pair -> {
if (pair.getValue() != null) {
myMap.put(pair.getKey(), pair.getValue());
} else {
myList.add(pair.getKey());
}
}
);
Of course, this assumes that both collections (myMap and myList) are declared and initialized prior to the above piece of code.
Update: using Map.forEach makes the code shorter, plus more efficient and readable, as Jorn Vernee kindly suggested:
animalMap.forEach(
(key, value) -> {
if (value != null) {
myMap.put(key, value);
} else {
myList.add(key);
}
}
);

In most cases, when you find yourself using forEach on a Stream, you should rethink whether you are using the right tool for your job or whether you are using it the right way.
Generally, you should look for an appropriate terminal operation doing what you want to achieve or for an appropriate Collector. Now, there are Collectors for producing Maps and Lists, but no out of-the-box collector for combining two different collectors, based on a predicate.
Now, this answer contains a collector for combining two collectors. Using this collector, you can achieve the task as
Pair<Map<KeyType, Animal>, List<KeyType>> pair = animalMap.entrySet().stream()
.collect(conditional(entry -> entry.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue),
Collectors.mapping(Map.Entry::getKey, Collectors.toList()) ));
Map<KeyType,Animal> myMap = pair.a;
List<KeyType> myList = pair.b;
But maybe, you can solve this specific task in a simpler way. One of you results matches the input type; it’s the same map just stripped off the entries which map to null. If your original map is mutable and you don’t need it afterwards, you can just collect the list and remove these keys from the original map as they are mutually exclusive:
List<KeyType> myList=animalMap.entrySet().stream()
.filter(pair -> pair.getValue() == null)
.map(Map.Entry::getKey)
.collect(Collectors.toList());
animalMap.keySet().removeAll(myList);
Note that you can remove mappings to null even without having the list of the other keys:
animalMap.values().removeIf(Objects::isNull);
or
animalMap.values().removeAll(Collections.singleton(null));
If you can’t (or don’t want to) modify the original map, there is still a solution without a custom collector. As hinted in Alexis C.’s answer, partitioningBy is going into the right direction, but you may simplify it:
Map<Boolean,Map<KeyType,Animal>> tmp = animalMap.entrySet().stream()
.collect(Collectors.partitioningBy(pair -> pair.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)));
Map<KeyType,Animal> myMap = tmp.get(true);
List<KeyType> myList = new ArrayList<>(tmp.get(false).keySet());
The bottom line is, don’t forget about ordinary Collection operations, you don’t have to do everything with the new Stream API.

The problem by using stream().forEach(..) with a call to add or put inside the forEach (so you mutate the external myMap or myList instance) is that you can run easily into concurrency issues if someone turns the stream in parallel and the collection you are modifying is not thread safe.
One approach you can take is to first partition the entries in the original map. Once you have that, grab the corresponding list of entries and collect them in the appropriate map and list.
Map<Boolean, List<Map.Entry<K, V>>> partitions =
animalMap.entrySet()
.stream()
.collect(partitioningBy(e -> e.getValue() == null));
Map<K, V> myMap =
partitions.get(false)
.stream()
.collect(toMap(Map.Entry::getKey, Map.Entry::getValue));
List<K> myList =
partitions.get(true)
.stream()
.map(Map.Entry::getKey)
.collect(toList());
... or if you want to do it in one pass, implement a custom collector (assuming a Tuple2<E1, E2> class exists, you can create your own), e.g:
public static <K,V> Collector<Map.Entry<K, V>, ?, Tuple2<Map<K, V>, List<K>>> customCollector() {
return Collector.of(
() -> new Tuple2<>(new HashMap<>(), new ArrayList<>()),
(pair, entry) -> {
if(entry.getValue() == null) {
pair._2.add(entry.getKey());
} else {
pair._1.put(entry.getKey(), entry.getValue());
}
},
(p1, p2) -> {
p1._1.putAll(p2._1);
p1._2.addAll(p2._2);
return p1;
});
}
with its usage:
Tuple2<Map<K, V>, List<K>> pair =
animalMap.entrySet().parallelStream().collect(customCollector());
You can tune it more if you want, for example by providing a predicate as parameter.

I think it's possible in Java 9:
animalMap.entrySet().stream()
.forEach(
pair -> Optional.ofNullable(pair.getValue())
.ifPresentOrElse(v -> myMap.put(pair.getKey(), v), v -> myList.add(pair.getKey())))
);
Need the ifPresentOrElse for it to work though. (I think a for loop looks better.)

groupingBy and filter in one step

I have a Stream<String>, and I want a Map<Integer, String>. Let's call my classifier function getKey(String) - it can be expensive. Sometimes it returns zero, which means that the String should be discarded and not included in the resulting map.
So, I can use this code:
Stream<String> stringStream;
Map<Integer, String> result =
stringStream.collect(Collectors.groupingBy(this::getKey, Collectors.joining());
result.remove(0);
This first adds the unwanted Strings to the Map keyed by zero, and then removes them. There may be a lot of them. Is there an elegant way to avoid adding them to the map in the first place?
I don't want to add a filter step before grouping, because that would mean executing the decision/classification code twice.

You said that calling getKey is expensive, but you could still map the elements of the stream up-front before filtering them. The call to getKey will be only done once in this case.
Map<Integer, String> result =
stringStream.map(s -> new SimpleEntry<>(this.getKey(s), s))
.filter(e -> e.getKey() != 0)
.collect(groupingBy(Map.Entry::getKey, mapping(Map.Entry::getValue, joining())));
Note that there is no tuple classes in the standard API. You may roll your own one or use AbstractMap.SimpleEntry as a substitute.
Alternatively, if you think the first version creates a lot of entries, you can use the collect method where you provide yourself the supplier, accumulator and combiner.
Map<Integer, String> result = stringStream
.collect(HashMap::new,
(m, e) -> {
Integer key = this.getKey(e);
if(key != 0) {
m.merge(key, e, String::concat);
}
},
Map::putAll);

You may use a stream of pairs like this:
stringStream.map(x -> new Pair(getKey(x), x))
.filter(pair -> pair.left != 0) // or whatever predicate
.collect(Collectors.groupingBy(pair -> pair.left,
Collectors.mapping(pair -> pair.right, Collectors.joining())));
This code assumes simple Pair class with two fields left and right.
Some third-party libraries like my StreamEx provide additional methods to remove the boilerplate:
StreamEx.of(stringStream)
.mapToEntry(this::getKey, x -> x)
.filterKeys(key -> key != 0) // or whatever
.grouping(Collectors.joining());

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Collect to map skipping null key/values - java

You could create a isFunc1AndFunc2NotNull() method in the current class : boolean isFunc1AndFunc2NotNull(Foo foo){ return func1(foo) != null && func2(foo) != null; } And change your stream as : stream.filter(this::isFunc1AndFunc2NotNull) .collect(Collectors.toMap(this::func1, this::func2));

Related

Accumulate values and return result in Java stream

Using Java 8 stream methods to get the last max value

Transform list to mapping using java streams

How to use if-else logic in Java 8 stream forEach

groupingBy and filter in one step

Categories

Resources