Java 8 streams intermediary map/collect to a stream with 2 values - java

Imagine that I have the following working lambda expression:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.filter(f -> f.isAnnotationPresent(Column.class))
.collect(toMap(f -> {
f.setAccessible(true);
return f;
}, f -> f.getAnnotation(Column.class).name()));
I would like to create a stream with 2 values before the filter statement. So I want to do a mapping but still keep the original value aside from it. I want to achieve something like this:
this.fields = Arrays.stream(resultClass.getDeclaredFields())
//map to <Field, Annotation> stream
.filter((f, a) -> a != null)
.collect(toMap(f -> {
f.setAccessible(true);
return f;
}, f -> a.name()));
Is this possible with Java 8 streams? I have looked at collect(groupingBy()) but still without succes.

You need something like a Pair that holds two values. You can write your own, but here is some code that repurposes AbstractMap.SimpleEntry:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.map(f -> new AbstractMap.SimpleEntry<>(f, f.getAnnotation(Column.class)))
.filter(entry -> entry.getValue() != null)
.peek(entry -> entry.getKey().setAccessible(true))
.collect(toMap(Map.Entry::getKey, entry -> entry.getValue().name()));

You can do the entire operation in one go during the collect operation without the need of a pair type:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.collect(HashMap::new, (m,f) -> {
Column c=f.getAnnotation(Column.class);
if(c!=null) {
f.setAccessible(true);
m.put(f, c.name());
}
}, Map::putAll);
Still, to me it looks cleaner to separate the two operations which do not becong together:
Map<Field, String> fields = Arrays.stream(resultClass.getDeclaredFields())
.collect(HashMap::new, (m,f) -> {
Column c=f.getAnnotation(Column.class);
if(c!=null) m.put(f,c.name());
}, Map::putAll);
AccessibleObject.setAccessible(
fields.keySet().stream().toArray(AccessibleObject[]::new), true);
This solution does iterate twice over the fields having the annotation, but since this performs only one security check rather than one check per field, it might still outperform all other solutions.
Generally, you shouldn’t try to optimize unless there really is a performance problem and if you do it, you should measure, not guess about the costs of the operations. The results might be surprising and iterating multiple times over a data set is not necessarily bad.

#Peter Lawrey: I tried your suggestion with an intermediary map. It works now but it is not really pretty.
this.fields = Arrays.stream(resultClass.getDeclaredFields())
.collect(HashMap<Field, Column>::new, (map, f) -> map.put(f, f.getAnnotation(Column.class)), HashMap::putAll)
.entrySet().stream()
.filter(entry -> entry.getValue() != null)
.peek(entry -> entry.getKey().setAccessible(true))
.collect(toMap(Map.Entry::getKey, entry -> entry.getValue().name()));

Related

After flattening a Java 8 multilevel map my expected result is empty

I have the below multilevel map:
Map<String, List<Map<String, Map<String, Map<String, Map<String, String>>>>>> input =
ImmutableMap.of("A",
ImmutableList.of(ImmutableMap.of("2",
ImmutableMap.of("3",
ImmutableMap.of("4",
ImmutableMap.of("5", "a"))))));
In short it'll be like
{
"A":[{"2":{"3":{"4":{"5":"a"}}}}],
"B":[{"2":{"3":{"4":{"5":"b"}}}}]
}
My requirement is to construct a map of the form
{
"A":"a",
"B":"b"
}
I tried the below code but for some reason myMap is always empty even though I'm populating it. What am I missing?
Map<String, String> myMap = new HashMap<>();
input.entrySet()
.stream()
.map(l -> l.getValue().stream().map(m -> m.get(m.keySet().toArray()[0]))
.map(n -> n.get(n.keySet().toArray()[0]))
.map(o -> o.get(o.keySet().toArray()[0]))
.map(p -> myMap.put(l.getKey(), p.get(p.keySet().toArray()[0])))).collect(Collectors.toList());
System.out.println(myMap);
Here's what I get when I add two peek calls to your pipeline:
input.entrySet().stream()
.peek(System.out::println) //<- this
.map(l -> ...)
.peek(System.out::println) //<- and this
.collect(Collectors.toList());
output:
A=[{2={3={4={5=a}}}}]
java.util.stream.ReferencePipeline$3#d041cf
If you notice the problem, you're collecting streams, and these streams don't get executed through a call to a terminal operation... When I try adding something like .count() to the inner stream, your expected output is produced:
...
.map(l -> l.getValue().stream().map(m -> m.get(m.keySet().toArray()[0]))
.map(n -> n.get(n.keySet().toArray()[0]))
.map(o -> o.get(o.keySet().toArray()[0]))
.map(p -> myMap.put(l.getKey(), p.get(p.keySet().toArray()[0])))
.count()) //just an example
...
Now, I suppose you know that a terminal operation needs to be called for the intermediate ones to run.
In a rather desperate attempt to simplify this code, as the stream seems to make it simply hard to read, I thought you might be interested in this, which assumes that no collection is empty in the tree but at least addrsses the record as one object, and not a collection of records (but I'm sure no code will look clean for that deep map of of maps).
String key = input.keySet().iterator().next();
String value = input.entrySet().iterator().next()
.getValue().get(0)
.values().iterator().next()
.values().iterator().next()
.values().iterator().next()
.values().iterator().next();
myMap.put(key, value);

Elegant way to flatMap Set of Sets inside groupingBy

So I have a piece of code where I'm iterating over a list of data. Each one is a ReportData that contains a case with a Long caseId and one Ruling. Each Ruling has one or more Payment. I want to have a Map with the caseId as keys and sets of payments as values (i.e. a Map<Long, Set<Payments>>).
Cases are not unique across rows, but cases are.
In other words, I can have several rows with the same case, but they will have unique rulings.
The following code gets me a Map<Long, Set<Set<Payments>>> which is almost what I want, but I've been struggling to find the correct way to flatMap the final set in the given context. I've been doing workarounds to make the logic work correctly using this map as is, but I'd very much like to fix the algorithm to correctly combine the set of payments into one single set instead of creating a set of sets.
I've searched around and couldn't find a problem with the same kind of iteration, although flatMapping with Java streams seems like a somewhat popular topic.
rowData.stream()
.collect(Collectors.groupingBy(
r -> r.case.getCaseId(),
Collectors.mapping(
r -> r.getRuling(),
Collectors.mapping(ruling->
ruling.getPayments(),
Collectors.toSet()
)
)));
Another JDK8 solution:
Map<Long, Set<Payment>> resultSet =
rowData.stream()
.collect(Collectors.toMap(p -> p.Case.getCaseId(),
p -> new HashSet<>(p.getRuling().getPayments()),
(l, r) -> { l.addAll(r);return l;}));
or as of JDK9 you can use the flatMapping collector:
rowData.stream()
.collect(Collectors.groupingBy(r -> r.Case.getCaseId(),
Collectors.flatMapping(e -> e.getRuling().getPayments().stream(),
Collectors.toSet())));
The cleanest solution is to define your own collector:
Map<Long, Set<Payment>> result = rowData.stream()
.collect(Collectors.groupingBy(
ReportData::getCaseId,
Collector.of(HashSet::new,
(s, r) -> s.addAll(r.getRuling().getPayments()),
(s1, s2) -> { s1.addAll(s2); return s1; })
));
Two other solutions to which I thought first but are actually less efficient and readable, but still avoid constructing the intermediate Map:
Merging the inner sets using Collectors.reducing():
Map<Long, Set<Payment>> result = rowData.stream()
.collect(Collectors.groupingBy(
ReportData::getCaseId,
Collectors.reducing(Collections.emptySet(),
r -> r.getRuling().getPayments(),
(s1, s2) -> {
Set<Payment> r = new HashSet<>(s1);
r.addAll(s2);
return r;
})
));
where the reducing operation will merge the Set<Payment> of entries with the same caseId. This can however cause a lot of copies of the sets if you have a lot of merges needed.
Another solution is with a downstream collector that flatmaps the nested collections:
Map<Long, Set<Payment>> result = rowData.stream()
.collect(Collectors.groupingBy(
ReportData::getCaseId,
Collectors.collectingAndThen(
Collectors.mapping(r -> r.getRuling().getPayments(), Collectors.toList()),
s -> s.stream().flatMap(Set::stream).collect(Collectors.toSet())))
);
Basically it puts all sets of matching caseId together in a List, then flatmaps that list into a single Set.
There are probably better ways to do this, but this is the best I found:
Map<Long, Set<Payment>> result =
rowData.stream()
// First group by caseIds.
.collect(Collectors.groupingBy(r -> r.case.getCaseId()))
.entrySet().stream()
// By streaming over the entrySet, I map the values to the set of payments.
.collect(Collectors.toMap(
Map.Entry::getKey,
entry -> entry.getValue().stream()
.flatMap(r -> r.getRuling().getPayments().stream())
.collect(Collectors.toSet())));

Java Streams -- How to filter based on function output correctly

Edit:
Basically I want to filter all the entries based on whether entry.getObject() is a string that contains value "value".
So I have a block of code that looks something like this:
list.stream()
.filter((entry) -> entry.getObject() != null)
.filter((entry) -> entry.getObject() instanceof String)
.filter((entry) -> ((String)entry.getObject()).toLowerCase().contains(value))
.collect(Collectors.toList());
The main problem is that I can't figure out how to structure this to persist the value of entry.getObject(), and I can't figure out how to manipulate the output of of entry.getObject() without losing sight of the entry that yielded the value. Earlier attempts looked more like this:
list.stream()
.map((entry) -> entry.getObject())
.filter((object) -> entry instanceof String)
.map((object) -> (String)entry)
.filter((str) -> str.toLowerCase().contains(value))
/* ... */
But then I couldn't figure out any way to relate it to the entry in the list that I started out with.
A possible solution might look like this:
list.stream()
.filter((entry) -> Arrays.stream(new Entry[] {entry})
// Map from entry to entry.getObject()
.map((entry) -> entry.getObject())
// Remove any objects that aren't strings
.filter((object) -> entry instanceof String)
// Map the object to a string
.map((object) -> (String)entry)
// Remove any strings that don't contain the value
.filter((str) -> str.toLowerCase().contains(value))
// If there is a product remaining, then entry is what I want
.count() > 0)
.collect(Collectors.toList());
This way, we can split off and analyze entry.getObject() without multiple calls to get the value back at each step.
You could do something like
list.stream()
.map( (e) -> new Entry(e, e.getObject()) )
.filter( (p) -> p.getValue() instanceof String )
//...
.map( (p) -> p.getKey() )
.collect(Collectors.toList());
Using Map.Entry or swing.Pair (or roll your own tuple like structure)

How to use if-else logic in Java 8 stream forEach

What I want to do is shown below in 2 stream calls. I want to split a collection into 2 new collections based on some condition. Ideally I want to do it in 1. I've seen conditions used for the .map function of streams, but couldn't find anything for the forEach. What is the best way to achieve what I want?
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() != null)
.forEach(pair-> myMap.put(pair.getKey(), pair.getValue()));
animalMap.entrySet().stream()
.filter(pair-> pair.getValue() == null)
.forEach(pair-> myList.add(pair.getKey()));
Just put the condition into the lambda itself, e.g.
animalMap.entrySet().stream()
.forEach(
pair -> {
if (pair.getValue() != null) {
myMap.put(pair.getKey(), pair.getValue());
} else {
myList.add(pair.getKey());
}
}
);
Of course, this assumes that both collections (myMap and myList) are declared and initialized prior to the above piece of code.
Update: using Map.forEach makes the code shorter, plus more efficient and readable, as Jorn Vernee kindly suggested:
animalMap.forEach(
(key, value) -> {
if (value != null) {
myMap.put(key, value);
} else {
myList.add(key);
}
}
);
In most cases, when you find yourself using forEach on a Stream, you should rethink whether you are using the right tool for your job or whether you are using it the right way.
Generally, you should look for an appropriate terminal operation doing what you want to achieve or for an appropriate Collector. Now, there are Collectors for producing Maps and Lists, but no out of-the-box collector for combining two different collectors, based on a predicate.
Now, this answer contains a collector for combining two collectors. Using this collector, you can achieve the task as
Pair<Map<KeyType, Animal>, List<KeyType>> pair = animalMap.entrySet().stream()
.collect(conditional(entry -> entry.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue),
Collectors.mapping(Map.Entry::getKey, Collectors.toList()) ));
Map<KeyType,Animal> myMap = pair.a;
List<KeyType> myList = pair.b;
But maybe, you can solve this specific task in a simpler way. One of you results matches the input type; it’s the same map just stripped off the entries which map to null. If your original map is mutable and you don’t need it afterwards, you can just collect the list and remove these keys from the original map as they are mutually exclusive:
List<KeyType> myList=animalMap.entrySet().stream()
.filter(pair -> pair.getValue() == null)
.map(Map.Entry::getKey)
.collect(Collectors.toList());
animalMap.keySet().removeAll(myList);
Note that you can remove mappings to null even without having the list of the other keys:
animalMap.values().removeIf(Objects::isNull);
or
animalMap.values().removeAll(Collections.singleton(null));
If you can’t (or don’t want to) modify the original map, there is still a solution without a custom collector. As hinted in Alexis C.’s answer, partitioningBy is going into the right direction, but you may simplify it:
Map<Boolean,Map<KeyType,Animal>> tmp = animalMap.entrySet().stream()
.collect(Collectors.partitioningBy(pair -> pair.getValue() != null,
Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)));
Map<KeyType,Animal> myMap = tmp.get(true);
List<KeyType> myList = new ArrayList<>(tmp.get(false).keySet());
The bottom line is, don’t forget about ordinary Collection operations, you don’t have to do everything with the new Stream API.
The problem by using stream().forEach(..) with a call to add or put inside the forEach (so you mutate the external myMap or myList instance) is that you can run easily into concurrency issues if someone turns the stream in parallel and the collection you are modifying is not thread safe.
One approach you can take is to first partition the entries in the original map. Once you have that, grab the corresponding list of entries and collect them in the appropriate map and list.
Map<Boolean, List<Map.Entry<K, V>>> partitions =
animalMap.entrySet()
.stream()
.collect(partitioningBy(e -> e.getValue() == null));
Map<K, V> myMap =
partitions.get(false)
.stream()
.collect(toMap(Map.Entry::getKey, Map.Entry::getValue));
List<K> myList =
partitions.get(true)
.stream()
.map(Map.Entry::getKey)
.collect(toList());
... or if you want to do it in one pass, implement a custom collector (assuming a Tuple2<E1, E2> class exists, you can create your own), e.g:
public static <K,V> Collector<Map.Entry<K, V>, ?, Tuple2<Map<K, V>, List<K>>> customCollector() {
return Collector.of(
() -> new Tuple2<>(new HashMap<>(), new ArrayList<>()),
(pair, entry) -> {
if(entry.getValue() == null) {
pair._2.add(entry.getKey());
} else {
pair._1.put(entry.getKey(), entry.getValue());
}
},
(p1, p2) -> {
p1._1.putAll(p2._1);
p1._2.addAll(p2._2);
return p1;
});
}
with its usage:
Tuple2<Map<K, V>, List<K>> pair =
animalMap.entrySet().parallelStream().collect(customCollector());
You can tune it more if you want, for example by providing a predicate as parameter.
I think it's possible in Java 9:
animalMap.entrySet().stream()
.forEach(
pair -> Optional.ofNullable(pair.getValue())
.ifPresentOrElse(v -> myMap.put(pair.getKey(), v), v -> myList.add(pair.getKey())))
);
Need the ifPresentOrElse for it to work though. (I think a for loop looks better.)

groupingBy and filter in one step

I have a Stream<String>, and I want a Map<Integer, String>. Let's call my classifier function getKey(String) - it can be expensive. Sometimes it returns zero, which means that the String should be discarded and not included in the resulting map.
So, I can use this code:
Stream<String> stringStream;
Map<Integer, String> result =
stringStream.collect(Collectors.groupingBy(this::getKey, Collectors.joining());
result.remove(0);
This first adds the unwanted Strings to the Map keyed by zero, and then removes them. There may be a lot of them. Is there an elegant way to avoid adding them to the map in the first place?
I don't want to add a filter step before grouping, because that would mean executing the decision/classification code twice.
You said that calling getKey is expensive, but you could still map the elements of the stream up-front before filtering them. The call to getKey will be only done once in this case.
Map<Integer, String> result =
stringStream.map(s -> new SimpleEntry<>(this.getKey(s), s))
.filter(e -> e.getKey() != 0)
.collect(groupingBy(Map.Entry::getKey, mapping(Map.Entry::getValue, joining())));
Note that there is no tuple classes in the standard API. You may roll your own one or use AbstractMap.SimpleEntry as a substitute.
Alternatively, if you think the first version creates a lot of entries, you can use the collect method where you provide yourself the supplier, accumulator and combiner.
Map<Integer, String> result = stringStream
.collect(HashMap::new,
(m, e) -> {
Integer key = this.getKey(e);
if(key != 0) {
m.merge(key, e, String::concat);
}
},
Map::putAll);
You may use a stream of pairs like this:
stringStream.map(x -> new Pair(getKey(x), x))
.filter(pair -> pair.left != 0) // or whatever predicate
.collect(Collectors.groupingBy(pair -> pair.left,
Collectors.mapping(pair -> pair.right, Collectors.joining())));
This code assumes simple Pair class with two fields left and right.
Some third-party libraries like my StreamEx provide additional methods to remove the boilerplate:
StreamEx.of(stringStream)
.mapToEntry(this::getKey, x -> x)
.filterKeys(key -> key != 0) // or whatever
.grouping(Collectors.joining());

Categories

Resources