I have a Map<Long, List<Member>>() and I want to produce a Map<Member, Long> that is calculated by iterating that List<Member> in Map.Entry<Long, List<Member>> and summing the keys of each map entry for each member in that member list. It's easy without non-functional way but I couldn't find a way without writing a custom collector using Java 8 Stream API. I think I need something like Stream.collect(Collectors.toFlatMap) however there is no such method in Collectors.
The best way that I could found is like this:
longListOfMemberMap = new HashMap<Long, List<Member>>()
longListOfMemberMap.put(10, asList(member1, member2));
Map<Member, Long> collect = longListOfMemberMap.entrySet().stream()
.collect(new Collector<Map.Entry<Long, List<Member>>, Map<Member, Long>, Map<Member, Long>>() {
#Override
public Supplier<Map<Member, Long>> supplier() {
return HashMap::new;
}
#Override
public BiConsumer<Map<Member, Long>, Map.Entry<Long, List<Member>>> accumulator() {
return (memberLongMap, tokenRangeListEntry) -> tokenRangeListEntry.getValue().forEach(member -> {
memberLongMap.compute(member, new BiFunction<Member, Long, Long>() {
#Override
public Long apply(Member member, Long aLong) {
return (aLong == null ? 0 : aLong) + tokenRangeListEntry.getKey();
}
});
});
}
#Override
public BinaryOperator<Map<Member, Long>> combiner() {
return (memberLongMap, memberLongMap2) -> {
memberLongMap.forEach((member, value) -> memberLongMap2.compute(member, new BiFunction<Member, Long, Long>() {
#Override
public Long apply(Member member, Long aLong) {
return aLong + value;
}
}));
return memberLongMap2;
};
}
#Override
public Function<Map<Member, Long>, Map<Member, Long>> finisher() {
return memberLongMap -> memberLongMap;
}
#Override
public Set<Characteristics> characteristics() {
return EnumSet.of(Characteristics.UNORDERED);
}
});
// collect is equal to
// 1. member1 -> 10
// 2. member2 -> 10
The code in the example takes a Map> as parameter and produces a Map:
parameter Map<Long, List<Member>>:
// 1. 10 -> list(member1, member2)
collected value Map<Member, Long>:
// 1. member1 -> 10
// 2. member2 -> 10
However as you see it's much more ugly than the non-functional way. I tried Collectors.toMap and reduce method of Stream but I couldn't find a way to do with a few lines of code.
Which way would be the simplest and functional for this problem?
longListOfMemberMap.entrySet().stream()
.flatMap(entry -> entry.getValue().stream().map(
member ->
new AbstractMap.SimpleImmutableEntry<>(member, entry.getKey())))
.collect(Collectors.groupingBy(
Entry::getKey,
Collectors.summingLong(Entry::getValue)));
...though an even simpler but more imperative alternative might look like
Map<Member, Long> result = new HashMap<>();
longListOfMemberMap.forEach((val, members) ->
members.forEach(member -> result.merge(member, val, Long::sum)));
I will just point out that the code you have posted can be written down much more concisely when relying on Collector.of and turning your anonymous classes into lambdas:
Map<Member, Long> result = longListOfMemberMap.entrySet().stream()
.collect(Collector.of(
HashMap::new,
(acc, item) -> item.getValue().forEach(member -> acc.compute(member,
(x, val) -> Optional.ofNullable(val).orElse(0L) + item.getKey())),
(m1, m2) -> {
m1.forEach((member, val1) -> m2.compute(member, (x, val2) -> val1 + val2));
return m2;
}
));
This still cumbersome, but at least not overwhelmingly so.
Related
I wrote a stream pipeline:
private void calcMin(Clazz clazz) {
OptionalInt min = listOfObjects.stream().filter(y -> (y.getName()
.matches(clazz.getFilter())))
.map(y -> (y.getUserNumber()))
.mapToInt(Integer::intValue)
.min();
list.add(min.getAsInt());
}
This pipeline gives me the lowest UserNumber.
So far, so good.
But I also need the greatest UserNumber.
And I also need the lowest GroupNumber.
And also the greatest GroupNumber.
I could write:
private void calcMax(Clazz clazz) {
OptionalInt max = listOfObjects.stream().filter(y -> (y.getName()
.matches(clazz.getFilter())))
.map(y -> (y.getUserNumber()))
.mapToInt(Integer::intValue)
.max();
list.add(max.getAsInt());
}
And I could also write the same for .map(y -> (y.getGroupNumber())).
This will work, but it is very redudant.
Is there a way to do it more variable?
There are two differences in the examples: the map() operation, and the terminal operation (min() and max()). So, to reuse the rest of the pipeline, you'll want to parameterize these.
I will warn you up front, however, that if you call this parameterized method directly from many places, your code will be harder to read. Comprehension of the caller's code will be easier if you keep a helper function—with a meaningful name—that delegates to the generic method. Obviously, there is a balance here. If you wanted to add additional functional parameters, the number of helper methods would grow rapidly and become cumbersome. And if you only call each helper from one place, maybe using the underlying function directly won't add too much clutter.
You don't show the type of elements in the stream. I'm using the name MyClass in this example as a placeholder.
private static OptionalInt extremum(
Collection<? extends MyClass> input,
Clazz clazz,
ToIntFunction<? super MyClass> valExtractor,
Function<IntStream, OptionalInt> terminalOp) {
IntStream matches = input.stream()
.filter(y -> y.getName().matches(clazz.getFilter()))
.mapToInt(valExtractor);
return terminalOp.apply(matches);
}
private OptionalInt calcMinUserNumber(Clazz clazz) {
return extremum(listOfObjects, clazz, MyClass::getUserNumber, IntStream::min);
}
private OptionalInt calcMaxUserNumber(Clazz clazz) {
return extremum(listOfObjects, clazz, MyClass::getUserNumber, IntStream::max);
}
private OptionalInt calcMinGroupNumber(Clazz clazz) {
return extremum(listOfObjects, clazz, MyClass::getGroupNumber, IntStream::min);
}
private OptionalInt calcMaxGroupNumber(Clazz clazz) {
return extremum(listOfObjects, clazz, MyClass::getGroupNumber, IntStream::max);
}
...
And here's a usage example:
calcMaxGroupNumber(clazz).ifPresent(list::add);
The solution may reduce redundancy but it removes readability from the code.
IntStream maxi = listOfObjects.stream().filter(y -> (y.getName()
.matches(clazz.getFilter())))
.map(y -> (y.getUserNumber()))
.mapToInt(Integer::intValue);
System.out.println(applier(() -> maxi, IntStream::max));
//System.out.println(applier(() -> maxi, IntStream::min));
...
public static OptionalInt applier(Supplier<IntStream> supplier, Function<IntStream, OptionalInt> predicate) {
return predicate.apply(supplier.get());
}
For the sake of variety, I want to add the following approach which uses a nested Collectors.teeing (Java 12 or higher) which enables to get all values by just streaming over the collection only once.
For the set up, I am using the below simple class :
#AllArgsConstructor
#ToString
#Getter
static class MyObject {
int userNumber;
int groupNumber;
}
and a list of MyObjects:
List<MyObject> myObjectList = List.of(
new MyObject(1, 2),
new MyObject(2, 3),
new MyObject(3, 4),
new MyObject(5, 3),
new MyObject(6, 2),
new MyObject(7, 6),
new MyObject(1, 12));
If the task was to get the max and min userNumber one could do a simple teeing like below and add for example the values to map:
Map<String , Integer> maxMinUserNum =
myObjectList.stream()
.collect(
Collectors.teeing(
Collectors.reducing(Integer.MAX_VALUE, MyObject::getUserNumber, Integer::min),
Collectors.reducing(Integer.MIN_VALUE, MyObject::getUserNumber, Integer::max),
(min,max) -> {
Map<String,Integer> map = new HashMap<>();
map.put("minUser",min);
map.put("maxUser",max);
return map;
}));
System.out.println(maxMinUserNum);
//output: {minUser=1, maxUser=7}
Since the task also includes to get the max and min group numbers, we could use the same approach as above and only need to nest the teeing collector :
Map<String , Integer> result =
myObjectList.stream()
.collect(
Collectors.teeing(
Collectors.teeing(
Collectors.reducing(Integer.MAX_VALUE, MyObject::getUserNumber, Integer::min),
Collectors.reducing(Integer.MIN_VALUE, MyObject::getUserNumber, Integer::max),
(min,max) -> {
Map<String,Integer> map = new LinkedHashMap<>();
map.put("minUser",min);
map.put("maxUser",max);
return map;
}),
Collectors.teeing(
Collectors.reducing(Integer.MAX_VALUE, MyObject::getGroupNumber, Integer::min),
Collectors.reducing(Integer.MIN_VALUE, MyObject::getGroupNumber, Integer::max),
(min,max) -> {
Map<String,Integer> map = new LinkedHashMap<>();
map.put("minGroup",min);
map.put("maxGroup",max);
return map;
}),
(map1,map2) -> {
map1.putAll(map2);
return map1;
}));
System.out.println(result);
output
{minUser=1, maxUser=7, minGroup=2, maxGroup=12}
I have the following class:
public static class GenerateMetaAlert implements WindowFunction<Tuple2<String, Boolean>, Tuple2<String, Boolean>, Tuple, TimeWindow> {
#Override
public void apply(Tuple key, TimeWindow timeWindow, Iterable<Tuple2<String, Boolean>> iterable, Collector<Tuple2<String, Boolean>> collector) throws Exception {
//code
}
}
What I'm trying to do is is for each element of the collection there are any other with the opposite value in a field.
An example:
Iterable: [<val1,val2>,<val3,val4>,<val5,val6>,...,<valx,valy>]
|| || || ||
elem1 elem2 elem3 elemn
What I would like to test:
foreach(element)
if elem(i).f0 = elem(i+1).f0 then ...
if elem(i).f0 = elem(i+2).f0 then ...
<...>
if elem(i+1).f0 = elem(i+2).f0 then ...
<...>
if elem(n-1).f0 = elem(n).f0 then ...
I think this would be possible using something like this:
Tuple2<String, Boolean> tupla = iterable.iterator().next();
iterable.iterator().forEachRemaining((e)->{
if ((e.f0 == tupla.f0) && (e.f1 != tupla.f1)) collector.collect(e);});
But like i'm new with Java, I don't know how I could do it in an optimal way.
This is a part of a Java program which use Apache Flink:
.keyBy(0, 1)
.timeWindow(Time.seconds(60))
.apply(new GenerateMetaAlert())
Testing:
Using the following code:
public static class GenerateMetaAlert implements WindowFunction<Tuple2<String, Boolean>, Tuple2<String, Boolean>, Tuple, TimeWindow> {
#Override
public void apply(Tuple key, TimeWindow timeWindow, Iterable<Tuple2<String, Boolean>> iterable, Collector<Tuple2<String, Boolean>> collector) throws Exception {
System.out.println("key: " +key);
StreamSupport.stream(iterable.spliterator(), false)
.collect(Collectors.groupingBy(t -> t.f0)) // yields a Map<String, List<Tuple2<String, Boolean>>>
.values() // yields a Collection<List<Tuple2<String, Boolean>>>
.stream()
.forEach(l -> {
System.out.println("l.size: " +l.size());
// l is the list of tuples for some common f0
while (l.size() > 1) {
Tuple2<String, Boolean> t0 = l.get(0);
System.out.println("t0: " +t0);
l = l.subList(1, l.size());
l.stream()
.filter(t -> t.f1 != t0.f1)
.forEach(t -> System.out.println("t: "+ t));
}
});
}
}
The result is:
key: (868789022645948,true)
key: (868789022645948,false)
l.size: 2
l.size: 2
t0: (868789022645948,true)
t0: (868789022645948,false)
Conclusion of this test: is like the condition .filter(t -> t.f1 != t0.f1) is never met
If I change .filter(t -> t.f1 != t0.f1) for .filter(t -> t.f1 != true) (or false) the filter works
I also use the following:
final Boolean[] aux = new Boolean[1];
<...>
Tuple2<String, Boolean> t0 = l.get(0);
aux[0] = t0.f1;
<...>
.filter(t -> !t.f1.equals(aux[0]))
But even with that, I don't have any output (I only have it when I use t.f1.equals(aux[0])
An Iterable allows you to obtain as many Iterators over its elements as you like, but each of them iterates over all the elements, and only once. Thus, your idea for using forEachRemaining() will not work as you hope. Because you're generating a new Iterator to invoke that method on, it will start at the beginning instead of after the element most recently provided by the other iterator.
What you can do instead is create a Stream by use of the Iterable's Spliterator, and use a grouping-by Collector to group the iterable's tuples by their first value. You can then process the tuple lists as you like.
For example, although I have some doubts as to whether it's what you actually want, this implements the logic described in the question:
StreamSupport.stream(iterable.spliterator(), false)
.collect(Collectors.groupingBy(t -> t.f0)) // yields a Map<String, List<Tuple2<String, Boolean>>>
.values() // yields a Collection<List<Tuple2<String, Boolean>>>
.stream()
.forEach(l -> {
// l is the list of tuples for some common f0
while (l.size() > 1) {
Tuple2<String, Boolean> t0 = l.get(0);
l = l.subList(1, l.size());
l.stream()
.filter(t -> t.f1 != t0.f1)
.forEach(t -> collect(t));
}
});
Note well that that can collect the same tuple multiple times, as follows from your pseudocode. If you wanted something different, such as collecting only tuples representing a flip of f1 value for a given f0, once each, then you would want a different implementation of the lambda in the outer forEach() operation.
Following class:
public class Foo {
private Date date;
private String name;
private Long number;
}
I now have a List<Foo> which I want to convert to Map<Date, Map<String,Long>> (Long should be a sum of numbers). What makes this hard is that I want exactly 26 entries in the inner map, where the 26th is called "Others" which sums up everything that has a number lower than the other 25.
I came up with following code:
data.stream().collect(Collectors.groupingBy(e -> e.getDate(), Collectors.groupingBy(e -> {
if (/*get current size of inner map*/>= 25) {
return e.getName();
} else {
return "Other";
}
}, Collectors.summingLong(e -> e.getNumber()))));
As you can see, I have no idea how to check the number of elements which are already in the inner map. How can I get the current size of the inner map or is there another way to achieve what I want?
My Java 7 code:
Map<Date, Map<String, Long>> result = new LinkedHashMap<Date, Map<String, Long>>();
for (Foo fr : data) {
if (result.get(fr.getDate()) == null) {
result.put(fr.getDate(), new LinkedHashMap<String, Long>());
}
if (result.get(fr.getDate()) != null) {
if (result.get(fr.getDate()).size() >= 25) {
if (result.get(fr.getDate()).get("Other") == null) {
result.get(fr.getDate()).put("Other", 0l);
}
if (result.get(fr.getDate()).get("Other") != null) {
long numbers= result.get(fr.getDate()).get("Other");
result.get(fr.getDate()).replace("Other", numbers+ fr.getNumbers());
}
} else {
result.get(fr.getDate()).put(fr.getName(), fr.getNumbers());
}
}
}
Edit:
The map should help me to realize a table like this:
But I need to sum the "Others" first.
If you need any more infos feel free to ask
I don’t think that this operation will benefit from using the Stream API. Still, you can improve the operation with Java 8 features:
Map<Date, Map<String, Long>> result = new LinkedHashMap<>();
for(Foo fr : data) {
Map<String, Long> inner
= result.computeIfAbsent(fr.getDate(), date -> new LinkedHashMap<>());
inner.merge(inner.size()>=25?"Other":fr.getAirlineName(), fr.getNumbers(), Long::sum);
}
This code assumes that the airline names are already unique for each date. Otherwise, you would have to extend the code to
Map<Date, Map<String, Long>> result = new LinkedHashMap<>();
for(Foo fr : data) {
Map<String, Long> inner
= result.computeIfAbsent(fr.getDate(), date -> new LinkedHashMap<>());
inner.merge(inner.size() >= 25 && !inner.containsKey(fr.getAirlineName())?
"Other": fr.getAirlineName(), fr.getNumbers(), Long::sum);
}
to accumulate the values for the airline correctly.
For completeness, here is how to implement it as a stream operation.
Since the custom collector has some complexity, it’s worth writing it as reusable code:
public static <T,K,V> Collector<T,?,Map<K,V>> toMapWithLimit(
Function<? super T, ? extends K> key, Function<? super T, ? extends V> value,
int limit, K fallBack, BinaryOperator<V> merger) {
return Collector.of(LinkedHashMap::new, (map, t) ->
mergeWithLimit(map, key.apply(t), value.apply(t), limit, fallBack, merger),
(map1,map2) -> {
if(map1.isEmpty()) return map2;
if(map1.size()+map2.size() < limit)
map2.forEach((k,v) -> map1.merge(k, v, merger));
else
map2.forEach((k,v) ->
mergeWithLimit(map1, k, v, limit, fallBack, merger));
return map1;
});
}
private static <T,K,V> void mergeWithLimit(Map<K,V> map, K key, V value,
int limit, K fallBack, BinaryOperator<V> merger) {
map.merge(map.size() >= limit && !map.containsKey(key)? fallBack: key, value, merger);
}
This is like Collectors.toMap, but supporting a limit and a fallback key for additional entries. You may recognize the Map.merge call, similar to the loop solution as the crucial element.
Then, you may use the collector as
Map<Date, Map<String, Long>> result = data.stream().collect(
Collectors.groupingBy(Foo::getDate, LinkedHashMap::new,
toMapWithLimit(Foo::getAirlineName, Foo::getNumbers, 25, "Other", Long::sum)));
A bit too late :) But I come with this Java 8 solution without using the for loop or the custom collector. It is based on collectingAndThen which allows you to transform the result of collecting operation.
It allows me to divide the stream in finisher operation based on treshold.
However, I am not sure about the performance.
int treshold = 25
Map<Date, Map<String, Long>> result = data.stream().collect(groupingBy(Foo::getDate,
collectingAndThen(Collectors.toList(), x -> {
if (x.size() >= treshold) {
Map<String, Long> resultMap = new HashMap<>();
resultMap.putAll(x.subList(0, treshold).stream().collect(groupingBy(Foo::getName, Collectors.summingLong(Foo::getNumber))));
resultMap.putAll(x.subList(treshold, x.size()).stream().collect(groupingBy(y -> "Other", Collectors.summingLong(Foo::getNumber))));
return resultMap;
} else {
return x.stream().collect(groupingBy(Foo::getName, Collectors.summingLong(Foo::getNumber)));
}
})));
First of all, let's simplify the original problem by adapting it to java 8 without using Streams.
Map<Date, Map<String, Long>> result = new LinkedHashMap();
for (Foo fr : data) {
Map<String, Long> map = result.getOrDefault(fr.getDate(), new LinkedHashMap());
if (map.size() >= 25) {
Long value = map.getOrDefault("Other", 0L); // getOrDefault from 1.8
map.put("Other", value + 1);
} else {
map.put(fr.getName(), fr.getNumber());
}
result.put(fr.getDate(), map);
}
And now using Stream
int limit = 25;
Map<Date, Map<String, Long>> collect = data.stream()
.collect(Collectors.groupingBy(Foo::getDate))
.entrySet().stream()
.collect(Collectors.toMap(Map.Entry::getKey, v -> {
Map<String, Long> c = v.getValue().stream()
.limit(limit)
.collect(Collectors.toMap(Foo::getName, Foo::getNumber));
long remaining = v.getValue().size() - limit;
if (remaining > 0) {
c.put("Other", remaining);
}
return c;
}));
I'm trying to collect in a Map the results from the process a list of objects and that it returns a map. I think that I should do it with a Collectors.toMap but I haven't found the way.
This is the code:
public class Car {
List<VersionCar> versions;
public List<VersionCar> getVersions() {
return versions;
}
}
public class VersionCar {
private String wheelsKey;
private String engineKey;
public String getWheelsKey() {
return wheelsKey;
}
public String getEngineKey() {
return engineKey;
}
}
process method:
private static Map<String,Set<String>> processObjects(VersionCar version) {
Map<String,Set<String>> mapItems = new HashMap<>();
mapItems.put("engine", new HashSet<>(Arrays.asList(version.getEngineKey())));
mapItems.put("wheels", new HashSet<>(Arrays.asList(version.getWheelsKey())));
return mapItems;
}
My final code is:
Map<String,Set<String>> mapAllItems =
car.getVersions().stream()
.map(versionCar -> processObjects(versionCar))
.collect(Collectors.toMap()); // here I don't know like collect the map.
My idea is to process the list of versions and in the end get a Map with two items: wheels and engine but with a set<> with all different items for all versions. Do you have any ideas as can I do that with Collectors.toMap or another option?
The operator you want to use in this case is probably "reduce"
car.getVersions().stream()
.map(versionCar -> processObjects(versionCar))
.reduce((map1, map2) -> {
map2.forEach((key, subset) -> map1.get(key).addAll(subset));
return map1;
})
.orElse(new HashMap<>());
The lambda used in "reduce" is a BinaryOperator, that merges 2 maps and return the merged map.
The "orElse" is just here to return something in the case your initial collection (versions) is empty.
From a type point of view it gets rid of the "Optional"
You can use Collectors.toMap(keyMapper, valueMapper, mergeFunction). Last argument is used to resolve collisions between values associated with the same key.
For example:
Map<String, Set<String>> mapAllItems =
car.getVersions().stream()
.map(versionCar -> processObjects(versionCar))
.flatMap(m -> m.entrySet().stream())
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue,
(firstSet, secondSet) -> {
Set<String> result = new HashSet<>();
result.addAll(firstSet);
result.addAll(secondSet);
return result;
}
));
To get the mapAllItems, we don't need and should not define processObjects method:
Map<String, Set<String>> mapAllItems = new HashMap<>();
mapAllItems.put("engine", car.getVersions().stream().map(v -> v.getEngineKey()).collect(Collectors.toSet()));
mapAllItems.put("wheels", car.getVersions().stream().map(v -> v.getWheelsKey()).collect(Collectors.toSet()));
Or by AbstractMap.SimpleEntry which is lighter than the Map created byprocessObjects`:
mapAllItems = car.getVersions().stream()
.flatMap(v -> Stream.of(new SimpleEntry<>("engine", v.getEngineKey()), new SimpleEntry<>("wheels", v.getWheelsKey())))
.collect(Collectors.groupingBy(e -> e.getKey(), Collectors.mapping(e -> e.getValue(), Collectors.toSet())));
My first attempt with java 8 streams...
I have an object Bid, which represents a bid of a user for an item in an auction. i have a list of bids, and i want to make a map that counts in how many (distinct) auctions the user made a bid.
this is my take on it:
bids.stream()
.collect(
Collectors.groupingBy(
bid -> Bid::getBidderUserId,
mapping(Bid::getAuctionId, Collectors.toSet())
)
).entrySet().stream().collect(Collectors.toMap(
e-> e.getKey(),e -> e.getValue().size())
);
It works, but i feel like i'm cheating, cause i stream the entry sets of the map, instead of doing a manipulation on the initial stream... must be a more correct way of doing this, but i couldn't figure it out...
Thanks
You can perform groupingBy twice:
Map<Integer, Map<Integer, Long>> map = bids.stream().collect(
groupingBy(Bid::getBidderUserId,
groupingBy(Bid::getAuctionId, counting())));
This way you have how many bids each user has in each auction. So the size of internal map is the number of auctions the user participated. If you don't need the additional information, you can do this:
Map<Integer, Integer> map = bids.stream().collect(
groupingBy(
Bid::getBidderUserId,
collectingAndThen(
groupingBy(Bid::getAuctionId, counting()),
Map::size)));
This is exactly what you need: mapping of users to number of auctions user participated.
Update: there's also similar solution which is closer to your example:
Map<Integer, Integer> map = bids.stream().collect(
groupingBy(
Bid::getBidderUserId,
collectingAndThen(
mapping(Bid::getAuctionId, toSet()),
Set::size)));
Tagir Valeev's answer is the right one (+1). Here is an additional one that does exactly the same using your own downstream Collector for the groupBy:
Map<Integer, Long> map = bids.stream().collect(
Collectors.groupingBy(Bid::getBidderUserId,
new Collector<Bid, Set<Integer>, Long>() {
#Override
public Supplier<Set<Integer>> supplier() {
return HashSet::new;
}
#Override
public BiConsumer<Set<Integer>, Bid> accumulator() {
return (s, b) -> s.add(b.getAuctionId());
}
#Override
public BinaryOperator<Set<Integer>> combiner() {
return (s1, s2) -> {
s1.addAll(s2);
return s1;
};
}
#Override
public Function<Set<Integer>, Long> finisher() {
return (s) -> Long.valueOf(s.size());
}
#Override
public Set<java.util.stream.Collector.Characteristics> characteristics() {
return Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.UNORDERED, Collector.Characteristics.IDENTITY_FINISH));
}
}));