Java 8 stream groupBy pojo

Java 8 stream groupBy pojo - java

I have a collection of pojos:
public class Foo {
String name;
String date;
int count;
}
I need to iterate over collection, groupBy Foos by name and sum counts, then create new collection with pojos with summed count.
Here is how I do it now:
List<Foo> foosToSum = ...
Map<String, List<Foo>> foosGroupedByName = foosToSum.stream()
.collect(Collectors.groupingBy(Foo::getName));
List<Foo> groupedFoos = foosGroupedByName.keySet().stream().map(name -> {
int totalCount = 0;
String date = "";
for(Foo foo: foosGroupedByName.get(name)) {
totalCount += foo.getCount();
date = foo.getDate() //last is used
}
return new Foo(name, date, totalCount);
}).collect(Collectors.toList());
Is there a more beauty way to do it with streams?
UPDATE Thanks everyone for help. All answers were great.
I decided to create merge function in pojo.
The final solution looks like:
Collection<Foo> groupedFoos = foosToSum.stream()
.collect(Collectors.toMap(Foo::getName, Function.identity(), Foo::merge))
.values();

You can do it either using groupingBy or using toMap collector, as for which to use is debatable so I'll let you decide on the one you prefer.
For better readability, I'd create a merge function in Foo and hide all the merging logic inside there.
This also means better maintainability as the more complex the merging gets, you only have to change one place and that is the merge method, not the stream query.
e.g.
public Foo merge(Foo another){
this.count += another.getCount();
/* further merging if needed...*/
return this;
}
Now you can do:
Collection<Foo> resultSet = foosToSum.stream()
.collect(Collectors.toMap(Foo::getName,
Function.identity(), Foo::merge)).values();
Note, the above merge function mutates the objects in the source collection, if instead, you want to keep it immutable then you can construct new Foo's like this:
public Foo merge(Foo another){
return new Foo(this.getName(), null, this.getCount() + another.getCount());
}
Further, if for some reason you explicitly require a List<Foo> instead of Collection<Foo> then it can be done by using the ArrayList copy constructor.
List<Foo> resultList = new ArrayList<>(resultSet);
Update
As #Federico has mentioned in the comments the last merge function above is expensive as it creates unnecessary objects that could be avoided. So, as he has suggested, a more friendly alternative is to proceed with the first merge function I've shown above and then change your stream query to this:
Collection<Foo> resultSet = foosToSum.stream()
.collect(Collectors.toMap(Foo::getName,
f -> new Foo(f.getName(), null, f.getCount()), Foo::merge))
.values();

Yes, you could use a downstream collector in your groupingBy to immediately sum the counts. Afterwards, stream the map and map to Foos.
foosToSum.stream()
.collect(Collectors.groupingBy(Foo::getName,
Collectors.summingInt(Foo::getCount)))
.entrySet()
.stream()
.map(entry -> new Foo(entry.getKey(), null, entry.getValue()))
.collect(Collectors.toList());
A more efficient solution could avoid grouping into a map only to stream it immediately, but sacrifices some readability (in my opinion):
foosToSum.stream()
.collect(Collectors.groupingBy(Foo::getName,
Collectors.reducing(new Foo(),
(foo1, foo2) -> new Foo(foo1.getName(), null, foo1.getCount() + foo2.getCount()))))
.values();
By reducing Foos instead of ints, we keep the name in mind and can immediately sum into Foo.

Related

Which is efficient way to bifurcate List of nested object list ? java 8 flatmap vs for each?

I've a Terminal object:
class Terminal{
List<TerminalPeriodApplicability> periods= new ArrayList<>();
//few other attributes
//getters & setters
}
TerminalPeriodApplicability object:
class TerminalPeriodApplicability{
String name;
boolean isRequired;
//getters & setters
}
I want to bifurcate names of TerminalPeriodApplicability into optional & mandatory Sets based on isRequired's value.
I've tried two approaches of it. One with two forEach and the other with flatMap.
List<Terminal> terminals= getTerminals();
Set<String> mandatoryPeriods = new HashSet<>();
Set<String> optionalPeriods = new HashSet<>();
Approach 1:
terminals.forEach(terminal -> terminal.getApplicablePeriods().forEach(period->{
if(period.getIsRequired())
mandatoryPeriods.add(period.name());
else
optionalPeriods.add(period.name());
}));
Approach 2:
List<TerminalPeriodApplicability> applicablePeriods = terminals
.stream()
.flatMap(terminal -> terminal.getApplicablePeriods().stream())
.collect(Collectors.toList());
applicablePeriods.forEach(period->{
if(period.getIsRequired())
mandatoryPeriods.add(period.name());
else
optionalPeriods.add(period.name());
});
I would like to know which approach is more efficient in terms of time & space complexity. Or is there any better solution to solve this problem?

You can use a different terminal operation in your flatMap version - partitioningBy instead of toList - and avoid the second forEach:
Map<Boolean,List<TerminalPeriodApplicability>> periods = terminals
.stream()
.flatMap(terminal -> terminal.getApplicablePeriods().stream())
.collect(Collectors.partitioningBy(TerminalPeriodApplicability::getIsRequired);
or
Map<Boolean,Set<TerminalPeriodApplicability>> periods = terminals
.stream()
.flatMap(terminal -> terminal.getApplicablePeriods().stream())
.collect(Collectors.partitioningBy(TerminalPeriodApplicability::getIsRequired,
Collectors.toSet());
Correction: Since you want the two Sets to contain Strings instead of TerminalPeriodApplicability instances, it should be:
Map<Boolean,Set<String>> periods = terminals
.stream()
.flatMap(terminal -> terminal.getApplicablePeriods().stream())
.collect(Collectors.partitioningBy(TerminalPeriodApplicability::getIsRequired,
Collectors.mapping(TerminalPeriodApplicability::name,
Collectors.toSet()));

Use java stream to group by 2 keys on the same type

Using java stream, how to create a Map from a List to index by 2 keys on the same class?
I give here a code Example, I would like the map "personByName" to get all person by firstName OR lastName, so I would like to get the 3 "steves": when it's their firstName or lastname. I don't know how to mix the 2 Collectors.groupingBy.
public static class Person {
final String firstName;
final String lastName;
protected Person(String firstName, String lastName) {
super();
this.firstName = firstName;
this.lastName = lastName;
}
public String getFirstName() {
return firstName;
}
public String getLastName() {
return lastName;
}
}
#Test
public void testStream() {
List<Person> persons = Arrays.asList(
new Person("Bill", "Gates"),
new Person("Bill", "Steve"),
new Person("Steve", "Jobs"),
new Person("Steve", "Wozniac"));
Map<String, Set<Person>> personByFirstName = persons.stream().collect(Collectors.groupingBy(Person::getFirstName, Collectors.toSet()));
Map<String, Set<Person>> personByLastName = persons.stream().collect(Collectors.groupingBy(Person::getLastName, Collectors.toSet()));
Map<String, Set<Person>> personByName = persons.stream().collect(Collectors.groupingBy(Person::getLastName, Collectors.toSet()));// This is wrong, I want bot first and last name
Assert.assertEquals("we should search by firstName AND lastName", 3, personByName.get("Steve").size()); // This fails
}
I found a workaround by looping on the 2 maps, but it is not stream-oriented.

You can do it like this:
Map<String, Set<Person>> personByName = persons.stream()
.flatMap(p -> Stream.of(new SimpleEntry<>(p.getFirstName(), p),
new SimpleEntry<>(p.getLastName(), p)))
.collect(Collectors.groupingBy(SimpleEntry::getKey,
Collectors.mapping(SimpleEntry::getValue, Collectors.toSet())));
Assuming you add a toString() method to the Person class, you can then see result using:
List<Person> persons = Arrays.asList(
new Person("Bill", "Gates"),
new Person("Bill", "Steve"),
new Person("Steve", "Jobs"),
new Person("Steve", "Wozniac"));
// code above here
personByName.entrySet().forEach(System.out::println);
Output
Steve=[Steve Wozniac, Bill Steve, Steve Jobs]
Jobs=[Steve Jobs]
Bill=[Bill Steve, Bill Gates]
Wozniac=[Steve Wozniac]
Gates=[Bill Gates]

You could merge the two Map<String, Set<Person>> for example
Map<String, Set<Person>> personByFirstName =
persons.stream()
.collect(Collectors.groupingBy(
Person::getFirstName,
Collectors.toCollection(HashSet::new))
);
persons.stream()
.collect(Collectors.groupingBy(Person::getLastName, Collectors.toSet()))
.forEach((str, set) -> personByFirstName.merge(str, set, (s1, s2) -> {
s1.addAll(s2);
return s1;
}));
// personByFirstName contains now all personByName

One way would be by using the newest JDK12's Collector.teeing:
Map<String, List<Person>> result = persons.stream()
.collect(Collectors.teeing(
Collectors.groupingBy(Person::getFirstName,
Collectors.toCollection(ArrayList::new)),
Collectors.groupingBy(Person::getLastName),
(byFirst, byLast) -> {
byLast.forEach((last, peopleList) ->
byFirst.computeIfAbsent(last, k -> new ArrayList<>())
.addAll(peopleList));
return byFirst;
}));
Collectors.teeing collects to two separate collectors and then merges the results into a final value. From the docs:
Returns a Collector that is a composite of two downstream collectors. Every element passed to the resulting collector is processed by both downstream collectors, then their results are merged using the specified merge function into the final result.
So, the above code collects to a map by first name and also to a map by last name and then merges both maps into a final map by iterating the byLast map and merging each one of its entries into the byFirst map by means of the Map.computeIfAbsent method. Finally, the byFirst map is returned.
Note that I've collected to a Map<String, List<Person>> instead of to a Map<String, Set<Person>> to keep the example simple. If you actually need a map of sets, you could do it as follows:
Map<String, Set<Person>> result = persons.stream().
.collect(Collectors.teeing(
Collectors.groupingBy(Person::getFirstName,
Collectors.toCollection(LinkedHashSet::new)),
Collectors.groupingBy(Person::getLastName, Collectors.toSet()),
(byFirst, byLast) -> {
byLast.forEach((last, peopleSet) ->
byFirst.computeIfAbsent(last, k -> new LinkedHashSet<>())
.addAll(peopleSet));
return byFirst;
}));
Keep in mind that if you need to have Set<Person> as the values of the maps, the Person class must implement the hashCode and equals methods consistently.

If you want a real stream-oriented solution, make sure you don't produce any large intermediate collections, else most of the sense of streams is lost.
If just you want to just filter all Steves, filter first, collect later:
persons.stream
.filter(p -> p.getFirstName().equals('Steve') || p.getLastName.equals('Steve'))
.collect(toList());
If you want to do complex things with a stream element, e.g. put an element into multiple collections, or in a map under several keys, just consume a stream using forEach, and write inside it whatever handling logic you want.

You cannot key your maps by multiple values. For what you want to achieve, you have three options:
Combine your "personByFirstName" and "personByLastName" maps, you will have duplicate values (eg. Bill Gates will be in the map under the key Bill and also in the map under the key Gates). #Andreas answer gives a good stream-based way to do this.
Use an indexing library like lucene and index all your Person objects by first name and last name.
The stream approach - it will not be performant on large data sets but you can stream your collection and use filter to get your matches:
persons
.stream()
.filter(p -> p.getFirstName().equals("Steve")
|| p.getLastName().equals("Steve"))
.collect(Collectors.asList());
(I've written the syntax from memory so you might have to tweak it).

If I got it right you want to map each Person twice, once for the first name and once for the last.
To do this you have to double your stream somehow. Assuming Couple is some existing 2-tuple (Guava or Vavr have some nice implementation) you could:
persons.stream()
.map(p -> new Couple(new Couple(p.firstName, p), new Couple(p.lastName, p)))
.flatMap(c -> Stream.of(c.left, c.right)) // Stream of Couple(String, Person)
.map(c -> new Couple(c.left, Arrays.asList(c.right)))
.collect(Collectors.toMap(Couple::getLeft, Couple::getRight, Collection::addAll));
I didn't test it, but the concept is: make a stream of (name, person), (surname, person)... for every person, then simply map for the left value of each couple. The asList is to have a collection as value. If you need a Set chenge the last line with .collect(Collectors.toMap(Couple::getLeft, c -> new HashSet(c.getRight), Collection::addAll))

Try SetMultimap, either from Google Guava or my library abacus-common
SetMultimap<String, Person> result = Multimaps.newSetMultimap(new HashMap<>(), () -> new HashSet<>()); // by Google Guava.
// Or result = N.newSetMultimap(); // By Abacus-Util
persons.forEach(p -> {
result.put(p.getFirstName(), p);
result.put(p.getLastName(), p);
});

Java8 lambda approach

I have this piece of code that filters from a list of objects based on a set of String identifiers passed in and returns a map of string-id and objects. Something similar to follows:
class Foo {
String id;
String getId() {return id};
};
// Get map of id --> Foo objects whose string are in fooStr
Map<String,Foo> filterMethod (Set<String> fooStr) {
List<Foo> fDefs; // list of Foo objects
Map<String,Foo> fObjMap = new HashMap<String, Foo>(); // map of String to Foo objects
for (Foo f : fDefs) {
if (fooStr.contains(f.getId()))
fObjMap.put(f.getId(),f);
}
return (fObjMap);
}
Is there a better Java8 way of doing this using filter or map?
I could not figure it out and tried searching on stackoverflow but could not find any hints, so am posting as a question.
Any help is much appreciated.
~Ash

Just use the filter operator with the same predicate as above and then the toMap collector to build the map. Also notice that your iterative solution precludes any possibility of key conflict, hence, I have omitted that, too.
Map<String, Foo> idToFooMap = fDefs.stream()
.filter(f -> fooStr.contains(f.getId()))
.collect(Collectors.toMap(Foo::getId, f -> f));

When including items conditionally in the final output use filter and when going from stream to a map use Collectors.toMap. Here's what you end up with:
Map<String,Foo> filterMethod (final Set<String> fooStr) {
List<Foo> fDefs; // list of Foo objects
return fDefs.stream()
.filter(foo -> fooStr.contains(foo.getId()))
.collect(Collectors.toMap(Foo::getId, Function.identity()));
}

Though ggreiner has already provided a working solution, when there are duplicates you'd better handle it including a mergeFunction.
Directly using Collectors.toMap(keyMapper, valueMapper), one or another day you will encounter this following issue.
If the mapped keys contains duplicates (according to Object.equals(Object)), an IllegalStateException is thrown when the collection operation is performed. If the mapped keys may have duplicates, use toMap(Function, Function, BinaryOperator) instead.
Based on the OP's solution, I think it would be better using
import static java.util.stream.Collectors.*; // save some typing and make it cleaner;
fDefs.stream()
.filter(foo -> fooStr.contains(foo.getId()))
.collect(toMap(Foo::getId, foo -> foo, (oldFoo, newFoo) -> newFoo));

Maybe something like this?
Map<String,Foo> filterMethod (Set<String> fooStr) {
List<Foo> fDefs; // get this list from somewhere
Map<String, Foo> fObjMap = new HashMap<> ();
fDefs.stream()
.filter(foo -> fooStr.contains(foo.getId()))
.forEach(foo -> fObjMap.put(foo.getId(), foo))
return fObjMap;
}

Java Stream multi list iteration

I have 2 lists. 1 list is of Ids and the other list is full of Foo objects, call it list A. The Foo class looks like this:
public class Foo {
private String id;
/* other member variables */
Foo(String id) {
this.id = id;
}
public String getId() {
return id;
}
}
I have a plain list of ids like List<Integer>, call it list B. What I want to do is iterate over list B one element at a time, grab the id, compare it to list A and grab Foo object with the equivalent id and then add the Foo object to a new list, list C.
I'm trying to concatenate streams but I'm new to streams and I'm getting bogged down with all the methods like map, filter, forEach. I'm not sure what to use when.

The straightforward way would be what you have in your post: loop over the ids, select the first Foo having that id and if one if found, collect it into a List. Put into code, it would look like the following: each id is mapped to the corresponding Foo that is found by calling findFirst() on the foos having that id. This returns an Optional that are filtered out it the Foo doesn't exist.
List<Integer> ids = Arrays.asList(1, 2, 3);
List<Foo> foos = Arrays.asList(new Foo("2"), new Foo("1"), new Foo("4"));
List<Foo> result =
ids.stream()
.map(id -> foos.stream().filter(foo -> foo.getId().equals(id.toString())).findFirst())
.filter(Optional::isPresent)
.map(Optional::get)
.collect(Collectors.toList());
The big problem with this approach is that you need to traverse the foos list as many times as there are id to look. A better solution would first be to create a look-up Map where each id maps to the Foo:
Map<Integer, Foo> map = foos.stream().collect(Collectors.toMap(f -> Integer.valueOf(f.getId()), f -> f));
List<Foo> result = ids.stream().map(map::get).filter(Objects::nonNull).collect(Collectors.toList());
In this case, we look-up the Foo and filter out null elements that means no Foo was found.
Another whole different approach is not to traverse the ids and search the Foo, but filter the Foos having an id that is contained in the wanted list of ids. The problem with approach is that it requires to, then, sort the output list so that the order of the resulting list matches the order of the ids.

I would implement it like this :
List<Foo> list = Arrays.asList(
new Foo("abc"),
new Foo("def"),
new Foo("ghi")
);
List<String> ids = Arrays.asList("abc", "def", "xyz");
//Index Foo by ids
Map<String, Foo> map = list.stream()
.collect(Collectors.toMap(Foo::getId, Function.identity()));
//Iterate on ids, find the corresponding elements in the map
List<Foo> result = ids.stream().map(map::get)
.filter(Objects::nonNull) //Optional...
.collect(Collectors.toList());

Java 8: Convert a map with string values to a list containg a different type

I have this Map:
Map<Integer, Set<String>> map = ...
And I have this class Foo:
class Foo {
int id;
String name;
}
I want to convert the map to List<Foo>. Is there a convenient manner in Java 8 to do this?
Currently, my way is:
List<Foo> list = new ArrayList<>((int) map.values().flatMap(e->e.stream()).count()));
for(Integer id : map.keySet()){
for(String name : map.get(id)){
Foo foo = new Foo(id,name);
list.add(foo);
}
}
I feel it's too cumbersome.

You can have the following:
List<Foo> list = map.entrySet()
.stream()
.flatMap(e -> e.getValue().stream().map(name -> new Foo(e.getKey(), name)))
.collect(toList());
For each entry of the map, we create a Stream of the value and map it to the corresponding Foo and then flatten it using flatMap.

The main reason for your version being cumbersome is that you have decided to calculate the capacity of the ArrayList first. Since this calculation requires iterating over the entire map, there is unlikely to be any benefit in this. You definitely should not do such a thing unless you have proved using a proper benchmark that it is needed.
I can't see anything wrong with just using your original version but with the parameterless constructor for ArrayList instead. If you also get rid of the redundant local variable foo, it's actually fewer characters (and clearer to read) than the stream version.

final Map<Integer, Set<String>> map = new HashMap<>();
map
.entrySet()
.stream()
.flatMap(e -> e.getValue().stream().map(s -> new Foo(e.getKey(), s)))
.collect(Collectors.toList());

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java 8 stream groupBy pojo - java

Related

Which is efficient way to bifurcate List of nested object list ? java 8 flatmap vs for each?

Use java stream to group by 2 keys on the same type

Java8 lambda approach

Java Stream multi list iteration

Java 8: Convert a map with string values to a list containg a different type

Categories

Resources