Java API Streams collecting stream in Map where value is a TreeSet - java

There is a Student class which has name, surname, age fields and getters for them.
Given a stream of Student objects.
How to invoke a collect method such that it will return Map where keys are age of Student and values are TreeSet which contain surname of students with such age.
I wanted to use Collectors.toMap(), but got stuck.
I thought I could do like this and pass the third parameter to toMap method:
stream().collect(Collectors.toMap(Student::getAge, Student::getSurname, new TreeSet<String>()))`.

students.stream()
.collect(Collectors.groupingBy(
Student::getAge,
Collectors.mapping(
Student::getSurname,
Collectors.toCollection(TreeSet::new))
))

Eugene has provided the best solution to what you want as it's the perfect job for the groupingBy collector.
Another solution using the toMap collector would be:
Map<Integer, TreeSet<String>> collect =
students.stream()
.collect(Collectors.toMap(Student::getAge,
s -> new TreeSet<>(Arrays.asList(s.getSurname())),
(l, l1) -> {
l.addAll(l1);
return l;
}));

Related

Building immutable map using a list of objects

I have a list of students and a delegate that has a function to get a list of servers for a given student (getServersForStudent(student)). I would like to create a map for a list of students indexed for each server. A student can be in many servers.
private Map<Server, Student> getStudentsByServer(List<Student> students) {
Map<Server, List<Student>> map = new HashMap<>();
students.forEach(student ->
List<Server> servers = delegate.getServersForStudent(student);
if (!servers.isEmpty()) {
servers.forEach(server -> map.putIfAbsent(server, new ArrayList<>()).add(student));
}
);
return map
}
This works perfectly, but I would like to refactor this to use streams in order to make an immutable collection instead. I tried doing this with groupingBy, but I wasn't able to get the right result:
students
.stream()
.collect(
Collectors.groupingBy(
student -> delegate.getServersForStudent(student);
Collectors.mapping(Function.identity(), Collectors.toList())
)
);
This grouping doesn't have the same result as above since it is grouping by lists. Does anyone have any suggestions on how to best do this with Java streams?
Streams are not required to return an immutable collection; simply copy your collection into an immutable one in the end or wrap it in an unmodifiable wrapper:
private Map<Server, Student> getStudentsByServer(final List<Student> students) {
final Map<Server, List<Student>> map = new HashMap<>();
for (final Student student : students) {
for (final Server server : delegate.getServersForStudent(student)) {
map.computeIfAbsent(server, new ArrayList<>())
.add(student);
}
);
// wrap:
// return Collections.unmodifiableMap(map);
// or copy:
return Map.copyOf(map);
}
If you really want to do it stream-based, you have to first create a stream of tuples (student, server), which you can then group. Java does not have a specific tuple type, but short of creating a custom type, you can misuse Map.Entry<K, V> for that:
students
.stream()
.flatMap(student -> delegate.getServersForStudent(student)
.stream()
.map(server -> Map.entry(student, server)))
.collect(
Collectors.groupingBy(
tuple -> tuple.getValue(),
Collectors.mapping(
tuple -> tuple.getKey(),
Collectors.toList())));
Note that the collection return by Collectors don't make any promises about the (im)mutability. If you require immutability, you have to add another collection step using Collectors.collectingAndThen:
.collect(
Collectors.collectingAndThen(
Collectors.groupingBy(
tuple -> tuple.getValue(),
Collectors.mapping(
tuple -> tuple.getKey(),
Collectors.toList())),
Map::copyOf);
// or wrap with: Collections::unmodifiableMap
And it's definitely worthwhile to mention that an unmodifiable/immutable map as in the example above still allows to modify the list of servers, because that Collectors.toList() currently returns an ArrayList. If you require the value of the map to be immutable too, you have to take care of that yourself, e.g. using Collectors.toUnmodifiableList or by copying/wrapping the list again.

How to preserve all Subgroups while applying nested groupingBy collector

I am trying to group a list of employees by the gender and department.
How do I ensure all departments are included in a sorted order for each gender, even when the relevant gender count is zero?
Currently, I have the following code and output
employeeRepository.findAll().stream()
.collect(Collectors.groupingBy(Employee::getGender,
Collectors.groupingBy(Employee::getDepartment,
Collectors.counting())));
//output
//{MALE={HR=1, IT=1}, FEMALE={MGMT=1}}
Preferred output is:
{MALE={HR=1, IT=1, MGMT=0}, FEMALE={HR=0, IT=0, MGMT=1}}
To achieve that, first you have to group by department, and only then by gender, not the opposite.
The first collector groupingBy(Employee::getDepartment, _downstream_ ) will split the data set into groups based on department. As it downstream collector partitioningBy(employee -> employee.getGender() == Employee.Gender.MALE, _downstream_ ) will be applied, it'll divide the data mapped to each department into two parts based on the employee gender. And finally, Collectors.counting() applied as a downstream will provide the total number of employees of each gender for every department.
So the intermediate map produced by the collect() operation will be of type Map<String, Map<Boolean, Long>> - employee count by gender (Boolean) for each department (for simplicity, department is a plain string).
The next step in transform this map into Map<Employee.Gender, Map<String, Long>> - employee count by department for each gender.
My approach is to create a stream over the entry set and replace each entry with a new one, which will hold a gender as its key and in order to preserve the information about a department its value in turn will be an entry with a department as a key and a with a count by department as its value.
Then collect the stream of entries with groupingBy by the entry key. Apply mapping as a downstream collector to extract the nested entry. And then apply Collectors.toMap() to collect entries of type Map.Entry<String, Long> into map.
all departments are included in a sorted order
To insure the order in the nested map (department by count) a NavigableMap should be used.
In order to do that, a flavor of toMap() that expects a mapFactory needs to be used (it also expects a mergeFunction which isn't really useful for this task since there will be no duplicates, but it has to be provided as well).
public static void main(String[] args) {
List<Employee> employeeRepository =
List.of(new Employee("IT", Employee.Gender.MALE),
new Employee("HR", Employee.Gender.MALE),
new Employee("MGMT", Employee.Gender.FEMALE));
Map<Employee.Gender, NavigableMap<String, Long>> departmentCountByGender = employeeRepository
.stream()
.collect(Collectors.groupingBy(Employee::getDepartment, // Map<String, Map<Boolean, Long>> - department to *employee count* by gender
Collectors.partitioningBy(employee -> employee.getGender() == Employee.Gender.MALE,
Collectors.counting())))
.entrySet().stream()
.flatMap(entryDep -> entryDep.getValue().entrySet().stream()
.map(entryGen -> Map.entry(entryGen.getKey() ? Employee.Gender.MALE : Employee.Gender.FEMALE,
Map.entry(entryDep.getKey(), entryGen.getValue()))))
.collect(Collectors.groupingBy(Map.Entry::getKey,
Collectors.mapping(Map.Entry::getValue,
Collectors.toMap(Map.Entry::getKey,
Map.Entry::getValue,
(v1, v2) -> v1,
TreeMap::new))));
System.out.println(departmentCountByGender);
}
Dummy Employee class used for demo-purposes:
class Employee {
enum Gender {FEMALE, MALE};
private String department;
private Gender gender;
// etc.
// constructor, getters
}
Output
{FEMALE={HR=0, IT=0, MGMT=1}, MALE={HR=1, IT=1, MGMT=0}}
You can continue to work on the result of your code:
List<String> deptList = employees.stream().map(Employee::getDepartment).sorted().toList();
Map<Gender, Map<String, Long>> tmpResult = employees.stream()
.collect(Collectors.groupingBy(Employee::getGender, Collectors.groupingBy(Employee::getDepartment, Collectors.counting())));
Map<Gender, Map<String, Long>> finalResult = new HashMap<>();
for (Map.Entry<Gender, Map<String, Long>> entry : tmpResult.entrySet()) {
Map<String, Long> val = new LinkedHashMap<>();
for (String dept : deptList) {
val.put(dept, entry.getValue().getOrDefault(dept, 0L));
}
finalResult.put(entry.getKey(), val);
}
System.out.print(finalResult);
Probably readability or maintainability of code won't be good if you want to achieve result with one line of code.
However, there is one alternative if you don't mind to use third-party library: abacus-common
Map<Gender, Map<String, Integer>> result = Stream.of(employees)
.groupByToEntry(Employee::getGender, MoreCollectors.countingIntBy(Employee::getDepartment)) // step 1) group by gender
.mapValue(it -> Maps.newMap(deptList, Fn.identity(), dept -> it.getOrDefault(dept, 0), IntFunctions.ofLinkedHashMap())) // step 2) process the value.
.toMap();
Declaration: I'm the developer of abacus-common

Use java stream to group by 2 keys on the same type

Using java stream, how to create a Map from a List to index by 2 keys on the same class?
I give here a code Example, I would like the map "personByName" to get all person by firstName OR lastName, so I would like to get the 3 "steves": when it's their firstName or lastname. I don't know how to mix the 2 Collectors.groupingBy.
public static class Person {
final String firstName;
final String lastName;
protected Person(String firstName, String lastName) {
super();
this.firstName = firstName;
this.lastName = lastName;
}
public String getFirstName() {
return firstName;
}
public String getLastName() {
return lastName;
}
}
#Test
public void testStream() {
List<Person> persons = Arrays.asList(
new Person("Bill", "Gates"),
new Person("Bill", "Steve"),
new Person("Steve", "Jobs"),
new Person("Steve", "Wozniac"));
Map<String, Set<Person>> personByFirstName = persons.stream().collect(Collectors.groupingBy(Person::getFirstName, Collectors.toSet()));
Map<String, Set<Person>> personByLastName = persons.stream().collect(Collectors.groupingBy(Person::getLastName, Collectors.toSet()));
Map<String, Set<Person>> personByName = persons.stream().collect(Collectors.groupingBy(Person::getLastName, Collectors.toSet()));// This is wrong, I want bot first and last name
Assert.assertEquals("we should search by firstName AND lastName", 3, personByName.get("Steve").size()); // This fails
}
I found a workaround by looping on the 2 maps, but it is not stream-oriented.
You can do it like this:
Map<String, Set<Person>> personByName = persons.stream()
.flatMap(p -> Stream.of(new SimpleEntry<>(p.getFirstName(), p),
new SimpleEntry<>(p.getLastName(), p)))
.collect(Collectors.groupingBy(SimpleEntry::getKey,
Collectors.mapping(SimpleEntry::getValue, Collectors.toSet())));
Assuming you add a toString() method to the Person class, you can then see result using:
List<Person> persons = Arrays.asList(
new Person("Bill", "Gates"),
new Person("Bill", "Steve"),
new Person("Steve", "Jobs"),
new Person("Steve", "Wozniac"));
// code above here
personByName.entrySet().forEach(System.out::println);
Output
Steve=[Steve Wozniac, Bill Steve, Steve Jobs]
Jobs=[Steve Jobs]
Bill=[Bill Steve, Bill Gates]
Wozniac=[Steve Wozniac]
Gates=[Bill Gates]
You could merge the two Map<String, Set<Person>> for example
Map<String, Set<Person>> personByFirstName =
persons.stream()
.collect(Collectors.groupingBy(
Person::getFirstName,
Collectors.toCollection(HashSet::new))
);
persons.stream()
.collect(Collectors.groupingBy(Person::getLastName, Collectors.toSet()))
.forEach((str, set) -> personByFirstName.merge(str, set, (s1, s2) -> {
s1.addAll(s2);
return s1;
}));
// personByFirstName contains now all personByName
One way would be by using the newest JDK12's Collector.teeing:
Map<String, List<Person>> result = persons.stream()
.collect(Collectors.teeing(
Collectors.groupingBy(Person::getFirstName,
Collectors.toCollection(ArrayList::new)),
Collectors.groupingBy(Person::getLastName),
(byFirst, byLast) -> {
byLast.forEach((last, peopleList) ->
byFirst.computeIfAbsent(last, k -> new ArrayList<>())
.addAll(peopleList));
return byFirst;
}));
Collectors.teeing collects to two separate collectors and then merges the results into a final value. From the docs:
Returns a Collector that is a composite of two downstream collectors. Every element passed to the resulting collector is processed by both downstream collectors, then their results are merged using the specified merge function into the final result.
So, the above code collects to a map by first name and also to a map by last name and then merges both maps into a final map by iterating the byLast map and merging each one of its entries into the byFirst map by means of the Map.computeIfAbsent method. Finally, the byFirst map is returned.
Note that I've collected to a Map<String, List<Person>> instead of to a Map<String, Set<Person>> to keep the example simple. If you actually need a map of sets, you could do it as follows:
Map<String, Set<Person>> result = persons.stream().
.collect(Collectors.teeing(
Collectors.groupingBy(Person::getFirstName,
Collectors.toCollection(LinkedHashSet::new)),
Collectors.groupingBy(Person::getLastName, Collectors.toSet()),
(byFirst, byLast) -> {
byLast.forEach((last, peopleSet) ->
byFirst.computeIfAbsent(last, k -> new LinkedHashSet<>())
.addAll(peopleSet));
return byFirst;
}));
Keep in mind that if you need to have Set<Person> as the values of the maps, the Person class must implement the hashCode and equals methods consistently.
If you want a real stream-oriented solution, make sure you don't produce any large intermediate collections, else most of the sense of streams is lost.
If just you want to just filter all Steves, filter first, collect later:
persons.stream
.filter(p -> p.getFirstName().equals('Steve') || p.getLastName.equals('Steve'))
.collect(toList());
If you want to do complex things with a stream element, e.g. put an element into multiple collections, or in a map under several keys, just consume a stream using forEach, and write inside it whatever handling logic you want.
You cannot key your maps by multiple values. For what you want to achieve, you have three options:
Combine your "personByFirstName" and "personByLastName" maps, you will have duplicate values (eg. Bill Gates will be in the map under the key Bill and also in the map under the key Gates). #Andreas answer gives a good stream-based way to do this.
Use an indexing library like lucene and index all your Person objects by first name and last name.
The stream approach - it will not be performant on large data sets but you can stream your collection and use filter to get your matches:
persons
.stream()
.filter(p -> p.getFirstName().equals("Steve")
|| p.getLastName().equals("Steve"))
.collect(Collectors.asList());
(I've written the syntax from memory so you might have to tweak it).
If I got it right you want to map each Person twice, once for the first name and once for the last.
To do this you have to double your stream somehow. Assuming Couple is some existing 2-tuple (Guava or Vavr have some nice implementation) you could:
persons.stream()
.map(p -> new Couple(new Couple(p.firstName, p), new Couple(p.lastName, p)))
.flatMap(c -> Stream.of(c.left, c.right)) // Stream of Couple(String, Person)
.map(c -> new Couple(c.left, Arrays.asList(c.right)))
.collect(Collectors.toMap(Couple::getLeft, Couple::getRight, Collection::addAll));
I didn't test it, but the concept is: make a stream of (name, person), (surname, person)... for every person, then simply map for the left value of each couple. The asList is to have a collection as value. If you need a Set chenge the last line with .collect(Collectors.toMap(Couple::getLeft, c -> new HashSet(c.getRight), Collection::addAll))
Try SetMultimap, either from Google Guava or my library abacus-common
SetMultimap<String, Person> result = Multimaps.newSetMultimap(new HashMap<>(), () -> new HashSet<>()); // by Google Guava.
// Or result = N.newSetMultimap(); // By Abacus-Util
persons.forEach(p -> {
result.put(p.getFirstName(), p);
result.put(p.getLastName(), p);
});

How to iterate and work on the values of a map whose values are a list of elements using java 8 streams and lambdas

Gist: We are trying to rewrite our old java code with java 8 streams and lambdas wherever possible.
Question:
I have a map whose key is a string and values are list of user defined objects.
Map<String, List<Person>> personMap = new HashMap<String, List<Person>>();
personMap.put("A", Arrays.asList(person1, person2, person3, person4));
personMap.put("B", Arrays.asList(person5, person6, person7, person8));
Here the persons are grouped based on their name's starting character.
I want to find the average age of the persons on the list for every key in the map.
For the same, i have tried many things mentioned in stackoverflow but none is matching with my case or few are syntactically not working.
Thanks In Advance!
You didn't specify how (or if) you would like the ages to be stored, so I have it as just a print statement at the moment. Nevertheless, it's as simple as iterating over each entry in the Map and printing the average of each List (after mapping the Person objects to their age):
personMap.forEach((k, v) -> {
System.out.println("Average age for " + k + ": " + v.stream().mapToInt(Person::getAge).average().orElse(-1));
});
If you would like to store it within another Map, then you can collect it to one pretty easily:
personMap.entrySet()
.stream()
.collect(Collectors.toMap(Map.Entry::getKey, e -> e.getValue()
.stream()
.mapToInt(Person::getAge)
.average()
.orElse(-1)));
Note: -1 is returned if no average is present, and this assumes a getter method exists within Person called getAge, returning an int.
You can do it like so:
// this will get the average over a list
Function<List<Person>, Double> avgFun = (personList) ->
personList.stream()
.mapToInt(Person::getAge)
.average()
.orElseGet(null);
// this will get the averages over your map
Map<String, Double> result = personMap.entrySet().stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
entry -> avgFun.apply(entry.getValue())
));

Is it possible to use foreach in groupingBy in java stream?

I have a list with books. There is:
List(string) of authors
title(string) of book
and rating (double)
I would like to calculate average for each author of him books.
Problem is the list of authors, if I get one author for one book, It will be no problem. I would to solve it like this:
Map<String, Double> result = books.stream()
.collect(Collectors
.groupingBy(Book::getAuthor, TreeMap::new, Collectors
.averagingDouble(Book::getRating)
Anyone have solution?
I would map to a SimpleEntry containing each author along with their book to make it easier to group.
With this one can maintain both the author along with the book object thus enabling one to extract any information from the book object upon passing the downstream collector.
example:
Map<String, Double> resultSet =
books.stream()
.flatMap(book -> book.getAuthors()
.stream()
.map(author -> new AbstractMap.SimpleEntry<>(author, book)))
.collect(Collectors.groupingBy(AbstractMap.SimpleEntry::getKey,
TreeMap::new,
Collectors.averagingDouble(e -> e.getValue().getRating())));
Map<String, Long> result = books.stream()
.flatMap(i -> i.getAuthors().stream())
.collect(Collectors.groupingBy(Function.identity(),
Collectors.counting()));
flatten authors and groupingBy identity with counting can achieve this.

Categories

Resources