Is there an aggregateBy method in the stream Java 8 api? - java

Run across this very interesting but one year old presentation by Brian Goetz - in the slide linked he presents an aggregateBy() method supposedly in the Stream API, which is supposed to aggregate the elements of a list (?) to a map (given a default initial value and a method manipulating the value (for duplicate keys also) - see next slide in the presentation).
Apparently there is no such method in the Stream API. Is there another method that does something analogous in Java 8 ?

The aggregate operation can be done using the Collectors class. So in the video, the example would be equivalent to :
Map<String, Integer> map =
documents.stream().collect(Collectors.groupingBy(Document::getAuthor, Collectors.summingInt(Document::getPageCount)));
The groupingBy method will give you a Map<String, List<Document>>. Now you have to use a downstream collector to sum all the page count for each document in the List associated with each key.
This is done by providing a downstream collector to groupingBy, which is summingInt, resulting in a Map<String, Integer>.
They give basically the same example in the documentation where they compute the sum of the employees' salary by department.
I think that they removed this operation and created the Collectors class instead to have a useful class that contains a lot of reductions that you will use commonly.

Let's say we have a list of employees with their department and salary and we want the total salary paid by each department.
There are several ways to do it and you could for example use a toMap collector to aggregate the data per department:
the first argument is the key mapper (your aggregation axis = the department),
the second is the value mapper (the data you want to aggregate = salaries), and
the third is the merging function (how you want to aggregate data = sum the values).
Example:
import static java.util.stream.Collectors.*;
public static void main(String[] args) {
List<Person> persons = Arrays.asList(new Person("John", "Sales", 10000),
new Person("Helena", "Sales", 10000),
new Person("Somebody", "Marketing", 15000));
Map<String, Double> salaryByDepartment = persons.stream()
.collect(toMap(Person::department, Person::salary, (s1, s2) -> s1 + s2));
System.out.println("salary by department = " + salaryByDepartment);
}
As often with streams, there are several ways to get the desired result, for example:
import static java.util.stream.Collectors.*;
Map<String, Double> salaryByDepartment = persons.stream()
.collect(groupingBy(Person::department, summingDouble(Person::salary)));
For reference, the Person class:
static class Person {
private final String name, department;
private final double salary;
public Person(String name, String department, double salary) {
this.name = name;
this.department = department;
this.salary = salary;
}
public String name() { return name; }
public String department() { return department; }
public double salary() { return salary; }
}

This particular Javadoc entry is about the closest thing I could find on this piece of aggregation in Java 8. Even though it's a third party API, the signatures seem to line up pretty well - you provide some function to get values from, some terminal function for values (zero, in this case), and some function to combine the function and the values together.
It feels a lot like a Collector, which would offer us the ability to do this.
Map<String, Integer> strIntMap =
intList.stream()
.collect(Collectors
.groupingBy(Document::getAuthor,
Collectors.summingInt(Document::getPageCount)));
The idea then is that we group on the author's name for each entry in our list, and add up the total page numbers that the author has into a Map<String, Integer>.

Related

Appending to a list within a stream to a map

I'm attempting to consolidate multiple unnecessary web requests into a map, with the key connected to a location's ID, and the value being a list of products at that location.
The idea is to reduce the amount of requests to my flask server by creating a single request for each location, with a list of required products mapped to it.
I have tried to find others who has faced a similar problem using Java 8's streaming functionality, but I cannot find anyone who is trying to append to a list within a map.
Example;
public class Product {
public Integer productNumber();
public Integer locationNumber();
}
List<Product> products = ... (imagine many products in this list)
Map<Integer, List<Integer>> results = products.stream()
.collect(Collectors.toMap(p -> p.locationNumber, p -> Arrays.asList(p.productNumber));
Also, the second p parameter cannot access the current product in stream.
Because of this, I have been unable to test if I can append to a List when the location number matches a pre-existing list. I don't believe I can use Arrays.asList(), as I believe its immutable.
At the end, the map should have many product numbers in a list per location. Is it possible to append Integers to a pre-existing list within a map?
You may do it like so,
Map<Integer, List<Integer>> res = products.stream()
.collect(Collectors.groupingBy(Product::locationNumber,
Collectors.mapping(Product::productNumber, Collectors.toList())));
The java collectors API is pretty powerful and have lots of nice utility method to solve this.
public class Learn {
static class Product {
final Integer productNumber;
final Integer locationNumber;
Product(Integer productNumber, Integer locationNumber) {
this.productNumber = productNumber;
this.locationNumber = locationNumber;
}
Integer getProductNumber() {
return productNumber;
}
Integer getLocationNumber() {
return locationNumber;
}
}
public static Product of(int i, int j){
return new Product(i,j);
}
public static void main(String[] args) {
List productList = Arrays.asList(of(1,1),of(2,1),of(3,1),
of(7,2),of(8,2),of(9,2));
Map> results = productList.stream().collect(Collectors.groupingBy(Product::getLocationNumber,
Collectors.collectingAndThen(Collectors.toList(), pl->pl.stream().map(Product::getProductNumber).collect(Collectors.toList()))));
System.out.println(results);
}
}
So, what we are doing here is we are streaming the product list and grouping the stream by the location attribute but with the twist that we want to transform the collected list of products to list of product numbers.
Collectors.collectingAndThen is precisely the method for this which will let you specify a main collector toList() and a transformer function which is nothing but again a stream to map product to product numbers. IN java API doc the main collector and transformer are labeled as downstream collector and finisher.
Please go through the Collectors source code to have a complete understanding as to how all these different collectors are defined.

How to initialise a Map<K, Map<K,V>> on a single line

Is it possible to combine these two lines of code into one?
allPeople.put("Me", new HashMap<String, String>());
allPeople.get("Me").put("Name", "Surname");
The literal replacement of these two lines would be (in Java 8+):
allPeople.compute("Me", (k, v) -> new HashMap<>()).put("Name", "Surname");
or, in the style of Bax's answer, for pre-Java 9, you could use:
allPeople.put("Me", new HashMap<>(Collections.singletonMap("Name", "Surname")));
Starting with Java 9 there is a JDK provided Map factory
allPeople.put("Me", Map.of("Name", "Surname"));
You should probably represent a person as an object. That way you cannot call get("someKey") on a key that does not exist and your code blow up. That is the idea of object oriented programming. To encapsulate related data and functionality. Nested maps does a similar thing, but it is more error prone. For a language that does not support objects, that makes sense. But representing a person as an object allows you to better control the fields the mapping has, thus making your code more error-free.
class Person {
private String name;
private String surname;
public Person(String name, String surname) {
this.name = name;
this.surname = surname;
}
}
Then you create a map that maps names to people:
Map<String, Person> allPeople = new HashMap<>();
// Create an object that represents a person
Person me = new Person("name", "surname");
// Map the string "me" to the object me that represents me
allPeople.put("ME", me);

Group objects in list by multiple fields

I have a simple object like this
public class Person {
private int id;
private int age;
private String hobby;
//getters, setters
}
I want to group a list of Person by attributes
Output should be like this
Person count/Age/Hobby
2/18/Basket
5/20/football
With a chart for more understanding
X axis : hobby repartition
Y axis : count of person distribution
Colors represents age
I managed to group by one attribute using map, but I can't figure how to group by multiples attributes
//group only by age . I want to group by hobby too
personMapGroupped = new LinkedHashMap<String, List<Person>>();
for (Person person : listPerson) {
String key = person.getAge();
if (personMapGroupped.get(key) == null) {
personMapGroupped.put(key, new ArrayList<Person>());
}
personMapGroupped.get(key).add(person);
}
Then I retrieve the groupable object like this
for (Map.Entry<String, List<Person>> entry : personMapGroupped .entrySet()) {
String key = entry.getKey();// group by age
String value = entry.getValue(); // person count
// I want to retrieve the group by hobby here too...
}
Any advice would be appreciated.
Thank you very much
Implement methods for comparing people according to the different fields. For instance, if you want to group by age, add this method to Person:
public static Comparator<Person> getAgeComparator(){
return new Comparator<Person>() {
#Override
public int compare(Person o1, Person o2) {
return o1.age-o2.age;
}
};
}
Then you can simply call: Arrays.sort(people,Person.getAgeComparator()) or use the following code to sort a Collection:
List<Person> people = new ArrayList<>();
people.sort(Person.getAgeComparator());
To sort using more than one Comparator simultaneously, you first define a Comparator for each field (e.g. one for age and one for names). Then you can combine them using a ComparatorChain. You would use the ComparatorChain as follows:
ComparatorChain chain = new ComparatorChain();
chain.addComparator(Person.getNameComparator());
chain.addComparator(Person.getAgeComparator());
You could simply combine the attributes to a key.
for (Person person : listPerson) {
String key = person.getAge() + ";" + person.getHobby();
if (!personMapGrouped.contains(key)) {
personMapGrouped.put(key, new ArrayList<Person>());
}
personMapGrouped.get(key).add(person);
}
The count of entries is easy to determine by using personMapGrouped.get("18;Football").getSize().
I'm not sure about your requirements, but I'd probably use multiple maps (Google Guava's Multimap would make that easier btw) and sets, e.g. something like this:
//I'm using a HashMultimap since order of persons doesn't seem to be relevant and I want to prevent duplicates
Multimap<Integer, Person> personsByAge = HashMultimap.create();
//I'm using the hobby name here for simplicity, it's probably better to use some enum or Hobby object
Multimap<String, Person> personsByHobby = HashMultimap.create();
//fill the maps here by looping over the persons and adding them (no need to create the value sets manually
Since I use value sets Person needs a reasonable implementation of equals() and hashCode() which might make use of the id field. This also will help in querying.
Building subsets would be quite easy:
Set<Person> age18 = personsByAge.get(18);
Set<Person> basketballers = personsByHobby.get( "basketball" );
//making use of Guava again
Set<Person> basketballersAged18 = Sets.intersection( age18, basketballers );
Note that I made use of Google Guava here but you can achieve the same with some additional manual code (e.g. using Map<String, Set<Person>> and manually creating the value sets as well as using the Set.retainAll() method).

Which collections to use?

Suppose I want to store phone numbers of persons. Which kind of collection should I use for key value pairs? And it should be helpful for searching. The name may get repeated, so there may be the same name having different phone numbers.
In case you want to use key value pair. Good choice is to use Map instead of collection.
So what should that map store ?
As far it goes for key. First thing you want to assure is that your key is unique to avoid collisions.
class Person {
long uniqueID;
String name;
String lastname;
}
So we will use the uniqueID of Person for key.
What about value ?
In this case is harder. As the single Person can have many phone numbers. But for simple task lest assume that a person can have only one phone number. Then what you look is
class PhoneNumberRegistry {
Map<Long,String> phoneRegistry = new HashMap<>();
}
Where the long is taken from person. When you deal with Maps, you should implement the hashCode and equals methods.
Then your registry could look like
class PhoneNumberRegistry {
Map<Person,String> phoneRegistry = new HashMap<>();
}
In case when you want to store more then one number for person, you will need to change the type of value in the map.
You can use Set<String> to store multiple numbers that will not duplicate. But to have full control you should introduce new type that not only store the number but also what king of that number is.
class PhoneNumberRegistry {
Map<Person,HashSet<String>> phoneRegistry = new HashMap<>();
}
But then you will have to solve various problems like, what phone number should i return ?
Your problem has different solutions. For example, I'll go with a LIST: List<Person>, where Person is a class like this:
public class Person{
private String name;
private List<String> phoneNumbers;
// ...
}
For collections searching/filtering I suggest Guava Collections2.filter method.
You should use this:
Hashtable<String, ArrayList<String>> addressbook = new Hashtable<>();
ArrayList<String> persons = new ArrayList<String>()
persons.add("Tom Butterfly");
persons.add("Maria Wanderlust");
addressbook.put("+0490301234567", persons);
addressbook.put("+0490301234560", persons);
Hashtable are save to not have empty elements, the ArrayList is fast in collect small elements. Know that multiple persons with different names may have same numbers.
Know that 2 persons can have the same number and the same Name!
String name = "Tom Butterfly";
String[] array = addressbook.keySet().toArray(new String[] {});
int firstElement = Collections.binarySearch(Arrays.asList(array),
name, new Comparator<String>() {
#Override
public int compare(String top, String bottom) {
if (addressbook.get(top).contains(bottom)) {
return 0;
}
return -1;
}
});
System.out.println("Number is " + array[firstElement]);
Maybe
List<Pair<String, String> (for one number per person)
or
List<Pair<String, String[]> (for multiple numbers per person)
will fit your needs.

Which collection to use?

What kind of collection should I use if I need to create a collection that will allow me to store books and how many copies there are in circulation (for a library)?
I would use an ArrayList, but I also want to be able to sort the books by order of issue year.
You can create a Book Class with all the attributes you have for a book. And implement a Comparable for that Book Class and write sorting logic in there.
Maintain a List<Book>, and use Collections.sort method, to sort your List according to the implemented Sorting logic.
UPDATE: -
As far as, fast look-up is concerned, a Map is always the best bet. And is appropriate to implement a dictionary look-up kind of structure. For that, you would need some attribute that uniquely identifies each book. And then store your book as Map<String, Book>, where your key might be id of type String.
Also, in this case, your sorting logic will change a little. Now you would have to sort on the basis of your Map's value, i.e. on the basis of attributes of Book.
Here's a sample code you can make use of. I have just considered sorting on the basis of id. You can change the sorting logic as needed: -
class Book {
private int id;
private String title;
public Book() {
}
public Book(int id, String title) {
this.id = id;
this.title = title;
}
#Override
public String toString() {
return "Book[Title:" + this.getTitle() + ", Id:" + this.getId() + "]";
}
// Getters and Setters
}
public class Demo {
public static void main(String[] args) {
final Map<String, Book> map = new HashMap<String, Book>() {
{
put("b1", new Book(3, "abc"));
put("b2", new Book(2, "c"));
}
};
List<Map.Entry<String, Book>> keyList = new LinkedList<Map.Entry<String, Book>>(map.entrySet());
Collections.sort(keyList, new Comparator<Map.Entry<String, Book>>() {
#Override
public int compare(Map.Entry<String, Book> o1, Map.Entry<String, Book> o2) {
return o1.getValue().getId() - o2.getValue().getId();
}
});
Map<String, Book> result = new LinkedHashMap<String, Book>();
for (Iterator<Map.Entry<String, Book>> it = keyList.iterator(); it.hasNext();) {
Map.Entry<String, Book> entry = it.next();
result.put(entry.getKey(), entry.getValue());
}
System.out.println(result);
}
}
OUTPUT: -
"{b2=Book[Title:c, Id:2], b1=Book[Title:abc, Id:3]}"
Well, If the entire purpose of your collection is to store the counts of the books, than a dictionary/map, or whatever java's key-value collection is called.
It would probably have title as your key, and the count as your value.
Now I suspect that your collection might be a little more complicated than that, so you might want to make a Book class which has Count as a field, and then I'd probably have a string -> Book dictionary/map anyway, with the string as it's dewy decimal number or some other unique identifier.
Beyond a simple educational or toy project, you'd want to use a database rather than an in-memory collection. (Not really an answer, but I think worth stating.)
java.util.TreeMap can be used to index and sort this kind of requirements.
Check http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html for more details.
You can use your Book object as key mapped to the number of copies as the value.

Categories

Resources