Java Stream GroupBy and Reduce - java

I have an Item class which contains a code, quantity and amount fields, and a list of items which may contain many items (with same code). I want to group the items by code and sum up their quantities and amounts.
I was able to achieve half of it using stream's groupingBy and reduce. The grouping by worked, but the reduce is reducing all of the grouped items into one single item repeated over the different codes (groupingBy key).
Shouldn't reduce here reduce the list of items for each code from the map? Why is it retuning the same combined item for all.
Below is a sample code.
import java.util.List;
import java.util.Arrays;
import java.util.stream.Collectors;
import java.util.Map;
class HelloWorld {
public static void main(String[] args) {
List<Item> itemList = Arrays.asList(
createItem("CODE1", 1, 12),
createItem("CODE2", 4, 22),
createItem("CODE3", 5, 50),
createItem("CODE4", 2, 11),
createItem("CODE4", 8, 20),
createItem("CODE2", 1, 42)
);
Map<String, Item> aggregatedItems = itemList
.stream()
.collect(Collectors.groupingBy(
Item::getCode,
Collectors.reducing(new Item(), (aggregatedItem, item) -> {
int aggregatedQuantity = aggregatedItem.getQuantity();
double aggregatedAmount = aggregatedItem.getAmount();
aggregatedItem.setQuantity(aggregatedQuantity + item.getQuantity());
aggregatedItem.setAmount(aggregatedAmount + item.getAmount());
return aggregatedItem;
})
));
System.out.println("Map total size: " + aggregatedItems.size()); // expected 4
System.out.println();
aggregatedItems.forEach((key, value) -> {
System.out.println("key: " + key);
System.out.println("value - quantity: " + value.getQuantity() + " - amount: " + value.getAmount());
System.out.println();
});
}
private static Item createItem(String code, int quantity, double amount) {
Item item = new Item();
item.setCode(code);
item.setQuantity(quantity);
item.setAmount(amount);
return item;
}
}
class Item {
private String code;
private int quantity;
private double amount;
public Item() {
quantity = 0;
amount = 0.0;
}
public String getCode() { return code; }
public int getQuantity() { return quantity; }
public double getAmount() { return amount; }
public void setCode(String code) { this.code = code; }
public void setQuantity(int quantity) { this.quantity = quantity; }
public void setAmount(double amount) { this.amount = amount; }
}
and below is the output.
Map total size: 4
key: CODE2
value - quantity: 21 - amount: 157.0
key: CODE1
value - quantity: 21 - amount: 157.0
key: CODE4
value - quantity: 21 - amount: 157.0
key: CODE3
value - quantity: 21 - amount: 157.0

You must not modify the input arguments to Collectors.reducing. new Item() is only executed once and all your reduction operations will share the same "aggregation instance". In other words: the map will contain the same value instance 4 times (you can easily check yourself with System.identityHashCode() or by comparing for reference-equality: aggregatedItems.get("CODE1") == aggregatedItems.get("CODE2")).
Instead, return a new result instance:
final Map<String, Item> aggregatedItems = itemList
.stream()
.collect(Collectors.groupingBy(
Item::getCode,
Collectors.reducing(new Item(), (item1, item2) -> {
final Item reduced = new Item();
reduced.setQuantity(item1.getQuantity() + item2.getQuantity());
reduced.setAmount(item1.getAmount() + item2.getAmount());
return reduced;
})
));
Output:
Map total size: 4
key: CODE2
value - quantity: 5 - amount: 64.0
key: CODE1
value - quantity: 1 - amount: 12.0
key: CODE4
value - quantity: 10 - amount: 31.0
key: CODE3
value - quantity: 5 - amount: 50.0

You are using reducing, which assumes that you won't mutate the accumulator passed in. reducing won't create new Items for you for each new group, and expects you to create new Items and return them in the lambda, like this:
// this works as expected
.collect(Collectors.groupingBy(
Item::getCode,
Collectors.reducing(new Item(), (item1, item2) -> createItem(
item1.getCode(),
item1.getQuantity() + item2.getQuantity(),
item1.getAmount() + item2.getAmount()
))
));
so it is very suitable if you are using immutable objects like numbers or strings.
Since you are not creating new Items in your code, reducing keeps on reusing that same instance, resulting in the behaviour you see.
If you want to mutate the objects, you can do mutable reduction in a thread safe way with Collector.of:
.collect(Collectors.groupingBy(
Item::getCode,
Collector.of(Item::new, (aggregatedItem, item) -> {
int aggregatedQuantity = aggregatedItem.getQuantity();
double aggregatedAmount = aggregatedItem.getAmount();
aggregatedItem.setQuantity(aggregatedQuantity + item.getQuantity());
aggregatedItem.setAmount(aggregatedAmount + item.getAmount());
}, (item1, item2) -> createItem(
item1.getCode(),
item1.getQuantity() + item2.getQuantity(),
item1.getAmount() + item2.getAmount()
))
));
Notice that you now pass the reference to Item's constructor, i.e. a way to create new Items when necessary, as opposed to just a single new Item(). In addition, you also provide a third argument, the combiner, that tells the collector how to create a new item from two existing ones, which will be used if this collector is used in a concurrent situation. (See here for more info about the combiner)
This contrast between Collector.of and Collectors.reducing is the same contrast between Stream.reduce and Stream.collect. Learn more here.

Mutable reduction vs Immutable reduction
In this case, Collectors.reducing() isn't the right tool because it meant for immutable reduction, i.e. for performing folding operation in which every reduction step results in creation of a new immutable object.
But instead of generating a new object at each reduction step, you're changing the state of the object provided as an identity.
As a consequence, you're getting an incorrect result because the identity object would be created only once per thread. This single instance of the Item is used for accumulation, and reference to it end up in every value of the map.
More elaborate information you can find in the Stream API documentation, specifically in these parts: Reduction and Mutable Reduction.
And here's a short quote explaining how Stream.reduce() works (the mechanism behind Collectors.reducing() is the same):
The accumulator function takes a partial result and the next element, and produces a new partial result.
Use mutable reduction
The problem can be fixed by generating a new instance of Item while accumulating values mapped to the same key with, but a more performant approach would be to use mutable reduction instead.
For that, you can implement a custom collector created via static method Collector.of():
Map<String, Item> aggregatedItems = itemList.stream()
.collect(Collectors.groupingBy(
Item::getCode,
Collector.of(
Item::new, // mutable container of the collector
Item::merge, // accumulator - defines how stream data should be accumulated
Item::merge // combiner - mergin the two containers while executing stream in parallel
)
));
For convenience, you can introduce method merge() responsible for accumulating properties of the two items. It would allow to avoid repeating the same logic in accumulator and combiner, and keep the collector implementation lean and well-readable.
public class Item {
private String code;
private int quantity;
private double amount;
// getters, constructor, etc.
public Item merge(Item other) {
this.quantity += other.quantity;
this.amount += other.amount;
return this;
}
}

Related

Accumulate values of Duplicated Element in a List

My list consists of elements with fiels Type(String), Amount(Double) and Quantity(Integer) and it looks like this:
Type: Type A, Amount : 55.0, Quantity : 0
Type: Type A, Amount : 55.0, Quantity : 5
Type: Type A, Amount : 44.35, Quantity : 6
Type: Type A, Amount : 55.0, Quantity : 0
Type: Type B, Amount : 7.0, Quantity : 1
Type: Type B, Amount : 7.0, Quantity : 1
Type: Type C, Amount : 1613.57, Quantity : 0
Type: Type C, Amount : 1613.57, Quantity : 1
So i am trying to loop my array to find duplicate, and add the Amount if its duplicate. The outcome would be like this:
Type: Type A, Amount : 209.35.0, Quantity : 11
Type: Type B, Amount : 14.0, Quantity : 2
Type: Type C, Amount : 3227.14, Quantity : 1
What i have tried is creating another List, add the List to new List, then compare them, but didnt work
List<Type> newList = new ArrayList();
for(int k = 0; k < typeList.size(); k++) {
Type type= new Type();
Double totalAmount = Double.parseDouble("0");
type.setTypeName(typeList.get(k).getTypeName());
type.setAmount(chargeTypeList.get(k).getAmount());
newList.add(k, type);
if(typeList.get(k).getChargeTypeName().equalsIgnoreCase(newList.get(k).getiTypeName())) {
totalAmount += typeList.get(k).getAmount();
}
}
I don't want to hardcode the value to check for duplicate Type
You should probably be putting these values into a Map, which guarantees there is only one element for each key. Using a map is very common for representing amounts of some thing where we store the thing as the key and keep track of how many of those things we have in the value.
You can use compute to then add elements to the list.
What you currently have:
record Data(String type, Double amount, Integer quantity) {}
What may represent your data better:
record Datav2(Double amount, Integer quantity) {}
Storing Datav2 in a map and adding an element.
var map = new HashMap<>(Map.of("A", new Datav2( 2.0, 3)));
// add element to map equivalent to Data("A", 3.0, 3)
map.compute("A", (k, v) -> {
if (v == null) {
v = new Datav2(0.0, 0);
}
return new Datav2(v.amount = 3.0, v.quantity + 3);
});
If you need to start with a list for whatever reason you can use the Stream API to turn the list into a map. Specifically toMap.
var list = List.of(new Data("A", 2.0, 3),
new Data("A", 3.0, 3),
new Data("C", 2.0, 1),
new Data("B", 10.0, 3),
new Data("B", 2.0, 5)
);
var collected = list
.stream()
.collect(Collectors.toMap(
// what will the key be
Data::type,
// what will the value be
data -> new Datav2(data.amount, data.quantity),
// how do we combine two values if they have the same key
(d1, d2) -> new Datav2(d1.amount + d2.amount, d1.quantity + d2.quantity)
));
System.out.println(collected);
{A=Datav2[amount=5.0, quantity=6], B=Datav2[amount=12.0, quantity=8], C=Datav2[amount=2.0, quantity=1]}
Another approach would be to sort the list by type, then iterate it and add each item to an sum item. When the type changes, add your sum item to a result list and keep going.
Another way for achieving is by use of collect & hashmap's merge operation:
List<TypeClass> ls = List.of(new TypeClass("A", 12.3, 2), new TypeClass("A", 3.4, 4),
new TypeClass("B", 12.4, 6), new TypeClass("B", 12.8, 8));
System.out.println(
ls.stream().collect(HashMap<String, TypeClass>::new, (x, y) -> x.merge(y.getTypeName(), y, (o, p) -> {
return new TypeClass(y.getTypeName(), o.getAmount() + p.getAmount(),
o.getQuantity() + p.getQuantity());
}), (a, b) -> a.putAll(b)));
this will print following output:
{A=TypeClass [typeName=A, amount=15.700000000000001, quantity=6],
B=TypeClass [typeName=B, amount=25.200000000000003, quantity=14]}
Here, we are accumulating hashmap which is merged based on key i.e. your string value. Merged function is simple addition of amount & quantity of your Type Class.
You can use built-in collector groupingBy() to group the objects having the same type in conjunction with a custom collector created via Collector.of() as downstream of grouping.
Assuming that your custom object looks like this (for the purpose of conciseness, I've used a Java 16 record):
public record MyType(String type, double amount, int quantity) {}
Note:
Don't use wrapper-types without any good reason, uses primitives instead. That would allow avoiding unnecessary boxing/unboxing and eliminates the possibilities of getting a NullPointerException while performing arithmetical operations or comparing numeric values.
If the number values that type attribute might have is limited, then it would be better to use an enum instead of String because it's more reliable (it would guard you from making a typo) and offers some extra possibilities since enums have an extensive language support.
That's how the accumulation logic can be implemented:
List<MyType> typeList = new ArrayList();
List<MyType> newList = typeList.stream()
.collect(Collectors.groupingBy(
MyType::type,
Collector.of(
MyAccumulator::new,
MyAccumulator::accept,
MyAccumulator::merge
)
))
.entrySet().stream()
.map(entry -> new MyType(entry.getKey(),entry.getValue().getAmount(), entry.getValue().getQuantity()))
.toList();
And that's how the custom accumulation type internally used by the collector might look like:
public static class MyAccumulator implements Consumer<MyType> {
private double amount;
private int quantity;
#Override
public void accept(MyType myType) {
add(myType.amount(), myType.quantity());
}
public MyAccumulator merge(MyAccumulator other) {
add(other.amount, other.quantity);
return this;
}
private void add(double amount, int quantity) {
this.amount += amount;
this.quantity += quantity;
}
// getters
}

How to multiply key * value in Map in Java?

I have this class:
class Product {
public double price;
public Product(double price) {
this.price = price;
}
}
And a Map:
Map<Product, Integer> products = new HashMap<>();
That contains several products added like so:
products.put(new Product(2.99), 2);
products.put(new Product(1.99), 4);
And I want to calculate the sum of all products multiple the values using streams? I tried:
double total = products.entrySet().stream().mapToDouble((k, v) -> k.getKey().price * v.getValue()).sum();
But it doesn't compile, I get “Cannot resolve method getValue()”.
I expect:
(2.99 * 2) + (1.99 * 4) = 5.98 + 7.96 = 13.94
The stream of entries needs single parameter lambda for each entry, not (k,v):
double total = products.entrySet().stream().mapToDouble(e -> e.getKey().price * e.getValue()).sum();
You can avoid the explicit creation of a doubleStream with something like:
double total = products.entrySet()
.stream()
.collect(Collectors.summingDouble(e -> e.getKey().price * e.getValue()));
Not directly related to your question, but I wouldn't use a map for what you are doing. Instead create a new class
public class ProductAmount {
private Product product;
private int amount;
public ProductAmount(Product product, int amount) {
this.product = product;
this.amount = amount;
}
public double getCombinedPrice() {
return product.price * amount;
}
}
Then you can use a List instead of a Map.
List<ProductAmount> products = Arrays.asList(
new ProductAmount(new Product(2.99), 2),
new ProductAmount (new Product(1.99), 4));
products.stream().mapToDouble(ProductAmount::getCombinedPrice).sum();
You can also do it like so.
double sum = 0;
for(Entry<Product, Integer> e : products.entrySet()) {
sum += e.getKey().price * e.getValue();
}
System.out.println(sum);
prints
13.940000000000001
But you have a fundamental flaw in your class. You don't override equals or hashCode. So you're are you using the object reference as the key. Try doing the following:
System.out.println(products.get(new Product(1.99));
It will print null since there is no entry for that reference (it's a different object than the one used to store the value 4).
And finally you should make certain your keys are immutable. Otherwise, circumstances could result in the same error.
Check out why do I need to override hashcode and equals.
And since it was mentioned in the comments, also check out what data type to use for money in java.

Java functional programming for multiple functionality with single stream data

There is a List of object like:-
ID Employee IN_COUNT OUT_COUNT Date
1 ABC 5 7 2020-06-11
2 ABC 12 5 2020-06-12
3 ABC 9 6 2020-06-13
This is the an employee data for three date which I get from a query in List object.
Not I want total number of IN_COUNT and OUT_COUNT between three date. This can be achieved by doing first iterating stream for only IN_COUNT and calling sum() and then in second iteration, only OUT_COUNT data can be summed. But I don't want to iterate the list two times.
How is this possible in functional programming using stream or any other option.
What you are trying to do is called a 'fold' operation in functional programming. Java streams call this 'reduce' and 'sum', 'count', etc. are just specialized reduces/folds. You just have to provide a binary accumulation function. I'm assuming Java Bean style getters and setters and an all args constructor. We just ignore the other fields of the object in our accumulation:
List<MyObj> data = fetchData();
Date d = new Date();
MyObj res = data.stream()
.reduce((a, b) -> {
return new MyObj(0, a.getEmployee(),
a.getInCount() + b.getInCount(), // Accumulate IN_COUNT
a.getOutCount() + b.getOutCount(), // Accumulate OUT_COUNT
d);
})
.orElseThrow();
This is simplified and assumes that you only have one employee in the list, but you can use standard stream operations to partition and group your stream (groupBy).
If you don't want to or can't create a MyObj, you can use a different type as accumulator. I'll use Map.entry, because Java lacks a Pair/Tuple type:
Map.Entry<Integer, Integer> res = l.stream().reduce(
Map.entry(0, 0), // Identity
(sum, x) -> Map.entry(sum.getKey() + x.getInCount(), sum.getValue() + x.getOutCount()), // accumulate
(s1, s2) -> Map.entry(s1.getKey() + s2.getKey(), s1.getValue() + s2.getValue()) // combine
);
What's happening here? We now have a reduce function of Pair accum, MyObj next -> Pair. The 'identity' is our start value, the accumulator function adds the next MyObj to the current result and the last function is only used to combine intermediate results (e.g., if done in parallel).
Too complicated? We can split the steps of extracting interesting properties and accumulating them:
Map.Entry<Integer, Integer> res = l.stream()
.map(x -> Map.entry(x.getInCount(), x.getOutCount()))
.reduce((x, y) -> Map.entry(x.getKey() + y.getKey(), x.getValue() + y.getValue()))
.orElseGet(() -> Map.entry(0, 0));
You can use reduce to done this:
public class Counts{
private int inCount;
private int outCount;
//constructor, getters, setters
}
public static void main(String[] args){
List<Counts> list = new ArrayList<>();
list.add(new Counts(5, 7));
list.add(new Counts(12, 5));
list.add(new Counts(9, 6));
Counts total = list.stream().reduce(
//it's start point, like sum = 0
//you need this if you don't want to modify objects from list
new Counts(0,0),
(sum, e) -> {
sum.setInCount( sum.getInCount() + e.getInCount() );
sum.setOutCount( sum.getOutCount() + e.getOutCount() );
return sum;
}
);
System.out.println(total.getInCount() + " - " + total.getOutCount());
}

looking for a recommendation for Data structure for a sortable pair

I'm looking for a data structure s.t I could store a pair of Integer and String
and I'll be able to sort it twice: once by Integer descending order and once by lexical order.
I also want to be able to add a new pair dynamically.
for example: {(13,a)(12,d) (9,a)}
sort by numbers: {(13, a) (12, d) (9,a)}
sort by lexical order: {(9, a) (13, a) (12, d)}
What would you suggest?
Create a class Pair which holds an integer and a string:
public class Pair {
private Integer num;
private String text;
public Pair(Integer num, String text) {
this.num = num;
this.text = text;
}
public Integer getNum() { return num; }
public String getText() { return text; }
}
List<Pair> list = new ArrayList<>();
list.add(new Pair(13, "a"));
list.add(new Pair(12, "d"));
list.add(new Pair(9, "a"));
Java 8 does support custom inline comparators when sorting, but in your case it appears that you want a two level sort, first by number, then by text (or vice-versa for the other comparator). In this case, we define two custom comparators. The second sorting condition is added via the Comparator#thenComparing() method in a chaining sort of fashion. Then, we convert a stream into an actual sorted list.
Comparator<Pair> c1 = Comparator.comparing(pair -> -pair.num);
c1 = c1.thenComparing(Comparator.comparing(pair -> pair.text));
Stream<Pair> pairStream = list.stream().sorted(c1);
List<Pair> sortedPairs = pairStream.collect(Collectors.toList());
System.out.println("Sorting descending by number:");
for (Pair p : sortedPairs) {
System.out.println("(" + p.getNum() + ", " + p.getText() + ")");
}
Comparator<Pair> c2 = Comparator.comparing(pair -> pair.text);
c2 = c2.thenComparing(Comparator.comparing(pair -> pair.num));
pairStream = list.stream().sorted(c2);
sortedPairs = pairStream.collect(Collectors.toList());
System.out.println("Sorting ascending by text:");
for (Pair p : sortedPairs) {
System.out.println("(" + p.getNum() + ", " + p.getText() + ")");
}
Output:
Sorting descending by number:
(13, a)
(12, d)
(9, a)
Sorting ascending by text:
(9, a)
(13, a)
(12, d)
Demo here:
Rextester
Since you want to store a list of things with order, I suggest using a List.
The type that this List is going to store is IntegerStringPair, which is defined like this:
class IntegerStringPair {
private int integer;
private String string;
public int getInteger() {
return integer;
}
public String getString() {
return string;
}
public IntegerStringPair(int integer, String string) {
this.integer = integer;
this.string = string;
}
}
Your list will be declared like this:
List<IntegerStringPair> list = new ArrayList<>();
To sort the list, you can do these:
// by integer
list.sort((x, y) -> Integer.compare(x.getInteger(), y.getInteger()));
// by string lexically
list.sort((x, y) -> x.getString().compareTo(y.getString()));
This really calls for a data class that isn't itself Comparable but is rather sorted using two different Comparators.
The rest depends on your other requirements - is modifying the collection more common than reading it? Are you going to use List semantics (have duplicate items in the collection) or is it going to be a Set?. Depending on those, the structure holding the items might be anything from a List with two methods that copy and sort the list according to one of the two comparators, to two TreeSets to hold pre-sorted values.

JAVA - Storing data from result set to hashmap and aggregating them

as the title, I'd like to store data from result set to hash map and then use them for further processing (max, min, avg, grouping).
So far, I achieved this by using a proper hash map and implementing each operation from scratch - iterating over the hash map (key, value) pairs.
My question is: does it exist a library that performs such operations?
For example, a method that computes the maximum value over a List or a method that, given two same-size arrays, performs a "index-to-index" difference.
Thanks in advance.
Well there is the Collection class for instance. There is a bunch of useful static methods but you'll have to read and choose the one you need. Here is the documentation:
https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html
This class consists exclusively of static methods that operate on or
return collections.
Example:
List<Integer> list = new ArrayList<>();
List<String> stringList = new ArrayList<>();
// Populate the lists
for(int i=0; i<=10; ++i){
list.add(i);
String newString = "String " + i;
stringList.add(newString);
}
// add another negative value to the integer list
list.add(-1939);
// Print the min value from integer list and max value form the string list.
System.out.println("Max value: " + Collections.min(list));
System.out.println("Max value: " + Collections.max(stringList));
The output will be:
run:
Max value: -1939
Max value: String 9
BUILD SUCCESSFUL (total time: 0 seconds)
Similar question, however, was answered before for example here:
how to get maximum value from the List/ArrayList
There are some usefull functions in Collections API already.
For example max or min
Collections.max(arrayList);
Please investigate collections documentation to see if there is a function that you need. Probably there woulde be.
You can use java 8 streams for this.
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class Testing {
public static void main(String[] args) {
//List of integers
List<Integer> list = new ArrayList<>();
list.add(7);
list.add(5);
list.add(4);
list.add(6);
list.add(9);
list.add(11);
list.add(12);
//get sorted list using streams
System.out.println(list.stream().sorted().collect(Collectors.toList()));
//find min value in list
System.out.println(list.stream().min(Integer::compareTo).get());
//find max value in list
System.out.println(list.stream().max(Integer::compareTo).get());
//find average of list
System.out.println(list.stream().mapToInt(val->val).average().getAsDouble());
//Map of integers
Map<Integer,Integer> map = new HashMap<>();
map.put(1, 10);
map.put(2, 12);
map.put(3, 15);
//find max value in map
System.out.println(map.entrySet().stream().max((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getValue());
//find key of max value in map
System.out.println(map.entrySet().stream().max((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getKey());
//find min value in map
System.out.println(map.entrySet().stream().min((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getValue());
//find key of max value in map
System.out.println(map.entrySet().stream().min((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getKey());
//find average of values in map
System.out.println(map.entrySet().stream().map(Map.Entry::getValue).mapToInt(val ->val).average().getAsDouble());
}
}
Keep in mind that it will only work if your system has jdk 1.8 .For lower version of jdk streams are not supported.
In Java8 there are IntSummaryStatistics, LongSummaryStatistics, DoubleSummaryStatistics to calculate max,min,count,average and sum
public static void main(String[] args) {
List<Employee> resultSet = ...
Map<String, DoubleSummaryStatistics> stats = resultSet.stream().collect(Collectors.groupingBy(Employee::getName, Collectors.summarizingDouble(Employee::getSalary)));
stats.forEach((n, stat) -> System.out.println("Name " + n + " Average " + stat.getAverage() + " Max " + stat.getMax())); // min, sum, count can also be taken from stat
}
static class Employee {
String name;
Double salary;
public String getName() {
return name;
}
public Double getSalary() {
return salary;
}
}
For max, min, avg you can use Java 8 and it's stream processing.

Categories

Resources