Parallel looping with Java Streams?

Parallel looping with Java Streams? - java

I have two lists. One shows number of successful attempts for each individual in a group of people for some game.
public class SuccessfulAttempts{
String name;
int successCount;
}
List<SuccessfulAttempts> success;
And total number of attempts for each individual.
public class TotalAttempts{
String name;
int totalCount;
}
List<TotalAttempts> total;
And I want to show the percentage success for each person in the group.
public class PercentageSuccess{
String name;
float percentage;
}
List<PercentageSuccess> percentage;
And assume I have populate first two lists like this.
success.add(new SuccessfulAttempts(Alice, 4));
success.add(new SuccessfulAttempts(Bob, 7));
total.add(new TotalAttempts(Alice, 5));
total.add(new TotalAttempts(Bob, 10));
Now I want to calculate the percentage success for each person using Java Streams. So I actually need this kind of a result for the list List<PercentageSuccess> percentage.
new PercentageSuccess(Alice, 80);
new PercentageSuccess(Bob, 70);
And I want to calculate them (Alice's percentage and Bob's percentage) in parallel (I know how to do sequentially using a loop). How I can achieve this with Java Streams (or any other simple way)??

I would suggest converting one of your list to a Map for Easier access of count. Else for each value of one list you've to loop in the other list which will be O(n^2) complexity.
List<SuccessfulAttempts> success = new ArrayList<>();
List<TotalAttempts> total = new ArrayList<>();
success.add(new SuccessfulAttempts("Alice", 4));
success.add(new SuccessfulAttempts("Bob", 7));
total.add(new TotalAttempts("Alice", 5));
total.add(new TotalAttempts("Bob", 10));
// First create a Map
Map<String, Integer> attemptsMap = success.parallelStream()
.collect(Collectors.toMap(SuccessfulAttempts::getName, SuccessfulAttempts::getSuccessCount));
// Loop through the list of players and calculate percentage.
List<PercentageSuccess> percentage =
total.parallelStream()
// Remove players who have not participated from List 'total'. ('attempt' refers to single element in List 'total').
.filter(attempt -> attemptsMap.containsKey(attempt.getName()))
// Calculate percentage and create the required object
.map(attempt -> new PercentageSuccess(attempt.getName(),
((attemptsMap.get(attempt.getName()) * 100) / attempt.getTotalCount())))
// Collect it back to list
.collect(Collectors.toList());
percentage.forEach(System.out::println);

If arrays are of same same size and correctly ordered, you can use integer indexes to access original list elements.
List<PercentageSuccess> result = IntStream.range(0, size).parallel().mapToObj(index -> /*get the elements and construct percentage progress for person with given index*/).collect(Collectors.toList())
This means you have to create a method or custructor for PercentageSuccess which construncts a percentage for given SuccessAttempts and TotalAttempts.
PercentageSuccess(SuccessfulAttempts success, TotalAttempts total) {
this.name = success.name;
this.percentage = (float) success.successCount / (float) total.totalCount;
}
Then you construct a stream of integers from 0 to size which is parallel:
IntStream.range(0, size).parallel()
this is actually parallel for loop. Then turn each integer into PercentageSuccess of index'th person (note that you must enshure that lists are of same size and not shuffled, otherwice my code is not correct).
.mapToObj(index -> new PercentageSuccess(success.get(index), total.get(index))
and finally turn Stream to List with
.collect(Collectors.toList())
Also, this approach is not optimal in case success or total are LinkedList or other list implementation with O(n) cost of accessing element by index.

private static List<PercentageAttempts> percentage(List<SuccessfulAttempts> success, List<TotalAttempts> total) {
Map<String, Integer> successMap = success.parallelStream()
.collect(Collectors.toMap(SuccessfulAttempts::getName, SuccessfulAttempts::getSuccessCount, (a, b) -> a + b));
Map<String, Integer> totalMap = total.parallelStream()
.collect(Collectors.toMap(TotalAttempts::getName, TotalAttempts::getTotalCount));
return successMap.entrySet().parallelStream().map(entry -> new PercentageAttempts(entry.getKey(),
entry.getValue() * 1.0f / totalMap.get(entry.getKey()) * 100))
.collect(Collectors.toList());
}

Related

Java functional programming for multiple functionality with single stream data

There is a List of object like:-
ID Employee IN_COUNT OUT_COUNT Date
1 ABC 5 7 2020-06-11
2 ABC 12 5 2020-06-12
3 ABC 9 6 2020-06-13
This is the an employee data for three date which I get from a query in List object.
Not I want total number of IN_COUNT and OUT_COUNT between three date. This can be achieved by doing first iterating stream for only IN_COUNT and calling sum() and then in second iteration, only OUT_COUNT data can be summed. But I don't want to iterate the list two times.
How is this possible in functional programming using stream or any other option.

What you are trying to do is called a 'fold' operation in functional programming. Java streams call this 'reduce' and 'sum', 'count', etc. are just specialized reduces/folds. You just have to provide a binary accumulation function. I'm assuming Java Bean style getters and setters and an all args constructor. We just ignore the other fields of the object in our accumulation:
List<MyObj> data = fetchData();
Date d = new Date();
MyObj res = data.stream()
.reduce((a, b) -> {
return new MyObj(0, a.getEmployee(),
a.getInCount() + b.getInCount(), // Accumulate IN_COUNT
a.getOutCount() + b.getOutCount(), // Accumulate OUT_COUNT
d);
})
.orElseThrow();
This is simplified and assumes that you only have one employee in the list, but you can use standard stream operations to partition and group your stream (groupBy).
If you don't want to or can't create a MyObj, you can use a different type as accumulator. I'll use Map.entry, because Java lacks a Pair/Tuple type:
Map.Entry<Integer, Integer> res = l.stream().reduce(
Map.entry(0, 0), // Identity
(sum, x) -> Map.entry(sum.getKey() + x.getInCount(), sum.getValue() + x.getOutCount()), // accumulate
(s1, s2) -> Map.entry(s1.getKey() + s2.getKey(), s1.getValue() + s2.getValue()) // combine
);
What's happening here? We now have a reduce function of Pair accum, MyObj next -> Pair. The 'identity' is our start value, the accumulator function adds the next MyObj to the current result and the last function is only used to combine intermediate results (e.g., if done in parallel).
Too complicated? We can split the steps of extracting interesting properties and accumulating them:
Map.Entry<Integer, Integer> res = l.stream()
.map(x -> Map.entry(x.getInCount(), x.getOutCount()))
.reduce((x, y) -> Map.entry(x.getKey() + y.getKey(), x.getValue() + y.getValue()))
.orElseGet(() -> Map.entry(0, 0));

You can use reduce to done this:
public class Counts{
private int inCount;
private int outCount;
//constructor, getters, setters
}
public static void main(String[] args){
List<Counts> list = new ArrayList<>();
list.add(new Counts(5, 7));
list.add(new Counts(12, 5));
list.add(new Counts(9, 6));
Counts total = list.stream().reduce(
//it's start point, like sum = 0
//you need this if you don't want to modify objects from list
new Counts(0,0),
(sum, e) -> {
sum.setInCount( sum.getInCount() + e.getInCount() );
sum.setOutCount( sum.getOutCount() + e.getOutCount() );
return sum;
}
);
System.out.println(total.getInCount() + " - " + total.getOutCount());
}

How can I manage items with overlapping ranges, where based on a value I get the matching items

Say I have the following items (unsorted):
A, with A.amount = 10
B, with B.amount = 100
C, with C.amount = 50
D, with D.amount = 50
Now for every unique amount boundary AB in items, find the items whose range include the value and calculate cumulative bracket. So:
AB=10 results in { A, B, C, D } -> cumulative bracket 210
AB=50 results in { B, C, D } -> cumulative bracket 200
AB=100 results in { B } -> cumulative bracket 100
It would be used like so:
for (int AB : collectAmountBoundaries(items)) {
Collection<Item> itemsInBracket = findItemsForAB(items, AB);
// execute logic, calculations etc with cumulative bracket value for AB
}
Now I can code all this using vanilla Java, by first manually transforming the collection of items into a map of AB→cumulativeBracketValue or something. However, since I'm working with ranges and overlap-logic I feel somehow a clean solution involving NavigableMap, Range logic or something clever should be possible (it feels like a common pattern). Or perhaps using streams to do a collect groupingBy?
I'm not seeing it right now. Any ideas on how to tackle this cleanly?

I think, doing a simple filter and then adding the filtered result to a List and amount to a total is sufficient.
static ListAndCumalativeAmount getCR(List<Item> items, double amount) {
ListAndCumalativeAmount result = new ListAndCumalativeAmount();
items.stream().filter(item -> item.amount >= amount).forEach((i) -> {
result.getItems().add(i.name);
result.add(i.amount);
});
return result;
}
static class ListAndCumalativeAmount {
private List<String> items = new ArrayList<>();
private Double amount = new Double(0.0);
public List<String> getItems() {
return items;
}
public void add(double value) {
amount = amount + value;
}
public Double getAmount() {
return amount;
}
}

This is a way to do it with streams and groupingBy:
Map<Integer, SimpleEntry<List<Item>, Double>> groupedByBracketBoundary = items.stream()
.collect(groupingBy(o -> o.getAmount())).entrySet().stream()
// map map-values to map-entries of original value and sum, keeping key the same
.collect(toMap(Entry::getKey, e -> new SimpleEntry<>(e.getValue(),
e.getValue().stream()
.map(o -> o.getAmount())
.reduce(0d, (amount1, amount2) -> amount1 + amount2))));
LinkedHashSet<Integer> sortedUniqueAmountBoundaries = internalList.stream()
.map(o -> o.getAmount())
.sorted()
.collect(Collectors.toCollection(LinkedHashSet::new));
for (int ab : sortedUniqueAmountBoundaries) {
List<Item> itemsInBracket = groupedByBracketBoundary.get(ab).getKey();
double cumulativeAmountForBracket = groupedByBracketBoundary.get(ab).getValue();
// execute logic, calculations etc with cumulative bracket value for AB
}
Somehow this feels succinct and verbose at the same time, it's rather dense. Isn't there a JDK api or 3rd party library that does this kind of thing?

How to get 10 Objects based on a value from the Object?

I have to create a method that gives the 10 Taxpayers that spent the most in the entire system.
There's a lot of classes already created and code that would have to be in between but what I need is something like:
public TreeSet<Taxpayer> getTenTaxpayers(){
TreeSet<Taxpayer> taxp = new TreeSet<Taxpayer>();
...
for(Taxpayer t: this.taxpayers.values()){ //going through the Map<String, Taxpayer>
for(Invoice i: this.invoices.values()){ //going through the Map<String, Invoice>
if(taxp.size()<=10){
if(t.getTIN().equals(i.getTIN())){ //if the TIN on the taxpayer is the same as in the Invoice
...
}
}
}
}
return taxp;
}
To sum it up, I have to go through a Map<String, Taxpayer> which has for example 100 Taxpayers, then go through a Map<String, Invoice> for each respective invoice and return a new Collection holding the 10 Taxpayers that spent the most on the entire system based on 1 attribute on the Invoice Class. My problem is how do I get those 10, and how do I keep it sorted. My first look at it was to use a TreeSet with a Comparator but the problem is the TreeSet would be with the class Taxpayer while what we need to compare is an attribute on the class Invoice.

Is this a classic Top K problem ? Maybe you can use the java.util.PriorityQueue to build a min heap to get the top 10 Taxpayer.

This can be broken down into 3 steps:
Extract distinct TaxPayers
Extract Invoices for each payer and then sum amount
Sort by the payed amount and limit to first 10
If you are using java-8 you can do something like:
final Map<TaxPayer, Double> toTenMap = payersMap.values() // get values from map
.stream() // create java.util.Stream
.distinct() // do not process duplicates (TaxPayer must provide a standard-compliant equals method)
.map(taxPayer -> {
final double totalAmount = invoicesMap
.values() // get values from the invoices map
.stream() // create Stream
.filter(invoice -> invoice.getTIN().equals(taxPayer.getTIN())) // get only those for the current TaxPayer
.mapToDouble(Invoice::getAmount) // get amount
.sum(); // sum amount
return new AbstractMap.SimpleEntry<>(taxPayer, totalAmount); // create Map.Entry
})
.sorted( ( entry1, entry2 ) -> { // sort by total amount
if (entry1.getValue() > entry2.getValue()) return 1;
if (entry1.getValue() < entry2.getValue()) return -1;
return 0;
})
.limit(10) // get only top ten payers
.collect(Collectors.toMap( // save to map
AbstractMap.SimpleEntry::getKey,
AbstractMap.SimpleEntry::getValue
));
Surely there is a more elegant solution. Also, I haven't tested it because I don't have much time now.

Java 8 lambda sum, count and group by

Select sum(paidAmount), count(paidAmount), classificationName,
From tableA
Group by classificationName;
How can i do this in Java 8 using streams and collectors?
Java8:
lineItemList.stream()
.collect(Collectors.groupingBy(Bucket::getBucketName,
Collectors.reducing(BigDecimal.ZERO,
Bucket::getPaidAmount,
BigDecimal::add)))
This gives me sum and group by. But how can I also get count on the group name ?
Expectation is :
100, 2, classname1
50, 1, classname2
150, 3, classname3

Using an extended version of the Statistics class of this answer,
class Statistics {
int count;
BigDecimal sum;
Statistics(Bucket bucket) {
count = 1;
sum = bucket.getPaidAmount();
}
Statistics() {
count = 0;
sum = BigDecimal.ZERO;
}
void add(Bucket b) {
count++;
sum = sum.add(b.getPaidAmount());
}
Statistics merge(Statistics another) {
count += another.count;
sum = sum.add(another.sum);
return this;
}
}
you can use it in a Stream operation like
Map<String, Statistics> map = lineItemList.stream()
.collect(Collectors.groupingBy(Bucket::getBucketName,
Collector.of(Statistics::new, Statistics::add, Statistics::merge)));
this may have a small performance advantage, as it only creates one Statistics instance per group for a sequential evaluation. It even supports parallel evaluation, but you’d need a very large list with sufficiently large groups to get a benefit from parallel evaluation.
For a sequential evaluation, the operation is equivalent to
lineItemList.forEach(b ->
map.computeIfAbsent(b.getBucketName(), x -> new Statistics()).add(b));
whereas merging partial results after a parallel evaluation works closer to the example already given in the linked answer, i.e.
secondMap.forEach((key, value) -> firstMap.merge(key, value, Statistics::merge));

As you're using BigDecimal for the amounts (which is the correct approach, IMO), you can't make use of Collectors.summarizingDouble, which summarizes count, sum, average, min and max in one pass.
Alexis C. has already shown in his answer one way to do it with streams. Another way would be to write your own collector, as shown in Holger's answer.
Here I'll show another way. First let's create a container class with a helper method. Then, instead of using streams, I'll use common Map operations.
class Statistics {
int count;
BigDecimal sum;
Statistics(Bucket bucket) {
count = 1;
sum = bucket.getPaidAmount();
}
Statistics merge(Statistics another) {
count += another.count;
sum = sum.add(another.sum);
return this;
}
}
Now, you can make the grouping as follows:
Map<String, Statistics> result = new HashMap<>();
lineItemList.forEach(b ->
result.merge(b.getBucketName(), new Statistics(b), Statistics::merge));
This works by using the Map.merge method, whose docs say:
If the specified key is not already associated with a value or is associated with null, associates it with the given non-null value. Otherwise, replaces the associated value with the results of the given remapping function

You could reduce pairs where the keys would hold the sum and the values would hold the count:
Map<String, SimpleEntry<BigDecimal, Long>> map =
lineItemList.stream()
.collect(groupingBy(Bucket::getBucketName,
reducing(new SimpleEntry<>(BigDecimal.ZERO, 0L),
b -> new SimpleEntry<>(b.getPaidAmount(), 1L),
(v1, v2) -> new SimpleEntry<>(v1.getKey().add(v2.getKey()), v1.getValue() + v2.getValue()))));
although Collectors.toMap looks cleaner:
Map<String, SimpleEntry<BigDecimal, Long>> map =
lineItemList.stream()
.collect(toMap(Bucket::getBucketName,
b -> new SimpleEntry<>(b.getPaidAmount(), 1L),
(v1, v2) -> new SimpleEntry<>(v1.getKey().add(v2.getKey()), v1.getValue() + v2.getValue())));

JAVA - Storing data from result set to hashmap and aggregating them

as the title, I'd like to store data from result set to hash map and then use them for further processing (max, min, avg, grouping).
So far, I achieved this by using a proper hash map and implementing each operation from scratch - iterating over the hash map (key, value) pairs.
My question is: does it exist a library that performs such operations?
For example, a method that computes the maximum value over a List or a method that, given two same-size arrays, performs a "index-to-index" difference.
Thanks in advance.

Well there is the Collection class for instance. There is a bunch of useful static methods but you'll have to read and choose the one you need. Here is the documentation:
https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html
This class consists exclusively of static methods that operate on or
return collections.
Example:
List<Integer> list = new ArrayList<>();
List<String> stringList = new ArrayList<>();
// Populate the lists
for(int i=0; i<=10; ++i){
list.add(i);
String newString = "String " + i;
stringList.add(newString);
}
// add another negative value to the integer list
list.add(-1939);
// Print the min value from integer list and max value form the string list.
System.out.println("Max value: " + Collections.min(list));
System.out.println("Max value: " + Collections.max(stringList));
The output will be:
run:
Max value: -1939
Max value: String 9
BUILD SUCCESSFUL (total time: 0 seconds)
Similar question, however, was answered before for example here:
how to get maximum value from the List/ArrayList

There are some usefull functions in Collections API already.
For example max or min
Collections.max(arrayList);
Please investigate collections documentation to see if there is a function that you need. Probably there woulde be.

You can use java 8 streams for this.
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class Testing {
public static void main(String[] args) {
//List of integers
List<Integer> list = new ArrayList<>();
list.add(7);
list.add(5);
list.add(4);
list.add(6);
list.add(9);
list.add(11);
list.add(12);
//get sorted list using streams
System.out.println(list.stream().sorted().collect(Collectors.toList()));
//find min value in list
System.out.println(list.stream().min(Integer::compareTo).get());
//find max value in list
System.out.println(list.stream().max(Integer::compareTo).get());
//find average of list
System.out.println(list.stream().mapToInt(val->val).average().getAsDouble());
//Map of integers
Map<Integer,Integer> map = new HashMap<>();
map.put(1, 10);
map.put(2, 12);
map.put(3, 15);
//find max value in map
System.out.println(map.entrySet().stream().max((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getValue());
//find key of max value in map
System.out.println(map.entrySet().stream().max((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getKey());
//find min value in map
System.out.println(map.entrySet().stream().min((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getValue());
//find key of max value in map
System.out.println(map.entrySet().stream().min((entry1,entry2) -> entry1.getValue() > entry2.getValue() ? 1: -1).get().getKey());
//find average of values in map
System.out.println(map.entrySet().stream().map(Map.Entry::getValue).mapToInt(val ->val).average().getAsDouble());
}
}
Keep in mind that it will only work if your system has jdk 1.8 .For lower version of jdk streams are not supported.

In Java8 there are IntSummaryStatistics, LongSummaryStatistics, DoubleSummaryStatistics to calculate max,min,count,average and sum
public static void main(String[] args) {
List<Employee> resultSet = ...
Map<String, DoubleSummaryStatistics> stats = resultSet.stream().collect(Collectors.groupingBy(Employee::getName, Collectors.summarizingDouble(Employee::getSalary)));
stats.forEach((n, stat) -> System.out.println("Name " + n + " Average " + stat.getAverage() + " Max " + stat.getMax())); // min, sum, count can also be taken from stat
}
static class Employee {
String name;
Double salary;
public String getName() {
return name;
}
public Double getSalary() {
return salary;
}
}

For max, min, avg you can use Java 8 and it's stream processing.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parallel looping with Java Streams? - java

Related

Java functional programming for multiple functionality with single stream data

How can I manage items with overlapping ranges, where based on a value I get the matching items

How to get 10 Objects based on a value from the Object?

Java 8 lambda sum, count and group by

JAVA - Storing data from result set to hashmap and aggregating them

Categories

Resources