Java stream API - avoid using same predicate twice to calculate average

Java stream API - avoid using same predicate twice to calculate average - java

class A {
double value;
String key;
}
class B {
List<A> aList;
}
Given List<B> bList, I want to calculate the average of the values for a specific key k. Assume each B can hold only 0 or 1 times an instance of A with key.equals(k).
I figured I could first filter aList, and later extract the value using mapToDouble:
double average = blist.stream().filter(
b -> b.aList.stream().anyMatch(
a -> a.key.equals(k)
)
).mapToDouble(
b -> b.aList.stream().filter(
a -> a.key.equals(k)
).findFirst().get().value
).average().orElse(0);
But there is clearly a redundancy here, since I am filtering the same list by the same predicate twice (a -> a.key.equals(k)).
Is there a way to mapToDouble while omitting elements with missing matching keys at the same time?
Edit:
Here is a more concrete example, hopefully it will make it easier to understand:
String courseName;
...
double average = students.stream().filter(
student -> student.getGrades().stream().anyMatch(
grade -> grade.getCourseName().equals(courseName)
)
).mapToDouble(
student -> student.getGrades().stream().filter(
grade -> grade.getCourseName().equals(courseName)
).findFirst().get().getValue()
).average().orElse(0);
System.out.println(courseName + " average: " + average);

Try this:
double average = bList.stream()
.flatMap(b -> b.aList.stream())
.filter(a -> a.key.equals(k))
.mapToDouble(a -> a.value)
.average()
.orElse(Double.NaN);
If your objects have private field and getters, which they really should, it'd be like this:
double average = bList.stream()
.map(B::getaList)
.flatMap(List::stream)
.filter(a -> a.getKey().equals(k))
.mapToDouble(A::getValue)
.average()
.orElse(Double.NaN);

Try this.
double average = blist.stream()
.map(b -> b.aList.stream()
.filter(a -> a.key.equals(k))
.findFirst())
.filter(a -> a.isPresent())
.mapToDouble(a -> a.get().value)
.average().orElse(0);

Related

Java streams average

I need to create two methods using streams. A method that returns an average score of each task.
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {}
and a method that returns a task with the highest average score.
public String easiestTask(Stream<CourseResult> results) {}
I can only modify those 2 methods.
Here is CourseResult class
public class CourseResult {
private final Person person;
private final Map<String, Integer> taskResults;
public CourseResult(final Person person, final Map<String, Integer> taskResults) {
this.person = person;
this.taskResults = taskResults;
}
public Person getPerson() {
return person;
}
public Map<String, Integer> getTaskResults() {
return taskResults;
}
}
And methods that create CourseResult objects.
private final String[] programTasks = {"Lab 1. Figures", "Lab 2. War and Peace", "Lab 3. File Tree"};
private final String[] practicalHistoryTasks = {"Shieldwalling", "Phalanxing", "Wedging", "Tercioing"};
private Stream<CourseResult> programmingResults(final Random random) {
int n = random.nextInt(names.length);
int l = random.nextInt(lastNames.length);
return IntStream.iterate(0, i -> i + 1)
.limit(3)
.mapToObj(i -> new Person(
names[(n + i) % names.length],
lastNames[(l + i) % lastNames.length],
18 + random.nextInt(20)))
.map(p -> new CourseResult(p, Arrays.stream(programTasks).collect(toMap(
task -> task,
task -> random.nextInt(51) + 50))));
}
private Stream<CourseResult> historyResults(final Random random) {
int n = random.nextInt(names.length);
int l = random.nextInt(lastNames.length);
AtomicInteger t = new AtomicInteger(practicalHistoryTasks.length);
return IntStream.iterate(0, i -> i + 1)
.limit(3)
.mapToObj(i -> new Person(
names[(n + i) % names.length],
lastNames[(l + i) % lastNames.length],
18 + random.nextInt(20)))
.map(p -> new CourseResult(p,
IntStream.iterate(t.getAndIncrement(), i -> t.getAndIncrement())
.map(i -> i % practicalHistoryTasks.length)
.mapToObj(i -> practicalHistoryTasks[i])
.limit(3)
.collect(toMap(
task -> task,
task -> random.nextInt(51) + 50))));
}
Based on these methods I can calculate an average of each task by dividing sum of scores of this task by 3, because there are only 3 Persons tho I can make it so it divides by a number equal to number of CourseResult objects in a stream if these methods get their .limit(3) changed.
I don't know how to access keys of taskResults Map. I think I need them to then return a map of unique keys. A value for each unique key should be an average of values from taskResults map assigend to those keys.

For your first question: map each CourseResult to taskResults, flatmap to get all entries of each taskResults map form all CourseResults, group by map keys (task names) and collect averaging the values for same keys:
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {
return results.map(CourseResult::getTaskResults)
.flatMap(m -> m.entrySet().stream())
.collect(Collectors.groupingBy(Map.Entry::getKey, Collectors.averagingInt(Map.Entry::getValue)));
}
You can use the same approach for your second question to calculate the average for each task and finaly stream over the entries of the resulting map to find the task with the highest average.
public String easiestTask(Stream<CourseResult> results) {
return results.map(CourseResult::getTaskResults)
.flatMap(m -> m.entrySet().stream())
.collect(Collectors.groupingBy(Map.Entry::getKey, Collectors.averagingInt(Map.Entry::getValue)))
.entrySet().stream()
.max(Map.Entry.comparingByValue())
.map(Map.Entry::getKey)
.orElse("No easy task found");
}
To avoid code duplication you can call the first method within the second:
public String easiestTask(Stream<CourseResult> results) {
return averageScoresPerTask(results).entrySet()
.stream()
.max(Map.Entry.comparingByValue())
.map(Map.Entry::getKey)
.orElse("No easy task found");
}
EDIT
To customize the calculation of the average regardless how many items your maps contain, don't use the inbuilt operations like Collectors.averagingInt or Collectors.averagingDouble. Instead wrap your collector in collectingAndThen and sum the scores using Collectors.summingInt and finally after collecting divide using a divisor according if the task name starts with Lab or not:
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {
return results.map(CourseResult::getTaskResults)
.flatMap(m -> m.entrySet().stream())
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(Map.Entry::getKey, Collectors.summingInt(Map.Entry::getValue)),
map -> map.entrySet()
.stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getKey().startsWith("Lab") ? e.getValue() / 3. : e.getValue() / 4.))
));
}

To create a map containing an average score for each task, you need to flatten the map taskResults of every CourseResult result object in the stream and group the data by key (i.e. by task name).
For that you can use collector groupingBy(), as its downstream collector that would be responsible for calculation the average from the score-values mapped to the same task you can use averagingDouble().
That's how it might look like:
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {
return results
.map(CourseResult::getTaskResults) // Stream<Map<String, Integer>> - stream of maps
.flatMap(map -> map.entrySet().stream()) // Stream<Map.Entry<String, Integer>> - stream of entries
.collect(Collectors.groupingBy(
Map.Entry::getKey,
Collectors.averagingDouble(Map.Entry::getValue)
));
}
To find the easiest task, you can use this map instead of passing the stream as an argument because the logic of this method requires applying the same operations. It would make sense in the real life scenario when you're retrieving the data that is stored somewhere (it would be better to avoid double-processing it) and more over in your case you can't generate a stream from the source twice and pass into these two methods because in your case stream data is random. Passing the same stream into both method is not an option because you can execute a stream pipeline only once, when it hits the terminal operation - it's done, you can't use it anymore, hence you can't pass the same stream with random data in these two methods.
public String easiestTask(Map<String, Double> averageByTask) {
return averageByTask.entrySet().stream()
.max(Map.Entry.comparingByValue()) // produces result of type Optianal<Map.Entry<String, Double>>
.map(Map.Entry::getKey) // transforming into Optianal<String>
.orElse("no data"); // or orElseThrow() if data is always expected to be present depending on your needs
}

Java stream create object and compare

I'm pretty new to Java streams. I've to split a string returned by filter in stream, create a new object with the strings in the split and compare each object with a predefined object. Stream looks like this (I know this is incorrect, just a representation of what I am trying to do):
xmlstream.stream()
.filter(xml->xml.getName()) //returns a string
.map(returnedString -> split("__"))
.map(eachStringInList -> new TestObj(returnedStr[0], returnedStr[1]))
.map(eachTestObj -> eachTestObj.compareTo(givenObj))
.max(Comparing.compare(returnedObj :: aProperty))
How do I achieve this? Basically map each string in list to create an object, compare that to a fix object and return max based on one of the properties.
Thanks.

You could use reduce like so:
TestObj predefined = ...
TestObj max =
xmlstream.stream()
.map(xml -> xml.getName()) //returns a string
.map(s -> s.split("__"))
.map(a -> new TestObj(a[0], a[1]))
.reduce(predifined, (e, a) ->
e.aProperty().compareTo(a.aProperty()) >= 0 ? e : a);
A more efficient version of the above would be:
TestObj predefined = ...
TestObj max =
xmlstream.stream()
.map(xml -> xml.getName()) //returns a string
.map(s -> s.split("__"))
.map(a -> new TestObj(a[0], a[1]))
.filter(e -> e.aProperty().compareTo(predefined.aProperty()) > 0)
.findFirst()
.orElse(predefined);
Update:
if you want to retrieve the max object by a given property from all the TestObj objects less than the predefined TestObj, then you can proceed as follows:
TestObj predefined = ...
Optional<TestObj> max =
xmlstream.stream()
.map(xml -> xml.getName())
.map(s -> s.split("_"))
.map(a -> new TestObj(a[0], a[1]))
.filter(e -> e.aProperty().compareTo(predefined.aProperty()) < 0)
.max(Comparator.comparing(TestObj::aProperty));
max returns an Optional<T>; if you're unfamiliar with it then consult the documentation here to familiarise you're with the different ways to unwrap an Optional<T> object.

Processing HashMap using Java 8 Stream API

I have a hash table in the form
Map<String, Map<String,Double>
I need to process it and create another one having the same structure.
Following a sample to explain the goal
INPUT HASH TABLE
----------------------------
| | 12/7/2000 5.0 |
| id 1 | 13/7/2000 4.5 |
| | 14/7/2000 3.4 |
...
| id N | .... |
OUTPUT HASH TABLE
| id 1 | 1/1/1800 max(5,4.5,3.4) |
... ...
In particular, the output must have the same keys (id1, ..., id n)
The inner hash table must have a fixed key (1/1/1800) and a processed value.
My current (not working) code:
output = input.entrySet()
.stream()
.collect(
Collectors.toMap(entry -> entry.getKey(),
entry -> Collectors.toMap(
e -> "1/1/2000",
e -> {
// Get input array
List<Object> list = entry.getValue().values().stream()
.collect(Collectors.toList());
DescriptiveStatistics stats = new DescriptiveStatistics();
// Remove the NaN values from the input array
list.forEach(v -> {
if(!new Double((double)v).isNaN())
stats.addValue((double)v);
});
double value = stats.max();
return value;
}));
Where is the issue?
Thanks

The issue is trying to call Collectors.toMap a second type inside the first Collectors.toMap. Collectors.toMap should be passed to a method that accepts a Collector.
Here's one way to achieve what you want:
Map<String, Map<String,Double>>
output = input.entrySet()
.stream()
.collect(Collectors.toMap(e -> e.getKey(),
e -> Collections.singletonMap (
"1/1/1800",
e.getValue()
.values()
.stream()
.filter (d->!Double.isNaN (d))
.mapToDouble (Double::doubleValue)
.max()
.orElse(0.0))));
Note that there's no need for a second Collectors.toMap. The inner Maps of your output have a single entry each, so you can use Collections.singletonMap to create them.

Your original code can be solved using Collections.singletonMap instead of Collectors.toMap
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(
Collectors.toMap(entry -> entry.getKey(),
entry -> {
// Get input array
List<Object> list = entry.getValue().values().stream()
.collect(Collectors.toList());
DescriptiveStatistics stats = new DescriptiveStatistics();
// Remove the NaN values from the input array
list.forEach(v -> {
if(!new Double((double)v).isNaN())
stats.addValue((double)v);
});
double value = stats.max();
return Collections.singletonMap("1/1/2000", value);
}));
Or make the nested Collectors.toMap a part of an actual stream operation
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(Collectors.toMap(entry -> entry.getKey(),
entry -> Stream.of(entry.getValue()).collect(Collectors.toMap(
e -> "1/1/2000",
e -> {
// Get input array
List<Object> list = e.values().stream()
.collect(Collectors.toList());
DescriptiveStatistics stats = new DescriptiveStatistics();
// Remove the NaN values from the input array
list.forEach(v -> {
if(!new Double((double)v).isNaN())
stats.addValue((double)v);
});
double value = stats.max();
return value;
}))));
though that’s quiet a baroque solution.
That said, you should be aware that there’s the standard DoubleSummaryStatistics making DescriptiveStatistics unnecessary, though, both are unnecessary if you only want to get the max value.
Further, List<Object> list = e.values().stream().collect(Collectors.toList()); could be simplified to List<Object> list = new ArrayList<>(e.values()); if a List is truly required, but here, Collection<Double> list = e.values(); would be sufficient, and typing the collection with Double instead of Object makes the subsequent type casts unnecessary.
Using these improvements for the first variant, you’ll get
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(
Collectors.toMap(entry -> entry.getKey(),
entry -> {
Collection<Double> list = entry.getValue().values();
DoubleSummaryStatistics stats = new DoubleSummaryStatistics();
list.forEach(v -> {
if(!Double.isNaN(v)) stats.accept(v);
});
double value = stats.getMax();
return Collections.singletonMap("1/1/2000", value);
}));
But, as said, DoubleSummaryStatistics still is more than needed to get the maximum:
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(Collectors.toMap(entry -> entry.getKey(),
entry -> {
double max = Double.NEGATIVE_INFINITY;
for(double d: entry.getValue().values())
if(d > max) max = d;
return Collections.singletonMap("1/1/2000", max);
}));
Note that double comparisons always evaluate to false if at least one value is NaN, so using the right operator, i.e. “value possibly NaN” > “current max never NaN”, we don’t need an extra conditional.
Now, you might replace the loop with a stream operation and you’ll end up at Eran’s solution. The choice is yours.

Stream groupingBy by one field then merge all others

I have trouble with stream groupingby.
List<FAR> listFar = farList.stream().filter(f -> !f.getStatus().equals(ENUM.STATUS.DELETED))
.collect(Collectors.toList());
List<HAUL> haulList = listFar.stream().map(f -> f.getHaul()).flatMap(f -> f.stream())
.collect(Collectors.toList());
It groups by specie, it's all fine, but there are another attributes to HAUL.
Map<Specie, List<HAUL>> collect = haulList.stream().collect(Collectors.groupingBy(HAUL::getSpecie));
Attributes:
haul.getFishCount(); (Integer)
haul.getFishWeight(); (BigDecimal)
Is it possible to group by HAUL::getSpecie (by Specie), but also "merging" together those two extra fields, so I have total?
For example: I have 3 of HAUL elements where fish specie A has 50/30/10 kg in weight.
Can I group it by specie and have total weight?

If I understood correctly:
haulsList
.stream()
.collect(Collectors.groupingBy(HAUL::getSpecie,
Collectors.collectingAndThen(Collectors.toList(),
list -> {
int left = list.stream().mapToInt(HAUL::getFishCount).sum();
BigDecimal right = list.stream().map(HAUL::getFishWeight).reduce(BigDecimal.ZERO, (x, y) -> x.add(y));
return new AbstractMap.SimpleEntry<>(left, right);
})));
There is a form to do:
.stream()
.collect(Collectors.groupingBy(HAUL::getSpecie,
Collectors.summingInt(HAUL::getFishCount)));
or
.stream()
.collect(Collectors.groupingBy(HAUL::getSpecie,
Collectors.mapping(HAUL::getFishWeight, Collectors.reducing((x, y) -> x.add(y)))));
But you can't really make these to act at the same time.

You might use mapping and reduce for example:
class Foo { int count; double weight; String spice; }
List<Foo> fooList = Arrays.asList(
new Foo(1,new BigDecimal(10), "a"),
new Foo(2,new BigDecimal(38), "a"),
new Foo(5,new BigDecimal(2), "b"),
new Foo(4,new BigDecimal(8), "b"));
Map<String,Optional<BigDecimal>> spieceWithTotalWeight = fooList.stream().
collect(
groupingBy(
Foo::getSpice,
mapping(
Foo::getWeight,
Collectors.reducing(BigDecimal::add)
)
)
);
System.out.println(spieceWithTotalWeight); // {a=Optional[48], b=Optional[10]}
I hope this helps.

If I'm getting your question correctly, you want the total sum of count * weight for each specie.
You can do this by using Collectors.groupingBy with a downstream collector that reduces the list of HAUL of each specie to the sum of haul.getFishCount() * haul.getFishWeight():
Map<Specie, BigDecimal> result = haulList.stream()
.collect(Collectors.groupingBy(haul -> haul.getSpecie(),
Collectors.mapping(haul ->
new BigDecimal(haul.getFishCount()).multiply(haul.getFishWeight()),
Collectors.reducing(BigDecimal::plus))));
This will get the total sum of count * weight for each specie. If you could add the following method to your Haul class:
public BigDecimal getTotalWeight() {
return new BigDecimal(getFishCount()).multiply(getFishWeight());
}
Then, collecting the stream would be easier and more readable:
Map<Specie, BigDecimal> result = haulList.stream()
.collect(Collectors.groupingBy(haul -> haul.getSpecie(),
Collectors.mapping(haul -> haul.getTotalWeight(),
Collectors.reducing(BigDecimal::plus))));
EDIT: After all, it seems that you want separate sums for each field...
I would use Collectors.toMap with a merge function for this. Here's the code:
Map<Specie, List<BigDecimal>> result = haulList.stream()
.collect(Collectors.toMap(
haul -> haul.getSpecie(),
haul -> Arrays.asList(
new BigDecimal(haul.getFishCount()),
haul.getFishWeight()),
(list1, list2) -> {
list1.set(0, list1.get(0).plus(list2.get(0)));
list1.set(1, list1.get(1).plus(list2.get(1)));
return list1;
}));
This uses a list of 2 elements to store the fish count at index 0 and the fish weight at index 1, for every specie.

Return in one string the min, max, average, sum, count of salaries with Stream and java8

I have a List of employees which are characterized by a salary.
Why this code does not work?
String joined = employees.stream().collect(
Collectors.summingInt(Employee::getSalary),
Collectors.maxBy(Comparator.comparing(Employee::getSalary)),
Collectors.minBy(Comparator.comparing(Employee::getSalary)),
Collectors.averagingLong((Employee e) ->e.getSalary() * 2),
Collectors.counting(),
Collectors.joining(", "));
I'm using a suite of collectors.

Note that currently you're trying to get not the max/min salary, but the Employee having such salary. If you actually want to have the max/min salary itself (number), then these characteristics could be calculated at once using Collectors.summarizingInt():
IntSummaryStatistics stats = employees.stream()
.collect(Collectors.summarizingInt(Employee::getSalary));
If you want to join them to string, you may use:
String statsString = Stream.of(stats.getSum(), stats.getMax(), stats.getMin(),
stats.getAverage()*2, stats.getCount())
.map(Object::toString)
.collect(Collectors.joining(", "));
If you actually want to get an Employee with max/min salary, here IntSummaryStatistics will not help you. However you may create the stream of collectors instead:
String result = Stream.<Collector<Employee,?,?>>of(
Collectors.summingInt(Employee::getSalary),
Collectors.maxBy(Comparator.comparing(Employee::getSalary)),
Collectors.minBy(Comparator.comparing(Employee::getSalary)),
Collectors.averagingLong((Employee e) ->e.getSalary() * 2),
Collectors.counting())
.map(collector -> employees.stream().collect(collector))
.map(Object::toString)
.collect(Collectors.joining(", "));
Note that in this way you will have an output like (depending on the Employee.toString() implementation:
1121, Optional[Employee [salary=1000]], Optional[Employee [salary=1]], 560.5, 4
Don't forget that maxBy/minBy return Optional.
If you are unsatisfied with the first solution and for some reason don't want to iterate the input several times, you can create a combined collector using a method like this:
/**
* Returns a collector which joins the results of supplied collectors
* into the single string using the supplied delimiter.
*/
#SafeVarargs
public static <T> Collector<T, ?, String> joining(CharSequence delimiter,
Collector<T, ?, ?>... collectors) {
#SuppressWarnings("unchecked")
Collector<T, Object, Object>[] cs = (Collector<T, Object, Object>[]) collectors;
return Collector.<T, Object[], String>of(
() -> Stream.of(cs).map(c -> c.supplier().get()).toArray(),
(acc, t) -> IntStream.range(0, acc.length)
.forEach(idx -> cs[idx].accumulator().accept(acc[idx], t)),
(acc1, acc2) -> IntStream.range(0, acc1.length)
.mapToObj(idx -> cs[idx].combiner().apply(acc1[idx], acc2[idx]))
.toArray(),
acc -> IntStream.range(0, acc.length)
.mapToObj(idx -> cs[idx].finisher().apply(acc[idx]).toString())
.collect(Collectors.joining(delimiter)));
}
Having such method you can write
String stats = employees.stream().collect(joining(", ",
Collectors.summingInt(Employee::getSalary),
Collectors.maxBy(Comparator.comparing(Employee::getSalary)),
Collectors.minBy(Comparator.comparing(Employee::getSalary)),
Collectors.averagingLong((Employee e) ->e.getSalary() * 2),
Collectors.counting()));

I finally found the solution.. Thanks for trying guys
String s = employees.stream().mapToDouble(a>a.getSalary()).summaryStatistics().toString();
and this is the output:
DoubleSummaryStatistics{count=21, sum=17200,000000, min=100,000000,
average=819,047619, max=2100,000000}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java stream API - avoid using same predicate twice to calculate average - java

Try this. double average = blist.stream() .map(b -> b.aList.stream() .filter(a -> a.key.equals(k)) .findFirst()) .filter(a -> a.isPresent()) .mapToDouble(a -> a.get().value) .average().orElse(0);

Related

Java streams average

Java stream create object and compare

Processing HashMap using Java 8 Stream API

Stream groupingBy by one field then merge all others

Return in one string the min, max, average, sum, count of salaries with Stream and java8

Categories

Resources