Processing HashMap using Java 8 Stream API

Processing HashMap using Java 8 Stream API - java

I have a hash table in the form
Map<String, Map<String,Double>
I need to process it and create another one having the same structure.
Following a sample to explain the goal
INPUT HASH TABLE
----------------------------
| | 12/7/2000 5.0 |
| id 1 | 13/7/2000 4.5 |
| | 14/7/2000 3.4 |
...
| id N | .... |
OUTPUT HASH TABLE
| id 1 | 1/1/1800 max(5,4.5,3.4) |
... ...
In particular, the output must have the same keys (id1, ..., id n)
The inner hash table must have a fixed key (1/1/1800) and a processed value.
My current (not working) code:
output = input.entrySet()
.stream()
.collect(
Collectors.toMap(entry -> entry.getKey(),
entry -> Collectors.toMap(
e -> "1/1/2000",
e -> {
// Get input array
List<Object> list = entry.getValue().values().stream()
.collect(Collectors.toList());
DescriptiveStatistics stats = new DescriptiveStatistics();
// Remove the NaN values from the input array
list.forEach(v -> {
if(!new Double((double)v).isNaN())
stats.addValue((double)v);
});
double value = stats.max();
return value;
}));
Where is the issue?
Thanks

The issue is trying to call Collectors.toMap a second type inside the first Collectors.toMap. Collectors.toMap should be passed to a method that accepts a Collector.
Here's one way to achieve what you want:
Map<String, Map<String,Double>>
output = input.entrySet()
.stream()
.collect(Collectors.toMap(e -> e.getKey(),
e -> Collections.singletonMap (
"1/1/1800",
e.getValue()
.values()
.stream()
.filter (d->!Double.isNaN (d))
.mapToDouble (Double::doubleValue)
.max()
.orElse(0.0))));
Note that there's no need for a second Collectors.toMap. The inner Maps of your output have a single entry each, so you can use Collections.singletonMap to create them.

Your original code can be solved using Collections.singletonMap instead of Collectors.toMap
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(
Collectors.toMap(entry -> entry.getKey(),
entry -> {
// Get input array
List<Object> list = entry.getValue().values().stream()
.collect(Collectors.toList());
DescriptiveStatistics stats = new DescriptiveStatistics();
// Remove the NaN values from the input array
list.forEach(v -> {
if(!new Double((double)v).isNaN())
stats.addValue((double)v);
});
double value = stats.max();
return Collections.singletonMap("1/1/2000", value);
}));
Or make the nested Collectors.toMap a part of an actual stream operation
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(Collectors.toMap(entry -> entry.getKey(),
entry -> Stream.of(entry.getValue()).collect(Collectors.toMap(
e -> "1/1/2000",
e -> {
// Get input array
List<Object> list = e.values().stream()
.collect(Collectors.toList());
DescriptiveStatistics stats = new DescriptiveStatistics();
// Remove the NaN values from the input array
list.forEach(v -> {
if(!new Double((double)v).isNaN())
stats.addValue((double)v);
});
double value = stats.max();
return value;
}))));
though that’s quiet a baroque solution.
That said, you should be aware that there’s the standard DoubleSummaryStatistics making DescriptiveStatistics unnecessary, though, both are unnecessary if you only want to get the max value.
Further, List<Object> list = e.values().stream().collect(Collectors.toList()); could be simplified to List<Object> list = new ArrayList<>(e.values()); if a List is truly required, but here, Collection<Double> list = e.values(); would be sufficient, and typing the collection with Double instead of Object makes the subsequent type casts unnecessary.
Using these improvements for the first variant, you’ll get
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(
Collectors.toMap(entry -> entry.getKey(),
entry -> {
Collection<Double> list = entry.getValue().values();
DoubleSummaryStatistics stats = new DoubleSummaryStatistics();
list.forEach(v -> {
if(!Double.isNaN(v)) stats.accept(v);
});
double value = stats.getMax();
return Collections.singletonMap("1/1/2000", value);
}));
But, as said, DoubleSummaryStatistics still is more than needed to get the maximum:
Map<String, Map<String,Double>> output = input.entrySet()
.stream()
.collect(Collectors.toMap(entry -> entry.getKey(),
entry -> {
double max = Double.NEGATIVE_INFINITY;
for(double d: entry.getValue().values())
if(d > max) max = d;
return Collections.singletonMap("1/1/2000", max);
}));
Note that double comparisons always evaluate to false if at least one value is NaN, so using the right operator, i.e. “value possibly NaN” > “current max never NaN”, we don’t need an extra conditional.
Now, you might replace the loop with a stream operation and you’ll end up at Eran’s solution. The choice is yours.

Related

Java streams average

I need to create two methods using streams. A method that returns an average score of each task.
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {}
and a method that returns a task with the highest average score.
public String easiestTask(Stream<CourseResult> results) {}
I can only modify those 2 methods.
Here is CourseResult class
public class CourseResult {
private final Person person;
private final Map<String, Integer> taskResults;
public CourseResult(final Person person, final Map<String, Integer> taskResults) {
this.person = person;
this.taskResults = taskResults;
}
public Person getPerson() {
return person;
}
public Map<String, Integer> getTaskResults() {
return taskResults;
}
}
And methods that create CourseResult objects.
private final String[] programTasks = {"Lab 1. Figures", "Lab 2. War and Peace", "Lab 3. File Tree"};
private final String[] practicalHistoryTasks = {"Shieldwalling", "Phalanxing", "Wedging", "Tercioing"};
private Stream<CourseResult> programmingResults(final Random random) {
int n = random.nextInt(names.length);
int l = random.nextInt(lastNames.length);
return IntStream.iterate(0, i -> i + 1)
.limit(3)
.mapToObj(i -> new Person(
names[(n + i) % names.length],
lastNames[(l + i) % lastNames.length],
18 + random.nextInt(20)))
.map(p -> new CourseResult(p, Arrays.stream(programTasks).collect(toMap(
task -> task,
task -> random.nextInt(51) + 50))));
}
private Stream<CourseResult> historyResults(final Random random) {
int n = random.nextInt(names.length);
int l = random.nextInt(lastNames.length);
AtomicInteger t = new AtomicInteger(practicalHistoryTasks.length);
return IntStream.iterate(0, i -> i + 1)
.limit(3)
.mapToObj(i -> new Person(
names[(n + i) % names.length],
lastNames[(l + i) % lastNames.length],
18 + random.nextInt(20)))
.map(p -> new CourseResult(p,
IntStream.iterate(t.getAndIncrement(), i -> t.getAndIncrement())
.map(i -> i % practicalHistoryTasks.length)
.mapToObj(i -> practicalHistoryTasks[i])
.limit(3)
.collect(toMap(
task -> task,
task -> random.nextInt(51) + 50))));
}
Based on these methods I can calculate an average of each task by dividing sum of scores of this task by 3, because there are only 3 Persons tho I can make it so it divides by a number equal to number of CourseResult objects in a stream if these methods get their .limit(3) changed.
I don't know how to access keys of taskResults Map. I think I need them to then return a map of unique keys. A value for each unique key should be an average of values from taskResults map assigend to those keys.

For your first question: map each CourseResult to taskResults, flatmap to get all entries of each taskResults map form all CourseResults, group by map keys (task names) and collect averaging the values for same keys:
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {
return results.map(CourseResult::getTaskResults)
.flatMap(m -> m.entrySet().stream())
.collect(Collectors.groupingBy(Map.Entry::getKey, Collectors.averagingInt(Map.Entry::getValue)));
}
You can use the same approach for your second question to calculate the average for each task and finaly stream over the entries of the resulting map to find the task with the highest average.
public String easiestTask(Stream<CourseResult> results) {
return results.map(CourseResult::getTaskResults)
.flatMap(m -> m.entrySet().stream())
.collect(Collectors.groupingBy(Map.Entry::getKey, Collectors.averagingInt(Map.Entry::getValue)))
.entrySet().stream()
.max(Map.Entry.comparingByValue())
.map(Map.Entry::getKey)
.orElse("No easy task found");
}
To avoid code duplication you can call the first method within the second:
public String easiestTask(Stream<CourseResult> results) {
return averageScoresPerTask(results).entrySet()
.stream()
.max(Map.Entry.comparingByValue())
.map(Map.Entry::getKey)
.orElse("No easy task found");
}
EDIT
To customize the calculation of the average regardless how many items your maps contain, don't use the inbuilt operations like Collectors.averagingInt or Collectors.averagingDouble. Instead wrap your collector in collectingAndThen and sum the scores using Collectors.summingInt and finally after collecting divide using a divisor according if the task name starts with Lab or not:
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {
return results.map(CourseResult::getTaskResults)
.flatMap(m -> m.entrySet().stream())
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(Map.Entry::getKey, Collectors.summingInt(Map.Entry::getValue)),
map -> map.entrySet()
.stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getKey().startsWith("Lab") ? e.getValue() / 3. : e.getValue() / 4.))
));
}

To create a map containing an average score for each task, you need to flatten the map taskResults of every CourseResult result object in the stream and group the data by key (i.e. by task name).
For that you can use collector groupingBy(), as its downstream collector that would be responsible for calculation the average from the score-values mapped to the same task you can use averagingDouble().
That's how it might look like:
public Map<String, Double> averageScoresPerTask(Stream<CourseResult> results) {
return results
.map(CourseResult::getTaskResults) // Stream<Map<String, Integer>> - stream of maps
.flatMap(map -> map.entrySet().stream()) // Stream<Map.Entry<String, Integer>> - stream of entries
.collect(Collectors.groupingBy(
Map.Entry::getKey,
Collectors.averagingDouble(Map.Entry::getValue)
));
}
To find the easiest task, you can use this map instead of passing the stream as an argument because the logic of this method requires applying the same operations. It would make sense in the real life scenario when you're retrieving the data that is stored somewhere (it would be better to avoid double-processing it) and more over in your case you can't generate a stream from the source twice and pass into these two methods because in your case stream data is random. Passing the same stream into both method is not an option because you can execute a stream pipeline only once, when it hits the terminal operation - it's done, you can't use it anymore, hence you can't pass the same stream with random data in these two methods.
public String easiestTask(Map<String, Double> averageByTask) {
return averageByTask.entrySet().stream()
.max(Map.Entry.comparingByValue()) // produces result of type Optianal<Map.Entry<String, Double>>
.map(Map.Entry::getKey) // transforming into Optianal<String>
.orElse("no data"); // or orElseThrow() if data is always expected to be present depending on your needs
}

java 8 map.get multiple key values

I have following code where I want to get value for multiple keys which starts with same name:
for example contents_of_a1, contents_of_ab2, contents_of_abc3
Optional.ofNullable(((Map<?, ?>) fieldValue))
.filter(Objects::nonNull)
.map(coverages -> coverages.get("contents_of_%"))
.filter(Objects::nonNull)
.filter(LinkedHashMap.class::isInstance)
.map(LinkedHashMap.class::cast)
.map(contents -> contents.get("limit"))
.map(limit -> new BigDecimal(String.valueOf(limit)))
.orElse(new BigDecimal(number));
How can I pass contents_of%

I don't know the reasons behind the data structure and what you want to achieve.
However, it is not important as this can be easily reproduced.
Using of Optional is a good start, however, for iterating and processing multiple inputs, you need to use java-stream instead and then Optional inside of collecting (I assume you want Map<String, BigDecimal output, but it can be adjusted easily).
Also, note .filter(Objects::nonNull) is meaningless as Optional handles null internally and is never passed to the next method.
final Map<String, Map<?, ?>> fieldValue = Map.of(
"contents_of_a", new LinkedHashMap<>(Map.of("limit", "10")),
"contents_of_b", new HashMap<>(Map.of("limit", "11")), // Different
"contents_of_c", new LinkedHashMap<>(Map.of("amount", "12")), // No amount
"contents_of_d", new LinkedHashMap<>(Map.of("limit", "13")));
final List<String> contents = List.of(
"contents_of_a",
"contents_of_b",
"contents_of_c",
// d is missing, e is requested instead
"contents_of_e");
final int number = -1;
final Map<String, BigDecimal> resultMap = contents.stream()
.collect(Collectors.toMap(
Function.identity(), // key
content -> Optional.of(fieldValue) // value
.map(coverages -> fieldValue.get(content))
.filter(LinkedHashMap.class::isInstance)
// casting here to LinkedHashMap is not required
// unless its specific methods are to be used
// but we only get a value using Map#get
.map(map -> map.get("limit"))
.map(String::valueOf)
.map(BigDecimal::new)
// prefer this over orElse as Optional#orElseGet
// does not create an object if not required
.orElseGet(() -> new BigDecimal(number))));
// check out the output below the code
resultMap.forEach((k, v) -> System.out.println(k + " -> " + v));
Only the content for a is used as the remaining were either not an instance of LinkedHashMap, didn't contain a key limit or were not among requested contents.
contents_of_a -> 10
contents_of_b -> -1
contents_of_e -> -1
contents_of_c -> -1

If you want to filter a map for which key starting with "contents_of_", you can do this for Java 8:
Map<String, Object> filteredFieldValue = fieldValue.entrySet().stream().filter(e -> {
String k = e.getKey();
return Stream.of("contents_of_").anyMatch(k::startsWith);
}).collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));

Java streams adding multiple values conditionally

I have a List of objects like this, where amount can be negative or positive:
class Sale {
String country;
BigDecimal amount;
}
And I would like to end up with a pair of sums of all negative values, and all positive values, by country.
With these values:
country | amount
nl | 9
nl | -3
be | 7.9
be | -7
Is there a way to end up with Map<String, Pair<BigDecimal, BigDecimal>> using a single stream?
It's easy to do this with two separate streams, but I can't figure it out with just one.

It should be using Collectors.toMap with a merge function to sum pairs.
Assuming that a Pair is immutable and has only getters for the first and second elements, the code may look like this:
static Map<String, Pair<BigDecimal, BigDecimal>> sumUp(List<Sale> list) {
return list.stream()
.collect(Collectors.toMap(
Sale::getCountry,
sale -> sale.getAmount().signum() >= 0
? new Pair<>(sale.getAmount(), BigDecimal.ZERO)
: new Pair<>(BigDecimal.ZERO, sale.getAmount()),
(pair1, pair2) -> new Pair<>(
pair1.getFirst().add(pair2.getFirst()),
pair1.getSecond().add(pair2.getSecond())
)
// , LinkedHashMap::new // optional parameter to keep insertion order
));
}
Test
List<Sale> list = Arrays.asList(
new Sale("us", new BigDecimal(100)),
new Sale("uk", new BigDecimal(-10)),
new Sale("us", new BigDecimal(-50)),
new Sale("us", new BigDecimal(200)),
new Sale("uk", new BigDecimal(333)),
new Sale("uk", new BigDecimal(-70))
);
Map<String, Pair<BigDecimal, BigDecimal>> map = sumUp(list);
map.forEach((country, pair) ->
System.out.printf("%-4s|%s%n%-4s|%s%n",
country, pair.getFirst(), country, pair.getSecond()
));
Output
uk |333
uk |-80
us |300
us |-50

Solution clouse to Alex Rudenko's but using groupingBy and downstream collector:
Map<String, Pair<BigDecimal, BigDecimal>> map =
list.stream()
.collect(Collectors.groupingBy(Sale::getCountry,
Collectors.mapping(s ->
s.getAmount().signum() >= 0?
new Pair<>(s.getAmount(), BigDecimal.ZERO):
new Pair<>(BigDecimal.ZERO, s.getAmount()),
Collectors.reducing(new Pair(BigDecimal.ZERO, BigDecimal.ZERO),
(p1, p2) -> new Pair(p1.getKey().add(p2.getKey()),
p1.getValue().add(p2.getValue()))))
));

Use of stream, filter and average on list and jdk8

I have this list of data that look like this;
{id, datastring}
{1,"a:1|b:2|d:3"}
{2,"a:2|c:2|c:4"}
{3,"a:2|bb:2|a:3"}
{4,"a:3|e:2|ff:3"}
What I need to do here is to do operations like average or find all id for which a element in the string is less than a certain value.
Here are some example;
Averages
{a,2}{b,2}{bb,2}{c,3}{d,3}{e,2}{ff,3}
Find all id's where c<4
{2}
Find all id's where a<3
{1,2,3}
Would this be a good use of stream() and filter() ??

Yes you can use stream operations to achieve that but I would suggest to create a class for this datas, so that each row corresponds to one specific instance. That will make your life easier IMO.
class Data {
private int id;
private Map<String, List<Integer>> map;
....
}
That said let's take a look at how you could implement this. First, the find all's implementation:
public static Set<Integer> ids(List<Data> list, String value, Predicate<Integer> boundPredicate) {
return list.stream()
.filter(d -> d.getMap().containsKey(value))
.filter(d -> d.getMap().get(value).stream().anyMatch(boundPredicate))
.map(d -> d.getId())
.collect(toSet());
}
This one is simple to read. You get a Stream<Data> from the list. Then you apply a filter such that you only get instances that have the value given in the map, and that there is a value which satisfies the predicate you give. Then you map each instance to its corresponding id and you collect the resulting stream in a Set.
Example of call:
Set<Integer> set = ids(list, "a", value -> value < 3);
which outputs:
[1, 2, 3]
The average request was a bit more tricky. I ended up with another implementation, you finally get a Map<String, IntSummaryStatistics> at the end (which does contain the average) but also other informations.
Map<String, IntSummaryStatistics> stats = list.stream()
.flatMap(d -> d.getMap().entrySet().stream())
.collect(toMap(Map.Entry::getKey,
e -> e.getValue().stream().mapToInt(i -> i).summaryStatistics(),
(i1, i2) -> {i1.combine(i2); return i1;}));
You first get a Stream<Data>, then you flatMap each entry set of each map to have Stream<Entry<String, List<Integer>>. Now you collect this stream into a map for which each key is mapped by the entry's key and each List<Integer> is mapped by its corresponding IntSummaryStatistics value. If you have two identical keys, you combine their respective IntSummaryStatistics values.
Given you data set, you get a Map<String, IntSummaryStatistics>
ff => IntSummaryStatistics{count=1, sum=3, min=3, average=3.000000, max=3}
bb => IntSummaryStatistics{count=1, sum=2, min=2, average=2.000000, max=2}
a => IntSummaryStatistics{count=5, sum=11, min=1, average=2.200000, max=3}
b => IntSummaryStatistics{count=1, sum=2, min=2, average=2.000000, max=2}
c => IntSummaryStatistics{count=2, sum=6, min=2, average=3.000000, max=4}
d => IntSummaryStatistics{count=1, sum=3, min=3, average=3.000000, max=3}
e => IntSummaryStatistics{count=1, sum=2, min=2, average=2.000000, max=2}
from which you can easily grab the average.
Here's a full working example, the implementation can certainly be improved though.

I know that you have your answer, but here are my versions too :
Map<String, Double> result = list.stream()
.map(Data::getElements)
.flatMap((Multimap<String, Integer> map) -> {
return map.entries().stream();
})
.collect(Collectors.groupingBy(Map.Entry::getKey,
Collectors.averagingInt((Entry<String, Integer> token) -> {
return token.getValue();
})));
System.out.println(result);
List<Integer> result2 = list.stream()
.filter((Data data) -> {
return data.getElements().get("c").stream().anyMatch(i -> i < 4);
})
.map(Data::getId)
.collect(Collectors.toList());
System.out.println(result2);

How to use double colon in Stream.collect for HashMap?

I want to read a csv file into a hashmap by using the first column as a key, the second column as a value, and ignoring the third column.
I wrote the following code and it works. I would like to know how to rewrite the syntax with double colon "::".
I check the API docs, but most of examples are using List instead of Map.
I used a string to mock a csv file: "A,1,!","B,2,#","C,3,#","D,4,$","E,5,%"
Map<String,String> maps = Stream.of("A,1,!","B,2,#","C,3,#","D,4,$","E,5,%")
.collect(() -> new HashMap<String,String>(),
(map, line) -> {String x[] = line.split(","); map.put(x[0],x[1]);},
(map1, map2) -> map1.putAll(map2));
System.out.println(maps);
Thanks,
Ian

Personally I would do this:
Map<String, String> maps = Stream.of("A,1,!", "B,2,#", "C,3,#", "D,4,$", "E,5,%").
map(line -> line.split(",")).
collect(HashMap::new, (map, line) -> map.put(line[0], line[1]), HashMap::putAll);
i.e. separate out the logic into distinct stream transformation operations. Doing the map in the collect clouds the intent of the code.

I do not think you want to be using the concrete collect() with the supplier, accumulator and combiner.
You should rely more on higher level methods, this becomes then:
Map<String, String> map = Stream.of("A,1,!","B,2,#","C,3,#","D,4,$","E,5,%")
.map(line -> line.split(","))
.collect(Collectors.toMap(
array -> array[0],
array -> array[1]
));
Which does the following:
Create a Stream<String>.
Map it to a Stream<String[]>.
Collect the results via a Collectors.toMap which takes a key mapper and a value mapper as arguments.
Here I map the array to array[0] for the key.
Here I map the array to array[1] for the value.
Then to confirm it works I print:
map.forEach((k, v) -> System.out.println("Key = " + k + " / Value = " + v));
Which gives:
Key = A / Value = 1
Key = B / Value = 2
Key = C / Value = 3
Key = D / Value = 4
Key = E / Value = 5

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Processing HashMap using Java 8 Stream API - java

Related

Java streams average

java 8 map.get multiple key values

Java streams adding multiple values conditionally

Use of stream, filter and average on list and jdk8

How to use double colon in Stream.collect for HashMap?

Categories

Resources