Java-Stream - Sort and Transform elements using groupingBy()

Java-Stream - Sort and Transform elements using groupingBy() - java

Suppose I have the following domain class:
public class OrderRow {
private Long orderId;
private String action;
private LocalDateTime timestamp;
// getters, constructor, etc.
}
I have a following data set of OrderRows :
OrderId Action Timestamp
3 Pay money 2015-05-27 12:48:47.000
3 Select Item 2015-05-27 12:44:47.000
1 Generate Payment 2015-05-27 12:55:47.000
2 Pay money 2015-05-27 12:48:47.000
2 Select Item 2015-05-27 12:44:47.000
2 Deliver 2015-05-27 12:55:47.000
1 Generate Invoice 2015-05-27 12:48:47.000
1 Create PO 2015-05-27 12:44:47.000
3 Deliver 2015-05-27 12:55:47.000
What I want to obtain the following Map from the sample data shown above:
[3] -> ["Select Item", "Pay money", "Deliver"]
[1] -> ["Create PO", "Generate Invoice", "Generate Payment"]
[2] -> ["Select Item", "Pay money", "Deliver"]
By performing below operations :
I want to groupBy orderId.
Sort actions by timestamp.
Create a Set (as there can be duplicates) of actions.
I am trying to do this in a single groupingBy operation as performing separate sorting, mapping operations take a lot of time if data set is huge.
I've tried to do the following:
orderRows.stream()
.collect(Collectors.groupingBy(OrderRow::getOrderId,
Collectors.mapping(Function.identity(),
Collectors.toCollection(
() -> new TreeSet<>(Comparator.comparing(e -> e.timestamp))
))));
But then I get output as Map<String, Set<OrderRow>>
where as I need the result of type Map<String, Set<String>>.
Would be really grateful if someone can show me at least a direction to go.
Note that is a critical operation and should be done in few milliseconds, hence performance is important.

TL;DR
LinkedHashSet - is the only implementation of the Set interface, which would to retain the order of the plain Strings (action property), that differs from the alphabetical one.
Timsort has a slightly better performance than sorting via Red-black tree (i.e. by storing elments into a TreeSet). Since OP has said that "is a critical operation", that worth to take into consideration.
You can sort the stream elements by timestamp before applying collect. That would guarantee that actions in the list would be consumed in the required order.
And to retain the order of these Strings, you can use a LinkedHashSet (TreeSet would not be helpful in this case since you need to keep the order by property which would not be present in the collection).
var actionsById = orderRows.stream()
.sorted(Comparator.comparing(OrderRow::getTimestamp))
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.mapping(
OrderRow::getAction,
Collectors.toCollection(LinkedHashSet::new))
));
Alternatively, you can sort the sets while grouping (as you were trying to do).
var actionsById = orderRows.stream()
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.collectingAndThen(
Collectors.mapping(
Function.identity(),
Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(OrderRow::getTimestamp)))
),
set -> set.stream().map(OrderRow::getAction).collect(Collectors.toCollection(LinkedHashSet::new))
)
));
Or instead of sorting via Red-Black Tree (TreeSet is backed by the an implementation of this data structure) each set can be sorted via Timsort (which is an algorithm used while sorting an array of elements, and sorted() operation is invoked in the stream, its elements are being dumped into an array).
Maintaining a Red-Black Tree is constful, and theoretically Timsort should perform a bit better.
var actionsById = orderRows.stream()
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.collectingAndThen(
Collectors.mapping(
Function.identity(), Collectors.toList()
),
list -> list.stream().sorted(Comparator.comparing(OrderRow::getTimestamp))
.map(OrderRow::getAction)
.collect(Collectors.toCollection(LinkedHashSet::new))
)
));
To eliminate allocating a new array in memory (which happens when sorted() operation is being called) we can sort the lists produced by mapping() directly, and then create a stream out of sorted lists in the finisher function of collectingAndThen().
That would require writing a multiline lambda, but in this case is justifiable by the performance gain.
var actionsById = orderRows.stream()
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.collectingAndThen(
Collectors.mapping(
Function.identity(), Collectors.toList()
),
list -> {
list.sort(Comparator.comparing(OrderRow::getTimestamp));
return list.stream().map(OrderRow::getAction)
.collect(Collectors.toCollection(LinkedHashSet::new));
}
)
));

Related

Efficient way to select a range of key-value pairs from a Java HashMap

I am trying to come up with an efficient way to select a discrete, but arbitrary range of key-value pairs from a HashMap. This is easy in Python, but seems difficult in Java. I was hoping to avoid using Iterators, since they seems slow for this application (correct me if I'm wrong).
For example, I'd like to be able to make the following call:
ArrayList<Pair<K, V>> values = pairsFromRange(hashMap, 0, 5);

You can't do anything that performs meaningfully better than an Iterator to do this with a HashMap.
If you use a TreeMap, however, this becomes easy: use subMap(0, 5) or the like.

Looks straightforward with lambdas (implies iteration of course). skip(n) and limit(n) should allow to address any slice of the map.
Map<String, String> m = new HashMap<>();
m.put("k1","v");
m.put("k2","v");
m.put("k3","v");
m.put("k4","v");
m.put("k5","v");
Map<String,String> slice = m.entrySet().stream()
.limit(3)
.collect(Collectors.toMap(x -> x.getKey(), x -> x.getValue()));
System.out.println(slice);
slice ==> {k1=v, k2=v, k3=v}
slice = m.entrySet().stream()
.skip(2)
.limit(3)
.collect(Collectors.toMap(x -> x.getKey(), x -> x.getValue()));
System.out.println(slice);
slice ==> {k3=v, k4=v, k5=v}

How to use Java streams to create a list out of a map site

Starting with a map like:
Map<Integer, String> mapList = new HashMap<>();
mapList.put(2,"b");
mapList.put(4,"d");
mapList.put(3,"c");
mapList.put(5,"e");
mapList.put(1,"a");
mapList.put(6,"f");
I can sort the map using Streams like:
mapList.entrySet()
.stream()
.sorted(Map.Entry.<Integer, String>comparingByKey())
.forEach(System.out::println);
But I need to get list (and a String) of the correspondent sorted elements (that would be: a b c d e f) that do correspond with the keys: 1 2 3 4 5 6.
I cannot find the way to do it in that Stream command.
Thanks
As #MA says in his comment I need a mapping and that is not explained in this question: How to convert a Map to List in Java?
So thank you very much #MA
Sometimes people are too fast into closing questions!

You can use a mapping collector:
var sortedValues = mapList.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.collect(Collectors.mapping(Entry::getValue, Collectors.toList()))

You could also use some of the different collection classes instead of streams:
List<String> list = new ArrayList<>(new TreeMap<>(mapList).values());
The downside being that if you do all that in a single line it can get quite messy, quite fast. Additionally you're throwing away the intermediate TreeMap just for the sorting.

If you want to sort on the keys and collect only the values, you need to use a mapping function to only preserve the values after your sorting. Afterwards you can just collect or do a foreach loop.
mapList.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.map(Map.Entry::getValue)
.collect(Collectors.toList());

Handling nested Collections with Java 8 streams

Lately I came across a problem during working with nested collections (values of Maps inside a List):
List<Map<String, Object>> items
This list in my case contains 10-20 Maps.
At some point I had to replace value Calculation of key description to Rating. So I come up with this solution:
items.forEach(e -> e.replace("description","Calculation","Rating"));
It would be quite fine and efficient solution if all maps in this list will contain key-Value pair ["description", "Calculation"]. Unfortunately, I know that there will be only one such pair in the whole List<Map<String, Object>>.
The question is:
Is there a better (more efficient) solution of finding and replacing this one value, instead of iterating through all List elements using Java-8 streams?
Perfection would be to have it done in one stream without any complex/obfuscating operations on it.

items.stream()
.filter(map -> map.containsKey("description"))
.findFirst()
.ifPresent(map -> map.replace("description", "Calculation", "Rating"));
You will have to iterate over the list until a map with the key "description" is found. Pick up the first such, and try to replace.
As pointed out by #Holger, if the key "description" isn't single for all the maps, but rather the pair ("description", "Calculation") is unique:
items.stream()
.anyMatch(m -> m.replace("description", "Calculation", "Rating"));

Java Streams: Organize a collection into a map and select smallest key

I'm pretty sure this is not possible in one line, but I just wanted to check:
List<WidgetItem> selectedItems = null;
Map<Integer, List<WidgetItem>> itemsByStockAvailable = WidgetItems.stream()
.collect(Collectors.groupingBy(WidgetItem::getAvailableStock));
selectedItems = itemsByStockAvailable.get(
itemsByStockAvailable.keySet().stream().sorted().findFirst().get());
Basically I'm collecting all widget items into a map where the key is the availableStock quantity and the value is a list of all widgets that have that quantity (since multiple widgets might have the same value). Once I have that map, I would want to select the map's value that corresponds to the smallest key. The intermediate step of creating a Map isn't necessary, it's just the only way I could think of to do this.

It appears what you want is to keep all the widget items that were grouped with the lowest available stock. In that case, you can collect the grouped data into a TreeMap to ensure the ordering based on increasing values of the stock and retrieve the first entry with firstEntry()
List<WidgetItem> selectedItems =
widgetItems.stream()
.collect(Collectors.groupingBy(
WidgetItem::getAvailableStock,
TreeMap::new,
Collectors.toList()
))
.firstEntry()
.getValue();
The advantage is that it is done is one-pass over the initial list.

Essentially you want to get all the input elements which are minimal according to the custom comparator Comparator.comparingInt(WidgetItem::getAvailableStock). In general this problem could be solved without necessity to store everything into the intermediate map creating unnecessary garbage. Also it could be solved in single pass. Some interesting solutions already present in this question. For example, you may use the collector implemented by Stuart Marks:
List<WidgetItem> selectedItems = widgetItems.stream()
.collect(maxList(
Comparator.comparingInt(WidgetItem::getAvailableStock).reversed()));
Such collectors are readily available in my StreamEx library. The best suitable in your case is MoreCollectors.minAll(Comparator):
List<WidgetItem> selectedItems = widgetItems.stream()
.collect(MoreCollectors.minAll(
Comparator.comparingInt(WidgetItem::getAvailableStock)));

If you want to avoid creating the intermediate map, you can first determine the smallest stock value, filter by that value and collect to list.
int minStock = widgetItems.stream()
.mapToInt(WidgetItem::getAvailableStock)
.min()
.getAsInt(); // or throw if list is empty
List<WidgetItem> selectItems = widgetItems.stream()
.filter(w -> minStock == w.getAvailableStock())
.collect(toList());
Also, do not use sorted().findFirst() to find the min value of a stream. Use min instead.

You can find the smallest key in a first pass and then get all the items having that smallest key:
widgetItems.stream()
.map(WidgetItem::getAvailableStock)
.min(Comparator.naturalOrder())
.map(min ->
widgetItems.stream()
.filter(item -> item.getAvailableStock().equals(min))
.collect(toList()))
.orElse(Collections.emptyList());

I would collect the data into a NavigableMap, which involves only a small change to your original code:
List<WidgetItem> selectedItems = null;
NavigableMap<Integer, List<WidgetItem>> itemsByStockAvailable =
WidgetItems.stream()
.collect(Collectors.groupingBy(WidgetItem::getAvailableStock,
TreeMap::new, Collectors.toList()));
selectedItems = itemsByStockAvailable.firstEntry().getValue();

How to convert map with List as key to collection of merged list values?

I have a structure that looks like this:
Map<Long, List<Double>
I want to convert it to
List<Double>
where each item in this resulting list represents sum of the values for one key. With example Map:
{1: [2.0, 3.0, 4.0],
2: [1.5, 10.0]}
I want to achieve as a result:
[9.0, 11.5]
or
[11.5, 9.0]
(order doesn't matter).
Is it possible with Java 8 merge() method?
Actually the case above is a little bit simplified, because in fact my List is parametrized with some complex class and I want to create merged object of this class but I just want to grasp the general idea here.

Try:
Map<Long, List<Double>> map = // ...
List<Double> sums = map.values()
.stream()
.map(l -> l.stream().mapToDouble(d -> d).sum())
.collect(Collectors.toList())

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java-Stream - Sort and Transform elements using groupingBy() - java

Related

Efficient way to select a range of key-value pairs from a Java HashMap

How to use Java streams to create a list out of a map site

Handling nested Collections with Java 8 streams

Java Streams: Organize a collection into a map and select smallest key

How to convert map with List as key to collection of merged list values?

Categories

Resources