Java Streams: Organize a collection into a map and select smallest key - java

I'm pretty sure this is not possible in one line, but I just wanted to check:
List<WidgetItem> selectedItems = null;
Map<Integer, List<WidgetItem>> itemsByStockAvailable = WidgetItems.stream()
.collect(Collectors.groupingBy(WidgetItem::getAvailableStock));
selectedItems = itemsByStockAvailable.get(
itemsByStockAvailable.keySet().stream().sorted().findFirst().get());
Basically I'm collecting all widget items into a map where the key is the availableStock quantity and the value is a list of all widgets that have that quantity (since multiple widgets might have the same value). Once I have that map, I would want to select the map's value that corresponds to the smallest key. The intermediate step of creating a Map isn't necessary, it's just the only way I could think of to do this.

It appears what you want is to keep all the widget items that were grouped with the lowest available stock. In that case, you can collect the grouped data into a TreeMap to ensure the ordering based on increasing values of the stock and retrieve the first entry with firstEntry()
List<WidgetItem> selectedItems =
widgetItems.stream()
.collect(Collectors.groupingBy(
WidgetItem::getAvailableStock,
TreeMap::new,
Collectors.toList()
))
.firstEntry()
.getValue();
The advantage is that it is done is one-pass over the initial list.

Essentially you want to get all the input elements which are minimal according to the custom comparator Comparator.comparingInt(WidgetItem::getAvailableStock). In general this problem could be solved without necessity to store everything into the intermediate map creating unnecessary garbage. Also it could be solved in single pass. Some interesting solutions already present in this question. For example, you may use the collector implemented by Stuart Marks:
List<WidgetItem> selectedItems = widgetItems.stream()
.collect(maxList(
Comparator.comparingInt(WidgetItem::getAvailableStock).reversed()));
Such collectors are readily available in my StreamEx library. The best suitable in your case is MoreCollectors.minAll(Comparator):
List<WidgetItem> selectedItems = widgetItems.stream()
.collect(MoreCollectors.minAll(
Comparator.comparingInt(WidgetItem::getAvailableStock)));

If you want to avoid creating the intermediate map, you can first determine the smallest stock value, filter by that value and collect to list.
int minStock = widgetItems.stream()
.mapToInt(WidgetItem::getAvailableStock)
.min()
.getAsInt(); // or throw if list is empty
List<WidgetItem> selectItems = widgetItems.stream()
.filter(w -> minStock == w.getAvailableStock())
.collect(toList());
Also, do not use sorted().findFirst() to find the min value of a stream. Use min instead.

You can find the smallest key in a first pass and then get all the items having that smallest key:
widgetItems.stream()
.map(WidgetItem::getAvailableStock)
.min(Comparator.naturalOrder())
.map(min ->
widgetItems.stream()
.filter(item -> item.getAvailableStock().equals(min))
.collect(toList()))
.orElse(Collections.emptyList());

I would collect the data into a NavigableMap, which involves only a small change to your original code:
List<WidgetItem> selectedItems = null;
NavigableMap<Integer, List<WidgetItem>> itemsByStockAvailable =
WidgetItems.stream()
.collect(Collectors.groupingBy(WidgetItem::getAvailableStock,
TreeMap::new, Collectors.toList()));
selectedItems = itemsByStockAvailable.firstEntry().getValue();

Related

Java-Stream - Sort and Transform elements using groupingBy()

Suppose I have the following domain class:
public class OrderRow {
private Long orderId;
private String action;
private LocalDateTime timestamp;
// getters, constructor, etc.
}
I have a following data set of OrderRows :
OrderId Action Timestamp
3 Pay money 2015-05-27 12:48:47.000
3 Select Item 2015-05-27 12:44:47.000
1 Generate Payment 2015-05-27 12:55:47.000
2 Pay money 2015-05-27 12:48:47.000
2 Select Item 2015-05-27 12:44:47.000
2 Deliver 2015-05-27 12:55:47.000
1 Generate Invoice 2015-05-27 12:48:47.000
1 Create PO 2015-05-27 12:44:47.000
3 Deliver 2015-05-27 12:55:47.000
What I want to obtain the following Map from the sample data shown above:
[3] -> ["Select Item", "Pay money", "Deliver"]
[1] -> ["Create PO", "Generate Invoice", "Generate Payment"]
[2] -> ["Select Item", "Pay money", "Deliver"]
By performing below operations :
I want to groupBy orderId.
Sort actions by timestamp.
Create a Set (as there can be duplicates) of actions.
I am trying to do this in a single groupingBy operation as performing separate sorting, mapping operations take a lot of time if data set is huge.
I've tried to do the following:
orderRows.stream()
.collect(Collectors.groupingBy(OrderRow::getOrderId,
Collectors.mapping(Function.identity(),
Collectors.toCollection(
() -> new TreeSet<>(Comparator.comparing(e -> e.timestamp))
))));
But then I get output as Map<String, Set<OrderRow>>
where as I need the result of type Map<String, Set<String>>.
Would be really grateful if someone can show me at least a direction to go.
Note that is a critical operation and should be done in few milliseconds, hence performance is important.
TL;DR
LinkedHashSet - is the only implementation of the Set interface, which would to retain the order of the plain Strings (action property), that differs from the alphabetical one.
Timsort has a slightly better performance than sorting via Red-black tree (i.e. by storing elments into a TreeSet). Since OP has said that "is a critical operation", that worth to take into consideration.
You can sort the stream elements by timestamp before applying collect. That would guarantee that actions in the list would be consumed in the required order.
And to retain the order of these Strings, you can use a LinkedHashSet (TreeSet would not be helpful in this case since you need to keep the order by property which would not be present in the collection).
var actionsById = orderRows.stream()
.sorted(Comparator.comparing(OrderRow::getTimestamp))
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.mapping(
OrderRow::getAction,
Collectors.toCollection(LinkedHashSet::new))
));
Alternatively, you can sort the sets while grouping (as you were trying to do).
var actionsById = orderRows.stream()
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.collectingAndThen(
Collectors.mapping(
Function.identity(),
Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(OrderRow::getTimestamp)))
),
set -> set.stream().map(OrderRow::getAction).collect(Collectors.toCollection(LinkedHashSet::new))
)
));
Or instead of sorting via Red-Black Tree (TreeSet is backed by the an implementation of this data structure) each set can be sorted via Timsort (which is an algorithm used while sorting an array of elements, and sorted() operation is invoked in the stream, its elements are being dumped into an array).
Maintaining a Red-Black Tree is constful, and theoretically Timsort should perform a bit better.
var actionsById = orderRows.stream()
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.collectingAndThen(
Collectors.mapping(
Function.identity(), Collectors.toList()
),
list -> list.stream().sorted(Comparator.comparing(OrderRow::getTimestamp))
.map(OrderRow::getAction)
.collect(Collectors.toCollection(LinkedHashSet::new))
)
));
To eliminate allocating a new array in memory (which happens when sorted() operation is being called) we can sort the lists produced by mapping() directly, and then create a stream out of sorted lists in the finisher function of collectingAndThen().
That would require writing a multiline lambda, but in this case is justifiable by the performance gain.
var actionsById = orderRows.stream()
.collect(Collectors.groupingBy(
OrderRow::getOrderId,
Collectors.collectingAndThen(
Collectors.mapping(
Function.identity(), Collectors.toList()
),
list -> {
list.sort(Comparator.comparing(OrderRow::getTimestamp));
return list.stream().map(OrderRow::getAction)
.collect(Collectors.toCollection(LinkedHashSet::new));
}
)
));

How to use Java streams to create a list out of a map site

Starting with a map like:
Map<Integer, String> mapList = new HashMap<>();
mapList.put(2,"b");
mapList.put(4,"d");
mapList.put(3,"c");
mapList.put(5,"e");
mapList.put(1,"a");
mapList.put(6,"f");
I can sort the map using Streams like:
mapList.entrySet()
.stream()
.sorted(Map.Entry.<Integer, String>comparingByKey())
.forEach(System.out::println);
But I need to get list (and a String) of the correspondent sorted elements (that would be: a b c d e f) that do correspond with the keys: 1 2 3 4 5 6.
I cannot find the way to do it in that Stream command.
Thanks
As #MA says in his comment I need a mapping and that is not explained in this question: How to convert a Map to List in Java?
So thank you very much #MA
Sometimes people are too fast into closing questions!
You can use a mapping collector:
var sortedValues = mapList.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.collect(Collectors.mapping(Entry::getValue, Collectors.toList()))
You could also use some of the different collection classes instead of streams:
List<String> list = new ArrayList<>(new TreeMap<>(mapList).values());
The downside being that if you do all that in a single line it can get quite messy, quite fast. Additionally you're throwing away the intermediate TreeMap just for the sorting.
If you want to sort on the keys and collect only the values, you need to use a mapping function to only preserve the values after your sorting. Afterwards you can just collect or do a foreach loop.
mapList.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.map(Map.Entry::getValue)
.collect(Collectors.toList());

Find Max of Multiple Lists

I'm pretty new to java streams and am trying to determine how to find the max from each list, in a list of lists, and end with a single list that contains the max from each sublist.
I can accomplish this by using a for loop and stream like so:
// databaseRecordsLists is a List<List<DatabaseRecord>>
List<DatabaseRecord> mostRecentRecords = new ArrayList<>();
for (List<DatabaseRecord> databaseRecords : databaseRecordsLists) {
mostRecentRecords.add(databaseRecords.stream()
.max(Comparator.comparing(DatabaseRecord::getTimestamp))
.orElseThrow(NoSuchElementException::new));
}
I've looked into the flatMap api, but then I'll only end up with a single map of all DatabaseRecord objects, where I need a max from each individual list.
Any ideas on a cleaner way to accomplish this?
You don't need flatMap. Create a Stream<List<DatabaseRecord>>, and map each List<DatabaseRecord> of the Stream to the max element. Then collect all the max elements into the output List.
List<DatabaseRecord> mostRecentRecords =
databaseRecordsLists.stream()
.map(list -> list.stream()
.max(Comparator.comparing(DatabaseRecord::getTimestamp))
.orElseThrow(NoSuchElementException::new))
.collect(Collectors.toList());
Based on the comments, I suggested to rather ignore the empty collection, otherwise, no result would be returned and NoSuchElementException thrown even the empty collection might (?) be a valid state. If so, you can improve the current solution:
databaseRecordsLists.stream()
.filter(list -> !list.isEmpty()) // Only non-empty ones
.map(list -> list.stream()
.max(Comparator.comparing(DatabaseRecord::getTimestamp)) // Get these with max
.orElseThrow(NoSuchElementException::new)) // Never happens
.collect(Collectors.toList()); // To List
If you use a version higher than Java 8:
As of Java 10, orElseThrow(NoSuchElementException::new) can be subsituted with orElseThrow().
As of Java 11, you can use Predicate.not(..), therefore the filter part would look like: .filter(Predicate.not(List::isEmpty)).

Handling nested Collections with Java 8 streams

Lately I came across a problem during working with nested collections (values of Maps inside a List):
List<Map<String, Object>> items
This list in my case contains 10-20 Maps.
At some point I had to replace value Calculation of key description to Rating. So I come up with this solution:
items.forEach(e -> e.replace("description","Calculation","Rating"));
It would be quite fine and efficient solution if all maps in this list will contain key-Value pair ["description", "Calculation"]. Unfortunately, I know that there will be only one such pair in the whole List<Map<String, Object>>.
The question is:
Is there a better (more efficient) solution of finding and replacing this one value, instead of iterating through all List elements using Java-8 streams?
Perfection would be to have it done in one stream without any complex/obfuscating operations on it.
items.stream()
.filter(map -> map.containsKey("description"))
.findFirst()
.ifPresent(map -> map.replace("description", "Calculation", "Rating"));
You will have to iterate over the list until a map with the key "description" is found. Pick up the first such, and try to replace.
As pointed out by #Holger, if the key "description" isn't single for all the maps, but rather the pair ("description", "Calculation") is unique:
items.stream()
.anyMatch(m -> m.replace("description", "Calculation", "Rating"));

Find most common/frequent element in an ArrayList in Java

I have an array list with 5 elements each of which is an Enum. I want to build a method which returns another array list with the most common element(s) in the list.
Example 1:
[Activities.WALKING, Activities.WALKING, Activities.WALKING, Activities.JOGGING, Activities.STANDING]
Method would return: [Activities.WALKING]
Example 2:
[Activities.WALKING, Activities.WALKING, Activities.JOGGING, Activities.JOGGING, Activities.STANDING]
Method would return: [Activities.WALKING, Activities.JOGGING]
WHAT HAVE I TRIED:
My idea was to declare a count for every activity but that means that if I want to add another activity, I have to go and modify the code to add another count for that activity.
Another idea was to declare a HashMap<Activities, Integer> and iterate the array to insert each activity and its occurence in it. But then how will I extract the Activities with the most occurences?
Can you help me out guys?
The most common way of implementing something like this is counting with a Map: define a Map<MyEnum,Integer> which stores zeros for each element of your enumeration. Then walk through your list, and increment the counter for each element that you find in the list. At the same time, maintain the current max count. Finally, walk through the counter map entries, and add to the output list the keys of all entries the counts of which matches the value of max.
In statistics, this is called the "mode" (in your specific case, "multi mode" is also used, as you want all values that appear most often, not just one). A vanilla Java 8 solution looks like this:
Map<Activities, Long> counts =
Stream.of(WALKING, WALKING, JOGGING, JOGGING, STANDING)
.collect(Collectors.groupingBy(s -> s, Collectors.counting()));
long max = Collections.max(counts.values());
List<Activities> result = counts
.entrySet()
.stream()
.filter(e -> e.getValue().longValue() == max)
.map(Entry::getKey)
.collect(Collectors.toList());
Which yields:
[WALKING, JOGGING]
jOOλ is a library that supports modeAll() on streams. The following program:
System.out.println(
Seq.of(WALKING, WALKING, JOGGING, JOGGING, STANDING)
.modeAll()
.toList()
);
Yields:
[WALKING, JOGGING]
(disclaimer: I work for the company behind jOOλ)

Categories

Resources