Using Java 8 stream methods to get the last max value - java

Given a list of items with properties, I am trying to get the last item to appear with a maximum value of said property.
For example, for the following list of objects:
t i
A: 3
D: 7 *
F: 4
C: 5
X: 7 *
M: 6
I can get one of the Things with the highest i:
Thing t = items.stream()
.max(Comparator.comparingLong(Thing::getI))
.orElse(null);
However, this will get me Thing t = D. Is there a clean and elegant way of getting the last item, i.e. X in this case?
One possible solution is using the reduce function. However, the property is calculated on the fly and it would look more like:
Thing t = items.stream()
.reduce((left, right) -> {
long leftValue = valueFunction.apply(left);
long rightValue = valueFunction.apply(right);
return leftValue > rightValue ? left : right;
})
.orElse(null);
The valueFunction now needs to be called nearly twice as often.
Other obvious roundabout solutions are:
Store the object in a Tuple with its index
Store the object in a Tuple with its computed value
Reverse the list beforehand
Don't use Streams

Remove the equals option (don't return 0 if the compared numbers are equal, return -1 instead) from the comparator (ie. write your own comparator that doesn't include an equals option):
Thing t = items.stream()
.max((a, b) -> a.getI() > b.getI() ? 1 : -1)
.orElse(null);

Conceptually, you seem to be possibly looking for something like thenComparing using the index of the elements in the list:
Thing t = items.stream()
.max(Comparator.comparingLong(Thing::getI).thenComparing(items::indexOf))
.orElse(null);

To avoid the multiple applications of valueFunction in your reduce solution, simply explicitly calculate the result and put it in a tuple:
Item lastMax = items.stream()
.map(item -> new AbstractMap.SimpleEntry<Item, Long>(item, valueFunction.apply(item)))
.reduce((l, r) -> l.getValue() > r.getValue() ? l : r )
.map(Map.Entry::getKey)
.orElse(null);

Stream is not necessary bad if you do things in two steps :
1) Find the i value that has more occurrences in the Iterable (as you did)
2) Search the last element for this i value by starting from the end of items:
Thing t =
items.stream()
.max(Comparator.comparingLong(Thing::getI))
.mapping(firstMaxThing ->
return
IntStream.rangeClosed(1, items.size())
.mapToObj(i -> items.get(items.size()-i))
.filter(item -> item.getI() == firstMaxThing.getI())
.findFirst().get();
// here get() cannot fail as *max()* returned something.
)
.orElse(null)

The valueFunction now needs to be called nearly twice as often.
Note that even when using max, the getI method will be called again and again for every comparison, not just once per element. In your example, it's called 11 times, including 6 times for D, and for longer lists, too, it seems to be called on average twice per element.
How about you just cache the calculated value directly in the Thing instance? If this is not possible, you could use an external Map and use calculateIfAbsent to calculate the value only once for each Thing and then use your approach using reduce.
Map<Thing, Long> cache = new HashMap<>();
Thing x = items.stream()
.reduce((left, right) -> {
long leftValue = cache.computeIfAbsent(left, Thing::getI);
long rightValue = cache.computeIfAbsent(right, Thing::getI);
return leftValue > rightValue ? left : right;
})
.orElse(null);
Or a bit cleaner, calculating all the values beforehand:
Map<Thing, Long> cache = items.stream()
.collect(Collectors.toMap(x -> x, Thing::getI));
Thing x = items.stream()
.reduce((left, right) -> cache.get(left) > cache.get(right) ? left : right)
.orElse(null);

You can still use the reduction to get this thing done. If t1 is larger, then only it will keep t1. In all the other cases it will keep t2. If either t2 is larger or t1 and t2 are the same, then it will eventually return t2 adhering to your requirement.
Thing t = items.stream().
reduce((t1, t2) -> t1.getI() > t2.getI() ? t1 : t2)
.orElse(null);

Your current implementation using reduce looks good, unless your value-extractor function is costly.
Considering later you may want to reuse the logic for different object types & fields, I would extract the logic itself in separate generic method/methods:
public static <T, K, V> Function<T, Map.Entry<K, V>> toEntry(Function<T, K> keyFunc, Function<T, V> valueFunc){
return t -> new AbstractMap.SimpleEntry<>(keyFunc.apply(t), valueFunc.apply(t));
}
public static <ITEM, FIELD extends Comparable<FIELD>> Optional<ITEM> maxBy(Function<ITEM, FIELD> extractor, Collection<ITEM> items) {
return items.stream()
.map(toEntry(identity(), extractor))
.max(comparing(Map.Entry::getValue))
.map(Map.Entry::getKey);
}
The code snippets above can be used like this:
Thing maxThing = maxBy(Thing::getField, things).orElse(null);
AnotherThing maxAnotherThing = maxBy(AnotherThing::getAnotherField, anotherThings).orElse(null);

Related

Collector toConcurrentMap() - What should the MergeFunction for a stream of Unique elements

Create map using parallel stream
I have this working code to create a map and populate using parallel stream
SortedMap<Double, Double> map = new TreeMap<>();
for (double i = 10d; i < 50d; i += 2d)
map.put(i, null);
map.entrySet().parallelStream().forEach(e -> {
double res = bigfunction(e.getKey());
e.setValue(100d - res);
});
I'm working on reimplementing the above code in a proper way.
Incomplete code:
SortedMap<Double, Double> map = DoubleStream.iterate(10d, i -> i + 2d).limit(20)
.parallel()
.collect(Collectors.toConcurrentMap(
k -> k, k -> {
double res = bigfunction(k);
return 100d - res;
},
null,
TreeMap::new
));
What should be the mergeFunction in this case
What changes are required to make this work
In this case, you can throw from the mergeFunction of the collector, since you don't expect it to be used.
(left, right) -> {
throw new AssertionError("Duplicate has been encountered: " + left);
}
If at a later point in time, someone would alter the stream source in such a way that stream data would no longer be unique, it would immediately become clear that this change goes against the initial logic. Meanwhile, a dummy merger like (a, b) -> b would hide the fact that data is being processed and then thrown away.
It's also worth to be aware:
that while performing calculations with double your result might appear to be non-accurate because this type inherently incapable to represent precisely fractions which are not power of 2. For that reason, some data might be lost. If your bigfunction() does a lot of calculations and loosing precision is not acceptable, you might consider using BigDecimal, but keep in mind that computations would become much heavier.
Another important thing to consider is that collector toConcurrentMap() expects a concurrent implementation of Map (credits to #DuncG for spotting this). Contrary to collectors optimized for single-threaded environment like toMap(), which would create a new instance of the container for each thread and then would merge the containers, concurrent collectors like toConcurrentMap() create only one container. I.e. each thread would operate with the same map, and it should be thread-safe. Therefore, TreeMap is not suitable for this scenario. Instead, we can use ConcurrentSkipListMap which also maintains its data ordered by key.
While dealing with a ConcurrentSkipListMap (or TreeMap), interface NavigableMap is more useful because it gives access to methods like higherEntry(), higherKey(), etc. which are not available with SortedMap.
That how the code might look like:
NavigableMap<Double, Double> map = DoubleStream.iterate(10d, i -> i + 2d)
.limit(20)
.parallel()
.boxed()
.collect(Collectors.toConcurrentMap(
Function.identity(),
k -> 100d - bigfunction(k),
(left, right) -> {
throw new AssertionError("Duplicate has been encountered: " + left);
},
ConcurrentSkipListMap::new
));
A few things. First, you need to box the primitive DoubleStream by calling boxed() in order to use it with a Collector. Second, you don't need toConcurrentMap(). It's safe to call toMap() on a parallel stream. Finally, the merge function will never be invoked in this scenario, because you're iterating over distinct keys, so I would go with what's simplest:
SortedMap<Double, Double> map = DoubleStream.iterate(10d, i -> i + 2d)
.limit(20)
.boxed()
.parallel()
.collect(Collectors.toMap(
i -> i, i -> 100d - bigfunction(i), (a, b) -> b, TreeMap::new));

Best solution to collapse list of object if necessary

today i am trying to collapse a list of objects if they have certain characteristics.
My idea is to definitely use stream().
Imagine an object made like this:
public class ObjectA {
private Integer priority;
private LocalDate date;
private String string;
}
and I have a:
List<ObjectA> objects
I would like to collapse the objects in this list for objects that have the same date. If they have dates that fall within a specific time window, then only the one with the highest priority remains, if the priority are equal so the one with a specific constant in the attribute "string", the other is eliminated from the list.
Example, this:
window = 1; // days
constantString = "aab";
[{"priority":1, "date":"2021-09-22", "aaa"},
{"priority":1, "date":"2021-09-23", "aab"},
{"priority":1, "date":"2027-10-09", "bbb"}]
Became this:
[{"priority":2, "date":"2021-09-23", "aab"},
{"priority":1, "date":"2027-10-09", "bbb"}]
What do you think is the best solution in terms of efficiency, considering that this list could have many elements.
Thanks a lot for your help!
I am not sure if it is an optimal solution, however, if you insist on using Stream API, you can achieve it with a bunch of collectors:
groupBy to group by date into Map<String, List<ObjectA>.
maxBy to reduce into a single object from List<ObjectA> with the highest priority (hence Comparator).
Since the collectors above result in ugly Map<LocalDate, Optional<ObjectA>> use collectingAndThen to extract what you need back into List<ObjectA> using another Stream.
final String specificConstant = "ccc";
List<ObjectA> filtered = list.stream().collect(Collectors.collectingAndThen(
Collectors.groupingBy(
ObjectA::getDate,
Collectors.maxBy(Comparator
.comparing(ObjectA::getPriority)
.thenComparing(objA -> specificConstant.equals(objA.getString())))),
map -> map.values().stream()
.filter(Optional::isPresent)
.map(Optional::get)
.collect(Collectors.toList())));
A full example with a result printed out into the console:
List<ObjectA> list = List.of(
new ObjectA(1, LocalDate.parse("2021-09-22"), "aaa"),
new ObjectA(2, LocalDate.parse("2021-09-22"), "aaa"),
new ObjectA(2, LocalDate.parse("2021-09-22"), "ccc"),
new ObjectA(1, LocalDate.parse("2021-09-09"), "bbb")
);
final String specificConstant = "ccc";
List<ObjectA> filtered = list.stream().collect(Collectors.collectingAndThen(
Collectors.groupingBy(
ObjectA::getDate,
Collectors.maxBy(Comparator
.comparing(ObjectA::getPriority)
.thenComparing(c -> specificConstant.equals(c.getConstant())))),
map -> map.values().stream()
.filter(Optional::isPresent)
.map(Optional::get)
.collect(Collectors.toList())));
[ObjectA(priority=1, date=2021-09-09, constant=bbb), ObjectA(priority=2, date=2021-09-22, constant=ccc)]
Here is how you could do this using streams:
Collection<ObjectA> collapsed = values.stream()
.collect(Collectors.toMap(
ObjectA::getDate,
Function.identity(),
(a, b) -> a.priority < b.priority ? b : a))
.values();
The Collectors.toMap method takes the following arguments:
Function<? super T, ? extends K> keyMapper,
Function<? super T, ? extends U> valueMapper,
BinaryOperator<U> mergeFunction
The keyMapper determines the map key, we use the date from ObjectA.
The valueMapper determines the map value, this is the ObjectA instance itself (identity function).
The mergFunction determines what happens when two values have the same key. We provide here a BinaryOperator which chooses the element with the highest priority.
EDIT: Whilst I was answering this, you completely changed the specs of the question!
You can do this in O(n) time, with n the number of elements in the list.
Convert the list to a HashMap with keys as the dates, values as the priority. Then when you hit an element with the same date as an element you've recorded, you can replace the value for that date key if the new value has a higher priority.
Then, you can convert the HashMap back into a list if you want.

Java: is there more elegant way to extract a new list from an existing list using reduce?

I'm trying to take a list of elements, make some manipulation on part of these elements and put the output of these manipulations in a new list.
I want to do it with only one iteration on the list.
The only way I found to do it is:
List<Integer> newList = numList.stream().reduce(new ArrayList<Integer>(),
(acc, value) -> {
if (value % 2 == 0) {
acc.add(value * 10);
}
return acc;
},
(l1, l2) -> {
l1.addAll(l2);
return l1;
}
);
As you can see, it's very cumbersome.
I can of course use filter and then map, but in this case I iterate the list twice.
In other languages (Javascript for example) this kind of reduce operation is very straightforward, for example:
arr.reduce((acc, value) => {
if (value % 2 == 0) {
acc.push(value * 10);
}
return acc;
}, new Array())
Amazing!
I was thinking to myself whether Java has some nicer version for this reduce or the Java code I wrote is the shortest way to do such an operation.
I can of course use filter and they map, but in this case I iterate the list twice.
That's not true. The elements of the Stream are iterated once, no matter how many intermediate operations you have.
Besides, the correct way to perform mutable reduction is to use collect:
List<Integer> newList =
numList.stream()
.filter(v -> v % 2 == 0)
.map(v -> v * 10)
.collect(Collectors.toList());
What you're doing is the opposite of functional programming, because you're using functions only to get a side effect.
You can simply not use reduce to make it even more explicit:
List<Integer> newList = numList.stream()
.filter(value -> value % 2 == 0)
.map(value -> value * 10)
.collect(Collectors.toList());
Edit: Streams iterate over the collection once by definition. According to the javadoc:
A stream should be operated on (invoking an intermediate or terminal stream operation) only once.

Java 8 Streams : Count the occurrence of elements(List<String> list1) from list of text data(List<String> list2)

Input :
List<String> elements= new ArrayList<>();
elements.add("Oranges");
elements.add("Figs");
elements.add("Mangoes");
elements.add("Apple");
List<String> listofComments = new ArrayList<>();
listofComments.add("Apples are better than Oranges");
listofComments.add("I love Mangoes and Oranges");
listofComments.add("I don't know like Figs. Mangoes are my favorites");
listofComments.add("I love Mangoes and Apples");
Output : [Mangoes, Apples, Oranges, Figs] -> Output must be in descending order of the number of occurrences of the elements. If elements appear equal no. of times then they must be arranged alphabetically.
I am new to Java 8 and came across this problem. I tried solving it partially; I couldn't sort it. Can anyone help me with a better code?
My piece of code:
Function<String, Map<String, Long>> function = f -> {
Long count = listofComments.stream()
.filter(e -> e.toLowerCase().contains(f.toLowerCase())).count();
Map<String, Long> map = new HashMap<>(); //creates map for every element. Is it right?
map.put(f, count);
return map;
};
elements.stream().sorted().map(function).forEach(e-> System.out.print(e));
Output: {Apple=2}{Figs=1}{Mangoes=3}{Oranges=2}
In real life scenarios you would have to consider that applying an arbitrary number of match operations to an arbitrary number of comments can become quiet expensive when the numbers grow, so it’s worth doing some preparation:
Map<String,Predicate<String>> filters = elements.stream()
.sorted(String.CASE_INSENSITIVE_ORDER)
.map(s -> Pattern.compile(s, Pattern.LITERAL|Pattern.CASE_INSENSITIVE))
.collect(Collectors.toMap(Pattern::pattern, Pattern::asPredicate,
(a,b) -> { throw new AssertionError("duplicates"); }, LinkedHashMap::new));
The Predicate class is quiet valuable even when not doing regex matching. The combination of the LITERAL and CASE_INSENSITIVE flags enables searches with the intended semantic without the need to convert entire strings to lower case (which, by the way, is not sufficient for all possible scenarios). For this kind of matching, the preparation will include building the necessary data structure for the Boyer–Moore Algorithm for more efficient search, internally.
This map can be reused.
For your specific task, one way to use it would be
filters.entrySet().stream()
.map(e -> Map.entry(e.getKey(), listofComments.stream().filter(e.getValue()).count()))
.sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
.forEachOrdered(e -> System.out.printf("%-7s%3d%n", e.getKey(), e.getValue()));
which will print for your example data:
Mangoes 3
Apple 2
Oranges 2
Figs 1
Note that the filters map is already sorted alphabetically and the sorted of the second stream operation is stable for streams with a defined encounter order, so it only needs to sort by occurrences, the entries with equal elements will keep their relative order, which is the alphabetical order from the source map.
Map.entry(…) requires Java 9 or newer. For Java 8, you’d have to use something like
new AbstractMap.SimpleEntry(…) instead.
You can still modify your function to store Map.Entry instead of a complete Map
Function<String, Map.Entry<String, Long>> function = f -> Map.entry(f, listOfComments.stream()
.filter(e -> e.toLowerCase().contains(f.toLowerCase())).count());
and then sort these entries before performing a terminal operation forEach in your case to print
elements.stream()
.map(function)
.sorted(Comparator.comparing(Map.Entry<String, Long>::getValue)
.reversed().thenComparing(Map.Entry::getKey))
.forEach(System.out::println);
This will then give you as output the following:
Mangoes=3
Apples=2
Oranges=2
Figs=1
First thing is to declare an additional class. It'll hold element and count:
class ElementWithCount {
private final String element;
private final long count;
ElementWithCount(String element, long count) {
this.element = element;
this.count = count;
}
String element() {
return element;
}
long count() {
return count;
}
}
To compute count let's declare an additional function:
static long getElementCount(List<String> listOfComments, String element) {
return listOfComments.stream()
.filter(comment -> comment.contains(element))
.count();
}
So now to find the result we need to transform stream of elements to stream of ElementWithCount objects, then sort that stream by count, then transform it back to stream of elements and collect it into result list.
To make this task easier, let's define comparator as a separate variable:
Comparator<ElementWithCount> comparator = Comparator
.comparing(ElementWithCount::count).reversed()
.thenComparing(ElementWithCount::element);
and now as all parts are ready, final computation is easy:
List<String> result = elements.stream()
.map(element -> new ElementWithCount(element, getElementCount(listOfComments, element)))
.sorted(comparator)
.map(ElementWithCount::element)
.collect(Collectors.toList());
You can use Map.Entry instead of a separate class and inline getElementCount, so it'll be "one-line" solution:
List<String> result = elements.stream()
.map(element ->
new AbstractMap.SimpleImmutableEntry<>(element,
listOfComments.stream()
.filter(comment -> comment.contains(element))
.count()))
.sorted(Map.Entry.<String, Long>comparingByValue().reversed().thenComparing(Map.Entry.comparingByKey()))
.map(Map.Entry::getKey)
.collect(Collectors.toList());
But it's much harder to understand in this form, so I recommend to split it to logical parts.

Collect to map skipping null key/values

Let's say I have some stream and want to collect to map like this
stream.collect(Collectors.toMap(this::func1, this::func2));
But I want to skip null keys/values. Of course, I can do like this
stream.filter(t -> func1(t) != null)
.filter(t -> func2(t) != null)
.collect(Collectors.toMap(this::func1, this::func2));
But is there more beautiful/effective solution?
If you want to avoid evaluating the functions func1 and func2 twice, you have to store the results. E.g.
stream.map(t -> new AbstractMap.SimpleImmutableEntry<>(func1(t), func2(t))
.filter(e -> e.getKey()!=null && e.getValue()!=null)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
This doesn’t make the code shorter and even the efficiency depends on the circumstances. This change pays off, if the costs of evaluating the func1 and func2 are high enough to compensate the creation of temporary objects. In principle, the temporary object could get optimized away, but this isn’t guaranteed.
Starting with Java 9, you can replace new AbstractMap.SimpleImmutableEntry<>(…) with Map.entry(…). Since this entry type disallows null right from the start, it would need filtering before constructing the entry:
stream.flatMap(t -> {
Type1 value1 = func1(t);
Type2 value2 = func2(t);
return value1!=null && value2!=null? Stream.of(Map.entry(value1, value2)): null;
})
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
Alternatively, you may use a pair type of one of the libraries you’re already using (the Java API itself doesn’t offer such a type).
Another way to avoid evaluating the functions twice. Use a pair class of your choice. Not as concise as Holger's but it's a little less dense which can be easier to read.
stream.map(A::doFuncs)
.flatMap(Optional::stream)
.collect(Collectors.toMap(Pair::getKey, Pair::getValue));
private static Optional<Pair<Bar, Baz>> doFuncs(Foo foo)
{
final Bar bar = func1(foo);
final Baz baz = func2(foo);
if (bar == null || baz == null) return Optional.empty();
return Optional.of(new Pair<>(bar, baz));
}
(Choose proper names - I didn't know what types you were using)
One option is to do as in the other answers, i.e. use a Pair type, or an implementation of Map.Entry. Another approach used in functional programming would be to memoize the functions. According to Wikipedia:
memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
So you could do it by caching the results of the functions in maps:
public static <K, V> Function<K, V> memoize(Function<K, V> f) {
Map<K, V> map = new HashMap<>();
return k -> map.computeIfAbsent(k, f);
}
Then, use the memoized functions in the stream:
Function<E, K> memoizedFunc1 = memoize(this::func1);
Function<E, V> memoizedFunc2 = memoize(this::func2);
stream.filter(t -> memoizedFunc1.apply(t) != null)
.filter(t -> memoizedFunc2.apply(t) != null)
.collect(Collectors.toMap(memoizedFunc1, memoizedFunc2));
Here E stands for the type of the elements of the stream, K stands for the type returned by func1 (which is the type of the keys of the map) and V stands for the type returned by func2 (which is the type of the values of the map).
This is a naive solution, but does not call functions twice and does not create extra objects:
List<Integer> ints = Arrays.asList(1, null, 2, null, 3);
Map<Integer, Integer> res = ints.stream().collect(LinkedHashMap::new, (lhm, i) -> {
final Integer integer1 = func1(i);
final Integer integer2 = func2(i);
if(integer1 != null && integer2 != null) {
lhm.put(integer1, integer2);
}
}, (lhm1, lhm2) -> {});
You could create a isFunc1AndFunc2NotNull() method in the current class :
boolean isFunc1AndFunc2NotNull(Foo foo){
return func1(foo) != null && func2(foo) != null;
}
And change your stream as :
stream.filter(this::isFunc1AndFunc2NotNull)
.collect(Collectors.toMap(this::func1, this::func2));

Categories

Resources