java 8 stream groupingBy sum of composite variable - java

I have a class Something which contains an instance variable Anything.
class Anything {
private final int id;
private final int noThings;
public Anything(int id, int noThings) {
this.id = id;
this.noThings = noThings;
}
}
class Something {
private final int parentId;
private final List<Anything> anythings;
private int getParentId() {
return parentId;
}
private List<Anything> getAnythings() {
return anythings;
}
public Something(int parentId, List<Anything> anythings) {
this.parentId = parentId;
this.anythings = anythings;
}
}
Given a list of Somethings
List<Something> mySomethings = Arrays.asList(
new Something(123, Arrays.asList(new Anything(45, 65),
new Anything(568, 15),
new Anything(145, 27))),
new Something(547, Arrays.asList(new Anything(12, 123),
new Anything(678, 76),
new Anything(98, 81))),
new Something(685, Arrays.asList(new Anything(23, 57),
new Anything(324, 67),
new Anything(457, 87))));
I want to sort them such that the Something objects are sorted depending on the total descending sum of the (Anything object) noThings, and then by the descending value of the (Anything object) noThings
123 = 65+15+27 = 107(3rd)
547 = 123+76+81 = 280 (1st)
685 = 57+67+87 = 211 (2nd)
So that I end up with
List<Something> orderedSomethings = Arrays.asList(
new Something(547, Arrays.asList(new Anything(12, 123),
new Anything(98, 81),
new Anything(678, 76))),
new Something(685, Arrays.asList(new Anything(457, 87),
new Anything(324, 67),
new Anything(23, 57))),
new Something(123, Arrays.asList(new Anything(45, 65),
new Anything(145, 27),
new Anything(568, 15))));
I know that I can get the list of Anythings per parent Id
Map<Integer, List<Anythings>> anythings
= mySomethings.stream()
.collect(Collectors.toMap(p->p.getParentId(),
p->p.getAnythings()))
;
But after that I'm a bit stuck.

Unless I'm mistaken, you can not do both sorts in one go. But since they are independent of each other (the sum of the nothings in the Anythings in a Something is independent of their order), this does not matter much. Just sort one after the other.
To sort the Anytings inside the Somethings by their noThings:
mySomethings.stream().map(Something::getAnythings)
.forEach(as -> as.sort(Comparator.comparing(Anything::getNoThings)
.reversed()));
To sort the Somethings by the sum of the noThings of their Anythings:
mySomethings.sort(Comparator.comparing((Something s) -> s.getAnythings().stream()
.mapToInt(Anything::getNoThings).sum())
.reversed());
Note that both those sorts will modify the respective lists in-place.
As pointed out by #Tagir, the second sort will calculate the sum of the Anythings again for each pair of Somethings that are compared in the sort. If the lists are long, this can be very wasteful. Instead, you could first calculate the sums in a map and then just look up the value.
Map<Something, Integer> sumsOfThings = mySomethings.stream()
.collect(Collectors.toMap(s -> s, s -> s.getAnythings().stream()
.mapToInt(Anything::getNoThings).sum()));
mySomethings.sort(Comparator.comparing(sumsOfThings::get).reversed());

The problem of other solutions is that sums are not stored anywhere during sorting, thus when sorting large input, sums will be calculated for every row several times reducing the performance. An alternative solution is to create intermediate pairs of (something, sum), sort by sum, then extract something and forget about sum. Here's how it can be done with Stream API and SimpleImmutableEntry as pair class:
List<Something> orderedSomethings = mySomethings.stream()
.map(smth -> new AbstractMap.SimpleImmutableEntry<>(smth, smth
.getAnythings().stream()
.mapToInt(Anything::getNoThings).sum()))
.sorted(Entry.<Something, Integer>comparingByValue().reversed())
.map(Entry::getKey)
.collect(Collectors.toList());
There's some syntactic sugar available in my free StreamEx library which makes the code a little bit cleaner:
List<Something> orderedSomethings = StreamEx.of(mySomethings)
.mapToEntry(smth -> smth
.getAnythings().stream()
.mapToInt(Anything::getNoThings).sum())
.reverseSorted(Entry.comparingByValue())
.keys().toList();
As for sorting the Anything inside something: other solutions are ok.

In the end I added an extra method to the Something class.
public int getTotalNoThings() {
return anythings.stream().collect(Collectors.summingInt(Anything::getNoThings));
}
then I used this method to sort by total noThings (desc)
somethings = somethings.stream()
.sorted(Comparator.comparing(Something::getTotalNoThings).reversed())
.collect(Collectors.toList());
and then I used the code suggested above (thanks!) to sort by the Anything instance noThings
somethings .stream().map(Something::getAnythings)
.forEach(as -> as.sort(Comparator.comparing(Anything::getNoThings).reversed()));
Thanks again for help.

Related

How do I get a list of items from a subset of a list, based on the items in another list in Java?

I've got a List<String> which represents the ID's (can be duplicate), of items from another list, which is a List<Cheat>, where each Cheat has a String ID and a List<Integer> RNG. Both have accessor methods in Cheat.
I need to convert this list of ID's, into a list of RNG's for each Cheat that I have been supplied with the ID for.
For example, I could have 3 Cheats:
1:{ID:1, RNG:{1,2,3}}
2:{ID:2, RNG{1,2}}
3:{ID:3, RNG:{1}}
And a List of ID's of:
{3,1,1,2}.
I would need to end up with a final list of {1,1,2,3,1,2,3,1,2}, which is the RNG's of Cheat 3, then the RNG's of cheat 1, then the RNG's of cheat 1 again, then finally the RNG's of cheat 2.
If anyone could help me out it would be appreciated. Thank you.
I've tried and failed with:
ImmutableList<Integer> sequenceRngs = cheatIds.stream()
.map(s -> cheats.stream()
.filter(cheat -> cheat.getId().equals(s))
.findFirst()
.map(cheat -> cheat.getRng()))
.flatMap(cheat -> cheat.getRng())
.collect(ListUtils.toImmutableList());
One possible solution:
import java.util.List;
import java.util.stream.Collectors;
class Scratch {
static class Cheat {
int id;
List<Integer> rng;
public Cheat(int id, List<Integer> rng) {
this.id = id;
this.rng = rng;
}
}
public static void main(String[] args) {
List<Cheat> allCheats = List.of(
new Cheat(1, List.of(1,2,3)),
new Cheat(2, List.of(1,2)),
new Cheat(3, List.of(1))
);
List<Integer> result = List.of(3, 1, 1, 2).stream()
.flatMap(id -> allCheats.stream()
.filter(cheat -> cheat.id == id)
.findFirst().orElseThrow().rng.stream())
.collect(Collectors.toList());
System.out.println(result);
}
}
The key is to use flatMap to get the result in a single - not nested - Collection in the end.
The lambda that you pass to flatMap should return a Stream, not a List. And you should handle the case where there's no such element in the stream - even if you are sure there is. Something like this should do:
final ImmutableList<String> sequenceRngs = cheatIds.stream().flatMap(id ->
cheats.stream().filter(cheat -> id.equals(cheat.getId()))
.findAny().orElseThrow(IllegalStateException::new)
.getRng().stream())
.collect(ListUtils.toImmutableList());
Also, I would propose to convert the list of cheats to a map - that would simplify the code and reduce the complexity of searching from O(n) to O(1).
You can attain that with the following steps:
Create a map of cheatId to RNG ids associated:
Map<Integer, List<Integer>> map = cheats.stream()
.collect(Collectors.toMap(Cheat::getId,
cheat -> cheat.getRng().stream().map(RNG::getId).collect(Collectors.toList())));
Iterate over the cheatIds provided as input and get the corresponding RNG ids from the map to collect as output:
List<Integer> output = cheatIds.stream()
.flatMap(ch -> map.get(ch).stream())
.collect(Collectors.toList());

Filter objects from a list that have the same member

I have a list of objects. The object looks like this:
public class Slots {
String slotType;
Visits visit;
}
public class Visits {
private long visitCode;
private String agendaCode;
private String scheduledTime;
private String resourceType;
private String resourceDescription;
private String visitTypeCode;
...
}
I need to find the elements that have the same agendaCode, visitTypeCode and scheduledTime and for the life of me I can't get it done.
I tried this:
Set<String> agendas = slotsResponse.getContent().stream()
.map(Slots::getVisit)
.map(Visits::getAgendaCode)
.collect(Collectors.toUnmodifiableSet());
Set<String> visitTypeCode = slotsResponse.getContent().stream()
.map(Slots::getVisit)
.map(Visits::getVisitTypeCode)
.collect(Collectors.toUnmodifiableSet());
Set<String> scheduledTime = slotsResponse.getContent().stream()
.map(Slots::getVisit)
.map(Visits::getScheduledTime)
.collect(Collectors.toUnmodifiableSet());
List<Slots> collect = slotsResponse.getContent().stream()
.filter(c -> agendas.contains(c.getVisit().getAgendaCode()))
.filter(c -> visitTypeCode.contains(c.getVisit().getVisitTypeCode()))
.filter(c -> scheduledTime.contains(c.getVisit().getScheduledTime()))
.collect(Collectors.toList());
But it's not doing what I thought it would. Ideally I would have a list of lists, where each sublist is a list of Slots objects that share the same agendaCode, visitTypeCode and scheduledTime. I struggle with functional programming so any help or pointers would be great!
This is Java 11 and I'm also using vavr.
Since you mentioned you're using vavr, here is the vavr way to solve this question.
Supposed you have your io.vavr.collection.List (or Array or Vector or Stream or similar vavr collection) of visits:
List<Visits> visits = ...;
final Map<Tuple3<String, String, String>, List<Visits>> grouped =
visits.groupBy(visit ->
Tuple.of(
visit.getAgendaCode(),
visit.getVisitTypeCode(),
visit.getScheduledTime()
)
);
Or with a java.util.List of visits:
List<Visits> visits = ...;
Map<Tuple3<String, String, String>, List<Visits>> grouped = visits.stream().collect(
Collectors.groupingBy(
visit ->
Tuple.of(
visit.getAgendaCode(),
visit.getVisitTypeCode(),
visit.getScheduledTime()
)
)
);
The easiest way is to define a new class with necessaries fields (agendaCode, visitTypeCode and scheduledTime). Don't forget about equals/hashcode.
public class Visits {
private long visitCode;
private String resourceType;
private String resourceDescription;
private Code code;
...
}
class Code {
private String agendaCode;
private String scheduledTime;
private String visitTypeCode;
...
#Override
public boolean equals(Object o) {...}
#Override
public int hashCode() {...}
}
Then you can use groupingBy like:
Map<Code, List<Slots>> map = slotsResponse.getContent().stream()
.collect(Collectors.groupingBy(s -> s.getVisit().getCode()));
Also you can just implement equals method inside Visits only for agendaCode, visitTypeCode and scheduledTime. In this case use groupingBy by s.getVisit()
I love Ruslan's idea of using Collectors::groupingBy. Nevertheless, I don't like creating a new class or defining a new equals method. Both of them coerces you to a single Collectors::groupingBy version. What if you want to group by other fields in other methods?
Here is a piece of code that should let you overcome this problem:
slotsResponse.getContent()
.stream()
.collect(Collectors.groupingBy(s -> Arrays.asList(s.getVisit().getAgendaCode(), s.getVisit().getVisitTypeCode(), s.getVisit().getScheduledTime())))
.values();
My idea was to create a new container for every needed field (agendaCode, visitTypeCode, scheludedTime) and compare slots on these newly created containers. I would have liked doing so with a simple Object array, but it doesn't work - arrays should be compared with Arrays.equals which is not the comparison method used by Collectors::groupingBy.
Please note that you should store somewhere or use a method to define which fields you want to group by.
The fields you want to group by are all strings. You can define a function which concatenate those fields values and use that as key for your groups. Example
Function<Slots,String> myFunc = s -> s.getVisit().agendaCode + s.getVisit().visitTypeCode + s.getVisit().scheduledTime;
// or s.getVisit().agendaCode +"-"+ s..getVisit().visitTypeCode +"-"+ s.getVisit().scheduledTime;
And then group as below:
Map<String,List<Slots>> result = slotsResponse.getContent().stream()
.collect(Collectors.groupingBy(myFunc));

Java 8 Comma Separated String to Object property

I have three comma-separated lists (list of bus, car, cycle) and I am trying to write them into Java object properties using Java 8 streams.
Please find below what I have tried :
public class Traffic {
public int car;
public int bus;
public int cycle;
public Traffic(int car, int bus,int cycle){
this.car = car;
this.bus = bus;
this.cycle = cycle;
}
}
public class Test {
public static void main(String[] args) {
String bus = "5,9,15,86";
String car = "6,12,18,51";
String cycle = "81,200,576,894";
String[] busArray = bus.split(",");
String[] carArray = car.split(",");
String[] cycleArray = cycle.split(",");
List<Traffic> trafficList =
Arrays.stream(values)
.mapToInt(Integer::parseInt)
.mapToObj((int i,j) -> new Traffic(i,j))
.collect(Collectors.toList());
}
}
I was struggling with getting all streams up and injected into object properties. (I want to create 4 objects in this case populating all 3 properties.)
Basically, I am looking for something like below:
List<Traffic> trafficList =
Arrays.stream(carArray,busArray,cycleArray)
.mapToInt(Integer::parseInt)
.mapToObj((int i,j,k) -> new Traffic(i,j,k))
.collect(Collectors.toList());
If you want to create 4 objects of Traffic then you can use the following :
List<Traffic> collect = IntStream.range(0, busArray.length)
.mapToObj(i -> new Traffic(Integer.parseInt(busArray[i]),
Integer.parseInt(carArray[i]),
Integer.parseInt(cycleArray[i])))
.collect(Collectors.toList());
You just have to split your string and then map each value to your object.
Here I assume the value can be passed through the constructor of your Traffic object. If not, you can create it and set its value in 2 separate lines. The mapToInt is necessary if the value is expected to be an integer.
String original = "5,9,15,86";
String[] values = original.split(",");
List<Traffic> trafficList =
Arrays.stream(values)
.mapToInt(Integer::parseInt)
.map(Traffic::new)
.collect(Collectors.toList());
Define a constructor in the class Traffic that takes an integer as argument and assigns it to value attribute in the class.
static class Traffic {
private int value;
public Traffic(int value) {
this.value = value;
}
}
Now assuming the comma delimited string is in a string commandList, something like below.
String commaList = "1,3,5,6,7,8,9,100";
Following stream instruction will return a list of Traffic objects with the value assigned.
List<Traffic> listOfIntegers =
Arrays.asList(commaList.split(","))
.stream()
.map(e -> new Traffic(Integer.valueOf(e)))
.collect(Collectors.toList());
If you really want an array, you can try the following
Arrays.stream("5,9,15,86".split(","))
.map(Traffic::new)
.toArray(Traffic[]::new);
If a List<Traffic> is also okay for you i recommend this one
Arrays.stream("5,9,15,86".split(","))
.map(Traffic::new)
.collect(Collectors.toList());
And lastly if you only have a constructor for Integer for example, you can map the stram to int like
Arrays.stream("5,9,15,86".split(","))
.mapToInt(Integer::valueOf)
.mapToObj(Traffic::new)
.collect(Collectors.toList());
EDIT
I answered this question before the question was edited, that's why it is only a partial answer
EDIT2
Okay i got it, i used map instead of mapToObj what a huge mistake... But i found it out thanks to #JavaMan's helpful answers (notice that if you are using IntelliJ it offers you to replace map with mapToObj)

Merging objects from two list with unique id using Java 8

class Human {
Long humanId;
String name;
Long age;
}
class SuperHuman {
Long superHumanId;
Long humanId;
String name;
Long age;
}
I've two lists. List of humans and List of superHumans. I want to create a single list out of the two making sure that if a human is superhuman, it only appears once in the list using java 8. Is there a neat way to do it?
UPDATE: These are different classes i.e. neither extends the other. I want the final list to be of superhumans. If a human already is superhuman, we ignore that human object. If a human is not a superhuman, we convert the human object into the super human object. Ideally I would want to sort them by their age at the end so that I get a list of superhuman sorted by date in descending order.
Based on your updated question:
List<Human> humans = ... // init
List<SuperHuman> superHumans = ... // init
Set<Long> superHumanIds = superHumans.stream()
.map(SuperHuman::getHumanId)
.collect(toSet());
humans.stream()
.filter(human -> superHumanIds.contains(human.getHumanId()))
.map(this::convert)
.forEach(superHumans::add);
superHumans.sort(Comparator.comparing(SuperHuman::getAge));
Assuming this class has another method with the following signature:
private Superhuman convert(Human human) {
// mapping logic
}
You do have other suggestions about how your code should be re-factored to make it better, but in case you can't do that, there is a way - via a custom Collector that is not that complicated.
A custom collector also gives you the advantage of actually deciding which entry you want to keep - the one that is already collected or the one that is coming or latest in encounter order wins. It would require some code changes - but it's doable in case you might need it.
private static <T> Collector<Human, ?, List<Human>> noDupCollector(List<SuperHuman> superHumans) {
class Acc {
ArrayList<Long> superIds = superHumans.stream()
.map(SuperHuman::getHumanId)
.collect(Collectors.toCollection(ArrayList::new));
ArrayList<Long> seen = new ArrayList<>();
ArrayList<Human> noDup = new ArrayList<>();
void add(Human elem) {
if (superIds.contains(elem.getHumanId())) {
if (!seen.contains(elem.getHumanId())) {
noDup.add(elem);
seen.add(elem.getHumanId());
}
} else {
noDup.add(elem);
}
}
Acc merge(Acc right) {
noDup.addAll(right.noDup);
return this;
}
public List<Human> finisher() {
return noDup;
}
}
return Collector.of(Acc::new, Acc::add, Acc::merge, Acc::finisher);
}
Supposing you have these entries:
List<SuperHuman> superHumans = Arrays.asList(
new SuperHuman(1L, 1L, "Superman"));
//
List<Human> humans = Arrays.asList(
new Human(1L, "Bob"),
new Human(1L, "Tylor"),
new Human(2L, "John"));
Doing this:
List<Human> noDup = humans.stream()
.collect(noDupCollector(superHumans));
System.out.println(noDup); // [Bob, Tylor]
Try this.
List<Object> result = Stream.concat(
humans.stream()
.filter(h -> !superHumans.stream()
.anyMatch(s -> h.humanId == s.humanId)),
superHumans.stream())
.collect(Collectors.toList());
Suppose neither of the classes inherit the other one. I can imagine you have two lists:
Human alan = new Human(1, "Alan");
Human bertha = new Human(2, "Bertha");
Human carl = new Human(3, "Carl");
Human david = new Human(4, "David");
SuperHuman carlS = new SuperHuman(1, 3, "Carl");
SuperHuman eliseS = new SuperHuman(2, 0, "Elise");
List<Human> humans = new ArrayList<>(Arrays.asList(alan, bertha, carl, david));
List<SuperHuman> superHumans = new ArrayList<>(Arrays.asList(carlS, eliseS));
We see that Carl is listed as a human, and also as a superhuman. Two instances of the very same Carl exist.
List<Object> newList = humans.stream()
.filter((Human h) -> {
return !superHumans.stream()
.anyMatch(s -> s.getHumanId() == h.getHumanId());
})
.collect(Collectors.toList());
newList.addAll(superHumans);
This code tries to filter the list of humans, excluding all entries whose humanId exist in the list of superhumans. At last, all superhumans are added.
This design has several problems:
Intuitively, the classes are related, but your code says otherwise. The fact that you are trying to merge, suggests the types are related.
The classes have overlapping properties (humanId and name) which as well suggest that the classes are related.
The assumption that the classes are related, is definitely not unfounded.
As other commenters suggested, you should redesign the classes:
class Human {
long id; // The name 'humanId' is redundant, just 'id' is fine
String name;
}
class SuperHuman extends Human {
long superHumanId; // I'm not sure why you want this...
}
Human alan = new Human(1, "Alan");
Human bertha = new Human(2, "Bertha");
Human carl = new SuperHuman(3, "Carl");
Human david = new Human(4, "David");
Human elise = new SuperHuman(5, "Elise");
List<Human> people = Arrays.asList(alan, bertha, carl, david, elise);
Then you have one single instance for each person. Would you ever get all superhumans from the list, just use this:
List<Human> superHumans = people.stream()
.filter((Human t) -> t instanceof SuperHuman)
.collect(Collectors.toList());
superHumans.forEach(System.out::println); // Carl, Elise

Group and Reduce list of objects

I have a list of objects with many duplicated and some fields that need to be merged. I want to reduce this down to a list of unique objects using only Java 8 Streams (I know how to do this via old-skool means but this is an experiment.)
This is what I have right now. I don't really like this because the map-building seems extraneous and the values() collection is a view of the backing map, and you need to wrap it in a new ArrayList<>(...) to get a more specific collection. Is there a better approach, perhaps using the more general reduction operations?
#Test
public void reduce() {
Collection<Foo> foos = Stream.of("foo", "bar", "baz")
.flatMap(this::getfoos)
.collect(Collectors.toMap(f -> f.name, f -> f, (l, r) -> {
l.ids.addAll(r.ids);
return l;
})).values();
assertEquals(3, foos.size());
foos.forEach(f -> assertEquals(10, f.ids.size()));
}
private Stream<Foo> getfoos(String n) {
return IntStream.range(0,10).mapToObj(i -> new Foo(n, i));
}
public static class Foo {
private String name;
private List<Integer> ids = new ArrayList<>();
public Foo(String n, int i) {
name = n;
ids.add(i);
}
}
If you break the grouping and reducing steps up, you can get something cleaner:
Stream<Foo> input = Stream.of("foo", "bar", "baz").flatMap(this::getfoos);
Map<String, Optional<Foo>> collect = input.collect(Collectors.groupingBy(f -> f.name, Collectors.reducing(Foo::merge)));
Collection<Optional<Foo>> collected = collect.values();
This assumes a few convenience methods in your Foo class:
public Foo(String n, List<Integer> ids) {
this.name = n;
this.ids.addAll(ids);
}
public static Foo merge(Foo src, Foo dest) {
List<Integer> merged = new ArrayList<>();
merged.addAll(src.ids);
merged.addAll(dest.ids);
return new Foo(src.name, merged);
}
As already pointed out in the comments, a map is a very natural thing to use when you want to identify unique objects. If all you needed to do was find the unique objects, you could use the Stream::distinct method. This method hides the fact that there is a map involved, but apparently it does use a map internally, as hinted by this question that shows you should implement a hashCode method or distinct may not behave correctly.
In the case of the distinct method, where no merging is necessary, it is possible to return some of the results before all of the input has been processed. In your case, unless you can make additional assumptions about the input that haven't been mentioned in the question, you do need to finish processing all of the input before you return any results. Thus this answer does use a map.
It is easy enough to use streams to process the values of the map and turn it back into an ArrayList, though. I show that in this answer, as well as providing a way to avoid the appearance of an Optional<Foo>, which shows up in one of the other answers.
public void reduce() {
ArrayList<Foo> foos = Stream.of("foo", "bar", "baz").flatMap(this::getfoos)
.collect(Collectors.collectingAndThen(Collectors.groupingBy(f -> f.name,
Collectors.reducing(Foo.identity(), Foo::merge)),
map -> map.values().stream().
collect(Collectors.toCollection(ArrayList::new))));
assertEquals(3, foos.size());
foos.forEach(f -> assertEquals(10, f.ids.size()));
}
private Stream<Foo> getfoos(String n) {
return IntStream.range(0, 10).mapToObj(i -> new Foo(n, i));
}
public static class Foo {
private String name;
private List<Integer> ids = new ArrayList<>();
private static final Foo BASE_FOO = new Foo("", 0);
public static Foo identity() {
return BASE_FOO;
}
// use only if side effects to the argument objects are okay
public static Foo merge(Foo fooOne, Foo fooTwo) {
if (fooOne == BASE_FOO) {
return fooTwo;
} else if (fooTwo == BASE_FOO) {
return fooOne;
}
fooOne.ids.addAll(fooTwo.ids);
return fooOne;
}
public Foo(String n, int i) {
name = n;
ids.add(i);
}
}
If the input elements are supplied in the random order, then having intermediate map is probably the best solution. However if you know in advance that all the foos with the same name are adjacent (this condition is actually met in your test), the algorithm can be greatly simplified: you just need to compare the current element with the previous one and merge them if the name is the same.
Unfortunately there's no Stream API method which would allow you do to such thing easily and effectively. One possible solution is to write custom collector like this:
public static List<Foo> withCollector(Stream<Foo> stream) {
return stream.collect(Collector.<Foo, List<Foo>>of(ArrayList::new,
(list, t) -> {
Foo f;
if(list.isEmpty() || !(f = list.get(list.size()-1)).name.equals(t.name))
list.add(t);
else
f.ids.addAll(t.ids);
},
(l1, l2) -> {
if(l1.isEmpty())
return l2;
if(l2.isEmpty())
return l1;
if(l1.get(l1.size()-1).name.equals(l2.get(0).name)) {
l1.get(l1.size()-1).ids.addAll(l2.get(0).ids);
l1.addAll(l2.subList(1, l2.size()));
} else {
l1.addAll(l2);
}
return l1;
}));
}
My tests show that this collector is always faster than collecting to map (up to 2x depending on average number of duplicate names), both in sequential and parallel mode.
Another approach is to use my StreamEx library which provides a bunch of "partial reduction" methods including collapse:
public static List<Foo> withStreamEx(Stream<Foo> stream) {
return StreamEx.of(stream)
.collapse((l, r) -> l.name.equals(r.name), (l, r) -> {
l.ids.addAll(r.ids);
return l;
}).toList();
}
This method accepts two arguments: a BiPredicate which is applied for two adjacent elements and should return true if elements should be merged and the BinaryOperator which performs merging. This solution is a little bit slower in sequential mode than the custom collector (in parallel the results are very similar), but it's still significantly faster than toMap solution and it's simpler and somewhat more flexible as collapse is an intermediate operation, so you can collect in another way.
Again both these solutions work only if foos with the same name are known to be adjacent. It's a bad idea to sort the input stream by foo name, then using these solutions, because the sorting will drastically reduce the performance making it slower than toMap solution.
As already pointed out by others, an intermediate Map is unavoidable, as that’s the way of finding the objects to merge. Further, you should not modify source data during reduction.
Nevertheless, you can achieve both without creating multiple Foo instances:
List<Foo> foos = Stream.of("foo", "bar", "baz")
.flatMap(n->IntStream.range(0,10).mapToObj(i -> new Foo(n, i)))
.collect(collectingAndThen(groupingBy(f -> f.name),
m->m.entrySet().stream().map(e->new Foo(e.getKey(),
e.getValue().stream().flatMap(f->f.ids.stream()).collect(toList())))
.collect(toList())));
This assumes that you add a constructor
public Foo(String n, List<Integer> l) {
name = n;
ids=l;
}
to your Foo class, as it should have if Foo is really supposed to be capable of holding a list of IDs. As a side note, having a type which serves as single item as well as a container for merged results seems unnatural to me. This is exactly why to code turns out to be so complicated.
If the source items had a single id, using something like groupingBy(f -> f.name, mapping(f -> id, toList()), followed by mapping the entries of (String, List<Integer>) to the merged items was sufficient.
Since this is not the case and Java 8 lacks the flatMapping collector, the flatmapping step is moved to the second step, making it look much more complicated.
But in both cases, the second step is not obsolete as it is where the result items are actually created and converting the map to the desired list type comes for free.

Categories

Resources