Efficient way to use fork join pool with multiple parallel streams

Efficient way to use fork join pool with multiple parallel streams - java

I am using three streams which needs to call http requests. All of the calls are independent. So, I use parallel streams and collect the results from http response.
Currently I am using three separate parallel streams for these operations.
Map<String, ClassA> list1 = listOfClassX.stream().parallel()
.map(item -> {
ClassA instanceA = httpGetCall(item.id);
return instanceA;
})
.collect(Collectors.toConcurrentMap(item -> item.id, item -> item);
Map<String, ClassB> list1 = listOfClassY.stream().parallel()
.map(item -> {
ClassB instanceB = httpGetCall(item.id);
return instanceB;
})
.collect(Collectors.toConcurrentMap(item -> item.id, item -> item);
Map<String, ClassC> list1 = listOfClassZ.stream().parallel()
.map(item -> {
ClassC instanceC = httpGetCall(item.id);
return instanceC;
})
.collect(Collectors.toConcurrentMap(item -> item.id, item -> item);
It runs the three parallel streams separately one after another though each call is independent.
Will common fork join pool help in this case to optimize the use of thread pool here?
Is there any other way to optimize the performance of this code further?

Related

Reactive Java List to Map

I have a methodA which takes an argument and returns a result. I am writing a reactive method to invoke the function in bulk. But Not able to get my head around reactive syntax.
My code looks like this
List<GetResult> successfulResults =
Collections.synchronizedList(new ArrayList<>());
Map<String, Throwable> erroredResults = new ConcurrentHashMap<>();
Flux.fromIterable(docsToFetch).flatMap(key -> reactiveCollection.getAndTouch(key, Duration.ofMinutes(extendExpiryInMin))
.onErrorResume(e -> {
erroredResults.put(key, e);
return Mono.empty();
})
).doOnNext(successfulResults::add).last().block();
The current implementation calls the method but collects the result in list. Collecting result in list does not make sense to my use case. I want to collect the result in a hashmap of key and result.

The solution is
List<String> docsToFetch = Arrays.asList("airline_112", "airline_1191", "airline_1203");
Map<String, GetResult> successfulResults = new ConcurrentHashMap<>();
Map<String, Throwable> erroredResults = new ConcurrentHashMap<>();
Flux.fromIterable(docsToFetch).flatMap(key -> reactiveCollection.get(key).onErrorResume(e -> {
erroredResults.put(key, e);
return Mono.empty();
}).doOnNext(getResult -> successfulResults.put(key, getResult))).last().block();

Java 8 Lambdas flatmapping, groupingBy and mapping to get a Map of T and List<K>

Here's what I have so far:
Map<Care, List<Correlative>> mapOf = quickSearchList
.stream()
.map(QuickSearch::getFacility)
.collect(Collectors.flatMapping(facility -> facility.getFacilityCares().stream(),
Collectors.groupingBy(FacilityCare::getCare,
Collectors.mapping(c -> {
final Facility facility = new Facility();
facility.setId(c.getFacilityId());
return Correlative.createFromFacility(facility);
}, Collectors.toList()))));
I have a list of Quick Searches to begin with. Each item in the quick search has a single facility as in:
public class QuickSearch {
Facility facility;
}
In every Facility, there's a List of FacilityCare as in:
public class Facility {
List<FacilityCare> facilityCares;
}
And finally, FacilityCare has Care property as in:
public class FacilityCare {
Care care;
}
Now, the idea is to convert a List of QuickSearch to a Map of <Care, List<Correlative>>.
The code within the mapping() function is bogus, in the example above. FacilityCare only has facilityID and not Facility entity. I want the facility object that went as param in flatMapping to be my param again in mapping() function as in:
Collectors.mapping(c -> Correlative.createFromFacility(facility))
where "facility" is the same object as the one in flatMapping.
Is there any way to achieve this? Please let me know if things need to be explained further.
Edit:
Here's a solution doesn't fully utilize Collectors.
final Map<Care, List<Correlative>> mapToHydrate = new HashMap<>();
quickSearchList
.stream()
.map(QuickSearch::getFacility)
.forEach(facility -> {
facility.getFacilityCares()
.stream()
.map(FacilityCare::getCare)
.distinct()
.forEach(care -> {
mapToHydrate.computeIfAbsent(care, care -> new ArrayList<>());
mapToHydrate.computeIfPresent(care, (c, list) -> {
list.add(Correlative.createFromFacility(facility));
return list;
});
});
});

Sometimes, streams are not the best solution. This seems to be the case, because you are losing each facility instance when going down the pipeline.
Instead, you could do it as follows:
Map<Care, List<Correlative>> mapToHydrate = new LinkedHashMap<>();
quickSearchList.forEach(q -> {
Facility facility = q.getFacility();
facility.getFacilityCares().forEach(fCare ->
mapToHydrate.computeIfAbsent(fCare.getCare(), k -> new ArrayList<>())
.add(Correlative.createFromFacility(facility)));
});
This uses the return value of Map.computeIfAbsent (which is either the newly created list of correlatives or the already present one).
It is not clear from your question why you need distinct cares before adding them to the map.
EDIT: Starting from Java 16, you might want to use Stream.mapMulti:
Map<Care, List<Correlative>> mapToHydrate = quickSearchList.stream()
.map(QuickSearch::getFacility)
.mapMulti((facility, consumer) -> facility.getFacilityCares()
.forEach(fCare -> consumer.accept(Map.entry(fCare.getCare(), facility))))
.collect(Collectors.groupingBy(
e -> e.getKey(),
Collectors.mapping(
e -> Correlative.createFromFacility(e.getValue()),
Collectors.toList())));

This is what I came up with based on the information provided. The Facility and Care are stored in a temp array to be processed later in the desired map.
Map<Care, List<Correlative>> mapOf = quickSearchList.stream()
.map(QuickSearch::getFacility)
.flatMap(facility -> facility
.getFacilityCares().stream()
.map(facCare->new Object[]{facility, facCare.getCare()}))
.collect(Collectors.groupingBy(obj->(Care)obj[1], Collectors
.mapping(obj -> Correlative.createFromFacility(
(Facility)obj[0]),
Collectors.toList())));
I prepared some simple test data and this seems to work assuming I understand the ultimate goal. For each type of care offered, it puts all the facilities that offer that care in an associated list of facilities.

Inspired by #fps answer, I was able to come up with a solution that will work for the time being (pre-Java16).
Map<Care, List<Correlative>> mapOf = quickSearchList
.stream()
.map(QuickSearch::getFacility)
.map(expandIterable())
.collect(
Collectors.flatMapping(map -> map.entrySet().stream(),
Collectors.groupingBy(Map.Entry::getKey,
Collectors.mapping(entry -> Correlative.createFromFacility(entry.getValue()),
Collectors.toList()
)
)
));
}
public Function<Facility, Map<Care, Facility>> expandIterable() {
return facility -> facility.getFacilityCares()
.stream()
.map(FacilityCare::getCare)
.distinct()
.collect(Collectors.toMap(c -> c, c -> facility));
}
Basically, I added a method call that returns a Function that takes in Facility as argument and returns a Map of Care as key with Facility as value. That map is used in the collection of the previous stream.

Wrapping and turning a single CompleteableFuture<OlderCat> to a bulk operation with result of CompleteableFuture<Map<Cat.name, OlderCat>>

We have an async method:
public CompletableFuture<OlderCat> asyncGetOlderCat(String catName)
Given a list of Cats:
List<Cat> cats;
We like to create a bulk operation that will result in a map between the cat name and its async result:
public CompletableFuture<Map<String, OlderCat>>
We also like that if an exception was thrown from the asyncGetOlderCat, the cat will not be added to the map.
We were following this post and also this one and we came up with this code:
List<Cat> cats = ...
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(Collectors.toMap(Cat::getName,
c -> asynceGetOlderCat(c.getName())
.exceptionally( ex -> /* null?? */ ))
));
CompletableFuture<Void> allFutures = CompletableFuture
.allOf(completableFutures.values().toArray(new CompletableFuture[completableFutures.size()]));
return allFutures.thenApply(future -> completableFutures.keySet().stream()
.map(CompletableFuture::join) ???
.collect(Collectors.toMap(????)));
But it is not clear how in the allFutureswe can get access to the cat name and how to match between the OlderCat & the catName.
Can it be achieved?

You are almost there. You don't need to put an exceptionally() on the initial futures, but you should use handle() instead of thenApply() after the allOf(), because if any future fails, the allOf() will fail as well.
When processing the futures, you can then just filter out the failing ones from the result, and rebuild the expected map:
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(toMap(Cat::getName, c -> asyncGetOlderCat(c.getName())));
CompletableFuture<Void> allFutures = CompletableFuture
.allOf(completableFutures.values().toArray(new CompletableFuture[0]));
return allFutures.handle((dummy, ex) ->
completableFutures.entrySet().stream()
.filter(entry -> !entry.getValue().isCompletedExceptionally())
.collect(toMap(Map.Entry::getKey, e -> e.getValue().join())));
Note that the calls to join() are guaranteed to be non-blocking since the thenApply() will only be executed after all futures are completed.

As I get it, what you need is CompletableFuture with all results, the code below does exactly what you need
public CompletableFuture<Map<String, OlderCat>> getOlderCats(List<Cat> cats) {
return CompletableFuture.supplyAsync(
() -> {
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(Collectors.toMap(Cat::getName,
c -> asyncGetOlderCat(c.getName())
.exceptionally(ex -> {
ex.printStackTrace();
// if exception happens - return null
// if you don't want null - save failed ones to separate list and process them separately
return null;
}))
);
return completableFutures
.entrySet()
.stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getValue().join()
));
}
);
}
What it does here - returns future, which creates more completable future inside and waits at the end.

For loop including if to parallelStream() expression

Is there a way to parallelize this piece of code:
HashMap<String, Car> cars;
List<Car> snapshotCars = new ArrayList<>();
...
for (final Car car : cars.values()) {
if (car.isTimeInTimeline(curTime)) {
car.updateCalculatedPosition(curTime);
snapshotCars.add(car);
}
}
Update: This is what I tried before asking for assistance:
snapshotCars.addAll(cars.values().parallelStream()
.filter(c -> c.isTimeInTimeline(curTime))
.collect(Collectors.toList()));
How could I integrate this line? ->
car.updateCalculatedPosition(curTime);

Well, assuming that updateCalculatedPosition does not affect state outside of the Car object on which it runs, it may be safe enough to use peek for this:
List<Car> snapshotCars = cars.values()
.parallelStream()
.filter(c -> c.isTimeInTimeline(curTime))
.peek(c -> c.updateCalculatedPosition(curTime))
.collect(Collectors.toCollection(ArrayList::new));
I say this is "safe enough" because the collect dictates which elements will be peeked by peek, and these will necessarily be all the items that passed the filter. However, read this answer for the reason why peek should generally be avoided for "significant" operations.
Your peek-free alternative is to first, filter and collect, and then update using the finished collection:
List<Car> snapshotCars = cars.values()
.parallelStream()
.filter(c -> c.isTimeInTimeline(curTime))
.collect(Collectors.toCollection(ArrayList::new));
snapShotCars.parallelStream()
.forEach(c -> c.updateCalculatedPosition(curTime));
This is safer from an API point of view, but less parallel - you only start updating the positions after you have finished filtering and collecting.

If you want parallelized access to a List you might want to use Collections.synchonizedList to get a thread-safe list:
List<Car> snapshotCars = Collections.synchronizedList(new ArrayList<>());
Then you can use the stream API like so:
cars.values()
.parallelStream()
.filter(car -> car.isTimeInTimeline(curTime))
.forEach(car -> {
car.updateCalculatedPosition(curTime);
snapshotCars.add(car);
});

In addition to RealSkeptic’s answer, you can alternatively use your own collector:
List<Car> snapshotCars = cars.values().parallelStream()
.filter(c -> c.isTimeInTimeline(curTime))
.collect(ArrayList::new,
(l,c) -> { c.updateCalculatedPosition(curTime); l.add(c); },
List::addAll);
Note that .collect(Collectors.toList()) is equivalent (though not necessarily identical) to .collect(Collectors.toCollection(ArrayList::new)) which is equivalent to .collect(ArrayList::new, List::add, List::addAll).
So our custom collector does a similar operation, but replaces the accumulator with a function, which also performs the desired additional operation.

Concurrent processing of project reactor's flux

I’m very new to project reactor or reactive programming at large so I'm probably doing something wrong. I’m struggling to build a flow that does the following:
Given a class Entity:
Entity {
private Map<String, String> items;
public Map<String, String> getItems() {
return items;
}
}
read Entity from DB (ListenableFuture<Entity> readEntity())
perform some parallel async processing on every item (boolean processItem(Map.Entry<String, String> item))
when all finished call doneProcessing (void doneProcessing(boolean b))
Currently my code is:
handler = this;
Mono
.fromFuture(readEntity())
.doOnError(t -> {
notifyError(“some err-msg” , t);
return;
})
.doOnSuccess(e -> log.info("Got the Entity: " + e))
.flatMap( e -> Flux.fromIterable(e.getItems().entrySet()))
.all(handler::processItem)
.consume(handler::doneProcessing);
The thing works, but the handler::processItem calls don’t run concurrently on all items. I tried using dispatchOn and publishOn with both io and async SchedulerGroup and with various parameters, but still the calls run serially on one thread.
What am I doing wrong?
Apart, I’m sure that in general the above can be improved so any suggestion will be appreciated.
Thanks

You need another flatMap that forks and joins computation for each individual map element:
Mono.fromFuture(readEntity())
.flatMap(v -> Flux.fromIterable(v.getItems().entrySet()))
.flatMap(v -> Flux.just(v)
.publishOn(SchedulerGroup.io())
.doOnNext(handler::processItem))
.consume(handler::doneProcessing);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Efficient way to use fork join pool with multiple parallel streams - java

Related

Reactive Java List to Map

Java 8 Lambdas flatmapping, groupingBy and mapping to get a Map of T and List<K>

Wrapping and turning a single CompleteableFuture<OlderCat> to a bulk operation with result of CompleteableFuture<Map<Cat.name, OlderCat>>

For loop including if to parallelStream() expression

Concurrent processing of project reactor's flux

Categories

Resources