Execute multiple queries in parallel via Streams

Execute multiple queries in parallel via Streams - java

I am having the following method:
public String getResult() {
List<String> serversList = getServerListFromDB();
List<String> appList = getAppListFromDB();
List<String> userList = getUserFromDB();
return getResult(serversList, appList, userList);
}
Here I am calling three method sequentially which in turns hits the DB and fetch me results, then I do post processing on the results I got from the DB hits. I know how to call these three methods concurrently via use of Threads. But I would like to use Java 8 Parallel Stream to achieve this. Can someone please guide me how to achieve the same via Parallel Streams?
EDIT I just want to call the methods in parallel via Stream.
private void getInformation() {
method1();
method2();
method3();
method4();
method5();
}

You may utilize CompletableFuture this way:
public String getResult() {
// Create Stream of tasks:
Stream<Supplier<List<String>>> tasks = Stream.of(
() -> getServerListFromDB(),
() -> getAppListFromDB(),
() -> getUserFromDB());
List<List<String>> lists = tasks
// Supply all the tasks for execution and collect CompletableFutures
.map(CompletableFuture::supplyAsync).collect(Collectors.toList())
// Join all the CompletableFutures to gather the results
.stream()
.map(CompletableFuture::join).collect(Collectors.toList());
// Use the results. They are guaranteed to be ordered in the same way as the tasks
return getResult(lists.get(0), lists.get(1), lists.get(2));
}

As already mentioned, a standard parallel stream is probably not the best fit for your use case. I would complete each task asynchronously using an ExecutorService and "join" them when calling the getResult method:
ExecutorService es = Executors.newFixedThreadPool(3);
Future<List<String>> serversList = es.submit(() -> getServerListFromDB());
Future<List<String>> appList = es.submit(() -> getAppListFromDB());
Future<List<String>> userList = es.submit(() -> getUserFromDB());
return getResult(serversList.get(), appList.get(), userList.get());

foreach is what used for side-effects, you can call foreach on a parallel stream. ex:
listOfTasks.parallelStream().foreach(list->{
submitToDb(list);
});
However, parallelStream uses the common ForkJoinPool which is arguably not good for IO-bound tasks.
Consider using a CompletableFuture and supply an appropriate ExecutorService. It gives more flexibility (continuation,configuration). For ex:
ExecutorService executorService = Executors.newCachedThreadPool();
List<CompletableFuture> allFutures = new ArrayList<>();
for(Query query:queries){
CompletableFuture<String> query = CompletableFuture.supplyAsync(() -> {
// submit query to db
return result;
}, executorService);
allFutures.add(query);
}
CompletableFuture<Void> all = CompletableFuture.allOf(allFutures.toArray(new CompletableFuture[allFutures.size()]));

Not quite clear what do you mean, but if you just want to run some process on these lists on parallel you can do something like this:
List<String> list1 = Arrays.asList("1", "234", "33");
List<String> list2 = Arrays.asList("a", "b", "cddd");
List<String> list3 = Arrays.asList("1331", "22", "33");
List<List<String>> listOfList = Arrays.asList(list1, list2, list3);
listOfList.parallelStream().forEach(list -> System.out.println(list.stream().max((o1, o2) -> Integer.compare(o1.length(), o2.length()))));
(it will print most lengthy elements from each list).

Related

Refactoring blocking to async code using CompletableFuture

Situation
I did code refactoring using CompletableFuture for better performance.
A code is like below. (each result is independent.)
Code before refactoring
public Map<String, Object> retrieve() {
Object result1 = testProxy.findSomething(param1); // blocking
Object result2 = testProxy.findSomething(param2); // blocking
Object result3 = testProxy.findSomething(param3); // blocking
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", result1);
toClient.put("result2", result2);
toClient.put("result3", result3);
return toClient;
}
Code after refactoring
public Map<String, Object> retrieve() {
CompletableFuture<Object> future1 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param1));
CompletableFuture<Object> future2 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param2));
CompletableFuture<Object> future3 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param3));
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1.get());
toClient.put("result2", future2.get());
toClient.put("result3", future3.get());
return toClient;
}
After refactoring, I got a better performance result. However, I found a code using while loop to check task is done before getting a result.
ExecutorService executorService = Executors.newSingleThreadExecutor();
CompletableFuture<String> future = new CompletableFuture<>(); // creating an incomplete future
executorService.submit(() -> {
Thread.sleep(500);
future.complete("value"); // completing the incomplete future
return null;
});
while (!future.isDone()) { // checking the future for completion
Thread.sleep(1000);
}
String result = future.get(); // reading value of the completed future
logger.info("result: {}", result);
executorService.shutdown();
Questions
So, my questions are :
Did I code refactoring in right way using CompletableFuture ?
As far as I know, get() method blocks until task return result though, why the while loop needs ?
If I need to check whether all tasks are done, should I write code like this ?
CompletableFuture<Void> allFutures = CompletableFuture.allOf(future1, future2, future3);
while(!allFutures.isDone()){}
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1.get());
toClient.put("result2", future2.get());
toClient.put("result3", future3.get());

Is CompletableFuture used correct?
Did I code refactoring in right way using CompletableFuture ?
Your refactoring is okay. At least the 3 tasks are now executed in parallel in the background and not sequentially anymore.
However, I would suggest instead of then blocking on all 3, by doing get(), you could instead return the futures out of the method. This enables the user to decide how to handle the situation - whether he wants to continue an async chain or work with it in a blocking fashion by doing get(). So if applicable, change the method to:
// changed return type
public Map<String, CompletableFuture<Object>> retrieve() {
CompletableFuture<Object> future1 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param1));
CompletableFuture<Object> future2 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param2));
CompletableFuture<Object> future3 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param3));
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1); // no get() anymore
toClient.put("result2", future2); // no get() anymore
toClient.put("result3", future3); // no get() anymore
return toClient;
}
Minor note, you can do return Map.of("result1", future1, "result2", result2, "result3", result3) to simplify the end a bit.
On that note, I get that this is a heavily edited and simplified example but please check if the Map is actually meaningful in your case or whether you could just use a List instead:
return List.of(future1, future2, future3);
That said, in this particular case you could also utilize loops or streams to simplify the method further:
public List<CompletableFuture<Object>> retrieve() {
return Stream.of(param1, param2, param3)
.map(param -> CompletableFuture.supplyAsync(
() -> testProxy.findSomething(param)))
.toList();
}
I do have a concern on the missing type safety with your method though. Object is not really helpful to an user - but maybe this is not the case in your real code.
Does get() wait until its done?
As far as I know, get() method blocks until task return result though, why the while loop needs ?
You are correct. get() blocks until the task is done (or is cancelled, interrupted or ends abnormally due to an exception). A loop to wait on the result is not needed.
The example code you found is not the best. The loop is bad, it should be removed.
Is the callsite okay?
If I need to check whether all tasks are done, should I write code like this ?
The actively blocking loop while(!allFutures.isDone()){} is not okay and will melt your CPU (100% CPU usage). If you want to wait until all futures are done, just do allFutures.join() or allFutures.get(). That will be much better.
Yet again, if possible, give the user the possibility to decide and return the futures to him.

Efficient way to use fork join pool with multiple parallel streams

I am using three streams which needs to call http requests. All of the calls are independent. So, I use parallel streams and collect the results from http response.
Currently I am using three separate parallel streams for these operations.
Map<String, ClassA> list1 = listOfClassX.stream().parallel()
.map(item -> {
ClassA instanceA = httpGetCall(item.id);
return instanceA;
})
.collect(Collectors.toConcurrentMap(item -> item.id, item -> item);
Map<String, ClassB> list1 = listOfClassY.stream().parallel()
.map(item -> {
ClassB instanceB = httpGetCall(item.id);
return instanceB;
})
.collect(Collectors.toConcurrentMap(item -> item.id, item -> item);
Map<String, ClassC> list1 = listOfClassZ.stream().parallel()
.map(item -> {
ClassC instanceC = httpGetCall(item.id);
return instanceC;
})
.collect(Collectors.toConcurrentMap(item -> item.id, item -> item);
It runs the three parallel streams separately one after another though each call is independent.
Will common fork join pool help in this case to optimize the use of thread pool here?
Is there any other way to optimize the performance of this code further?

How can I immediately pipe tasks from one ThreadPool to another?

I've got a list of input elements which I want to queue into several ThreadPools. Let's say this is my input:
final List<Integer> ints = Stream.iterate(1, i -> i + 1).limit(100).collect(Collectors.toList());
These are the three functions I want the elements to run through each after another:
final Function<Integer, Integer> step1 =
value -> { // input from the ints list
return value * 2;
};
final Function<Integer, Double> step2 =
value -> { // input from the previous step1
return (double) (value * 2); //
};
final Function<Double, String> step3 =
value -> { // input from the previous step2
return "Result: " + value * 2;
};
And these would be the pools for each step:
final ExecutorService step1Pool = Executors.newFixedThreadPool(4);
final ExecutorService step2Pool = Executors.newFixedThreadPool(3);
final ExecutorService step3Pool = Executors.newFixedThreadPool(1);
I want each element to run through step1Pool and apply the step1. As soon as one element is done its result should
end up in step2pool so that step2 can be applied here. As soon as something in step2Pool is done it should be
queued in step3Pool and step3 should be applied.
On my main thread I want to wait until I have all the results from step3. The order in which each element is processed
doesn't matter. Only that they all run through step1 -> step2 -> step3 on the correct thread pool.
Basically I want to parallelize the Stream.map, push each result immediately to the next queue and wait until I've
got ints.size() results from my last thread pool back.
Is there a simple way to do achieve in Java?

I believe that CompletableFuture will help you here!
List<CompletableFuture<String>> futures = ints.stream()
.map(i -> CompletableFuture.supplyAsync(() -> step1.apply(i), step1Pool)
.thenApplyAsync(step2, step2Pool)
.thenApplyAsync(step3, step3Pool))
.collect(Collectors.toList());
List<String> result = futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());

Better use streams for that :
List<String> stringList = Stream.iterate(1, i -> i + 1)
.limit(100)
.parallel()
.map(step1)
.map(step2)
.map(step3)
.collect(Collectors.toList());

Java Multi-Thread Executor InvokeAll Problems

The code I'm having problems with is:
Executor executor = (Executor) callList;
List<ProgState> newProgList = executor.invokeAll(callList).stream()
.map(future -> {try {return future.get();} catch(Exception e){e.printStackTrace();}})
.filter(p -> p!=null).collect(Collectors.toList());
The method invokeAll(List>) is undefined for the type Executor
I am told I should use an executor like the one in the code snippet.
The Callables are defined within the following code:
List<Callable<ProgState>> callList = (List<Callable<ProgState>>) lst.stream()
.map(p -> ((Callable<ProgState>)(() -> {return p.oneStep();})))
.collect(Collectors.toList());
Here is the teacher's code:
//prepare the list of callables
List<Callable<PrgState>> callList = prgList.stream().map(p -> (() -> {return p.oneStep();})).collect(Collectors.toList());
//start the execution of the callables
//it returns the list of new created threads
List<PrgState> newPrgList = executor.invokeAll(callList).stream()
.map(future -> { try {
return future.get();
}
catch(Exception e) {
//here you can treat the possible
// exceptions thrown by statements
// execution
}
})
.filter(p -> p!=null).collect(Collectors.toList());
//add the new created threads to the list of existing threads
prgList.addAll(newPrgList);

If you can use stream(), why not parallelStream() as it would be much simpler.
List<PrgState> prgStates = prgList.parallelStream()
.map(p -> p.oneStep())
.collect(Collectors.toList());
This way you have no thread pool to configure, start or stop when finished.
Some might suggest that parallelStream() was the main reason for adding Stream and lambdas to Java 8 in the first place. ;)

You can't cast list of Callables with ExecutorService. You need to define ExecutorService which will inturn pick up callables and execute them in one or multiple threads in parallel.
This is what i think you are after:
ExecutorService executor = Executors.newCachedThreadPool();//change executor type as per your need.
List<ProgState> newProgList = executor.invokeAll(callList).stream().map(future -> {...

How to add returned values from a Supplier to a FutureList

i want to run the three methods posted below using CompletableFuture asynchronous supplier so that, when the Executor finishes the Futurelist should contain three values returned from the three methods respectively.
i know how to use the Futurelist, for an example:
futureList = CompletableFuture.supplyAsync()
but in my case, i want something like:
futureList.add(CompletableFuture.supplyAsync())
please let me know how can i do that.
methods:
this.compStabilityMeasure(this.frameIjList, this.frameIkList, SysConsts.STABILITY_MEASURE_TOKEN);
this.setTrackingRepValue(this.compTrackingRep(this.frameIjList, this.frameIkList, SysConsts.TRACKING_REPEATABILITY_TOKEN));
this.setViewPntRepValue(this.compViewPntRep(this.frameIjList, this.frameIkList, SysConsts.VIEWPOINT_REPEATABILITY_TOKEN));
compStabilityMeasure method implementation:
private void compStabilityMeasure(ArrayList<Mat> frameIjList, ArrayList<Mat>
frameIkList, String token) throws InterruptedException, ExecutionException {
// TODO Auto-generated method stub
synchronized (frameIjList) {
synchronized (frameIjList) {
this.setRepValue(this.compRep(frameIjList, frameIkList, token));
this.setSymRepValue(this.compSymRep(this.getRepValue(), frameIkList, frameIjList, token));
}
}
}

You want to look at using "thenCombineAsync", eg:
CompletableFuture<String> firstFuture = firstMethod();
CompletableFuture<String> secondFuture = secondMethod();
CompletableFuture<String> thirdFuture = thirdMethod();
CompletableFuture<List<String>> allCompleted = firstFuture
.thenCombineAsync(secondFuture, (first, second) -> listOf(first, second))
.thenCombineAsync(thirdFuture, (list, third) -> {
list.add(third);
return list;
});

You can use allOf, and then create a CompletableFuture that gets completed with a Stream containing the results of your individual CompletableFutures:
CompletableFuture<String> cf1 = CompletableFuture.supplyAsync(() -> "hi1");
CompletableFuture<String> cf2 = CompletableFuture.supplyAsync(() -> "hi2");
List<CompletableFuture<String>> cfsList = Arrays.asList(cf1, cf2);
CompletableFuture<Void> allCfs = CompletableFuture.allOf((CompletableFuture<String>[]) cfsList.toArray());
CompletableFuture<Stream<String>> cfWithFinishedStream = allCfs.thenApply((allCf) ->
cfsList.stream().map(cf -> cf.getNow("")));
Example to get the values from the stream when the CF completes:
cfWithFinishedStream.thenAccept(stream ->
stream.forEach(string -> System.out.println(string)));
If you don't like streams, you can convert them to a List using collect:
CompletableFuture<List<String>> cfWithFinishedList = allCfs
.thenApply((allCf) ->
cfsList.stream().map(cf ->
cf.getNow("")).collect(Collectors.toList()));

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Execute multiple queries in parallel via Streams - java

Related

Refactoring blocking to async code using CompletableFuture

Efficient way to use fork join pool with multiple parallel streams

How can I immediately pipe tasks from one ThreadPool to another?

Java Multi-Thread Executor InvokeAll Problems

How to add returned values from a Supplier to a FutureList

Categories

Resources