The code I'm having problems with is:
Executor executor = (Executor) callList;
List<ProgState> newProgList = executor.invokeAll(callList).stream()
.map(future -> {try {return future.get();} catch(Exception e){e.printStackTrace();}})
.filter(p -> p!=null).collect(Collectors.toList());
The method invokeAll(List>) is undefined for the type Executor
I am told I should use an executor like the one in the code snippet.
The Callables are defined within the following code:
List<Callable<ProgState>> callList = (List<Callable<ProgState>>) lst.stream()
.map(p -> ((Callable<ProgState>)(() -> {return p.oneStep();})))
.collect(Collectors.toList());
Here is the teacher's code:
//prepare the list of callables
List<Callable<PrgState>> callList = prgList.stream().map(p -> (() -> {return p.oneStep();})).collect(Collectors.toList());
//start the execution of the callables
//it returns the list of new created threads
List<PrgState> newPrgList = executor.invokeAll(callList).stream()
.map(future -> { try {
return future.get();
}
catch(Exception e) {
//here you can treat the possible
// exceptions thrown by statements
// execution
}
})
.filter(p -> p!=null).collect(Collectors.toList());
//add the new created threads to the list of existing threads
prgList.addAll(newPrgList);
If you can use stream(), why not parallelStream() as it would be much simpler.
List<PrgState> prgStates = prgList.parallelStream()
.map(p -> p.oneStep())
.collect(Collectors.toList());
This way you have no thread pool to configure, start or stop when finished.
Some might suggest that parallelStream() was the main reason for adding Stream and lambdas to Java 8 in the first place. ;)
You can't cast list of Callables with ExecutorService. You need to define ExecutorService which will inturn pick up callables and execute them in one or multiple threads in parallel.
This is what i think you are after:
ExecutorService executor = Executors.newCachedThreadPool();//change executor type as per your need.
List<ProgState> newProgList = executor.invokeAll(callList).stream().map(future -> {...
Related
Java version : 11
I have a List, which contains many sublist and for each sublist I want to perform certain transformation/operations.
I want to perform this operation in non-blocking asynchronous fashion, so I am using CompletableFuture.
This is my operation:
public static List<String> convertBusinessObjectJson(List<BusinessObject> businessObjList) {
List<Either> eitherValueOrException = {//omitted logic to convert to json}
return eitherValueOrException;
}
It returns a List of Either Objects, where Either holds, either runtime exception thrown by conversion logic or String result when conversion is successful.
This is my caller code:
mainList.forEach(sublist -> {
CompletableFuture<List<Either>> listCompletableFuture = CompletableFuture.supplyAsync(() -> FutureImpl.convertBusinessObjectJson(sublist));
});
Once the CompletableFuture<List<Either>> listCompletableFuture is received, I want to chain the operation,
As in
take CompletableFuture<List<Either>> listCompletableFuture, take exceptions only from list and, perform certain operation
take CompletableFuture<List<Either>> listCompletableFuture, take results only from list and, perform certain operation
Something like this (pseudo code):
mainList.forEach(sublist -> {
CompletableFuture<List<Either>> listCompletableFuture = CompletableFuture.supplyAsync(() -> FutureImpl.convertDSRowToJson(subDSRowList));
listCompletableFuture.thenApply(//function which pushes exception to say kafka)
listCompletableFuture.thenApply(//function which pushes result to say database)
});
Can it be done?
Any help is much appreciated :)
You could try smth like this:
var futureList = mainList.stream()
.map(sublist -> CompletableFuture.supplyAsync(() -> FutureImpl.convertBusinessObjectJson(sublist)))
.collect(Collectors.toList());
The above would collect a list of CompletableFutures. Now what needs to happen is we need to wait for the completion of all those futures. We do this by:
var joinedFutureList = futureList.stream()
.map(objectCompletableFuture -> {
try {
return objectCompletableFuture.get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
}).collect(Collectors.toList());
});
After that the separation would look smth like this:
var exceptionList = joinedFutureList.stream()
.filter(obj -> obj instanceof Exception)
.peek(System.out::println)
.collect(Collectors.toList());
var successList = joinedFutureList.stream()
.filter(obj -> obj instanceof String)
.peek(System.out::println)
.collect(Collectors.toList());
Situation
I did code refactoring using CompletableFuture for better performance.
A code is like below. (each result is independent.)
Code before refactoring
public Map<String, Object> retrieve() {
Object result1 = testProxy.findSomething(param1); // blocking
Object result2 = testProxy.findSomething(param2); // blocking
Object result3 = testProxy.findSomething(param3); // blocking
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", result1);
toClient.put("result2", result2);
toClient.put("result3", result3);
return toClient;
}
Code after refactoring
public Map<String, Object> retrieve() {
CompletableFuture<Object> future1 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param1));
CompletableFuture<Object> future2 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param2));
CompletableFuture<Object> future3 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param3));
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1.get());
toClient.put("result2", future2.get());
toClient.put("result3", future3.get());
return toClient;
}
After refactoring, I got a better performance result. However, I found a code using while loop to check task is done before getting a result.
ExecutorService executorService = Executors.newSingleThreadExecutor();
CompletableFuture<String> future = new CompletableFuture<>(); // creating an incomplete future
executorService.submit(() -> {
Thread.sleep(500);
future.complete("value"); // completing the incomplete future
return null;
});
while (!future.isDone()) { // checking the future for completion
Thread.sleep(1000);
}
String result = future.get(); // reading value of the completed future
logger.info("result: {}", result);
executorService.shutdown();
Questions
So, my questions are :
Did I code refactoring in right way using CompletableFuture ?
As far as I know, get() method blocks until task return result though, why the while loop needs ?
If I need to check whether all tasks are done, should I write code like this ?
CompletableFuture<Void> allFutures = CompletableFuture.allOf(future1, future2, future3);
while(!allFutures.isDone()){}
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1.get());
toClient.put("result2", future2.get());
toClient.put("result3", future3.get());
Is CompletableFuture used correct?
Did I code refactoring in right way using CompletableFuture ?
Your refactoring is okay. At least the 3 tasks are now executed in parallel in the background and not sequentially anymore.
However, I would suggest instead of then blocking on all 3, by doing get(), you could instead return the futures out of the method. This enables the user to decide how to handle the situation - whether he wants to continue an async chain or work with it in a blocking fashion by doing get(). So if applicable, change the method to:
// changed return type
public Map<String, CompletableFuture<Object>> retrieve() {
CompletableFuture<Object> future1 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param1));
CompletableFuture<Object> future2 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param2));
CompletableFuture<Object> future3 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param3));
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1); // no get() anymore
toClient.put("result2", future2); // no get() anymore
toClient.put("result3", future3); // no get() anymore
return toClient;
}
Minor note, you can do return Map.of("result1", future1, "result2", result2, "result3", result3) to simplify the end a bit.
On that note, I get that this is a heavily edited and simplified example but please check if the Map is actually meaningful in your case or whether you could just use a List instead:
return List.of(future1, future2, future3);
That said, in this particular case you could also utilize loops or streams to simplify the method further:
public List<CompletableFuture<Object>> retrieve() {
return Stream.of(param1, param2, param3)
.map(param -> CompletableFuture.supplyAsync(
() -> testProxy.findSomething(param)))
.toList();
}
I do have a concern on the missing type safety with your method though. Object is not really helpful to an user - but maybe this is not the case in your real code.
Does get() wait until its done?
As far as I know, get() method blocks until task return result though, why the while loop needs ?
You are correct. get() blocks until the task is done (or is cancelled, interrupted or ends abnormally due to an exception). A loop to wait on the result is not needed.
The example code you found is not the best. The loop is bad, it should be removed.
Is the callsite okay?
If I need to check whether all tasks are done, should I write code like this ?
The actively blocking loop while(!allFutures.isDone()){} is not okay and will melt your CPU (100% CPU usage). If you want to wait until all futures are done, just do allFutures.join() or allFutures.get(). That will be much better.
Yet again, if possible, give the user the possibility to decide and return the futures to him.
We have an async method:
public CompletableFuture<OlderCat> asyncGetOlderCat(String catName)
Given a list of Cats:
List<Cat> cats;
We like to create a bulk operation that will result in a map between the cat name and its async result:
public CompletableFuture<Map<String, OlderCat>>
We also like that if an exception was thrown from the asyncGetOlderCat, the cat will not be added to the map.
We were following this post and also this one and we came up with this code:
List<Cat> cats = ...
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(Collectors.toMap(Cat::getName,
c -> asynceGetOlderCat(c.getName())
.exceptionally( ex -> /* null?? */ ))
));
CompletableFuture<Void> allFutures = CompletableFuture
.allOf(completableFutures.values().toArray(new CompletableFuture[completableFutures.size()]));
return allFutures.thenApply(future -> completableFutures.keySet().stream()
.map(CompletableFuture::join) ???
.collect(Collectors.toMap(????)));
But it is not clear how in the allFutureswe can get access to the cat name and how to match between the OlderCat & the catName.
Can it be achieved?
You are almost there. You don't need to put an exceptionally() on the initial futures, but you should use handle() instead of thenApply() after the allOf(), because if any future fails, the allOf() will fail as well.
When processing the futures, you can then just filter out the failing ones from the result, and rebuild the expected map:
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(toMap(Cat::getName, c -> asyncGetOlderCat(c.getName())));
CompletableFuture<Void> allFutures = CompletableFuture
.allOf(completableFutures.values().toArray(new CompletableFuture[0]));
return allFutures.handle((dummy, ex) ->
completableFutures.entrySet().stream()
.filter(entry -> !entry.getValue().isCompletedExceptionally())
.collect(toMap(Map.Entry::getKey, e -> e.getValue().join())));
Note that the calls to join() are guaranteed to be non-blocking since the thenApply() will only be executed after all futures are completed.
As I get it, what you need is CompletableFuture with all results, the code below does exactly what you need
public CompletableFuture<Map<String, OlderCat>> getOlderCats(List<Cat> cats) {
return CompletableFuture.supplyAsync(
() -> {
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(Collectors.toMap(Cat::getName,
c -> asyncGetOlderCat(c.getName())
.exceptionally(ex -> {
ex.printStackTrace();
// if exception happens - return null
// if you don't want null - save failed ones to separate list and process them separately
return null;
}))
);
return completableFutures
.entrySet()
.stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getValue().join()
));
}
);
}
What it does here - returns future, which creates more completable future inside and waits at the end.
I would like to replicate and parallelize the following behavior with Java 8 streams:
for (animal : animalList) {
// find all other animals with the same breed
Collection<Animal> queryResult = queryDatabase(animal.getBreed());
if (animal.getSpecie() == cat) {
catList.addAll(queryResult);
} else {
dogList.addAll(queryResult);
}
}
This is what I have so far
final Executor queryExecutor =
Executors.newFixedThreadPool(Math.min(animalList.size(), 10),
new ThreadFactory(){
public Thread newThread(Runnable r){
Thread t = new Thread(r);
t.setDaemon(true);
return t;
}
});
List<CompletableFuture<Collection<Animal>>> listFutureResult = animalList.stream()
.map(animal -> CompletableFuture.supplyAsync(
() -> queryDatabase(animal.getBreed()), queryExecutor))
.collect(Collectors.toList());
List<Animal> = listFutureResult.stream()
.map(CompletableFuture::join)
.flatMap(subList -> subList.stream())
.collect(Collectors.toList());
1 - I'm not sure how to split the stream so that I can get 2 different animal lists, one for cats and one for dogs.
2 - does this solution look reasonable?
First, consider just using
List<Animal> result = animalList.parallelStream()
.flatMap(animal -> queryDatabase(animal.getBreed()).stream())
.collect(Collectors.toList());
even if it won’t give you the desired concurrency of up to ten. The simplicity might compensate it. Regarding the other part, it’s as easy as
Map<Boolean,List<Animal>> result = animalList.parallelStream()
.flatMap(animal -> queryDatabase(animal.getBreed()).stream())
.collect(Collectors.partitioningBy(animal -> animal.getSpecie() == cat));
List<Animal> catList = result.get(true), dogList = result.get(false);
In case you have more species than just cats and dogs, you may use Collectors.groupingBy(Animal::getSpecie) to get a map from species to list of animals.
If you insist on using your own thread pool, a few things can be improved:
Executor queryExecutor = Executors.newFixedThreadPool(Math.min(animalList.size(), 10),
r -> {
Thread t = new Thread(r);
t.setDaemon(true);
return t;
});
List<Animal> result = animalList.stream()
.map(animal -> CompletableFuture.completedFuture(animal.getBreed())
.thenApplyAsync(breed -> queryDatabase(breed), queryExecutor))
.collect(Collectors.toList()).stream()
.flatMap(cf -> cf.join().stream())
.collect(Collectors.toList());
Your supplyAsync variant required capturing the actual Animal instance, creating a new Supplier for each animal. In contrast, the function passed to thenApplyAsync is invariant, performing the same operation for each parameter value. The code above assumes that getBreed is a cheap operation, otherwise, it wouldn’t be hard to pass the Animal instance to completedFuture and perform getBreed() with the async function instead.
The .map(CompletableFuture::join) can be replaced by a simple chained .join() within the flatMap function. Otherwise, if you prefer method references, you should use them consistently, i.e. .map(CompletableFuture::join).flatMap(Collection::stream).
Of course, this variant also allows using partitioningBy instead of toList.
As a final note, if you invoke shutdown on the executor service after use, there is no need to mark the threads as daemon:
ExecutorService queryExecutor=Executors.newFixedThreadPool(Math.min(animalList.size(),10));
Map<Boolean,List<Animal>> result = animalList.stream()
.map(animal -> CompletableFuture.completedFuture(animal.getBreed())
.thenApplyAsync(breed -> queryDatabase(breed), queryExecutor))
.collect(Collectors.toList()).stream()
.flatMap(cf -> cf.join().stream())
.collect(Collectors.partitioningBy(animal -> animal.getSpecie() == cat));
List<Animal> catList = result.get(true), dogList = result.get(false);
queryExecutor.shutdown();
I am having the following method:
public String getResult() {
List<String> serversList = getServerListFromDB();
List<String> appList = getAppListFromDB();
List<String> userList = getUserFromDB();
return getResult(serversList, appList, userList);
}
Here I am calling three method sequentially which in turns hits the DB and fetch me results, then I do post processing on the results I got from the DB hits. I know how to call these three methods concurrently via use of Threads. But I would like to use Java 8 Parallel Stream to achieve this. Can someone please guide me how to achieve the same via Parallel Streams?
EDIT I just want to call the methods in parallel via Stream.
private void getInformation() {
method1();
method2();
method3();
method4();
method5();
}
You may utilize CompletableFuture this way:
public String getResult() {
// Create Stream of tasks:
Stream<Supplier<List<String>>> tasks = Stream.of(
() -> getServerListFromDB(),
() -> getAppListFromDB(),
() -> getUserFromDB());
List<List<String>> lists = tasks
// Supply all the tasks for execution and collect CompletableFutures
.map(CompletableFuture::supplyAsync).collect(Collectors.toList())
// Join all the CompletableFutures to gather the results
.stream()
.map(CompletableFuture::join).collect(Collectors.toList());
// Use the results. They are guaranteed to be ordered in the same way as the tasks
return getResult(lists.get(0), lists.get(1), lists.get(2));
}
As already mentioned, a standard parallel stream is probably not the best fit for your use case. I would complete each task asynchronously using an ExecutorService and "join" them when calling the getResult method:
ExecutorService es = Executors.newFixedThreadPool(3);
Future<List<String>> serversList = es.submit(() -> getServerListFromDB());
Future<List<String>> appList = es.submit(() -> getAppListFromDB());
Future<List<String>> userList = es.submit(() -> getUserFromDB());
return getResult(serversList.get(), appList.get(), userList.get());
foreach is what used for side-effects, you can call foreach on a parallel stream. ex:
listOfTasks.parallelStream().foreach(list->{
submitToDb(list);
});
However, parallelStream uses the common ForkJoinPool which is arguably not good for IO-bound tasks.
Consider using a CompletableFuture and supply an appropriate ExecutorService. It gives more flexibility (continuation,configuration). For ex:
ExecutorService executorService = Executors.newCachedThreadPool();
List<CompletableFuture> allFutures = new ArrayList<>();
for(Query query:queries){
CompletableFuture<String> query = CompletableFuture.supplyAsync(() -> {
// submit query to db
return result;
}, executorService);
allFutures.add(query);
}
CompletableFuture<Void> all = CompletableFuture.allOf(allFutures.toArray(new CompletableFuture[allFutures.size()]));
Not quite clear what do you mean, but if you just want to run some process on these lists on parallel you can do something like this:
List<String> list1 = Arrays.asList("1", "234", "33");
List<String> list2 = Arrays.asList("a", "b", "cddd");
List<String> list3 = Arrays.asList("1331", "22", "33");
List<List<String>> listOfList = Arrays.asList(list1, list2, list3);
listOfList.parallelStream().forEach(list -> System.out.println(list.stream().max((o1, o2) -> Integer.compare(o1.length(), o2.length()))));
(it will print most lengthy elements from each list).