I'm using Spring Boot and Spring Data Jpa, and I have logic which consists of 3 request in db which I want to run in parallel. I want to use for this purpose CompletableFuture.
In the end I need to build response object from result of 5 db query runs. 3 of them currently I'm running in a loop.
So I've create CompletableFuture
CompletableFuture<Long> totalFuture = CompletableFuture.supplyAsync(() -> myRepository.getTotal());
CompletableFuture<Long> countFuture = CompletableFuture.supplyAsync(() -> myRepository.getCount());
Then I'm plannig to use .allOf with this future. But I have problem with loop calls. How to rewrite it to use callable as in every request I need to pass value from request, and then sort into map, by key ?
Map<String, Integer> groupcount = new HashMap<>();
request.ids().forEach((key, value) -> count.put(key, myRepository
.getGroupCountId(value));
To explain a little more throughly I'm posting a code snippet which I want to chain but for now it works like this.
List<CompletableFuture<Void>> completableFutures = new ArrayList<>();
Map<String, Integer> groupcount = new ConcurrentHashMap<>();
for (var id : request.Ids().entrySet()) {
completableFutures.add(
CompletableFuture.runAsync(someOperation, EXECUTOR_SERVICE)
.thenApply(v -> runQuery(v.getValues))
.thenAcceptAsync(res-> groupcount .put(v.key, res));
}
CompletableFuture.allOf(completableFutures.toArray(new CompletableFuture[0])).get();
Related
I would like to leverage the .promise(final Function<Traversal<S, E>, T> traversalFunction) method of a Gremlin GraphTraversal. It's not clear to me what function I would use within the promise.
Using the Tinkerpop Client object, I do something like this:
GraphTraversal myTraversal = g.V().hasLabel("myLabel");
client.submitAsync(myTraversal)
.thenAccept(result -> {
List<Map<Object, Object>> resultList = new ArrayList<>();
result.iterator().forEachRemaining(item ->{
DefaultRemoteTraverser drt = (DefaultRemoteTraverser) item.getObject();
Map<Object, Object> itemMap = (HashMap) drt.get();
resultList.add(itemMap);
});
outputSuccess(resultList);
})
.exceptionally(throwable -> {
// handle;
return null;
})
What would the equivalent look like using .promise()? I looked for a test in the source repo that might provide a clue, but did not see one.
First note that the promise() can only be use if you are using the Traversal as a remote connection. It will throw an exception in embedded mode as explained in the javadoc.
The promise() takes a function that processes the Traversal after it has been submitted asynchronously to the server. You're really just providing a terminator to the promise() to get a result into the returned CompletableFuture:
g.V().out().promise(t -> t.toList());
I guess you could chain it to be more exactly like what you had in your example:
g.V().out().promise(t -> t.toList()).
thenAccept(r -> outputSuccess(r)).
exceptionally(...);
Situation
I did code refactoring using CompletableFuture for better performance.
A code is like below. (each result is independent.)
Code before refactoring
public Map<String, Object> retrieve() {
Object result1 = testProxy.findSomething(param1); // blocking
Object result2 = testProxy.findSomething(param2); // blocking
Object result3 = testProxy.findSomething(param3); // blocking
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", result1);
toClient.put("result2", result2);
toClient.put("result3", result3);
return toClient;
}
Code after refactoring
public Map<String, Object> retrieve() {
CompletableFuture<Object> future1 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param1));
CompletableFuture<Object> future2 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param2));
CompletableFuture<Object> future3 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param3));
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1.get());
toClient.put("result2", future2.get());
toClient.put("result3", future3.get());
return toClient;
}
After refactoring, I got a better performance result. However, I found a code using while loop to check task is done before getting a result.
ExecutorService executorService = Executors.newSingleThreadExecutor();
CompletableFuture<String> future = new CompletableFuture<>(); // creating an incomplete future
executorService.submit(() -> {
Thread.sleep(500);
future.complete("value"); // completing the incomplete future
return null;
});
while (!future.isDone()) { // checking the future for completion
Thread.sleep(1000);
}
String result = future.get(); // reading value of the completed future
logger.info("result: {}", result);
executorService.shutdown();
Questions
So, my questions are :
Did I code refactoring in right way using CompletableFuture ?
As far as I know, get() method blocks until task return result though, why the while loop needs ?
If I need to check whether all tasks are done, should I write code like this ?
CompletableFuture<Void> allFutures = CompletableFuture.allOf(future1, future2, future3);
while(!allFutures.isDone()){}
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1.get());
toClient.put("result2", future2.get());
toClient.put("result3", future3.get());
Is CompletableFuture used correct?
Did I code refactoring in right way using CompletableFuture ?
Your refactoring is okay. At least the 3 tasks are now executed in parallel in the background and not sequentially anymore.
However, I would suggest instead of then blocking on all 3, by doing get(), you could instead return the futures out of the method. This enables the user to decide how to handle the situation - whether he wants to continue an async chain or work with it in a blocking fashion by doing get(). So if applicable, change the method to:
// changed return type
public Map<String, CompletableFuture<Object>> retrieve() {
CompletableFuture<Object> future1 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param1));
CompletableFuture<Object> future2 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param2));
CompletableFuture<Object> future3 =
CompletableFuture.supplyAsync(() -> testProxy.findSomething(param3));
Map<String, Object> toClient = new HashMap<>();
toClient.put("result1", future1); // no get() anymore
toClient.put("result2", future2); // no get() anymore
toClient.put("result3", future3); // no get() anymore
return toClient;
}
Minor note, you can do return Map.of("result1", future1, "result2", result2, "result3", result3) to simplify the end a bit.
On that note, I get that this is a heavily edited and simplified example but please check if the Map is actually meaningful in your case or whether you could just use a List instead:
return List.of(future1, future2, future3);
That said, in this particular case you could also utilize loops or streams to simplify the method further:
public List<CompletableFuture<Object>> retrieve() {
return Stream.of(param1, param2, param3)
.map(param -> CompletableFuture.supplyAsync(
() -> testProxy.findSomething(param)))
.toList();
}
I do have a concern on the missing type safety with your method though. Object is not really helpful to an user - but maybe this is not the case in your real code.
Does get() wait until its done?
As far as I know, get() method blocks until task return result though, why the while loop needs ?
You are correct. get() blocks until the task is done (or is cancelled, interrupted or ends abnormally due to an exception). A loop to wait on the result is not needed.
The example code you found is not the best. The loop is bad, it should be removed.
Is the callsite okay?
If I need to check whether all tasks are done, should I write code like this ?
The actively blocking loop while(!allFutures.isDone()){} is not okay and will melt your CPU (100% CPU usage). If you want to wait until all futures are done, just do allFutures.join() or allFutures.get(). That will be much better.
Yet again, if possible, give the user the possibility to decide and return the futures to him.
I have a methodA which takes an argument and returns a result. I am writing a reactive method to invoke the function in bulk. But Not able to get my head around reactive syntax.
My code looks like this
List<GetResult> successfulResults =
Collections.synchronizedList(new ArrayList<>());
Map<String, Throwable> erroredResults = new ConcurrentHashMap<>();
Flux.fromIterable(docsToFetch).flatMap(key -> reactiveCollection.getAndTouch(key, Duration.ofMinutes(extendExpiryInMin))
.onErrorResume(e -> {
erroredResults.put(key, e);
return Mono.empty();
})
).doOnNext(successfulResults::add).last().block();
The current implementation calls the method but collects the result in list. Collecting result in list does not make sense to my use case. I want to collect the result in a hashmap of key and result.
The solution is
List<String> docsToFetch = Arrays.asList("airline_112", "airline_1191", "airline_1203");
Map<String, GetResult> successfulResults = new ConcurrentHashMap<>();
Map<String, Throwable> erroredResults = new ConcurrentHashMap<>();
Flux.fromIterable(docsToFetch).flatMap(key -> reactiveCollection.get(key).onErrorResume(e -> {
erroredResults.put(key, e);
return Mono.empty();
}).doOnNext(getResult -> successfulResults.put(key, getResult))).last().block();
I have the following method start that returns an Observable.
If I subscribe to it, the operations inside the handle (they are DB insert operations)
does not seem to happen. But If I directly subscribe to the handle method, it works and able to perform DB insertions.
Stepping through debug, the data is all correctly captured. It just seems like the subscription didn't work as intended.
Can I please get some advice on why it doesn't work if I subscribe to the start method, and how to fix this please? Thanks.
To note: Unsure if it matters, using RX Java 1.
Doesn't work if I subscribe to following.
public Observable start(KafkaConsumerRecords<String, String> records) {
return Observable.from(records.getDelegate().records().records("TOPIC_NAME"))
.buffer(2)
.map(this::convert)
.map(o -> handle(o));
}
It works if I subscribe to the handle method directly. (By passing in a hardcoded eventObject for testing)
public Observable<ResultSet> handle(Object eventObject) {
Map<String, String> map = (Map<String, String>) eventObject;
String s1 = map.get("item1");
String s2 = map.get("item2");
Observable<ResultSet> rs1 = someDBInsertMethod1(s1);
Observable<ResultSet> rs2 = someDBInsertMethod2(s2);
return Observable.merge(rs1, rs2);
}
For reference, convert method takes in an Object and returns a map.
public Map<String, String> convert(Object records) {
Map<String, String> map = new HashMap<String, String>();
// add some data to map from the object records.
return map;
}
This is how I subscribe via another class in a main method.
// doesn't work, db insert does not happen.
Observable result1 = myClass.start(getHardCodedRecords());
result1.subscribe(resultSet -> System.out.println("printing this thus no errors.... but no DB insert happened"));
//directly subscribing to the handle method and this time it works, able to insert into db.
Observable result2 = myClass.handle(getHardCodedMap());
result2.subscribe(resultSet -> System.out.println("printing this thus no errors.... works"));
I am having the following method:
public String getResult() {
List<String> serversList = getServerListFromDB();
List<String> appList = getAppListFromDB();
List<String> userList = getUserFromDB();
return getResult(serversList, appList, userList);
}
Here I am calling three method sequentially which in turns hits the DB and fetch me results, then I do post processing on the results I got from the DB hits. I know how to call these three methods concurrently via use of Threads. But I would like to use Java 8 Parallel Stream to achieve this. Can someone please guide me how to achieve the same via Parallel Streams?
EDIT I just want to call the methods in parallel via Stream.
private void getInformation() {
method1();
method2();
method3();
method4();
method5();
}
You may utilize CompletableFuture this way:
public String getResult() {
// Create Stream of tasks:
Stream<Supplier<List<String>>> tasks = Stream.of(
() -> getServerListFromDB(),
() -> getAppListFromDB(),
() -> getUserFromDB());
List<List<String>> lists = tasks
// Supply all the tasks for execution and collect CompletableFutures
.map(CompletableFuture::supplyAsync).collect(Collectors.toList())
// Join all the CompletableFutures to gather the results
.stream()
.map(CompletableFuture::join).collect(Collectors.toList());
// Use the results. They are guaranteed to be ordered in the same way as the tasks
return getResult(lists.get(0), lists.get(1), lists.get(2));
}
As already mentioned, a standard parallel stream is probably not the best fit for your use case. I would complete each task asynchronously using an ExecutorService and "join" them when calling the getResult method:
ExecutorService es = Executors.newFixedThreadPool(3);
Future<List<String>> serversList = es.submit(() -> getServerListFromDB());
Future<List<String>> appList = es.submit(() -> getAppListFromDB());
Future<List<String>> userList = es.submit(() -> getUserFromDB());
return getResult(serversList.get(), appList.get(), userList.get());
foreach is what used for side-effects, you can call foreach on a parallel stream. ex:
listOfTasks.parallelStream().foreach(list->{
submitToDb(list);
});
However, parallelStream uses the common ForkJoinPool which is arguably not good for IO-bound tasks.
Consider using a CompletableFuture and supply an appropriate ExecutorService. It gives more flexibility (continuation,configuration). For ex:
ExecutorService executorService = Executors.newCachedThreadPool();
List<CompletableFuture> allFutures = new ArrayList<>();
for(Query query:queries){
CompletableFuture<String> query = CompletableFuture.supplyAsync(() -> {
// submit query to db
return result;
}, executorService);
allFutures.add(query);
}
CompletableFuture<Void> all = CompletableFuture.allOf(allFutures.toArray(new CompletableFuture[allFutures.size()]));
Not quite clear what do you mean, but if you just want to run some process on these lists on parallel you can do something like this:
List<String> list1 = Arrays.asList("1", "234", "33");
List<String> list2 = Arrays.asList("a", "b", "cddd");
List<String> list3 = Arrays.asList("1331", "22", "33");
List<List<String>> listOfList = Arrays.asList(list1, list2, list3);
listOfList.parallelStream().forEach(list -> System.out.println(list.stream().max((o1, o2) -> Integer.compare(o1.length(), o2.length()))));
(it will print most lengthy elements from each list).