Process List of entities using completable futures - java

I have a list of entities of type T. I also have a functional interface which acts as Supplier which has the method to performTask on entity and send back the result R which looks like:
R performTask(T entity) throws Exception.
I want to filter both: the successful results and errors & exceptions coming out of it onto separate maps. The code I wrote here is taking time, Kindly suggest what can be done.
I am looping on the list of entities, then process their completable future one by one, which I think is not the right way to do. Can you all suggest what can be done here ?
private void updateResultAndExceptionMaps(List < T > entities, final TaskProcessor < T, R > taskProcessor) {
ExecutorService executor = createExecutorService();
Map < T, R > outputMap = Collections.synchronizedMap(new HashMap < T, R > ());
Map < T, Exception > errorMap = new ConcurrentHashMap < T, Exception > ();
try {
entities.stream()
.forEach(entity -> CompletableFuture.supplyAsync(() -> {
try {
return taskProcessor.performTask(entity);
} catch (Exception e) {
errorMap.put(entity, (Exception) e.getCause());
LOG.error("Error processing entity Exception: " + entity, e);
}
return null;
}, executor)
.exceptionally(throwable -> {
errorMap.put(entity, (Exception) throwable);
LOG.error("Error processing entity Throwable: " + entity, throwable);
return null;
})
.thenAcceptAsync(R -> outputMap.put(entity, R))
.join()
); // end of for-each
LOG.info("outputMap Map -> " + outputMap);
LOG.info("errorMap Map -> " + errorMap);
} catch (Exception ex) {
LOG.warn("Error: " + ex, ex);
} finally {
executor.shutdown();
}
}
outputmap should contain the entity and result, R.
errorMap should contain entity and Exception.

This is because you iterate over List of entities one by one, create CompletableFuture object and immediately block iteration because of join method which waits until given processor finishes it work or throw exception. You can do that with full multithreading support by converting each entity to CompletableFuture, collect all CompletableFuture instances and after that wait for all invoking join on each.
Below code should do the trick in your case:
entities.stream()
.map(entity -> CompletableFuture.supplyAsync(() -> {
try {
return taskProcessor.performTask(entity);
} catch (Exception e) {
errorMap.put(entity, (Exception) e.getCause());
}
return null;
}, executor)
.exceptionally(throwable -> {
errorMap.put(entity, (Exception) throwable);
return null;
})
.thenAcceptAsync(R -> outputMap.put(entity, R))
).collect(Collectors.toList())
.forEach(CompletableFuture::join);

Related

Getting the line number of the Mono/Flux that returned Mono.empty()

Let's say I have a long chain of Monos. Some monos in the chain might return Mono.empty().
I can recover with switchIfEmpty, but I'd like to know which mono raised the empty (maybe so I can know where to add smarter empty handling).
Is there a way to programmatically get this information?
Silly example. In cases where I return how did I get here?, how I can know if the first flatMap or the second flatMap triggered the empty handler?
Mono.just("data")
.flatMap(t -> {
if (System.currentTimeMillis() % 2 == 0) {
return Mono.empty();
}
return Mono.just("happy1");
})
.flatMap(t -> {
if (System.currentTimeMillis() % 2 == 0) {
return Mono.empty();
}
return Mono.just("happy2");
})
.map(s -> {
return "successful complete: " + s;
})
.switchIfEmpty(Mono.fromCallable(() -> {
return "how did I get here?";
}))
.block();
Due to the dynamic nature of Flux and Mono, and to the fact that the onComplete signal is considered neutral enough that it is usually just passed through, there is no generic solution for this.
In your particular example, you could replace the Mono.empty() with something like Mono.empty().doOnComplete(() -> /* log something */).
You could even directly perform the logging in the if block, but the decorated empty trick is probably adaptable to more situations.
Another possibility is to turn emptiness into an error, rather than a switch on onComplete signal.
Errors are less neutral, so there are ways to enrich them for debugging purposes. For instance, with a .checkpoint("flatMapX") statement after each flatMap, you'd get additional stacktrace parts that would point to the flatMap which failed due to emptyness.
A way of turning emptiness to error in Mono is .single(), which will enforce exactly one onNext() or propagate onError(NoSuchElementException).
One thing to keep in mind with this trick is that the placement of checkpoint matters: it MUST be AFTER the single() so that the error raised from the single() gets detected and enriched.
So if I build on your snippet:
static final String PARSEABLE_MARKER = "PARSEABLE MARKER: <";
static final char MARKER_END = '>';
String parseLocation(Exception e) {
StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw);
e.printStackTrace(pw);
String trace = sw.toString();
int start = trace.indexOf(PARSEABLE_MARKER);
if (start > 0) {
trace = trace.substring(start + PARSEABLE_MARKER.length());
trace = trace.substring(0, trace.indexOf(MARKER_END));
return trace;
}
return "I don't know";
}
String testInner() {
Random random = new Random();
final boolean first = random.nextBoolean();
return Mono.just("data")
.flatMap(t -> {
if (System.currentTimeMillis() % 2 == 0 && first) {
return Mono.empty();
}
return Mono.just("happy1");
})
.single()
.checkpoint(PARSEABLE_MARKER + "the first flatMap" + MARKER_END)
.flatMap(t -> {
if (System.currentTimeMillis() % 2 == 0 && !first) {
return Mono.empty();
}
return Mono.just("happy2");
})
.single()
.checkpoint(PARSEABLE_MARKER + "the second flatMap" + MARKER_END)
.map(s -> {
return "successful complete: " + s;
})
.onErrorResume(NoSuchElementException.class, e ->
Mono.just("how did I get here? " + parseLocation(e)))
.block();
}
This can be run in a loop in a test for instance:
#Test
void test() {
int successCount = 0;
int firstCount = 0;
int secondCount = 0;
for (int i = 0; i < 100; i++) {
String message = testInner();
if (message.startsWith("how")) {
if (message.contains("first")) {
firstCount++;
}
else if (message.contains("second")) {
secondCount++;
}
else {
System.out.println(message);
}
}
else {
successCount++;
}
}
System.out.printf("Stats: %d successful, %d detected first, %d detected second", successCount, firstCount, secondCount);
}
Which prints something like:
Stats: 85 successful, 5 detected first, 10 detected second

CompleteableFuture in a loop contruct in a private Ethereum Blockchain

I have a private Ethereum blockchain set up with 5 machines mining on it. The size of the block chain [number of blocks] are as of now, 300. The processing is done on back-end Java.
I need to run the following loop construct in a asynchronous manner. The bottleneck of the loop is during the execution of the following command:
EthBlock eb = web3.ethGetBlockByNumber(new DefaultBlockParameterNumber(BigInteger.valueOf(i)), true).send();
The command can also return a Completablefuture<EthBlock> object by ending it with supplyAsync() given here https://github.com/web3j/web3j#start-sending-requests Just calling supplyAync().get() removes the parallelism aspect and makes it behave synchronously.
public void businessLogic() throws Exception {
recentBlocks = new ArrayList<EthBlock.Block>();
for (long i = 1; i <= 300000; i++) {
EthBlock eb = web3.ethGetBlockByNumber(new DefaultBlockParameterNumber(BigInteger.valueOf(i)), true).send();
if (eb == null || eb.getBlock() == null) {
continue;
}
EthBlock.Block block = eb.getBlock();
recentBlocks.add(block);
}
}
I not able to grasp the institution of translating the code into a way CompleteableFuture can operate on. Goal is to 'group' up multiple calls to web.ethGetBlockNumber(...).supplyAync() into a collection and call them all at once to update an array which will get filled by EthBlock objects i.e recentBlocks.
This is what I came up with:
public void businessLogic() throws Exception {
recentBlocks = new ArrayList<EthBlock.Block>();
List<CompleteableFuture> compFutures = new ArrayList<>();
for (long i = 0, i <= 300000, i++){
CompleteableFuture<EthBlock> compFuture = eb3.ethGetBlockByNumber(new DefaultBlockParameterNumber(BigInteger.valueOf(i)), true).sendAsync();
compFuture.thenAcceptAsync(eb -> // Doesn't look right
EthBlock.Block block = eb.getBlock();
recentBlock.add(block);)
compFutures.add(compFuture);
}
CompleteableFuture.allOf(compFutures).get();
}
Implementing IntStream
long start = System.nanoTime();
recentBlocks = IntStream.rangeClosed(0, 300_000)
.parallel()
.mapToObj(i -> {
try {
System.out.println("Current Thread -> " + Thread.currentThread());
return web3.ethGetBlockByNumber(new DefaultBlockParameterNumber(BigInteger.valueOf(i)), true).send();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
})
.filter(Objects::nonNull)
.map(EthBlock::getBlock)
.filter(Objects::nonNull)
.collect(Collectors.toList());
long stop = System.nanoTime();
System.out.println("Time Elapsed: " + TimeUnit.MICROSECONDS.convert(stop-start, TimeUnit.NANOSECONDS));
You might be able to benefit from a parallel stream instead of relying on CompletableFuture, assuming the order of the resulting List isn't important:
IntStream.rangeClosed(0, 300_000)
.parallel()
.mapToObj(i -> web3.ethGetBlockByNumber(new DefaultBlockParameterNumber(BigInteger.valueOf(i)), true).send())
.filter(Objects::nonNull)
.map(EthBlock::getBlock)
.filter(Objects::nonNull)
.collect(Collectors.toList());
Because you stated that didn't help, let's try an ExecutorService that utilizes a cached thread pool instead:
List<EthBlock.Block> blocks = Collections.synchronizedList(new ArrayList<>(300_000));
ExecutorService service = Executors.newCachedThreadPool();
for (int i = 0; i <= 300_000; i++) {
BigInteger number = BigInteger.valueOf(i);
service.execute(() -> {
EthBlock eb = web3.ethGetBlockByNumber(new DefaultBlockParameterNumber(number), true).send();
if (eb == null) {
return;
}
EthBlock.Block block = eb.getBlock();
if (block != null) {
blocks.add(block);
}
});
}
CompletableFuture contains an Override for get:
get(long timeout, TimeUnit unit). You can use this to poll by making the get timeout if it does not return within a specific time.

cassandra driver: return set of failures in list of futures

I've got a list of futures that perform data deletion for given list of studentIds from cassandra:
val studentIds: List<String> = getStudentIds(...)
val boundStatements: List<BoundStatement> = studentIds.map(bindStudentDelete(it))
val deleteFutures = boundStatements.map { session.executeAsync(it) }
deleteFutures.forEach {
// callback that will send metrics for monitoring
Futures.addCallback(it, MyCallback(...))
}
Above I have registered a callback MyCallback(...) for each future for sending metrics. Then I do:
Futures.inCompletionOrder(deleteFutures).forEach { it.get() }
to wait for the completion of all the deletes. If for any reason that some of the futures end up failing (cancelled, something else goes wrong, etc.), I want to return the list of studentIds so that I can deal with it later.
What is the best way to achieve that?
EDIT
The callback could be a way to mutate a state to track success/failure of all the deletions.
class MyCallback(
private val statsDClient: StatsdClient,
private val tags: Array<String>,
val failures: MutableList<String>
) : FutureCallback<Any> {
override fun onSuccess(result: Any?) {
//send success metrics
...
}
override fun onFailure(t: Throwable) {
// send failure metrics
...
// do something here to get the associated studentId
val currId = ...
failures.add(currId)
}
}
Similarly, I could mutate a state in Futures.inCompletionOrder(deleteFutures).forEach block with a try/catch:
val failedDeletes = mutableListOf<String>()
Futures.inCompletionOrder(deleteFutures).forEach {
try {
it.get()
} catch (e: Exception) {
// do something to get the studentId for this future
val currId = ...
failedDeletes.add(currId)
}
}
However, there are 2 things I don't like/know about it. One is that it's mutating a state that we have to define outside. The other is that I still don't know how to get the studentId from the point of failure (in onFailure or catch block).
I have added a code snippet below in JAVA. This is blocking procedure.
ResultSet getUninterruptibly()
Waits for the query to return and return its result. This method is
usually more convenient than Future.get() because it:
Waits for the result uninterruptibly, and so doesn't throw InterruptedException.
Returns meaningful exceptions, instead of having to deal with ExecutionException.
As such, it is the preferred way to get the future result.
Check this link:
Interface ResultSetFuture
List<ResultSetFuture> futures = new ArrayList<>();
List<Long> futureStudentIds = new ArrayList<>();
// List<Long> successfullIds = new ArrayList<>();
List<Long> unsuccessfullIds = new ArrayList<>();
for (long studentid : studentids) {
futures.add(session.executeAsync(statement.deleteStudent(studentid)));
futureStudentIds.add(studentid);
}
for (int index = 0; index < futures.size(); index++) {
try {
futures.get(index).getUninterruptibly();
// successfullIds.add(futureStudentIds.get(index));
} catch (Exception e) {
unsuccessfullIds.add(futureStudentIds.get(index));
LOGGER.error("", e);
}
}
return unsuccessfullIds;
For Non-blocking you can use ListenableFuture.
Asynchronous queries with the Java driver

Rx-Java replace Observable in case of error

I have 2 URLs to fetch the data, for example: \location_1\{userid} and \location_2\{userid}. for the first i get the list of the users and then need to fetch user details by above requests. the issue is that i need to call the \location_1\{userid} and in case there is an error(exception) fetch the data from \location_2\{userid}. is it possible to make it with single rx-chain? i've tried try/catch as described here but looks catch newer calls, only onErrorResumeNext calls.
Observable<List<TestModel2>> observable = apiTest
.performTest()
.flatMapIterable(items -> items)
.flatMap(testModel -> {
try
{
return apiTest.performTest2(testModel.userId);
} catch (Exception e)
{
return apiTest.performTest3(testModel.userId);
}
}).doOnNext(testModel2 -> {Log.d("TestItemData", "doOnNext --- " + testModel2.title);})
.onErrorResumeNext(throwable ->{
Log.d("TestItemData", "onErrorResumeNext -------- ");
return Observable.empty();
})
.toList()
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread());
Use onErrorResumeNext (as you already did a bit later in the flow):
Observable<List<TestModel2>> observable = apiTest
.performTest()
.flatMapIterable(items -> items)
.flatMap(testModel ->
apiTest.performTest2(testModel.userId)
.onErrorResumeNext(e -> apiTest.performTest3(testModel.userId)); // <----------------
)
.doOnNext(testModel2 -> {
Log.d("TestItemData", "doOnNext --- " + testModel2.title);
})
.onErrorResumeNext(throwable ->{
Log.d("TestItemData", "onErrorResumeNext -------- ");
return Observable.empty();
})
.toList()
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread());

RxJava concurrency with multiple subscribers and events

I'm looking for a way to attach multiple subscribers to an RxJava Observable stream, with each subscriber processing emitted events asynchronously.
I first tried using .flatMap() but that didn't seem to work on any subsequent subscribers. All subscribers were processing events on the same thread.
.flatMap(s -> Observable.just(s).subscribeOn(Schedulers.newThread()))
What ended up working was consuming each event in a new thread by creating a new Observable each time:
Observable.from(Arrays.asList(new String[]{"1", "2", "3"}))
.subscribe(j -> {
Observable.just(j)
.subscribeOn(Schedulers.newThread())
.subscribe(i -> {
try {
Thread.sleep(ThreadLocalRandom.current().nextInt(100, 500));
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("s1=>" + Thread.currentThread().getName() + "=>" + i);
});
});
Output:
s1=>RxNewThreadScheduler-1=>1
s1=>RxNewThreadScheduler-2=>2
s1=>RxNewThreadScheduler-3=>3
And the end result with multiple subscribers:
ConnectableObservable<String> e = Observable.from(Arrays.asList(new String[]{"1", "2", "3"}))
.publish();
e.subscribe(j -> {
Observable.just(j)
.subscribeOn(Schedulers.newThread())
.subscribe(i -> {
try {
Thread.sleep(ThreadLocalRandom.current().nextInt(100, 500));
} catch (InterruptedException e1) {
e1.printStackTrace();
}
System.out.println("s1=>" + Thread.currentThread().getName() + "=>" + i);
});
});
e.subscribe(j -> {
Observable.just(j)
.subscribeOn(Schedulers.newThread())
.subscribe(i -> {
try {
Thread.sleep(ThreadLocalRandom.current().nextInt(100, 500));
} catch (InterruptedException e1) {
e1.printStackTrace();
}
System.out.println("s2=>" + Thread.currentThread().getName() + "=>" + i);
});
});
e.connect();
Output:
s2=>RxNewThreadScheduler-4=>2
s1=>RxNewThreadScheduler-1=>1
s1=>RxNewThreadScheduler-3=>2
s2=>RxNewThreadScheduler-6=>3
s2=>RxNewThreadScheduler-2=>1
s1=>RxNewThreadScheduler-5=>3
However, this seems a little clunky. Is there a more elegant solution or is RxJava just not a good use case for this?
Use .flatMap(s -> Observable.just(s).observeOn(Schedulers.newThread())....)
if I understood the rx-contract correctly, you are trying to do something, which is against it.
Lets have a look at the contract
The contract of an RxJava Observable is that events ( onNext() , onCompleted() , onEr
ror() ) can never be emitted concurrently. In other words, a single Observable
stream must always be serialized and thread-safe. Each event can be emitted from a
different thread, as long as the emissions are not concurrent. This means no inter‐
leaving or simultaneous execution of onNext() . If onNext() is still being executed on
one thread, another thread cannot begin invoking it again (interleaving). --Tomasz Nurkiewicz in Reactive Programming with RxJava
In my opinion you are trying to break the contract by using a nested subscription in the outer subscription. The onNext call to the subscriber is not serialized anymore.
Why not move the "async"-workload from the subscriber to a flatMap-operator and subscribe to the new observable:
ConnectableObservable<String> stringObservable = Observable.from(Arrays.asList(new String[]{"1", "2", "3"}))
.flatMap(s -> {
return Observable.just(s).subscribeOn(Schedulers.computation());
})
.publish();
stringObservable
.flatMap(s -> {
// do More asyncStuff depending on subscription
return Observable.just(s).subscribeOn(Schedulers.newThread());
})
.subscribe(s -> {
// use result here
});
stringObservable
.subscribe(s -> {
// use immediate result here.
});
stringObservable.connect();
flatMap along with doOnNext on the Observable inside the flatMap will result in the same output as yours.
onNext() is always called in a sequential manner hence using doOnNext after the flatMap will also not work for you. Due to the same reason writing the action inside the final subscribe didn't work in your case.
The below code is written using RxJava2. In version 1 of RxJava you will have to add the try-catch block around Thread.sleep.
ConnectableObservable<String> e = Observable.just("1", "2", "3").publish();
e.flatMap(
s -> Observable.just(s)
.subscribeOn(Schedulers.newThread())
.doOnNext(i -> { // <<<<<<
Thread.sleep(ThreadLocalRandom.current().nextInt(100, 500));
System.out.println("s1=>" + Thread.currentThread().getName() + "=>" + i);
}))
.subscribe();
e.flatMap(
s -> Observable.just(s)
.subscribeOn(Schedulers.newThread())
.doOnNext(i -> { // <<<<<<
Thread.sleep(ThreadLocalRandom.current().nextInt(100, 500));
System.out.println("s2=>" + Thread.currentThread().getName() + "=>" + i);
}))
.subscribe();
e.connect();
You can achieve it with Flowable and parallel:
Flowable.fromIterable(Arrays.asList("1", "2", "3"))
.parallel(3)
.runOn(Schedulers.newThread())
.map(item -> {
try {
Thread.sleep(ThreadLocalRandom.current().nextInt(100, 500));
} catch (InterruptedException e1) {
e1.printStackTrace();
}
System.out.println("s1=>" + Thread.currentThread().getName() + "=>" + item);
return Completable.complete();
})
.sequential().subscribe();

Categories

Resources