how to log pipelines in Java project reactor? - java

I have started a new project that uses java reactor and spring webFLux. Recently I had to debug a production bug and It was a nightmare since they do not log anything. So, reading I found two ways to start adding logs to pipelines. One by using .log() and another one by using onErrorResume, doOnSubscribe, doOnSuccess. Do you know which one should I use? Are better ways to log pipelines?
return repositoryName.findById(event.eventId())
.filter(event -> event.completedDate() == null)
.filterWhen(event ->
externalService.getEventSummary(event.getUser().userId())
.doOnNext(event -> log.info("Event found {} and should be marked as found", event.id()))

I highly recommend the Reactor doc to check what each method does https://projectreactor.io/docs/core/release/reference/
doOnNext, doOnError, doOnSuccess are highly used to execute blocking operations without side effects on the current stream. If you received a Square object on your doOnNext and doOnError, this method would return precisely the received Square object (even if you changed its properties)
doOnNext to execute after the previous sequence finishes try-catch
doOnError the subsequent execution if an exception was thrown (and there's no other exception handler on stream)
doOnSuccess is similar to a finally block from try-catch
onErrorResume is most used to handle exceptions if needed, similar to catch(Exception e), but it will return a new object
As you can see, it's okay to have logs on flatMap, map, doOnNext, doOnError, or onErrorResume. The correct one to use will vary on what you need. If logging is the only thing you need, I recommend doOnNext, doOnError, or doOnFinally.
As said before, please refer to the docs too to check what each method stands for

Related

Find out all java code places where a blocking operation happens

In our Netty application. We are moving all blocking calls in our code to run in a special backgroundThreadGroup.
I'd like to be able to log in production the threadName and the lineNumber of the java code that is about to execute a blocking operation. (i.e. sync File and Network IO)
That way I can grep for the logs looking at places were we might have missed to move our blocking code to the backgroundThreadGroup.
Is there a way to instrument the JVM so that it can tell me that?
Depends on what you mean by a "blocking operation".
In a broad sense, any operation that causes a voluntary context switch is blocking. Trying to do something special about them is absolutely impractical.
For example, in Java, any method containing synchronized is potentially blocking. This includes
ConcurrentHashMap.put
SecureRandom.nextInt
System.getProperty
and many more. I don't think you really want to avoid calling all these methods that look normal at a first glance.
Even simple methods without any synchronization primitives can be blocking. E.g., ByteBuffer.get may result in a page fault and a blocking read on the OS level. Furthermore, as mentioned in comments, there are JVM level blocking operations that are not under your control.
In short, it's impractical if not impossible to find all places in the code where a blocking operation happens.
If, however, you are interested in finding particular method calls that you believe are bad (like Thread.sleep and Socket.read), you can definitely do so. There is a BlockHound project specifically for this purpose. It already has a predefined list of "bad" methods, but can be customized with your own list.
There is a library called BlockHound, that will throw an exception unless you have configured BlockHound to ignore that specific blocking call
This is how you configure BlockHound for Netty: https://github.com/violetagg/netty/blob/625f9d5781ed85bfaca6fa4e826d0d46d70fdbd8/common/src/main/java/io/netty/util/internal/Hidden.java
(You can improve the above code by replacing the last line with builder.nonBlockingThreadPredicate(
p -> p.or(thread -> thread instanceof FastThreadLocalThread)); )
see https://github.com/reactor/BlockHound
see https://blog.frankel.ch/blockhound-how-it-works/
I personally used it to find all blocking call within our Netty based service.
Good Luck

Is chaining multiple CompletableFuture calls, each with its own exception logic a good practice?

I have a flow in my project, which uploads a resource asynchronously, then performs an operation on it, and uses job polling to get the new resource with all operations performed. I've written each of this wrapped inside a completable future, making a chain of 4 future objects. Each call has its own response parsing and exception logic, which I've added to the chain as well. This creates something like
CompletableFuture.supplyAsync(() -> mysupplier1).exceptionally(exceptionlogic).thenApplyAsync(() -> mysupplier2).exceptionally(exceptionlogic).thenApplyAsync(() ->mysupplier3).exceptionally(exceptionlogic)
I'd like to know if this way of using a CompletableFuture a good practice in general. It's not particularly hard to read but I feel this might be tough to debug.

Is it reasonable to throw an exception from an asynchronous method?

Developing in Java an asynchronous method with a CompletableFuture return type we expect the resulting CF to complete normally or exceptionally depending on whether that method succeeds or fails.
Yet, consider for instance that my method writes to an AsynchronousChannel and got an exception opening that channel. It has not even started writing. So, in this case I am tenting to just let the exception flow to the caller. Is that correct?
Although the caller will have to deal with 2 failure scenarios: 1) exception, or 2) rejected promise.
Or alternatively, should my method catch that exception and return a rejected promise instead?
IMO, option 1) makes the API harder to use because there will be two different paths for communicating errors:
"Synchronous" exceptions, where the method ends the an exception being thrown.
"Asynchronous" exceptions, where the method returns a CF, which completes with an exception. Note that it is impossible to avoid this case, because there will always be situations where the errors are only found after the asynchronous path has started (e.g. timeouts).
The programmer now has to ensure both these two paths are correctly handled, instead of just one.
It is also interesting to observe that the behaviour for both C# and Javascript is to always report exceptions thrown inside the body of an async function via the returned Task/Promise, even for exceptions thrown before the first await, and never by ending the async function call with an exception.
The same is also true for Kotlin's coroutines, even when using the Unconfined dispatcher
class SynchronousExceptionExamples {
#Test
fun example() {
log.info("before launch")
val job = GlobalScope.launch(Dispatchers.Unconfined) {
log.info("before throw")
throw Exception("an-error")
}
log.info("after launch")
Thread.sleep(1000)
assertTrue(job.isCancelled)
}
}
will produce
6 [main] INFO SynchronousExceptionExamples - before launch
73 [main #coroutine#1] INFO SynchronousExceptionExamples - before throw
(...)
90 [main] INFO SynchronousExceptionExamples - after launch
Note as the exception occurs in the main thread, however launch ends with a proper Job.
I think both are valid designs. Datastax actually started their design with first approach, where borrowing a connection was blocking, and switched to fully async model (https://docs.datastax.com/en/developer/java-driver/3.5/upgrade_guide/#3-0-4)
As a user of datastax java driver I was very happy with the fix, as it changed the api to be truly non-blocking (even opening a channel, in your example, has a cost).
But I don't think there are right and wrong here...
It doesn't make a big difference from the callers point of view. In either case there will be visibility of the cause of the exception whether it it thrown from the method or from calling get() on the completable future.
I would perhaps argue that an exception thrown by the completable future should be an exception from the async computation and not failing to start that computation.

Using ForkJoinPool on a set of documents

I have never used a ForkJoinPool and I came accross this code snippet.
I have a Set<Document> docs. Document has a write method. If I do the following, do I need to have a get or join to ensure that all the docs in the set have correctly finished their write method?
ForkJoinPool pool = new ForkJoinPool(concurrencyLevel);
pool.submit(() -> docs.parallelStream().forEach(
doc -> {
doc.write();
})
);
What happens if one of the docs is unable to complete it's write? Say it throws an exception. Does the code given wait for all the docs to complete their write operation?
ForkJoinPool.submit(Runnable) returns a ForkJoinTask representing the pending completion of the task. If you want to wait for all documents to be processed, you need some form of synchronization with that task, like calling its get() method (from the Future interface).
Concerning the exception handling, as usual any exception during the stream processing will stop it. However you have to refer to the documentation of Stream.forEach(Consumer):
The behavior of this operation is explicitly nondeterministic. For parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. […]
This means that you have no guarantee of which document will be written if an exception occurs. The processing will stop but you cannot control which document will still be processed.
If you want to make sure that the remaining documents are processed, I would suggest 2 solutions:
surround the document.write() with a try/catch to make sure no exception propagates, but this makes it difficult to check which document succeeded or if there was any failure at all; or
use another solution to manage your parallel processing, like the CompletableFuture API. As noted in the comments, your current solution is a hack that works thanks to implementation details, so it would be preferable to do something cleaner.
Using CompletableFuture, you could do it as follows:
List<CompletableFuture<Void>> futures = docs.stream()
.map(doc -> CompletableFuture.runAsync(doc::write, pool))
.collect(Collectors.toList());
This will make sure that all documents are processed, and inspect each future in the returned list for success or failure.

Play Framework - Parallel promises with partial acceptable failure

I currently try to call a bunch of webservices in parallel. In the end I want to evaluate all the responses. Therefore I use Promise.sequence. Unfortunately the whole method fails if one of the web calls failed. I would be satisfied if I just get the response of the succeeded calls.
Is there some way to perform the Promise.sequence and just retrieve the succeeded calls? After that it would be nice to handle the failed calls in any separate way.
I found a solution for now. For each Promise i create via ws.url("http://...").get() i define a recover method, e.g.
ws.url(theUrl).get().recover((t) -> null)
So when these Promises are processed via Promise.sequence no error is thrown (because it was already catched by the recover of the particular WS call promise).
Later on I just have to check if a result is null and then drop it from further processing.

Categories

Resources