Reactor not honoring runOn after flatMap call

Reactor not honoring runOn after flatMap call - java

I am using the SpringData MongoDB Reactive Streams driver with code that does something like this:
reactiveMongoOperations.changeStream(changeStreamOptions, MyObject.class)
.parallel()
.runOn(Schedulers.newParallel("my-scheduler", 4))
.map(ChangeStreamEvent::getBody)
.flatMap(o -> {
reactiveMongoOperations.findAndModify(query, update, options, MyObject.class)
})
.subscribe(this::process)
I would expected everything to execute in my-scheduler. What actually happens is that the flatMap operation does execute in my-scheduler, while the code in my process() method does not.
Can someone please explain why this is so - is this a bug or am I doing something wrong? How can I get all the operations defined in the Flux to execute on the same scheduler?

runOn() specifies the scheduler which is used to run each "rail" of the parallel thread. It doesn't effect subscribers.
If you want to specify a scheduler for subscribers, then you should specify that using subscribeOn() on the original Flux (before the parallel() call.)

Related

Reactor: function creating Monos to Flux

Basically, I'm making a queue processor in Spring Boot and want to use Reactor for async. I've made a function needs to loop forever as it's the one that pulls from the queue then marks the item as processed.
here's the blocking version that works Subscribe returns a Mono
while(true) {
manager.Subscribe().block()
}
I'm not sure how to turn this into a Flux I've looked a interval, generate, create, etc. and I can't get anything to work without calling block()
Here's an example of what I've tried
Flux.generate(() -> manager,
(state, sink) -> {
state.Subscribe().block();
sink.next("done");
return state;
}));
Being a newbie to Reactor, I haven't been able to find anything about just loop and processing the Monos synchronously without blocking.
Here's what the Subscribe method does using the AWS Java SDK v2:
public Mono Subscribe() {
return Mono.fromFuture(_client.receiveMessage(ReceiveMessageRequest.builder()
.waitTimeSeconds(10)
.queueUrl(_queueUrl)
.build()))
.filter(x -> x.messages() != null)
.flatMap(x -> Mono.when(x.messages()
.stream()
.map(y -> {
_log.warn(y.body());
return Mono.fromFuture(_client.deleteMessage(DeleteMessageRequest.builder()
.queueUrl(_queueUrl)
.receiptHandle(y.receiptHandle())
.build()));
}).collect(Collectors.toList())));
}
Basically, I'm just polling an SQS queue, deleting the messages then I want to do it again. This is all just exploratory for me.
Thanks!

You need two things: a way to subscribe in a loop and a way to ensure that the Subscribe() method is effectively called on each iteration (because the Future needs to be recreated).
repeat() is a baked in operator that will resubscribe to its source once the source completes. If the source errors, the repeat cycle stops. The simplest variant continues to do so Long.MAX_VALUE times.
The only problem is that in your case the Mono from Subscribe() must be recreated on each iteration.
To do so, you can wrap the Subscribe() call in a defer: it will re-invoke the method each time a new subscription happens, which includes each repeat attempt:
Flux<Stuff> repeated = Mono
.defer(manager::Subscribe)
.repeat();

RxJava 2 Zip operation in different threads

I have created a really simple example using RxJava 2 (everything I have developed was using RxJava 1) and I have found next behavior that I don't understand at all. I have next Observable with zip operation:
Observable.zip(getGame(gameId), getDetail(gameId), getReviews(gameId),
(game, detail, reviews) -> new GameInfo(game, detail, reviews))
.subscribeOn(Schedulers.newThread())
.subscribe(sendGameInfo(asyncResponse));
Each of the methods returns an instance of Observable. In theory, I would expect that each of the method (getGame, getDetail, ...) would be executed in parallel in a new Thread, but doing a sysout I noticed that all the time is the same Thread so they are not executed in parallel. I suppose that this is the expected behavior but if I would like to make in parallel is there a way to do it without having to define a runnable inside each of the observable?
Thank you very much.

Ok you need to subscribeOn every Observable
Observable.zip(getGame(gameId)
.subscribeOn(Schedulers.from(executor)),
getDetail(gameId)
.subscribeOn(Schedulers.from(executor)),
getReviews(gameId)
.subscribeOn(Schedulers.from(executor)),
(game, detail, reviews) -> new GameInfo(game, detail, reviews))
.subscribeOn(Schedulers.from(executor))
.subscribe(sendGameInfo(asyncResponse));

How can I explicitly signal completion of a Flowable in RxJava?

I'm trying to create a Flowable which is wrapping an Iterable. I push elements to my Iterable periodically but it seems that the completion event is implicit. I don't know how to signal that processing is complete. For example in my code:
// note that this code is written in Kotlin
val iterable = LinkedBlockingQueue<Int>()
iterable.addAll(listOf(1, 2, 3))
val flowable = Flowable.fromIterable(iterable)
.subscribeOn(Schedulers.computation())
.observeOn(Schedulers.computation())
flowable.subscribe(::println, {it.printStackTrace()}, {println("completed")})
iterable.add(4)
Thread.sleep(1000)
iterable.add(5)
Thread.sleep(1000)
This prints:
1
2
3
4
completed
I checked the source of the Flowable interface but it seems that I can't signal that a Flowable is complete explicitly. How can I do so? In my program I publish events which have some delay between them and I would like to be explicit when to complete the event flow.
Clarification:
I have a long running process which emits events. I gather them in a queue and I expose a method which returns a Flowable which wraps around my queue. The problem is that there might be already elements in the queue when I create the Flowable. I will process the events only once and I know when the flow of events stops so I know when I need to complete the Flowable.

Using .fromIterable is the wrong way to create a Flowable for your use case.
Im not actually clear on what that use case is, but you probably want to use Flowable.create() or a PublishSubject
val flowable = Flowable.create<Int>( {
it.onNext(1)
it.onNext(2)
it.onComplete()
}, BackpressureStrategy.MISSING)
val publishSubject = PublishSubject.create<Int>()
val flowableFromSubject = publishSubject.toFlowable(BackpressureStrategy.MISSING)
//This data will be dropepd unless something is subscribed to the flowable.
publishSubject.onNext(1)
publishSubject.onNext(2)
publishSubject.onComplete()
Of course how you deal with back-pressure will depend on the nature of the source of data.

Like suggested by akarnokd, ReplayProcessor do exactly what you want. Replace iterable.add(item) with processor.onNext(item), and call processor.onComplete() when you are done.

Returning multiple values with CompletableFuture.supplyAsync

I am writing a program to download historical quotes from a source. The source provides files over http for each day which need to be parsed and processed. The program downloads multiple files in parallel using a CompletableFuture using different stages. The first stage is to make a Http call using HttpClient and get the response.
The getHttpResponse() method returns a CloseableHttpResponse Object. I also want to return a url for which this http request was made. Simplest way is to have a wrapper object having these 2 fields, but i feel it is too much to have a class just to contain these 2 fields. Is there a way with CompletableFuture or Streams that I can achieve this?
filesToDownload.stream()
.map(url -> CompletableFuture.supplyAsync(() -> this.getHttpResponse(url), this.executor) )
.map(httpResponseFuture -> httpResponseFuture.thenAccept(t -> processHttpResponse(t)))
.count();

It’s not clear why you want to bring in the Stream API at all costs. Splitting the CompletableFuture use into two map operations causes the problem which wouldn’t exist otherwise. Besides that, using map for side effects is an abuse of the Stream API. This may break completely in Java 9, if filesToDownload is a Stream source with a known size (like almost every Collection). Then, count() will simply return that known size, without processing the functions of the map operations…
If you want to pass the URL and the CloseableHttpResponse to processHttpResponse, you can do it as easy as:
filesToDownload.forEach(url ->
CompletableFuture.supplyAsync(() -> this.getHttpResponse(url), this.executor)
.thenAccept( t -> processHttpResponse(t, url))
);
Even, if you use the Stream API to collect results, there is no reason to split the CompletableFuture into multiple map operations:
List<…> result = filesToDownload.stream()
.map(url -> CompletableFuture.supplyAsync(() -> this.getHttpResponse(url), this.executor)
.thenApply( t -> processHttpResponse(t, url)) )
.collect(Collectors.toList())
.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
Note that this will collect the CompletableFutures into a List before waiting for any result in a second Stream operation. This is preferable to using a parallel Stream operation as it ensures that all asynchronous operations have been submitted, before starting to wait.
Using a single Stream pipeline would imply waiting for the completion of the first job before even submitting the second and using a parallel Stream would only reduce that problem instead of solving it. It would depend on the execution strategy of the Stream implementation (the default Fork/Join pool), which interferes with actual policy of your specified executor. E.g., if the specified executor is supposed to use more threads than CPU cores, the Stream would still submit only as much jobs at a time as there are cores — or even less if there are other jobs on the default Fork/Join pool.
In contrast, the behavior of the solution above will be entirely controlled by the execution strategy of the specified executor.

What will happen after timeout when using Observable.timeout?

I have an Observable that go to database and query for some information. I don't want my observable executes longer than 5 seconds, thus I use:
myObservable.timeout(5,second);
Then I want to handle the error notification also, thus I use:
myObservable.timeout(5,second).onError(return empty result);
Then I wonder for what will happen to the code in myObservable that is used to do database query. Will it also be terminated, or it will continue to run ? (which happens to Java native Future.get(timeLimit))

Let's take an example :
Observable.interval(1, TimeUnit.SECONDS)
.timeout(10, TimeUnit.MICROSECONDS)
.onErrorReturn(e -> -1L)
.subscribe(System.out::println,
Throwable::printStackTrace,
() -> System.err.println("completed"));
the timeout operator will emit an error. But precedent operators won't be notifier of this error.
The operator onErrorReturn will transform your error to an event and then will complete your stream (and mark it as finished) and then your source observable will be unsubscribe.
This unsubscription part will run some code that, depending of how your source observable is written, that may stop your request, or just do nothing, or free some resources.
In your case, it may call the cancel method on your Future (according to the Subscriptions class)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reactor not honoring runOn after flatMap call - java

runOn() specifies the scheduler which is used to run each "rail" of the parallel thread. It doesn't effect subscribers. If you want to specify a scheduler for subscribers, then you should specify that using subscribeOn() on the original Flux (before the parallel() call.)

Related

Reactor: function creating Monos to Flux

RxJava 2 Zip operation in different threads

How can I explicitly signal completion of a Flowable in RxJava?

Returning multiple values with CompletableFuture.supplyAsync

What will happen after timeout when using Observable.timeout?

Categories

Resources