I've been reading throughout the Reactor documentation, but I was not being able to find proper pattern for the following problem.
I have a method that is supposed to do something asynchronously. I returns the result responses in form of a Flux and the consumer could subscribe to it.
The method has following definition:
Flux<ResultMessage> sendRequest(RequestMessage message);
The returning flux is a hot flux, results can come at any given time asynchronously.
The potential consumer could use it in following manner:
sendRequest(message).subscribe(response->doSomethinWithResponse(response);
An implementation can be like this:
Flux<ResultMessage> sendRequest(RequestMessage message) {
Flux<ResultMessage> result = incomingMessageStream
.filter( resultMessage -> Objects.equals( resultMessage.getId(), message.getId() ) )
.take( 2 );
// The message sending is done here...
return result;
}
Where the incomingMessageStream is a Flux of all messages going through this channel.
Problem with this implementation is that consumer is subscribed after the result messages are coming, and it can miss some of them.
So, what I am looking for is a solution that will allow consumer not to depend on time of subscription. A potential consumer may not be required to subscribe to resulting Flux at all. I am looking for a general solution, but if it is not possible you can assume that number of resulting messages is not greater than 2.
After some time I created a solution that seems to work:
Flux<ResultMessage> sendRequest(RequestMessage message) {
final int maxResponsesCount = 2;
final Duration responseTimeout = Duration.ofSeconds( 10 );
final Duration subscriptionTimeout = Duration.ofSeconds( 5 );
// (1)
ConnectableFlux<ResultMessage> result = incomingMessageStream
.ofType( ResultMessage.class )
.filter( resultMessage ->Objects.equals(resultMessage.getId(), message.getId() ) )
.take( maxResponsesCount )
.timeout( responseTimeout )
.replay( maxResponsesCount );
Disposable connectionDisposable = result.connect();
// (2)
AtomicReference<Subscription> subscriptionForCancelSubscription = new AtomicReference<>();
Mono.delay( subscriptionTimeout )
.doOnSubscribe( subscriptionForCancelSubscription::set )
.subscribe( x -> connectionDisposable.dispose() );
// The message sending is done here...
// (3)
return result
.doOnSubscribe(s ->subscriptionForCancelSubscription.get().cancel())
.doFinally( signalType -> connectionDisposable.dispose() );
}
I am using a ConnectableFlux that connects to the stream immediately, without subscribing, which is set to use reply() method to store all messages, so any subscriber at the later point would not miss response messages (1).
There are few paths this can be executed:
Method is called and no subscription has performed on the flux
Solution - there is a timer that removes the connected flux resource after 5 seconds if no subscription is done. (2)
Method is called and subscribed to the flux
2.1. No message has been returned
Solution - there is a timeout set for getting responses (.timeout( responseTimeout )). After that .doFinally(..) cleans the resources (1)(3).
2.2. Some of response messages have been returned
Solution - same as 2.1.
2.3. All response messages have been returned
Solution - The doFinally() is executed due to max number of elements reached ( .take( maxResponsesCount ) ) (1)(3)
I am yet to perform some serious testing on this, if something goes wrong, I'll add the correction to this answer.
Related
I am using Kafka DSL. How would I proceed to suppress the output of an aggregation (similar behavior to this) with a custom condition?
Let's say for every key I may have a START and a STOP event. I only want to aggregate this key when the STOP event arrives or after a timeout.
The desired flow would be something roughly like this:
time input-topic output-topic
1 key1:{type:start, time: 0} ...
3 key2:{type:start, time: 2} ...
4 key1:{type:stop, time:3} ...
4+e ... key1:{type:closed, duration:3}
61 ... ...
61+e ... key2:{type:timeout, duration:60}
where the timeout is 60 units of time and e is a an arbitrary time the stream takes to process the event.
The code (pseudocode for now) would be something like
KStream<String,String> sourceStream = builder.stream("input-topic", Consumed.with(stringSerializer, stringSerializer));
KGroupedStream<String, String> groupedStream = sourceStream
.groupByKey();
KTable<String, String> aggregatedStream = groupedStream
.suppress(Suppressed.untilWindowCloses(myCustomCondition()))
.aggregate(
() -> null,
(aggKey, newValue, aggValue) -> aggregateStartStop(aggValue, newValue),
Materialized
.<String, String, KeyValueStore<Bytes, byte[]>>as("aggregated-stream-store")
.withValueSerde(Serdes.String())
);
aggregatedStream.toStream();
KafkaStreams streams = new KafkaStreams(builder.build(), streamsSettings);
streams.start();
You could use the KTable to store the state (in your case, the type) along with a 60 second window. Whenever you receive an event for that particular key you update the state and time. Then you can use a filter before a .to() method to either send or not send a message to the outgoing topic based on the state (type).
Take a look at Neil Avery's blog post here:
https://www.confluent.io/blog/journey-to-event-driven-part-4-four-pillars-of-event-streaming-microservices
And scroll down to Event Flow Breakdown 1. Payments inflight
It's where I got the idea from.
I ran this:
Mono<Void> mono = Mono.empty();
System.out.println("mono.block: " + mono.block());
and it produces:
mono.block: null
as expected. In other words, calling block will return immediately if the Mono already completed.
Another example, resembling the real-world scenario. I have a source flux, e.g.:
Flux<Integer> ints = Flux.range(0, 2);
I make a connectable flux that I will use to allow multiple subscribers:
ConnectableFlux<Integer> publish = ints.publish();
For this example, let's say there's a single real-work subscriber:
publish
.doOnComplete(() -> System.out.println("publish completed"))
.subscribe();
and another subscriber that just produces the element count:
Mono<Long> countMono = publish
.doOnComplete(() -> System.out.println("countMono completed"))
.count();
countMono.subscribe();
I connect the connectable flux and print the element count:
publish.connect();
System.out.println("block");
long count = countMono.block();
System.out.println("count: " + count);
This prints:
publish completed
countMono completed
block
In other words, both subscribers subscribe successfully and complete, but then countMono.block() blocks indefinitely.
Why is that and how do I make this work? My end goal is to get the count of the elements.
You can get this to work by using autoConnect or refCount instead of manually calling connect().
For example:
Flux<Integer> ints = Flux.range(0, 2);
Flux<Integer> publish = ints.publish()
.autoConnect(2); // new
publish
.doOnComplete(() -> System.out.println("publish completed"))
.subscribe();
Mono<Long> countMono = publish
.doOnComplete(() -> System.out.println("countMono completed"))
.count();
// countMono.subscribe();
long count = countMono.block();
System.out.println("count: " + count);
Why does your example not work?
Here is what I think is happening in your example... but this is based on my limited knowledge, and I'm not 100% sure it is correct.
.publish() turns the upstream source into a hot stream
You then subscribe twice (but these don't start the flow yet, since the connectable flux is not connected to the upstream yet)
.connect() subscribes to the upstream, and starts the flow
The upstream, and the two subscriptions that were registered before connect() complete (since this is all happening in the main thread)
At this point the ConnectableFlux is no longer connected to the upstream, because the upstream has completed (The reactor docs are light on details on what happens to a ConnectableFlux when new subscriptions arrive after the upstream source completes, so this is what I'm not 100% certain about.)
block() creates a new subscription.
But since the ConnectableFlux is no longer connected, no data is flowing
If you were to call connect() again (from another thread, since the main thread is blocked), data would flow again, and the block() would complete. However, this would be a new sequence (not the original sequence that completed in step 4)
Why does my example work?
Only two subscriptions are created (instead of 3 in your example), one from a .subscribe() call, and one from .block(). The ConnectableFlux auto connects after 2 subscriptions, and therefore the block() subscription completes. Both subscriptions share the same upstream sequence.
After spending the day of learning about the java Concurrency API, I still dont quite get how could I create the following functionality with the help of CompletableFuture and ExecutorService classes:
When I get a request on my REST endpoint I need to:
Start an asynchronous task (includes DB query, filtering, etc.), which will give me a list of String URLs at the end
In the meanwhile, responde back to the REST caller with HTTP OK, that the request was received, I'm working on it
When the asynchronous task is finished, I need to send HTTP requests (with the payload, the REST caller gave me) to the URLs I got from the job. At most the number of URLs would be around a 100, so I need these to happen in parallel.
Ideally I have some syncronized counter which counts how many of the http requests were a success/fail, and I can send this information back to the REST caller (the URL I need to send it back to is provided inside the request payload).
I have the building blocks (methods like: getMatchingObjectsFromDB(callerPayload), getURLs(resultOfgetMachingObjects), sendHttpRequest(Url, methodType), etc...) written for these already, I just cant quite figure out how to tie step 1 and step 3 together. I would use CompletableFuture.supplyAsync() for step 1, then I would need the CompletableFuture.thenComponse method to start step 3, but it's not clear to me how parallelism can be done with this API. It is rather intuitive with ExecutorService executor = Executors.newWorkStealingPool(); though, which creates a thread pool based on how much processing power is available and the tasks can be submitted via the invokeAll() method.
How can I use CompletableFutureand ExecutorService together? Or how can I guarantee parallel execution of a list of tasks with CompletableFuture? Demonstrating code snippet would be much appreciated. Thanks.
You should use join() to wait for all thread finish.
Create Map<String, Boolean> result to store your request result.
In your controller:
public void yourControllerMethod() {
CompletableFuture.runAsync(() -> yourServiceMethod());
}
In your service:
// Execute your logic to get List<String> urls
List<CompletableFuture> futures = urls.stream().map(v ->
CompletableFuture.supplyAsync(url -> requestUrl(url))
.thenAcceptAsync(requestResult -> result.put(url, true or false))
).collect(toList()); // You have list of completeable future here
Then use .join() to wait for all thread (Remember that your service are executed in its own thread already)
CompletableFuture.allOf(futures).join();
Then you can determine which one success/fail by accessing result map
Edit
Please post your proceduce code so that other may understand you also.
I've read your code and here are the needed modification:
When this for loop was not commented out, the receiver webserver got
the same request twice,
I dont understand the purpose of this for loop.
Sorry in my previous answer, I did not clean it up. That's just a temporary idea on my head that I forgot to remove at the end :D
Just remove it from your code
// allOf() only accepts arrays, so the List needed to be converted
/* The code never gets over this part (I know allOf() is a blocking call), even long after when the receiver got the HTTP request
with the correct payload. I'm not sure yet where exactly the code gets stuck */
Your map should be a ConcurrentHashMap because you're modifying it concurrently later.
Map<String, Boolean> result = new ConcurrentHashMap<>();
If your code still does not work as expected, I suggest to remove the parallelStream() part.
CompletableFuture and parallelStream use common forkjoin pool. I think the pool is exhausted.
And you should create your own pool for your CompletableFuture:
Executor pool = Executors.newFixedThreadPool(10);
And execute your request using that pool:
CompletableFuture.supplyAsync(YOURTASK, pool).thenAcceptAsync(Yourtask, pool)
For the sake of completion here is the relevant parts of the code, after clean-up and testing (thanks to Mạnh Quyết Nguyễn):
Rest controller class:
#POST
#Path("publish")
public Response publishEvent(PublishEvent eventPublished) {
/*
Payload verification, etc.
*/
//First send the event to the right subscribers, then send the resulting hashmap<String url, Boolean subscriberGotTheRequest> back to the publisher
CompletableFuture.supplyAsync(() -> EventHandlerService.propagateEvent(eventPublished)).thenAccept(map -> {
if (eventPublished.getDeliveryCompleteUri() != null) {
String callbackUrl = Utility
.getUri(eventPublished.getSource().getAddress(), eventPublished.getSource().getPort(), eventPublished.getDeliveryCompleteUri(), isSecure,
false);
try {
Utility.sendRequest(callbackUrl, "POST", map);
} catch (RuntimeException e) {
log.error("Callback after event publishing failed at: " + callbackUrl);
e.printStackTrace();
}
}
});
//return OK while the event publishing happens in async
return Response.status(Status.OK).build();
}
Service class:
private static List<EventFilter> getMatchingEventFilters(PublishEvent pe) {
//query the database, filter the results based on the method argument
}
private static boolean sendRequest(String url, Event event) {
//send the HTTP request to the given URL, with the given Event payload, return true if the response is positive (status code starts with 2), false otherwise
}
static Map<String, Boolean> propagateEvent(PublishEvent eventPublished) {
// Get the event relevant filters from the DB
List<EventFilter> filters = getMatchingEventFilters(eventPublished);
// Create the URLs from the filters
List<String> urls = new ArrayList<>();
for (EventFilter filter : filters) {
String url;
try {
boolean isSecure = filter.getConsumer().getAuthenticationInfo() != null;
url = Utility.getUri(filter.getConsumer().getAddress(), filter.getPort(), filter.getNotifyUri(), isSecure, false);
} catch (ArrowheadException | NullPointerException e) {
e.printStackTrace();
continue;
}
urls.add(url);
}
Map<String, Boolean> result = new ConcurrentHashMap<>();
Stream<CompletableFuture> stream = urls.stream().map(url -> CompletableFuture.supplyAsync(() -> sendRequest(url, eventPublished.getEvent()))
.thenAcceptAsync(published -> result.put(url, published)));
CompletableFuture.allOf(stream.toArray(CompletableFuture[]::new)).join();
log.info("Event published to " + urls.size() + " subscribers.");
return result;
}
Debugging this was a bit harder than usual, sometimes the code just magically stopped. To fix this, I only put code parts into the async task which was absolutely necessary, and I made sure the code in the task was using thread-safe stuff. Also I was a dumb-dumb at first, and my methods inside the EventHandlerService.class used the synchronized keyword, which resulted in the CompletableFuture inside the Service class method not executing, since it uses a thread pool by default.
A piece of logic marked with synchronized becomes a synchronized block, allowing only one thread to execute at any given time.
I have an observable that:
emits data after few seconds.
can be triggered several times.
the operation can't be executed in parallel. So we need a buffer.
I understand that this isn't clear so let me explain with example:
Observable<IPing> pingObservable = Observable.defer(() ->
new PingCommand(account, folders)
.post()
.asObservable()
);
this is the main feature. It shouldn't be called again while a previous one is executing, but it should remember that user requests it again. So I created close buffer as PublishSubject
closeBuffer = PublishSubject.create();
now I'm wondering how to merge it.
I have tried this:
Observable.defer(() -> new PingCommand(account, folders)
.post()
.asObservable()
.buffer(() -> closeBuffer)
.flatMap(Observable::from)
.first()
);
but it is not working as I want.
Edit:
I will try to explain that better:
I'm sending POST to the server - We can wait for a response several MINUTES (because it is Exchange ActiveSync PUSH). I cannot ping again while one request is sending. So I have to wait until one request is done. I don't need to buffer those observables - just information if an user is requesting ping - and send request after a first one is done. I'm just learning reactive so I don't know how to really use complicated functions like backpressure.
This is how I want this problem to be solved (pseudo code)
??????<Result> request
= ????.???()
.doOnNext( result -> { … })
.doOnSubscribe(() -> { … })
.doOnCompleted(() -> { … })
.…
//__________________________________________________________
Observable<Result> doAsyncWork(Data data) { … } // this is API function
//__________________________________________________________
// api usage example
Subscription s1 = doAsyncWork(someData).subscribe() // start observing async work; executed doOnSubscribe
Subscription s2 = doAsyncWork(someData).subscribe() // wait for async work result …
//__________________________________________________________
// after some time pass, maybe from other thread
Subscription s1 = doAsyncWork(someData).subscribe() // wait for async work result …
//__________________________________________________________
// async work completes, all subscribers obtain the same result; executed doOnCompleted
//__________________________________________________________
// again
Subscription s1 = doAsyncWork(someData).subscribe() // start observing async work; executed doOnSubscribe
// async work completes, subscriber obtains result; executed doOnCompleted
Obviously, I can use if instead but I want to know how to do it in a proper way.
I have an Observable (which obtains data from network).
The problem is that observable can be fast or slow depending on network conditions.
I show progress widget, when observable is executing, and hide it when observable completes. When the network is fast - progress flikers (appears and disappears). I want to set minimum execution time of observable to 1 second. How can I do that?
"Delay" operator is not an option because it will delay even for slow network.
You can use Observable.zip() for that. Given
Observable<Response> network = ...
One can do
Observable<Integer> readyNotification = Observable.just(42).delay(1, TimeUnit.SECONDS);
Observable delayedNetwork = network.zipWith(readyNotification,
(response, notUsed) -> response);
Use Observable.concatEager()
It allows you to force one stream to complete after another (concat operator), but also kick off the network request immediately without having to wait for the first argument observable to complete (concatEager):
Observable<Response> responseObservable = ...;
Observable<Response> responseWithMinDelay = Observable.concatEager(
Observable.timer(1, TimeUnit.SECONDS).ignoreElements(),
responseObservable
).cast(Response.class);
It looked like Observable.zip would be a reasonable approach, and it seemed to work well until there was an error emitted; then it didn't wait for the expected time.
This seemed to work well for me:
Observable.mergeDelayError(
useCase.execute(), // can return Unit or throw error
Observable.timer(1, TimeUnit.SECONDS)
)
.reduce { _, _ -> Unit }
.doOnError { /* will wait at least 1 second */ }
.subscribe { /* will wait at least 1 second */ }