Async processing of events in EventProcessorClient - java

We are migrating from an old SDK-Version to version 4 SDK of Azure EventHub and we are wondering if we still need to implement asynchronous handling of events by our selves. We are implementing the EventProcessorClient with the EventProcessorClientBuilder. so my question: Is the onEvent method called asynchronous in this example or not?
new EventProcessorClientBuilder()
.connectionString(connectionString)
.checkpointStore(new BlobCheckpointStore(blobContainer))
.processEvent(this::onEvent)
.buildEventProcessorClient();
I dug deep into the Library and found that in the PartitionPumpManager an EventHubConsumerAsyncClient is build. But Iam not 100% certain.
If it is asynchronous, is there any way to maximum number of tasks and/or a way to give a timeout for a single task?

The invocation of 'onEvent' is asynchronous, i.e., it won't block the main thread.
As shown below, there are four different (mutually exclusive) ways to register the callback to receive events. Each takes the callback to deliver events as the first parameter -
processEvent(Consumer<EventContext> onEvent);
processEvent(Consumer<EventContext> onEvent, Duration timeout);
processEvent(Consumer<EventBatchContext> onBatchOfEvents, int batchSize);
processEvent(Consumer<EventBatchContext> onBatchOfEvents, int batchSize, Duration timeout);
Option 2 has the timeout as a second parameter, i.e., how long to wait for an event to be available.
Options 3 and 4 allow the application to register a callback to handle a batch of events; the second parameter indicates the desired number of events in each batch. The callback will be invoked with a EventBatchContext object where the batch can be accessed from EventBatchContext::getEvents().
Additionally, Option 4 allows specifying how long to wait for a batch of size N (where N is the second parameter) to be delivered to the callback. Upon timeout timer expiration, if there are only M events (where M < N), the batch with M elements will be delivered to the callback. If no events are available within the timeout, then the EventBatchContext object delivered to callback returns an empty list when its getEvents() is called.
I think Option 3 or 4 are what you're looking for, which we recommend using because callback for each event has a slight overhead.

Related

How to batch process events based on time or after specific amount

I have an event system that I can subscribe to for when a specific object is changed. After receiving this event, I want to execute a task for this object.
It is possible that multiple objects are changed at the same time. E.g. if I change 1000 objects I get 1000 events. The problem is that it takes way longer for the task I want to execute to process 1 objects 1000 times than 1000 objects 1 time. I cannot change the way the events are generated.
So what I thought about is to batch up these events when I receive them. E.g. Collect 1000 Items in a Queue and the execute the task on all objects from the collected events.
The problem is: what happens when only 999 objects are changed? Then my task is never executed. So I also want to drain the queue e.g. 5 seconds after the first object was inserted.
Is there any library for this specific task? Or do I have to build this myself with a Queue and some logic to do the things I want?
I'm almost sure that doesn't exist some specific lib for this, what I done once I needed a same strategy for events like you, was create a queue or a repository to store the events, and started a ScheduledExecutorService with a task running at a fixed rate, to consume the events, if there isn't events to consume I just skiped the execution. You can even put a verification in the store add method to see if the store has 1000 or more and hasn't been processed yet, so you can fire the task.

Designing non-real time, non-blocking, result-dependent system

Context:
1) We have a scheduler which picks up jobs and process them by calling another rest-call in a blocking manner.
2) Scheduler thread needs to wait for the rest-call to complete and in-turn do some another task based upon the result.
3) There is no constraint for this to be real time.
Problem Statement:
1) What we want is to free scheduler threads as soon as an external call is made as external call takes significant time to complete.
2) We should be informed about the result received from the external call as we need to do some processing based on the result.
Idea in my mind:
1) Rather than calling the external system using synchronous Http call, we
can push the event to the queue.
2) Api consumer of another system will read the event from the queue and do the long running task. And post processing push the result back to the queue on a different topic.
3) Our system now can read the response from the queue(second topic) and do the necessary actions.
This is one of the design approach that comes to my
I need advice on whether we can improve the design somehow.
1) Can this be done without introduction of queue ?
2) Is there any better way to achieve the asynchronous processing ?
If you want to avoid using a queue, I can think of 2 other alternatives, for example:
1) Rather than calling the external system using synchronous Http call, we can push the event to the queue.
alternative a)
you do a synchronous HTTP GET to tell the other system that you want certain job to be executed (the other system replies quickly with a "200 OK" to confirm that it received the request).
alternative b)
you do a synchronous HTTP GET to tell the other system that you want certain job to be executed (the other system replies quickly with a "200 OK" and a unique ID to identify the job to be executed)
2) Api consumer of another system will read the event from the queue and do the long running task. And post processing push the result back to the queue on a different topic.
3) Our system now can read the response from the queue(second topic) and do the necessary actions.
alternative a)
upon receiving the request, the other system performs the long running computation and then when it is ready it makes a synchronous HTTP call to your original system to inform that the job is ready.
alternative b)
upon receiving the request, the other system performs the long running computation.
the original system doesn't know if the job is done, so it polls at certain times (doing a synchronous HTTP GET to a different REST API) providing the JOB ID, to find out if the job is ready.

Reactive event processing with retrying in Java/Groovy

I would like to implement a microservice which after receive a request (via message queue) will try to execute it via REST/SOAP calls to the external services. On success the reply should be sent back via MQ, but on failure the request should be rescheduled for the execution later (using some custom algorithm like 10 seconds, 1 minute, 10 minutes, timeout - give up). After specified amount of time the failure message should be sent back to the requester.
It should run on Java 8 and/or Groovy. Event persistence is not required.
First I though about Executor and Runnable/Future together with ScheduledExecutorService.scheduleWithFixedDelay, but it looks to much low level for me. The second idea was actors with Akka and Scheduler (for rescheduling), but I'm sure there could be some other approaches.
Question. What technique would you use for reactive event processing with an ability to reschedule them on failure?
"Event" is quite fuzzy term, but most of definitions I met was talking about one of techniques of Inversion of Control. This one was characterized with fact, that you don't care WHEN and BY WHOM some piece of code will be called, but ON WHAT CONDITION. That means that you invert (or more precisely "lose") control over execution flow.
Now, you want event-driven processing (so you don't want to handle WHEN and BY WHOM), yet you want to specify TIMED (so strictly connected to WHEN) behaviour on failure. This is some kind of paradox to me.
I'd say you would do better, if you'd use callbacks for reactive programming, and on failure you'd just start new thread that will sleep for 10 seconds and re-run callback.
In the end I have found the library async-retry which was written just for this purpose. It allows to asynchronously retry the execution in a very customizable way. Internally it leverages ScheduledExecutorService and CompletableFuture (or ListenableScheduledFuture from Guava when Java 7 has to be used).
Sample usage (from the project web page):
ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
RetryExecutor executor = new AsyncRetryExecutor(scheduler).
retryOn(SocketException.class).
withExponentialBackoff(500, 2). //500ms times 2 after each retry
withMaxDelay(10_000). //10 seconds
withUniformJitter(). //add between +/- 100 ms randomly
withMaxRetries(20);
final CompletableFuture<Socket> future = executor.getWithRetry(() ->
new Socket("localhost", 8080)
);
future.thenAccept(socket ->
System.out.println("Connected! " + socket)
);

What's the effect on a second request of calling Thread.currentThread().sleep(2000) in a Spring MVC request handler?

I need to wait for a condition in a Spring MVC request handler while I call a third party service to update some entities for a user.
The wait averages about 2 seconds.
I'm calling Thread.sleep to allow the remote call to complete and for the entities to be updated in the database:
Thread.currentThread().sleep(2000);
After this, I retrieve the updated models from the database and display the view.
However, what will be the effect on parallel requests that arrive for processing at this controller/request handler?
Will parallel requests also experience a wait?
Or will they be spawned off into separate threads and so not be affected by the delay experienced by the current request?
What are doing may work sometimes, but it is not a reliable solution.
The Java Future interface, along with a configured ExecutorService allows you to begin some operation and have one or more threads wait until the result is ready (or optionally until a certain amount of time has passed).
You can find documentation for it here:
http://download.oracle.com/javase/6/docs/api/java/util/concurrent/Future.html

Time restricted service

i'm developing an app that make requests to the Musicbrainz webservice. I read in the musicbrainz manual to not make more than one request per second to the webservice or the client IP will be blocked.
What architecture do you suggest in order to make this restriction transparent to the service client.
I would like to call a method (getAlbuns for example) and it should only make the request 1sec after the last request.
I also want to call 10 request at once and the service should handle the queueing, returning the results when avaiable (Non-blocking).
Thanks!
Because of the required delay between invocations, I'd suggest a java.util.Timer or java.util.concurrent.ScheduledThreadPoolExecutor. Timer is very simple, and perfectly adequate for this use case. But if additional scheduling requirements are identified later, a single Executor could handle all of them. In either case, use fixed-delay method, not a fixed-rate method.
The recurring task polls a concurrent queue for a request object. If there is a pending request, the task executes it, and returns the result via a callback. The query for the service and the callback to invoke are members of the request object.
The application keeps a reference to the shared queue. To schedule a request, simply add it to the queue.
Just to clarify, if the queue is empty when the scheduled task is executed, no request is made. The simple approach would be just to end the task, and the scheduler will invoke the task one second later to check again.
However, this means that it could take up to one second to start a task, even if no requests have been processed lately. If this unnecessary latency is intolerable, writing your own thread is probably preferable to using Timer or ScheduledThreadPoolExecutor. In your own timing loop, you have more control over the scheduling if you choose to block on an empty queue until a request is available. The built-in timers aren't guaranteed to wait a full second after the previous execution finished; they generally schedule relative to the start time of the task.
If this second case is what you have in mind, your run() method will contain a loop. Each iteration starts by blocking on the queue until a request is received, then recording the time. After processing the request, the time is checked again. If the time difference is less than one second, sleep for the the remainder. This setup assumes that the one second delay is required between the start of one request and the next. If the delay is required between the end of one request and the next, you don't need to check the time; just sleep for one second.
One more thing to note is that the service might be able to accept multiple queries in a single request, which would reduce overhead. If it does, take advantage of this by blocking on take() for the first element, then using poll(), perhaps with a very short blocking time (5 ms or so), to see if the application is making any more requests. If so, these can be bundled up in a single request to the service. If queue is a BlockingQueue<? extends Request>, it might look something like this:
Collection<Request> bundle = new ArrayList<Request>();
bundle.add(queue.take());
while (bundle.size() < BUNDLE_MAX) {
Request req = queue.poll(EXTRA, TimeUnit.MILLISECONDS);
if (req == null)
break;
bundle.add(req);
}
/* Now make one service request with contents of "bundle". */
You need to define a local "proxy service" which your local clients will call.
The local proxy will receive requests and pass it on to the real service. But only at the rate of one message per second.
How you do this depends very much on the tecnoligy available to you.
The simplest would be a mutithreaded java service with a static and synchronised LastRequestTime long;" timestamp variable. (Although you would need some code acrobatics to keep your requests in sequence).
A more sophisticated service could have worker threads receiving the requests and placing them on a queue with a single thread picking up the requests and passing them on to the real service.

Categories

Resources