Performance of WebFlux with retry exponential backoff

Performance of WebFlux with retry exponential backoff - java

I'm building a webhook service that sends events to client's URLs.
In case of failure or timeout, I need to retry the sending with with exponential backoff. I've two ways to implement the logic:
Using WebClient's internal retry feature:
WebClient.create()
.post()
.uri(URL)
.exchange()
...
.retryWhen(Retry.backoff(4, Duration.ofSeconds(3)).jitter(0.7));
Another way is using rabbitMQ deadLetterExchange to re-queue the message with exponential backoof as mentioned here:
https://www.baeldung.com/spring-amqp-exponential-backoff
Applying WebFlux internal retry feature is easy for development but I've some concerns over using it because messages are stored in application memory and it may affect the performance when the number of messages are high.
I want to know other developers thoughts on these options.
Also, is there any better options?

In my opinion these two approaches can be used together.
The retry strategy is the simplest way to manage very transient error. BUT we cannot retry indefinitely.
An infinite retry may cause memory issue. But not only! When you shutdown a server, you probably wish to stop incoming traffic first and expect there is no computation/transaction during the shutting down ...
So we need another strategy for messages that failed after some retries. And a Dead Letter Queue is a valid option.
In the real life we may have several DQL, then the default one is used for the unexpected cases. Obviously we cannot hope to have a good algorithm in that cases.

Related

Cassandra Retry Policy for NoHostAvailableException [duplicate]

I am using the Datastax Cassandra driver and have a RetryPolicy setup to retry when a host is unavailable. However, I have noticed that it retries as fast as it can. I would like to change it to have an increasing delay between retries rather than hammer the cluster if it is struggling. This is particularly important for OVERLOADED request errors since I do want to retry in these scenarios, but with a substantial delay.
Where is the right place to put a delay and what is the right mechanism? Should I just throw a Thread.sleep(...) in my RetryPolicy?
I don't mind taking up a request on-the-wire slot (towards the maximum number of in-flight requests) but I am not okay with completely blocking other writes if we are not yet at the in-flight request limit.

You can implement your own retry policy by adding a delay. The simplest way is to pick the source code of the default retry and modify it yourself to implement an exponential delay for retry or something similar.
For exponential delay, just look at the source code of http://docs.datastax.com/en/drivers/java/3.0/com/datastax/driver/core/policies/ExponentialReconnectionPolicy.html to see how it works

How to speed up the consume time of ibm jms message

My application being as ibm client to consume the message which sent by ibm MQ server. But sometimes they will sent big number of messages(e.g:50000). But our client application can not "eat" the message so quickly.
what i'v been tried:
Use caching connection factory, But not help too much.
org.springframework.jms.connection.CachingConnectionFactory
I can't open multiple threads for the listener to speed up the consuming speed(Currently is set to 1) because of our business requirement.
Thanks in advanced!
Edit:
For each message processing time is like: (e.g:0:00:00.079) But wait to start process next message will take long time (e.g:0:00:00.534)

Consider the transactional and persistence requirements of the messages.
There are a number of options within MQ that could be enabled here to speed up delivery.
MQ is optimized for either persistent/transactional or non-persistent/non-transactional workloads. Don't mix them, so for example sending persistent messages in a non-transactional session.
If you are using non-persistent/non-transactional messaging then look into the READ_AHEAD options to stream messages down to the client.
In additional ensure that selectors are not in use.

If the client implementation is negotiable, look at sending aggregate messages that combine individual messages, especially if the business logic can adapt to handle them together before (for example) saving something to a database.
The only legitimate "business" reason that you can't have multiple listener threads is because of event/workflow ordering and the chance of processing two related messages concurrently rather than sequentially. However, perhaps it's possible to redesign the client so that messages are segregated by the sender, using JMS properties of some sort, and then have each listener filter by various properties. As long as all related events/messages get the same property, you might be able to have multiple listeners.
Not ideal, but if you made the listener stateful so that you knew when to rollback event B because related event A is currently getting processed, that might work. Difficult to do well with more overhead processing. Better yet, figure out a way to process messages out-of-order and yet get the correct answer in the end.
Ultimately, for large number of messages, you really need to figure out how to get n listeners because otherwise you might never catch up and, worse case, continue to grow your backlog.

Akka system from a QA perspective

I had been testing an Akka based application for more than a month now. But, if I reflect upon it, I have following conclusions:
Akka actors alone can achieve lot of concurrency. I have reached more than 100,000 messages/sec. This is fine and it is just message passing.
Now, if there is netty layer for connections at one end or you end up with akka actors eventually doing DB calls, REST calls, writing to files, the whole system doesn't make sense anymore. The actors' mailbox gets full and their throughput(here, ability to receive msgs/sec) goes slow.
From a QA perspective, this is like having a huge pipe in which you can forcefully pump lot of water and it can handle. But, if the input hose is bad, or the endpoints cannot handle the pressure, this huge pipe is of no use.
I need answers for the following so that I can suggest or verify in the system:
Should the blocking calls like DB calls, REST calls be handled by actors? Or they good only for message passing?
Can it be like, lets say you have the need of connecting persistently millions of android/ios devices to your akka system. Instead of sockets(so unreliable) etc., can remote actor be implemented as a persistent connection?
Is it ok to do any sort of computation in actor's handleMessage()? Like DB calls etc.
I would request this post to get through by the editors. I cannot ask all of these separately.

1) Yes, they can. But this operation should be done in separate (worker) actor, that uses fork-join-pool in combination with scala.concurrent.blocking around the blocking code, it needs it to prevent thread starvation. If target system (DB, REST and so on) supports several concurrent connections, you may use akka's routers for that (creating one actor per connection in pool). Also you can produce several actors for several different tables (resources, queues etc.), depending on your transaction isolation and storage's consistency requirements.
Another way to handle this is using asynchronous requests with acknowledges instead of blocking. You may also put the blocking operation inside some separate future (thread, worker), which will send acknowledge message at the operation's end.
2) Yes, actor may be implemented as a persistence connection. It will be just an actor, which holds connection's state (as actors are stateful). It may be even more reliable using Akka Persistence, which can save connection to some storage.
3) You can do any non-blocking computations inside the actor's receive (there is no handleMessage method in akka). The failures (like no connection to DB) will be managing automatically by Akka Supervising. For the blocking code, see 1.
P.S. about "huge pipe". The backend-application itself is a pipe (which is becoming huge with akka), so nothing can help you to improve performance if environement can't handle it - there is no pumps in this world. But akka is also a "water tank", which means that outer pressure may be stronger than inner. Btw, it means that developer should be careful with mailboxes - as "too much water" may cause OutOfMemory, the way to prevent that is to organize back pressure. It can be done by not acknowledging incoming message (or simply blocking an endpoint's handler) til it proceeded by akka.

I'm not sure I can understand all of your question, but in general actors are good also for slow work:
1) Yes, they are perfectly fine. Just create/assign 1 actor per every request (maybe behind an akka router for load balancing), and once it's done it can either mark itself as "free for new work" or self-terminate. Remember to execute the slow code in a future. Personally, I like avoiding the ask/pipe pattern due to the implicit timeouts and exception swallowing, just use tells with request id's, but if your latencies and error rates are low, go for ask/pipe.
2) You could, but in that case I'd suggest having a pool of connections rather than spawning them per-request, as that takes longer. If you can provide more details, I can maybe improve this answer.
3) Yes, but think about this: actors are cheap. Create millions of them, every time there is a blocking part, it should be a different, specialized actors. Bring single-responsibility to the extreme. If you have few, blocking actors, you lose all the benefits.

Hystrix: Custom circuit breaker and recovery logic

I just read the Hystrix guide and am trying to wrap my head around how the default circuit breaker and recovery period operate, and then how to customize their behavior.
Obviously, if the circuit is tripped, Hystrix will automatically call the command's getFallBack() method; this much I understand. But what criteria go into making the circuit tripped in the first place? Ideally, I'd like to try hitting a backing service several times (say, a max of 3 attempts) before we consider the service to be offline/unhealthy and trip the circuit breaker. How could I implement this, and where?
But I imagine that if I override the default circuit breaker, I must also override whatever mechanism handles the default recovery period. If a backing service goes down, it could be for any one of several reasons:
There is a network outage between the client and server
The service was deployed with a bug that makes it incapable of returning valid responses to the client
The client was deployed with a bug that makes it incapable of sending valid requests to the server
Some weird, momentary service hiccup (perhaps the service is doing a major garbage collection, etc.)
etc.
In most of these cases, it is not sufficient to have a recovery period that merely waits N seconds and then tries again. If the service has a bug in it, or if someone pulled some network cables in the data center, we will always get failures from this service. Only in a small number of cases will the client-service automagically heal itself without any human interaction.
So I guess my next question is partially "How do I customize the default recovery period strategy?", but I guess it is mainly: "How do I use Hystrix to notify devops when a service is down and requires manual intervention?"

there are basically four reasons for Hystrix to call the fallback method: an exception, a timeout, too many parallel requests, or too many exceptions in the previous calls.
You might want to do a retry in your run() method if the return code or the exception you receive from your service indicate that a retry makes sense.
In your fallback method of the command you might retry when there was a timeout - when there where too many parallel requests or too many exceptions it usually makes no sense to call the same service again.
As also asked how to notify devops: You should connect a monitoring system to Hystrix that polls the status of the circuit breaker and the ratio of successful and unsuccessful calls. You can use the metrics publishers provided, JMX, or write your own adapter using Hystrix' API. I've written two adapters for Riemann and Zabbix in a tutorial I prepared; you'll new very few lines of code for that.
The tutorial also has a example application and a load driver to try out some scenarios.
Br,
Alexander.

Is MQ publish/subscribe domain-specific interface generally faster than point-to-point?

I'm working on the existing application that uses transport layer with point-to-point MQ communication.
For each of the given list of accounts we need to retrieve some information.
Currently we have something like this to communicate with MQ:
responseObject getInfo(requestObject){
code to send message to MQ
code to retrieve message from MQ
}
As you can see we wait until it finishes completely before proceeding to the next account.
Due to performance issues we need to rework it.
There are 2 possible scenarios that I can think off at the moment.
1) Within an application to create a bunch of threads that would execute transport adapter for each account. Then get data from each task. I prefer this method, but some of the team members argue that transport layer is a better place for such change and we should place extra load on MQ instead of our application.
2) Rework transport layer to use publish/subscribe model.
Ideally I want something like this:
void send (requestObject){
code to send message to MQ
}
responseObject receive()
{
code to retrieve message from MQ
}
Then I will just send requests in the loop, and later retrieve data in the loop. The idea is that while first request is being processed by the back end system we don't have to wait for the response, but instead send next request.
My question, is it going to be a lot faster than current sequential retrieval?

The question title frames this as a choice between P2P and pub/sub but the question body frames it as a choice between threaded and pipelined processing. These are two completely different things.
Either code snippet provided could just as easily use P2P or pub/sub to put and get messages. The decision should not be based on speed but rather whether the interface in question requires a single message to be delivered to multiple receivers. If the answer is no then you probably want to stick with point-to-point, regardless of your application's threading model.
And, incidentally, the answer to the question posed in the title is "no." When you use the point-to-point model your messages resolve immediately to a destination or transmit queue and WebSphere MQ routes them from there. With pub/sub your message is handed off to an internal broker process that resolves zero to many possible destinations. Only after this step does the published message get put on a queue where, for the remainder of it's journey, it then is handled like any other point-to-point message. Although pub/sub is not normally noticeably slower than point-to-point the code path is longer and therefore, all other things being equal, it will add a bit more latency.
The other part of the question is about parallelism. You proposed either spinning up many threads or breaking the app up so that requests and replies are handled separately. A third option is to have multiple application instances running. You can combine any or all of these in your design. For example, you can spin up multiple request threads and multiple reply threads and then have application instances processing against multiple queue managers.
The key to this question is whether the messages have affinity to each other, to order dependencies or to the application instance or thread which created them. For example, if I am responding to an HTTP request with a request/reply then the thread attached to the HTTP session probably needs to be the one to receive the reply. But if the reply is truly asynchronous and all I need to do is update a database with the response data then having separate request and reply threads is helpful.
In either case, the ability to dynamically spin up or down the number of instances is helpful in managing peak workloads. If this is accomplished with threading alone then your performance scalability is bound to the upper limit of a single server. If this is accomplished by spinning up new application instances on the same or different server/QMgr then you get both scalability and workload balancing.
Please see the following article for more thoughts on these subjects: Mission:Messaging: Migration, failover, and scaling in a WebSphere MQ cluster
Also, go to the WebSphere MQ SupportPacs page and look for the Performance SupportPac for your platform and WMQ version. These are the ones with names beginning with MP**. These will show you the performance characteristics as the number of connected application instances varies.

It doesn't sound like you're thinking about this the right way. Regardless of the model you use (point-to-point or publish/subscribe), if your performance is bounded by a slow back-end system, neither will help speed up the process. If, however, you could theoretically issue more than one request at a time against the back-end system and expect to see a speed up, then you still don't really care if you do point-to-point or publish/subscribe. What you really care about is synchronous vs. asynchronous.
Your current approach for retrieving the data is clearly synchronous: you send the request message, and wait for the corresponding response message. You could do your communication asynchronously if you simply sent all the request messages in a row (perhaps in a loop) in one method, and then had a separate method (preferably on a different thread) monitoring the incoming topic for responses. This would ensure that your code would no longer block on individual requests. (This roughly corresponds to option 2, though without pub/sub.)
I think option 1 could get pretty unwieldly, depending on how many requests you actually have to make, though it, too, could be implemented without switching to a pub/sub channel.

The reworked approach will use fewer threads. Whether that makes the application faster depends on whether the overhead of managing a lot of threads is currently slowing you down. If you have fewer than 1000 threads (this is a very, very rough order-of-magnitude estimate!), i would guess it probably isn't. If you have more than that, it might well be.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.