REST API request timeout for services which are always slow

REST API request timeout for services which are always slow - java

I am working on a project where I have to call a third-party REST service. The problem with the current setup is that service does not return in at least 16 seconds. This response time may exceed more than that.
To avoid the threads waiting on the server, my service has a timeout value of 16 seconds. But that value is not helping. I searched on this and found that the Circuit breaker pattern will be useful. Reference:- spring-boot-rest-api-request-timeout . I believe this pattern is useful when the service has a slow response a few times. In my case, it is always a slow service.
How can I tackle this scenario?

If you want the response from the third party REST service, you have no choise but to wait, but if your request method have other thing to do. You should use Callable Thread to sent request to REST service and let Main Thread to complete the other work first then wait for the Callable to come back.
Maybe you can try to use some Cache like #Cacheable or Redis for this scenario. It may speed up some of the similar request.
Or, just let your request method sent the response back to client first. After that, use AJAX to access the third party REST service from the client side.

Related

Java in web services return response after threshold time reaches and Continue execution after that

In Java in a web service, I have a requirement I want to return the response to the user after configured threshold time reaches and wants to continue processing after that.
Let's say I have a service it does step1, step 2, and the configured threshold is 1 second. Let's say step1 is completed at 1 second I want to return an acknowledgment response to the user and continue processing with step2 and wants to store response in DB or something like that.
Please let me know if anyone has any solutions or thoughts on this problem

There are multiple ways to achieve this
HTTP Layer
On HTTP layer, if the response comes back before the threshold, then I'd be tempted to send back a 200 Success.
However, if it takes more time than the threshold, you could use 202 Accepted
Looking at the RFC, its use case looks like this
6.3.3. 202 Accepted
The 202 (Accepted) status code indicates that the request has been
accepted for processing, but the processing has not been completed.
The request might or might not eventually be acted upon, as it might
be disallowed when processing actually takes place. There is no
facility in HTTP for re-sending a status code from an asynchronous
operation.
The 202 response is intentionally noncommittal. Its purpose is to
allow a server to accept a request for some other process (perhaps a
batch-oriented process that is only run once per day) without
requiring that the user agent's connection to the server persist
until the process is completed. The representation sent with this
response ought to describe the request's current status and point to
(or embed) a status monitor that can provide the user with an
estimate of when the request will be fulfilled.
Now, of course, instead of having a mix of 200 and 202, you could just return 202 everytime
Application Layer
In your application layer, you'll typically want to make use of asynchronous processing for this purpose.
There are multiple ways to leverage this way of working, you can:
Post a message on a queue/topic and let a message broker take care of dispatching it to another part of the app, or another app and let this part do the processing
Save the request inside of a database, and have another service poll the database for new requests, similar to queueing explained above, without JMS
If you're using Java EE, your EJB container allows you to work with #Asynchronous which will call a method asynchronously and return (so you'll be able to return 202)
If you're using Spring, it has an #Async annotation for the same purpose as hereabove
There are definitely other methods you could use to achieve this use case, but I think the ones I presented are the most common ones

Handle high number of traditional synchronous/blocking HTTP client requests with few threads Java?

I am working with Java. Another software developer has provided me his code performing synchronous HTTP calls and is responsible of maintaining it - he is using com.google.api.client.http. Updating his code to use an asynchronous HTTP client with a callback is not an available option, and I can't contact the developer to make changes to it. But I still want the efficient asynchronous behaviour of attaching a callback to an HTTP request.
(I am working in Spring Boot and my system is built using RabbitMQ AMQP if it has any effect.)
The simple HTTP GET (it is actually an API call) is performed as follows:
HttpResponse<String> response = httpClient.send(request, BodyHandlers.ofString());
This server I'm communicating with via HTTP takes some time to reply back... say 3-4 seconds. So my thread of execution is blocked for this duration, waiting for a reply. This scales very poorly, my single thread isn't doing is just waiting back for a reply to arrive - this is very heavy.
Sure, I can add the number of threads performing this call if I want to send more HTTP requests concurrently, i.e. I can scale in that way, but this doesn't sound efficient or correct. If possible, I would really like to get a better ratio than 1 thread waiting for 1 HTTP request in this situation.
In other words, I want to send thousands of HTTP requests with 2-3 available threads and handle the response once it arrives; I don't want to incur any significant delay between the execution of each request.
I was wondering: how can I achieve a more scalable solution? How can I handle thousands of this HTTP call per thread? What should I be looking at or do I just have no options and I am asking for the impossible?
EDIT: I guess this is another way to phrase my problem. Assume I have 1000 requests to be sent right now, each will last 3-4 seconds, but only 4-5 available threads of execution on which to send them. I would like to send them all at the same time, but that's not possible; if I manage to send them ALL within the span of 0.5s or less and handle their requests via some callback or something like that, I would consider that a great solution. But I can't switch to an asynchronous HTTP client library.

Using an asynchronous HTTP client is not an available option - I can't change my HTTP client library.
In that case, I think you are stuck with non-scalable synchronous behavior on the client side.
The only work-around I can think of is to run your requests as tasks in an ExecutorService with a bounded thread pool. That will limit the number of threads that are used ... but will also limit the number of simultaneous HTTP requests in play. This is replacing one scaling problem with another one: you are effectively rate-limiting your HTTP requests.
But the flip-side is that launching too many simultaneous HTTP requests is liable to overwhelm the target service(s) and / or the client or server-side network links. From that perspective, client-side rate limiting could be a good thing.
Assume I have 1000 requests to be sent right now, each will last 3-4 seconds, but only 4-5 available threads of execution on which to send them. I would like to send them all at the same time, but that's not possible; if I manage to send them ALL within the span of 0.5s or less and handle their requests via some callback or something like that, I would consider that a great solution. But I can't switch to an asynchronous HTTP client.
The only way you are going to be able to run > N requests at the same time with N threads is to use an asynchronous client. Period.
And "... callback or something like that ...". That's a feature you will only get with an asynchronous client. (Or more precisely, you can only get real asynchronous behavior via callbacks if there is a real asynchronous client library under the hood.)
So the solution is akin to sending the HTTP requests in a staggering manner i.e. some delay between one request and another, where each delay is limited by the number of available threads? If the delay between each request is not significant, I can find that acceptable, but I am assuming it would be a rather large delay between the time each thread is executed as each thread has to wait for each other to finish (3-4s)? In that case, it's not what I want.
With my proposed work-around, the delay between any two requests is difficult to quantify. However, if you are trying to submit a large number of requests at the same time and wait for all of the responses, then the delay between individual requests is not relevant. For that scenario, the relevant measure is the time taken to complete all of the requests. Assuming that nothing else is submitting to the executor, the time taken to complete the requests will be approximately:
nos_requests * average_request_time / nos_worker_threads
The other thing to note is that if you did manage to submit a huge number of requests simultaneously, the server delay of 3-4s per request is liable to increase. The server will only have the capacity to process a certain number of requests per second. If that capacity is exceeded, requests will either be delayed or dropped.
But if there are no other options.
I suppose, you could consider changing your server API so that you can submit multiple "requests" in a single HTTP request.
I think that the real problem here is there is a mismatch between what the server API was designed to support, and what you are trying to do with it.
And there is definitely a problem with this:
Another software developer has provided me his code performing synchronous HTTP calls and is responsible of maintaining it - he is using com.google.api.client.http. Updating his code to use an asynchronous HTTP client with a callback is not an available option, and I can't contact the developer to make changes to it.
Perhaps you need to "bite the bullet" and stop using his code. Work out what it is doing and replace it with your own implementation.
There is no magic pixie dust that will give scalable performance from a synchronous HTTP client. Period.

Asynchronous HTTP request in restfull Web Service

I have a java restful web service implemented, and I have one method in that ws that makes a HTTP request which takes like 3-4 minutes, I want to know if I can get any benefit of making that call asynchronous.
The thread could be used by another request or will be blocked anyway by the main call?
Edit: I am making a petition P to my web service A (a synchronous petition only), that petition is handled by thread T1, when the petition P call the URL that takes 3-4 minutes, would I get benefits if I make that call asynchronous (to the URL that takes 3-4 minutes). Benefits like the thread T1 will be able to handle new petitions?.
If the answer is no, then are there another benefit in doing that call asynchronously?

It's not good to block a HTTP request for such a long time, because HTTP is synchronous.
Instead of blocking, it would be better to make it asynchronous and return 202 Accepted. For getting the result you got two choices:
polling (client periodically polls for a result)
callback (notify client with help of callback-url)
For further reading look at this blogpost: https://www.adayinthelifeof.nl/2011/06/02/asynchronous-operations-in-rest/ or Best way to create REST API for long lasting tasks?.

How to handle asynchronous operations in REST

I need to understand what approaches there are to handle asynchronous operations in REST and what their advantages and disadvantages are. Some approaches I found:
Resource Based: Where the status of an operation is modeled as a status. User makes an async REST call (PUT, POST etc.) gets Accepted or In-Progress response (202). Further a status URI is polled repeatedly via GET to check status/progress/messages from operation execution.
Question: How long should this resource be active at Server? If the client polls in large intervals where in between the operation completes, how do we return the status? Seems like persisting the execution status would work. But how long to persist, when to archive/delete, is this kind of standard approach?
Callback Based: Where an async request is required to have a callback URI. The request gets processed asynchronously and upon completion makes a call to the callback URI with the operation status/result.
Question: This seems more elegant and involving less overhead at the server side. But how to handle scenarios where the callback server is intermittently down, not responding, etc.? Implement a typical retries where the callback URI provides retries configuration as well? Is there any other downside to this approach?
Servlet 3.0 Asynchronous support: Where an HTTP client makes a connection to a Java Servlet, which remains open until it is explicitly closed and until closed client and server can communicate asynchronously over it.
Question: Since its Servlet 3.0 spec, I think Jersey, the Spring REST implementation, doesn't utilize this approach as of now. Is there any specific REST implementation which utilizes a similar approach or pointer on ways to make it possible?
Any other approaches, maybe commercial ones?

Spring 3.2+ supports the async features of Servlet 3.0. From the Spring Blog:
you can make any existing controller method asynchronous by changing it to return a Callable. For example a controller method that returns a view name, can return Callable instead. An #ResponseBody that returns an object called Person can return Callable instead. And the same is true for any other controller return value type.
Jersey 2+ also supports asyncronous servers. See the Asynchronous Services and Clients chapter in the reference docs.

I think, the approach depends on time gap between initial request and the end of operation.
For short-time operations ( < 10s ) I would just keep the request open and return response when operation finished;
For long operations ( < 30m ) I would use servlet 3.0 or Comet model;
For extremely long operations ( hours, days ) good enough way, as for me, is just client-based polling or Comet with big timeouts.

I'm dealing now with the same situation and found the common approach of using Location header response to give a resource that can be monitored to check status (by polling of course). That seems to be the best, but in my case, I'm not creating a resource so I don't have a location to check the status (my async process is just to build a cache page).
You can always use your own headers to give an estimated time to complete the operation. Anyway I'm thinking of using Retry-After header to give an estimated time. What do you guys think?

I know this is old but I thought I'd chime in here to say that if what you want is something that can scale out in a stateless environment then you should go with your first option. You can perform the underlying operation anywhere and if you put the result in something like redis it will not matter to what web server the client makes subsequent polling requests. I'll usually put the polling interval in the response I sent to the client. When there a result is ready I will return the client a SEE OTHER that includes the id of the result in the URI.

Synchronous, Asynchronous and Command Client Requests with GWT and GAE

In designing my GWT/GAE app, it has become evident to me that my client-side (GWT) will be generating three types of requests:
Synchronous - "answer me right now! I'm important and require a real-time response!!!"
Asynchronous - "answer me when you can; I need to know the answer at some point but it's really not all that ugent."
Command - "I don't need an answer. This isn't really a request, it's just a command to do something or process something on the server-side."
My game plan is to implement my GWT code so that I can specify, for each specific server-side request (note: I've decided to go with RequestFactory over traditional GWT-RPC for reasons outside the scope of this question), which type of request it is:
SynchronousRequest - Synchronous (from above); sends a command and eagerly awaits a response that it then uses to update the client's state somehow
AsynchronousRequest - Asynchronous (from above); makes an initial request and somehow - either through polling or the GAE Channel API, is notified when the response is finally received
CommandRequest - Command (from above); makes a server-side request and does not wait for a response (even if the server fails to, or refuses to, oblige the command)
I guess my intention with SynchronousRequest is not to produce a totally blocking request, however it may block the user's ability to interact with a specific Widget or portion of the screen.
The added kicker here is this: GAE strongly enforces a timeout on all of its frontend instances (60 seconds). Backend instances have much more relaxed constraints for timeouts, threading, etc. So it is obvious to me that AsynchronousRequests and CommandRequests should be routed to backend instances so that GAE timeouts do not become an issue with them.
However, if GAE is behaving badly, or if we're hitting peak traffic, or if my code just plain sucks, I have to account for the scenario where a SynchronousRequest is made (which would have to go through a timeout-regulated frontend instance) and will timeout unless my GAE server code does something fancy. I know there is a method in the GAE API that I can call to see how many milliseconds a request has before its about to timeout; but although the name of it escapes me right now, it's what this "fancy" code would be based off of. Let's call it public static long GAE.timeLeftOnRequestInMillis() for the sake of this question.
In this scenario, I'd like to detect that a SynchronousRequest is about to timeout, and somehow dynamically convert it into an AsynchronousRequest so that it doesn't time out. Perhaps this means sending an AboutToTimeoutResponse back to the client, and force the client to decide about whether to resend as an AsynchronousRequest or just fail. Or perhaps we can just transform the SynchronousRequest into an AsynchronousRequest and push it to a queue where a backend instance will consume it, process it and return a response. I don't have any preferences when it comes to implementation, so long as the request doesn't fail or timeout because the server couldn't handle it fast enough (because of GAE-imposed regulations).
So then, here is what I'm actually asking here:
How can I wrap a RequestFactory call inside SynchronousRequest, AsynchronousRequest and CommandRequest in such a way that the RequestFactory call behaves the way each of them is intended? In other words, so that the call either partially-blocks (synchronous), can be notified/updated at some point down the road (asynchronous), or can just fire-and-forget (command)?
How can I implement my requirement to let a SynchronousRequest bypass GAE's 60-second timeout and still get processed without failing?
Please note: timeout issues are easily circumvented by re-routing things to backend instances, but backends don't/can't scale. I need scalability here as well (that's primarily why I'm on GAE in the first place!) - so I need a solution that deals with scalable frontend instances and their timeouts. Thanks in advance!

If the computation that you want GAE to do is going to take longer than 60 seconds, then don't wait for the results to be computed before sending a response. According to your problem definition, there is no way to get around this. Instead, clients should submit work orders, and wait for a notification from the server when the results are ready. Requests would consist of work orders, which might look something like this:
class ComputeDigitsOfPiWorkOrder {
// parameters for the computation
int numberOfDigitsToCompute;
// Used by the GAE app to contact the requester when results are ready.
ClientId clientId;
}
This way, your GAE app can respond as soon as the work order is saved (e.g. in Task Queue), and doesn't have to wait until it actually finishes calculating a billion digits of pi before responding. Your GWT client then waits for the result using the Channel API.
In order to give some work orders higher priority, you can use multiple task queues. If you want Task Queue work to scale automatically, you'll want to use push queues. Implementing priority using push queues is a little tricky, but you can configure high priority queues to have faster feed rate.
You could replace Channel API with some other notification solution, but that would probably be the most straightforward.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.