How to create a non-blocking #RestController webservice in spring?

How to create a non-blocking #RestController webservice in spring? - java

I'm having a #RestController webservice method that might block the response thread with a long running service call. As follows:
#RestController
public class MyRestController {
//could be another webservice api call, a long running database query, whatever
#Autowired
private SomeSlowService service;
#GetMapping()
public Response get() {
return service.slow();
}
#PostMapping()
public Response get() {
return service.slow();
}
}
Problem: what if X users are calling my service here? The executing threads will all block until the response is returned. Thus eating up "max-connections", max threads etc.
I remember some time ago a read an article on how to solve this issue, by parking threads somehow until the slow service response is received. So that those threads won't block eg the tomcat max connection/pool.
But I cannot find it anymore. Maybe somebody knows how to solve this?

there are a few solutions, such as working with asynchronous requests. In those cases, a thread will become free again as soon as the CompletableFuture, DeferredResult, Callable, ... is returned (and not necessarily completed).
For example, let's say we configure Tomcat like this:
server.tomcat.max-threads=5 # Default = 200
And we have the following controller:
#GetMapping("/bar")
public CompletableFuture<String> getSlowBar() {
return CompletableFuture.supplyAsync(() -> {
silentSleep(10000L);
return "Bar";
});
}
#GetMapping("/baz")
public String getSlowBaz() {
logger.info("Baz");
silentSleep(10000L);
return "Baz";
}
If we would fire 100 requests at once, you would have to wait at least 200 seconds before all the getSlowBar() calls are handled, since only 5 can be handled at a given time. With the asynchronous request on the other hand, you would have to wait at least 10 seconds, because all requests will likely be handled at once, and then the thread is available for others to use.
Is there a difference between CompletableFuture, Callable and DeferredResult? There isn't any difference result-wise, they all behave the similarly.
The way you have to handle threading is a bit different though:
With Callable, you rely on Spring executing the Callable using a TaskExecutor
With DeferredResult you have to to he thread-handling by yourself. For example by executing the logic within the ForkJoinPool.commonPool().
With CompletableFuture, you can either rely on the default thread pool (ForkJoinPool.commonPool()) or you can specify your own thread pool.
Other than that, CompletableFuture and Callable are part of the Java specification, while DeferredResult is a part of the Spring framework.
Be aware though, even though threads are released, connections are still kept open to the client. This means that with both approaches, the maximum amount of requests that can be handled at once is limited by 10000, and can be configured with:
server.tomcat.max-connections=100 # Default = 10000

in my opinion.the async may be better for the sever.for this particular api, async not works well.the clients also hold the connections. finally it will eating up "max-connections".you can send the request to messagequeue(kafka)and return success to clients. then you get the request and pass it to the slow sevice.

Related

How to turn a Mono into a truly asynchronous (not reactive!) method call?

I have a method
#Service
public class MyService {
public Mono<Integer> processData() {
... // very long reactive operation
}
}
In the normal program flow, I call this method asynchronously via a Kafka event.
For testing purposes I need to expose the method as a web service, but the method should be exposed as asynchronous: returning only HTTP code 200 OK ("request accepted") and continuing the data processing in the background.
Is it OK (= doesn't it have any unwanted side effects) just to call Mono#subscribe() and return from the controller method?
#RestController
#RequiredArgsConstructor
public class MyController {
private final MyService service;
#GetMapping
public void processData() {
service.processData()
.subscribeOn(Schedulers.boundedElastic())
.subscribe();
}
}
Or is it better to do it like this (here I am confused by the warning from IntelliJ, maybe the same as https://youtrack.jetbrains.com/issue/IDEA-276018 ?):
public Mono<Void> processData() {
service.processData()
.subscribeOn(Schedulers.boundedElastic())
.subscribe(); // IntelliJ complains "Inappropriate 'subscribe' call" but I think it's a false alarm in my case(?)
return Mono.empty();
}
Or some other solution?

Is it OK (= doesn't it have any unwanted side effects) just to call Mono#subscribe() and return from the controller method?
There are side effects, but you may be ok living with them:
It truly is fire and forget - which means while you'll never be notified about a success (which most people realise), you'll also never be notified about a failure (which far fewer people realise.)
If the process hangs for some reason, that publisher will never complete, and you'll have no way of knowing. Since you're subscribing on the bounded elastic threadpool, it'll also tie up one of those limited threads indefinitely too.
The first point you might be fine with, or you might want to put some error logging further down that reactive chain as a side-effect somehow so you at least have an internal notification if something goes wrong.
For the second point - I'd recommend putting a (generous) timeout on your method call so it at least gets cancelled if it hasn't completed in a set time, and is no longer hanging around consuming resources. If you're running an asynchronous task, then this isn't a massive issue as it'll just consume a bit of memory. If you're wrapping a blocking call on the elastic scheduler then this is worse however, as you're tying up a thread in that threadpool indefinitely.
I'd also question why you need to use the bounded elastic scheduler at all here - it's used for wrapping blocking calls, which doesn't seem to be the foundation of this use case. (To be clear, if your service is blocking then you should absolutely wrap it on the elastic scheduler - but if not then there's no reason to do so.)
Finally, this example:
public Mono<Void> processData() {
service.processData()
.subscribeOn(Schedulers.boundedElastic())
.subscribe();
return Mono.empty();
}
...is a brilliant example of what not to do, as you're creating a kind of "imposter reactive method" - someone may very reasonably subscribe to that returned publisher thinking it will complete when the underlying publisher completes, which obviously isn't what's happening here. Using a void return type and thus not returning anything is the correct thing to do in this scenario.

Your option with the following code is actually ok:
#GetMapping
public void processData() {
service.processData()
.subscribeOn(Schedulers.boundedElastic())
.subscribe();
}
This is actually what you do in a #Scheduled method which simply returns nothing and you explicitly subscribe to the Mono or Flux so that elements are emitted.

Spring boot Limit number of concurrent invocations of a specific API in a controller

I have a sprint boot (v1.5.15) based Restful application that provides user based services, particularly login and get user details.
The login activity is slightly heavy where as the get user details api is pretty light weight.
I have a controller akin to this
#RestController
public class UserController{
#PostMapping("/login")
public LoginResponse userLogin(#RequestBody LoginRequest loginRequest){
...
}
#GetMapping("/users/{id}")
public LoginResponse userIdGet(#PathVariable("id") String id){
...
}
}
Is there any way I could limit the number of concurrent calls to the /login api. Basically I want to limit this to say x as the /users/{id} can handle in the same resources around 10x of that calls.
The application uses the embedded tomcat server and I know of server.tomcat.max-connections, server.tomcat.max-threads and server.tomcat.min-spare-threads however these restrict the calls at the application level rather than at the API.

There are solutions which limit the number of active connections, see e.g. https://dzone.com/articles/how-to-limit-number-of-concurrent-user-session-in .
However, afaik, such solutions are just rejecting further request.
If you do not like to reject request, you might limit the concurrent work done by using using an application wide fixed thread pool ExecutorService ( https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool(int) ) and submit your request body to that thread pool and imediatly call get on the returned Future.
So you can replace
#PostMapping("/api/xyzMethod")
public Response xyzMethod(#RequestBody Request request) {
return handleXyzMethod(request); });
}
by
#PostMapping("/api/xyzMethod")
public Response xyzMethod(#RequestBody Request request) throws InterruptedException, ExecutionException {
return xyzMethodExecutor.submit(() -> { return handleXyzMethod(request); }).get();
}
with some
private static ExecutorService xyzMethodExecutor = Executors.newFixedThreadPool(10);
A drawback is that the user might have to wait for the reply and / or that multiple request will fill the threads pool queue until the service becomes (too) unresponsive. So maybe you have to endow this solution with some kind of timeout on the FutureTasks or combine the two solution (that is also have a larger limit on the number of concurrent sessions).

WLPs MicroProfile (FaultTolerance) Timeout Implementation does not interrupt threads?

I'm testing the websphere liberty's fault tolerance (microprofile) implementation. Therefore I made a simple REST-Service with a ressource which sleeps for 5 seconds:
#Path("client")
public class Client {
#GET
#Path("timeout")
public Response getClientTimeout() throws InterruptedException {
Thread.sleep(5000);
return Response.ok().entity("text").build();
}
}
I call this client within the same application within another REST-service:
#Path("mpfaulttolerance")
#RequestScoped
public class MpFaultToleranceController {
#GET
#Path("timeout")
#Timeout(4)
public Response getFailingRequest() {
System.out.println("start");
// calls the 5 seconds-ressource; should time out
Response response = ClientBuilder.newClient().target("http://localhost:9080").path("/resilience/api/client/timeout").request().get();
System.out.println("hello");
}
}
Now I'd expect that the method getFailingRequest() would time out after 4 ms and throw an exception. The actual behaviour is that the application prints "start", waits 5 seconds until the client returns, prints "hello" and then throws an "org.eclipse.microprofile.faulttolerance.exceptions.TimeoutException".
I turned on further debug information:
<logging traceSpecification="com.ibm.ws.microprofile.*=all" />
in server.xml. I get these information, that the timeout is registered even bevor the client is called! But the thread is not interrupted.
(if someone tells me how to get the stacktrace pretty in here... I can do that.)
Since this a very basic example: Am I doing anything wrong here? What can I do to make this example run properly.
Thanks
Edit: Running this example on WebSphere Application Server 18.0.0.2/wlp-1.0.21.cl180220180619-0403) auf Java HotSpot(TM) 64-Bit Server VM, Version 1.8.0_172-b11 (de_DE) with the features webProfile-8.0, mpFaultTolerance-1.0 and localConnector-1.0.
Edit: Solution, thanks to Andy McCright and Azquelt.
Since the call cannot be interrupted I have to make it asynchronous. So you got 2 threads: The first an who invoke the second thread with the call. The first thread will be interrupted, the second remains until the call finishes. But now you can go on with failure handling, open the circuit and stuff like that to prevent making further calls to the broken service.
#Path("mpfaulttolerance")
#RequestScoped
public class MpFaultToleranceController {
#Inject
private TestBase test;
#GET
#Path("timeout")
#Timeout(4)
public Response getFailingRequest() throws InterruptedException, ExecutionException {
Future<Response> resp = test.createFailingRequestToClientAsynch();
return resp.get();
}
}
And the client call:
#ApplicationScoped
public class TestBase {
#Asynchronous
public Future<Response> createFailingRequestToClientAsynch() {
Response response = ClientBuilder.newClient().target("http://localhost:9080").path("/resilience/api/client/timeout").request().get();
return CompletableFuture.completedFuture(response);
}
}

It does interrupt threads using Thread.interrupt(), but unfortunately not all Java operations respond to thread interrupts.
Lots of things do respond to interrupts by throwing an InterruptedException (like Thread.sleep(), Object.wait(), Future.get() and subclasses of InterruptableChannel) but InputStreams and Sockets don't.
I suspect that you (or the library you're using to make the request) is using a Socket which isn't interruptible so you don't see your method return early.
It's particularly unintuitive because Liberty's JAX-RS client doesn't respond to thread interrupts as Andy McCright mentioned. We're aware it's not a great situation and we're working on making it better.

I had the same problem. For some URLs I consume, the Fault Tolerance timeout doesn't work.
In my case I use RestClient. I solved my problem using the readTimeout() of the RestClientBuilder:
MyRestClientClass myRestClientClass = RestClientBuilder.newBuilder().baseUri(uri).readTimeout(3l, TimeUnit.SECONDS) .build(MyRestClientClient.class);
One advantage of using this Timeout control is that you can pass the timeout as a parameter.

Synchronous behavior when use methods of Java CompletableFuture

I am using Java's CompletableFuture like this into a spring boot #Service:
#Service
public class ProcessService {
private static final ExecutorService EXECUTOR = Executors.newFixedThreadPool(3);
#Autowired
ChangeHistoryService changeHistoryService;
public Attribute process(Attribute attribute) {
//some code
CompletableFuture.runAsync(() -> changeHistoryService.logChanges(attribute), EXECUTOR);
return attribute;
}
}
The process method is called form a method inside a #RestController:
#RestController
public class ProcessController {
#Autowired
ProcessService processService;
#RequestMapping(value = "/processAttribute",
method = {RequestMethod.POST},
produces = {MediaType.APPLICATION_JSON_VALUE},
consumes = {MediaType.APPLICATION_JSON_VALUE})
public Attribute applyRules(#RequestBody Attribute attribute) {
Attribute resultValue = processService.service(attribute);
return resultValue;
}
}
ChangeHistoryService::logChanges only save some data to database according to its parameter.
I have a microservice that makes a number of request to this "/processAttribute" endpoint and print all responses.
When I put a breakpoint in logChanges method, the microservice is waiting on some request but not all which makes me think that the ChangeHistoryService::logChanges not always runs async. If I don't supply the runAsync with a ExecutorService, the microservice blocks on more request but still not all.
From what I understood this is because method that process the request and logChanges method share same thread pool (ForkJoinPool?).
Anyway, as I have another ExecutorService, logChanges should not runs independently? Or is something about how IDE treats breakpoints on async task? I am using IntelliJ IDEA.

The problem was that the breakpoint suspends all threads and not only the thread that runs logChanges method. I fix this in Intellij IDEA by pressing right click on breakpoint and checked "Thread" checkbox, not "All":

You have a rather small threadpool, so it's no wonder that you can saturate it. The threads that process requests are not the same as the ones processing your CompletableFutures. One is an internal component of the server, and the second one is the one you explicitly created, EXECUTOR.
If you want to increase the asynchronousness, try giving EXECUTOR some more threads and see how the behaviour changes accordingly. Currently the EXECUTOR is a bottleneck, since there are far more threads available for requests to run in.
Note that by putting a breakpoint inside logChanges() you'll be blocking one thread in the pool, making it even more saturated.

jersey ws 2.0 #suspended AsyncResponse, what does it do?

I am analyzing some jersey 2.0 code and i have a question on how the following method works:
#Stateless
#Path("/mycoolstuff")
public class MyEjbResource {
…
#GET
#Asynchronous //does this mean the method executes on child thread ?
public void longRunningOperation(#Suspended AsyncResponse ar) {
final String result = executeLongRunningOperation();
ar.resume(result);
}
private String executeLongRunningOperation() { … }
}
Lets say im at a web browser and i type in www.mysite/mycoolstuff
this will execute the method but im not understanding what the asyncResponse is used for neither the #Asynchronous annotation. From the browser how would i notice its asychnronous ? what would be the difference in removing the annotation ? Also the suspended annotation after reading the documentation i'm not clear its purpose.
is the #Asynchronous annotation simply telling the program to execute this method on a new thread ? is it a convenience method for doing "new Thread(.....)" ?
Update: this annotation relieves the server of hanging onto the request processing thread. Throughput can be better. Anyway from the official docs:
Request processing on the server works by default in a synchronous processing mode, which means that a client connection of a request is processed in a single I/O container thread. Once the thread processing the request returns to the I/O container, the container can safely assume that the request processing is finished and that the client connection can be safely released including all the resources associated with the connection. This model is typically sufficient for processing of requests for which the processing resource method execution takes a relatively short time. However, in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used. In this model, the association between a request processing thread and client connection is broken. I/O container that handles incoming request may no longer assume that a client connection can be safely closed when a request processing thread returns. Instead a facility for explicitly suspending, resuming and closing client connections needs to be exposed. Note that the use of server-side asynchronous processing model will not improve the request processing time perceived by the client. It will however increase the throughput of the server, by releasing the initial request processing thread back to the I/O container while the request may still be waiting in a queue for processing or the processing may still be running on another dedicated thread. The released I/O container thread can be used to accept and process new incoming request connections.

#Suspended have more definite if you used it, else it will not make any difference of using it.
Let's talk about benefits of it:
#Suspended will pause/Suspend the current thread until it gets response,by default #NO_TIMEOUT no suspend timeout set. So it doesn't mean your request response (I/O)thread will get free and be available for other request.
Now Assume you want your service to be a response with some specific time, but the method you are calling from resource not guarantee the response time, then how will you manage your service response time? At that time, you can set suspend timeout for your service using #Suspended, and even provide a fall back response when time get exceed.
Below is some sample of code for setting suspend/pause timeout
public void longRunningOperation(#Suspended AsyncResponse ar) {
ar.setTimeoutHandler(customHandler);
ar.setTimeout(10, TimeUnit.SECONDS);
final String result = executeLongRunningOperation();
ar.resume(result);
}
for more details refer this

The #Suspended annotation is added before an AsyncResponse parameter on the resource method to tell the underlying web server not to expect this thread to return a response for the remote caller:
#POST
public void asyncPost(#Suspended final AsyncResponse ar, ... <args>) {
someAsyncMethodInYourServer(<args>, new AsyncMethodCallback() {
#Override
void completed(<results>) {
ar.complete(Response.ok(<results>).build());
}
#Override
void failed(Throwable t) {
ar.failed(t);
}
}
}
Rather, the AsyncResponse object is used by the thread that calls completed or failed on the callback object to return an 'ok' or throw an error to the client.
Consider using such asynchronous resources in conjunction with an async jersey client. If you're trying to implement a ReST service that exposes a fundamentally async api, these patterns allow you to project the async api through the ReST interface.
We don't create async interfaces because we have a process that takes a long time (minutes or hours) to run, but rather because we don't want our threads to ever sleep - we send the request and register a callback handler to be called later when the result is ready - from milliseconds to seconds later - in a synchronous interface, the calling thread would be sleeping during that time, rather than doing something useful. One of the fastest web servers ever written is single threaded and completely asynchronous. That thread never sleeps, and because there is only one thread, there's no context switching going on under the covers (at least within that process).

The #suspend annotation makes the caller actually wait until your done work. Lets say you have a lot of work to do on another thread. when you use jersey #suspend the caller just sits there and waits (so on a web browser they just see a spinner) until your AsyncResponse object returns data to it.
Imagine you had a really long operation you had to do and you want to do it on another thread (or multiple threads). Now we can have the user wait until we are done. Don't forget in jersey you'll need to add the " true" right in the jersey servlet definition in web.xml to get it to work.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to create a non-blocking #RestController webservice in spring? - java

Related

How to turn a Mono into a truly asynchronous (not reactive!) method call?

Spring boot Limit number of concurrent invocations of a specific API in a controller

WLPs MicroProfile (FaultTolerance) Timeout Implementation does not interrupt threads?

Synchronous behavior when use methods of Java CompletableFuture

jersey ws 2.0 #suspended AsyncResponse, what does it do?

Categories

Resources