We are writing some REST services using Jersey. Our service make some underlying service calls which happens to be dead slow, which results in holding each thread per request for 3-4 seconds. While investigating I came across Asynchronous Pages in .Net which assigns a thread to each request from thread pool and returns the thread to thread pool once I/O operation starts and gets a new thread when I/O operation is finished to do rest of the processing.
Is there anything similar to this exists in Jersey, where we can serve more concurrent connections instead of holding one thread for each connection until it completes. I don't want to POST a request, return a GUID and then keep polling for the status of the request from client, since I don't control client code.
Thanks,
GG
Take a look at the Atmosphere's Framework, specifically the atmosphere-jersey module that brings asynchronous annotation to Jersey.
Take a look at one of the samples, or read this quick tutorial. Atmosphere's Jersey does exactly what you are looking at, without requiring you to manipulate threads or anything like that. Come to our mailing list in case you need more help.
(I am the Atmosphere Creator and Lead)
Related
I have a api(GET REST) developed using java & Quarkus. I want to understand the default mechanism on how this api handle the multiple requests to same api. Is there any queuing mechanism used by default.? is there any multithreading used by default?
Please help to understand this.
Quarkus became popular for it's optimization of resources and benchmarks on heavy loaded systems. It will by default use 2 kind of different threads.
I/O threads or otherwise called event-loop threads
Worker threads
I/O threads or otherwise called event-loop threads. Those threads are responsible, among other things, for reading bytes from the HTTP request and writing bytes back to the HTTP response. The important here is that those threads are usually not blocked at all. You can see a simple of the functionality of those threads as illustrated in the following picture
The number of those I/o threads as described in documentation
The number if IO threads used to perform IO. This will be
automatically set to a reasonable value based on the number of CPU
cores if it is not provided. If this is set to a higher value than the
number of Vert.x event loops then it will be capped at the number of
event loops. In general this should be controlled by setting
quarkus.vertx.event-loops-pool-size, this setting should only be used
if you want to limit the number of HTTP io threads to a smaller number
than the total number of IO threads.
Worker threads. Here a pool of threads again are maintained by the system and the system assigns a worker thread to a execute some scheduled work on some request. Then this thread can be used from another thread to execute some other task. These threads normally take over long running tasks or blocking code.
The default number of these type of threads are 20 if not otherwise configured as indicated by documentation
So to sum up a request in Quarkus will be executed either by some I/O thread or some Worker thread and those threads will be shared between other requests too. An I/O thread will normally take over non blocking tasks that do not take long to be executed. A Worker thread will normally take over blocking tasks and long running processes.
Taking in consideration the above it makes sense that Quarkus will have configured much more Worker threads in the worker thread pool than I/O threads in the i/o thread pool.
What is very important to take from the above information is the following:
A worker thread will serve a specific request (ex request1) and if during this serve it get's blocked to do some I/O operation it will continue to wait for the I/O in order to complete the request it serves. When this request is finished the thread is able to move on and serve some other request (ex request2).
An I/O thread or event-loop thread will serve a specific request (ex request1) and if during this serve it get's blocked for some I/O operation which is needed for this request, it will pause this request, and continue to serve another request (ex request2). When the I/O of first request is completed the thread will return according to some algorithm that schedules the job again to request1 to continue from where it was left.
Now someone may question what is the case then, since usually every request requires some type of I/O operation then how can someone use I/O thread to have better performance. In that case the programmer has 2 choices when he declares the controller of quarkus to use I/O thread:
Spawn manually inside the controller method which is declared to be I/O some other thread to do the blocking code block work while the outer thread that serves the request is of type I/O (read http request data, write http response). The manual thread can be of type worker inside some service layer. This is a bit complicated approach.
Use some external library for I/O operations that's expected to work with the same approach that I/O threads work in quarkus. For example for database operations the I/O could be operated by the library hibernate-reactive. This way full benefits of I/O approach can be achieved.
Some side notes
Considering that we are in the java ecosystem it will be very useful to also mention that the above architecture and efficiency of resources is similar (not exactly same) with Spring Reactive (Web Flux).
But quarkus is based on Jax-Rs and will by default provide this architecture of efficient use of resources, independently of whether you write reactive code or not. When using Spring Boot however in order to have a similar efficiency with quarkus you have to use Spring Reactive (Web Flux).
In case you use the basic spring boot web, the architecture used will be of a single thread per incoming request. A specific thread in this case is not able to switch between different threads. It will need to complete some request in order to handle the next request.
Also in quarkus making a controller method execute from an I/O thread is as simple as placing an annotation #NonBlocking in that method. The same for an endpoint method that needs to be executed from a worker thread with #Blocking.
In Spring boot however switching from those 2 type of threads may mean switching from spring-boot-web to spring-boot-webflux and vice versa. Spring-boot-web has some support however now with servlet-3 to optimize it's approach, article with such optimization, but this requires some programming optimization and not an out of the box functionality.
Here are two links which seem to be contradicting each other. I'd sooner trust the docs:
Link 1
Request processing on the server works by default in a synchronous processing mode
Link 2
It already is multithreaded.
My question:
Which is correct. Can it be both synchronous and multithreaded?
Why do the docs say the following?:
in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used
If the docs are correct, why is the default action synchronous? All requests are asynchronous on client-side javascript by default for user experience, it would make sense then that the default action for server-side should also be asynchronous too.
If the client does not need to serve requests in a specific order, then who cares how "EXPENSIVE" the operation is. Shouldn't all operations simply be asynchronous?
Request processing on the server works by default in a synchronous processing mode
Each request is processed on a separate thread. The request is considered synchronous because that request holds up the thread until the request is finished processing.
It already is multithreaded.
Yes, the server (container) is multi-threaded. For each request that comes in, a thread is taken from the thread pool, and the request is tied to the particular request.
in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used
Yes, so that we don't hold up the container thread. There are only so many threads in the container thread pool to handle requests. If we are holding them all up with long processing requests, then the container may run out of threads, blocking other requests from coming in. In asynchronous processing, Jersey hands the thread back to the container, and handle the request processing itself in its own thread pool, until the process is complete, then send the response up to the container, where it can send it back to the client.
If the client does not need to serve requests in a specific order, then who cares how "EXPENSIVE" the operation is.
Not really sure what the client has to do with anything here. Or at least in the context of how you're asking the question. Sorry.
Shouldn't all operations simply be asynchronous?
Not necessarily, if all the requests are quick. Though you could make an argument for it, but that would require performance testing, and numbers you can put up against each other and make a decision from there. Every system is different.
I have been doing Java for a few years but I have not had much experience with Asynchronous programming.
I am working on an application that makes SOAP web service calls to some Synchronous web services and currently the implementation of my consuming application is Synchronous also ie. my applications threads block while waiting for the response.
I am trying to learn how to handle these SOAP calls in an asynchronous way - just for the hell of it but I have some high-level questions which I cant seem to find any answers to.
I am using CXF but my question is not specifically about CXF or SOAP, but higher-level, in terms of asynchronous application architecture I think.
What I want to know (working thru a scenario) - at a high level - is:
So I have a Thread (A) running in my JVM that makes a call to a remote web service
It registers a callback method and returns a Future
Thread (A) has done its bit and gets returned to its pool once it has returned the Future
The remote web service response returns and Thread (B) gets allocated and calls the callback method (which generally populates the Future with a result I believe)
Q1. I cant get my head off the blocking thread model - if Thread (A) is no longer listening to that network socket then how does the response that comes back from the remote service get allocated Thread (B) - is it simply treated as a new request coming into the server/container which then allocates a thread to service it?
Q2. Closely related to Q1 I imagine: if no Thread has the Future, or handler (with its callback method) on its stack, then how does the response from the remote web service get associated with the callback method it needs to call?
Or, in another way of asking, how does Thread B (now dealing with the response) get given a reference to the Future/Callback object?
Very sorry my question is so long - and thanks to anyone who gave their time to read through it! :)
I don't see why you'd add all this complexity using asynchronous Threading.
The way to design an asynchronous soap service:
You have one service sending out a response to a given client / clients.
Those clients work on the response given asynchronously.
When done, they would call another soap method to return their response.
The response will just be stored in a queue (e.g. a database table), without any extra logic. You'd have a "Worker" Service working on the incoming tasks. If a response is needed again another method on the other remote service would be called. The requests I would store as events in the database, which would later be asynchronously handled by an EventHandler. See
Hexagonal Architecture:
https://www.youtube.com/watch?v=fGaJHEgonKg
Your Q1 and Q2 seem to have more to do with multithreading than they have to do with asynchronous calls.
The magic of asynchronous web service calls is that you don't have to worry about multithreading to handle blocking while waiting for a response.
It's a bit unclear from the question what the specific problem statement is (i.e., what you are hoping to have your application do while blocking or rather than blocking), but here are a couple ways that you could use asynchronous web service calls that will allow you to do other work.
For the following cases, assume that the dispatch() method calls Dispatch.invokeAsync(T msg, AsyncHandler handler) and returns a Future:
1) Dispatch multiple web service requests, so that they run in parallel:
If you have multiple services to consume and they can all execute independently, dispatch them all at once and process the responses when you have received them all.
ArrayList<Future<?>> futures = new ArrayList<Future<?>>();
futures.add(serviceToConsume1.dispatch());
futures.add(serviceToConsume2.dispatch());
futures.add(serviceToConsume3.dispatch());
// now wait until all services return
for(Future f<?> : futures) {
f.get();
}
// now use responses to continue processing
2) Polling:
Future<?> f = serviceToConsume.dispatch();
while(!f.isDone()) {
// do other work here
}
// now use response to continue processing
I have a web-app where when the user submits a request, we send a JMS message to a remote service and then wait for the reply. (There are also async requests, and we have various niceties set up for message replay, etc, so we'd prefer to stick with JMS instead of, say, HTTP)
In How should I implement request response with JMS?, ActiveMQ seems to discourage the idea of either temporary queues per request or temporary consumers with selectors on the JMSCorrelationID, due to the overhead involved in spinning them up.
However, if I use pooled consumers for the replies, how do I dispatch from the reply consumer back to the original requesting thread?
I could certainly write my own thread-safe callback-registration/dispatch, but I hate writing code I suspect has has already been written by someone who knows better than I do.
That ActiveMQ page recommends Lingo, which hasn't been updated since 2006, and Camel Spring Remoting, which has been hellbanned by my team for its many gotcha bugs.
Is there a better solution, in the form of a library implementing this pattern, or in the form of a different pattern for simulating synchronous request-reply over JMS?
Related SO question:
Is it a good practice to use JMS Temporary Queue for synchronous use?, which suggests that spinning up a consumer with a selector on the JMSCorrelationID is actually low-overhead, which contradicts what the ActiveMQ documentation says. Who's right?
In a past project we had a similar situation, where a sync WS request was handled with a pair of Async req/res JMS Messages. We were using the Jboss JMS impl at that time and temporary destinations where a big overhead.
We ended up writing a thread-safe dispatcher, leaving the WS waiting until the JMS response came in. We used the CorrelationID to map the response back to the request.
That solution was all home grown, but I've come across a nice blocking map impl that solves the problem of matching a response to a request.
BlockingMap
If your solution is clustered, you need to take care that response messages are dispatched to the right node in the cluster. I don't know ActiveMQ, but I remember JBoss messaging to have some glitches under the hood for their clusterable destinations.
I would still think about using Camel and let it handle the threading, perhaps without spring-remoting but just raw ProducerTemplates.
Camel has some nice documentation about the topic and works very well with ActiveMQ.
http://camel.apache.org/jms#JMS-RequestreplyoverJMS
For your question about spinning up a selector based consumer and the overhead, what the ActiveMQ docs actually states is that it requires a roundtrip to the ActiveMQ broker, which might be on the other side of the globe or on a high delay network. The overhead in this case is the TCP/IP round trip time to the AMQ broker. I would consider this as an option. Have used it muliple times with success.
A colleague suggested a potential solution-- one response queue/consumer per webapp thread, and we can set the return-address to the response queue owned by that particular thread. Since these threads are typically long-lived (and are re-used for subsequent web requests), we only have to suffer the overhead at the time the thread is spawned by the pool.
That said, this whole exercise is making me rethink JMS vs HTTP... :)
I have always used CorrelationID for request / response and never suffered any performance issues. I can't imagine why that would be a performance issue at all, it should be super fast for any messaging system to implement and quite an important feature to implement well.
http://www.eaipatterns.com/RequestReplyJmsExample.html has the tow main stream solutions using replyToQueue or correlationID.
It's an old one, but I've landed here searching for something else and actually do have some insights (hopefully will be helpful to someone).
We have implemented very similar use-case with Hazelcast being our chassis for
cluster's internode comminication. The essense is 2 datasets: 1 distributed map for responses, 1 'local' list of response awaiters (on each node in cluster).
each request (receiving it's own thread from Jetty) creates an entry in the map of local awaiters; the entry has obviously the correlation UID and an object that will serve as a semaphore
then the request is being dispatched to the remote (REST/JMS) and the original thread starts waiting on the semaphore; UID must be part of the request
remote returns the response and writes it into the responses map with the correlated UID
responses map is being listened; if the UID of the newly coming response is found in the map of the local awaiters, it's semaphore is being notified, original request's thread is being released, picking up the response from the responses map and returning it to the client
This is a general description, I can update an answer with a few optimizations we have, in case there will be any interest.
I am working on a servlet that can take a few hours to complete the request. However, the client calling the servlet is only interested in knowing whether the request has been received by the servlet or not. The client doesn't want to wait hours before it gets any kind of response from the servlet. Also since calling the servlet is a blocking call, the client cannot proceed until it receives the response from the servlet.
To avoid this, I am thinking of actually launching a new thread in the servlet code. The thread launched by the servlet will do the time consuming processing allowing the servlet to return a response to the client very quickly. But I am not sure if this an acceptable way of working around the blocking nature of servlet calls. I have looked into NIO but it seems like it is not something that is guaranteed to work in any servlet container as the servlet container has be NIO based also.
What you need is a job scheduler because they give assurance that a job will be finished, even in case a server is restarted.
Take a look at java OSS job schedulers, most notably Quartz.
Your solution is correct, but creating threads in enterprise applications is considered a bad practice. Better use a thread pool or JMS queue.
You have to take into account what should happen server goes down during processing, how to react when multiple requests (think: hundreds or even thousands) occur at the same time, etc. So you have chosen the right direction, but it is a bit more complicated.
A thread isn't bad but I recommend throwing this off to an executor pool as a task. Better yet a long running work manager. It's not a bad practice to return quickly like you plan. I would recommend providing some sort of user feedback indicating where the user can find information about the long running job. So:
Create a job representing the work task with a unique ID
Send the job to your background handler object (that contains an executor)
Build a url for the unique job id.
Return a page describing where they can get the result
The page with the result will have to coordinate with this background job manager. While it's computing you can have this page describe the progress. When its done the page can display the results of the long running job.