Here are two links which seem to be contradicting each other. I'd sooner trust the docs:
Link 1
Request processing on the server works by default in a synchronous processing mode
Link 2
It already is multithreaded.
My question:
Which is correct. Can it be both synchronous and multithreaded?
Why do the docs say the following?:
in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used
If the docs are correct, why is the default action synchronous? All requests are asynchronous on client-side javascript by default for user experience, it would make sense then that the default action for server-side should also be asynchronous too.
If the client does not need to serve requests in a specific order, then who cares how "EXPENSIVE" the operation is. Shouldn't all operations simply be asynchronous?
Request processing on the server works by default in a synchronous processing mode
Each request is processed on a separate thread. The request is considered synchronous because that request holds up the thread until the request is finished processing.
It already is multithreaded.
Yes, the server (container) is multi-threaded. For each request that comes in, a thread is taken from the thread pool, and the request is tied to the particular request.
in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used
Yes, so that we don't hold up the container thread. There are only so many threads in the container thread pool to handle requests. If we are holding them all up with long processing requests, then the container may run out of threads, blocking other requests from coming in. In asynchronous processing, Jersey hands the thread back to the container, and handle the request processing itself in its own thread pool, until the process is complete, then send the response up to the container, where it can send it back to the client.
If the client does not need to serve requests in a specific order, then who cares how "EXPENSIVE" the operation is.
Not really sure what the client has to do with anything here. Or at least in the context of how you're asking the question. Sorry.
Shouldn't all operations simply be asynchronous?
Not necessarily, if all the requests are quick. Though you could make an argument for it, but that would require performance testing, and numbers you can put up against each other and make a decision from there. Every system is different.
Related
I am working with Java. Another software developer has provided me his code performing synchronous HTTP calls and is responsible of maintaining it - he is using com.google.api.client.http. Updating his code to use an asynchronous HTTP client with a callback is not an available option, and I can't contact the developer to make changes to it. But I still want the efficient asynchronous behaviour of attaching a callback to an HTTP request.
(I am working in Spring Boot and my system is built using RabbitMQ AMQP if it has any effect.)
The simple HTTP GET (it is actually an API call) is performed as follows:
HttpResponse<String> response = httpClient.send(request, BodyHandlers.ofString());
This server I'm communicating with via HTTP takes some time to reply back... say 3-4 seconds. So my thread of execution is blocked for this duration, waiting for a reply. This scales very poorly, my single thread isn't doing is just waiting back for a reply to arrive - this is very heavy.
Sure, I can add the number of threads performing this call if I want to send more HTTP requests concurrently, i.e. I can scale in that way, but this doesn't sound efficient or correct. If possible, I would really like to get a better ratio than 1 thread waiting for 1 HTTP request in this situation.
In other words, I want to send thousands of HTTP requests with 2-3 available threads and handle the response once it arrives; I don't want to incur any significant delay between the execution of each request.
I was wondering: how can I achieve a more scalable solution? How can I handle thousands of this HTTP call per thread? What should I be looking at or do I just have no options and I am asking for the impossible?
EDIT: I guess this is another way to phrase my problem. Assume I have 1000 requests to be sent right now, each will last 3-4 seconds, but only 4-5 available threads of execution on which to send them. I would like to send them all at the same time, but that's not possible; if I manage to send them ALL within the span of 0.5s or less and handle their requests via some callback or something like that, I would consider that a great solution. But I can't switch to an asynchronous HTTP client library.
Using an asynchronous HTTP client is not an available option - I can't change my HTTP client library.
In that case, I think you are stuck with non-scalable synchronous behavior on the client side.
The only work-around I can think of is to run your requests as tasks in an ExecutorService with a bounded thread pool. That will limit the number of threads that are used ... but will also limit the number of simultaneous HTTP requests in play. This is replacing one scaling problem with another one: you are effectively rate-limiting your HTTP requests.
But the flip-side is that launching too many simultaneous HTTP requests is liable to overwhelm the target service(s) and / or the client or server-side network links. From that perspective, client-side rate limiting could be a good thing.
Assume I have 1000 requests to be sent right now, each will last 3-4 seconds, but only 4-5 available threads of execution on which to send them. I would like to send them all at the same time, but that's not possible; if I manage to send them ALL within the span of 0.5s or less and handle their requests via some callback or something like that, I would consider that a great solution. But I can't switch to an asynchronous HTTP client.
The only way you are going to be able to run > N requests at the same time with N threads is to use an asynchronous client. Period.
And "... callback or something like that ...". That's a feature you will only get with an asynchronous client. (Or more precisely, you can only get real asynchronous behavior via callbacks if there is a real asynchronous client library under the hood.)
So the solution is akin to sending the HTTP requests in a staggering manner i.e. some delay between one request and another, where each delay is limited by the number of available threads? If the delay between each request is not significant, I can find that acceptable, but I am assuming it would be a rather large delay between the time each thread is executed as each thread has to wait for each other to finish (3-4s)? In that case, it's not what I want.
With my proposed work-around, the delay between any two requests is difficult to quantify. However, if you are trying to submit a large number of requests at the same time and wait for all of the responses, then the delay between individual requests is not relevant. For that scenario, the relevant measure is the time taken to complete all of the requests. Assuming that nothing else is submitting to the executor, the time taken to complete the requests will be approximately:
nos_requests * average_request_time / nos_worker_threads
The other thing to note is that if you did manage to submit a huge number of requests simultaneously, the server delay of 3-4s per request is liable to increase. The server will only have the capacity to process a certain number of requests per second. If that capacity is exceeded, requests will either be delayed or dropped.
But if there are no other options.
I suppose, you could consider changing your server API so that you can submit multiple "requests" in a single HTTP request.
I think that the real problem here is there is a mismatch between what the server API was designed to support, and what you are trying to do with it.
And there is definitely a problem with this:
Another software developer has provided me his code performing synchronous HTTP calls and is responsible of maintaining it - he is using com.google.api.client.http. Updating his code to use an asynchronous HTTP client with a callback is not an available option, and I can't contact the developer to make changes to it.
Perhaps you need to "bite the bullet" and stop using his code. Work out what it is doing and replace it with your own implementation.
There is no magic pixie dust that will give scalable performance from a synchronous HTTP client. Period.
So if AsyncContext::complete closes the response and I need to write the response within the asynchronous context, how do I implement a multi-step response in which some steps are blocking with non-blocking sections in-between them?
You seem to be operating under a misapprehension about the nature of an AsyncContext and the semantics of ServletRequest::startAsync. This method (re)initializes an AsyncContext for the request and associated response, creating one first if necessary, and associates it with the request / response pair. This puts the request into asynchronous mode, which, at its core, means nothing more than that the container will not consider request processing complete until the provided context's complete() method is invoked.
In particular, creating an async context does not create any threads or assign the associated request to a different thread, and the methods of an AsyncContext run on the thread that invokes them (though that's kinda a technicality for AsyncContext::start). The context is primarily an object for whatever asynchronous code you provide to use for interacting with the container, which it otherwise could not safely do. To actually perform processing on some other thread, you need to arrange for that thread to exist, and for the work to be assigned to it. AsyncContext::start is a convenient way to do that, but not the only way.
With respect specifically to
how do I implement a multi-step response in which some steps are blocking with non-blocking sections in-between them?
, the basic answer is "however you want". The AsyncContext neither hinders nor particularly helps you because it's about communication with the container, not about workflow. In particular, I see no need or special use for nested AsyncContexts.
I think you're describing a processing pipeline with certain, limited parallelization. You might implement that, say, by running the overall workflow -- all the "blocking" steps, I guess -- in a thread launched via AsyncContext::start, and dispatching the other work to a thread pool, in whatever units make sense. Do be aware, however, that the request and response objects are not thread-safe. Ideally, then, the primary thread will extract all the needed data from the request, and perform all needed writes to the response.
Alternatively, maybe you use the regular request processing thread for the main workflow, dispatch pieces of work to a thread pool as appropriate, and skip the AsyncContext bit altogether. It is not necessary in any absolute sense to use an AsyncContext to perform asynchronous computations in a web application -- it's purpose and the processing models it is designed to support are rather a lot more specific.
What is the best technology solution (framework/approach) to have a Request Queue in front of a REST service.
so that i can increase the no of instances of REST service for higher availability and by placing Request queue in front to form a service/transaction boundary for the service client.
I need good and lightweight technology/framework choice for Request Queue (java)
Approach to implement a competing consumer with it.
There's a couple of issues here, depending on your goals.
First, it only promotes availability of the resources on the back end. Consider if you have 5 servers handling queue requests on the back end. If one of those servers goes down, then the queued request should fall back in to the queue, and be redelivered to one of the remaining 4 servers.
However, while those back end servers are processing, the front end servers are holding on to the actual, initiating requests. If one of those front end servers fails, then those connections are lost completely, and it will be up to the original client to resubmit the request.
The premise perhaps is that simpler front end systems are at a lower risk for failure, and that's certainly true for software related failure. But networks cards, power supplies, hard drives, etc. are pretty agnostic to such false hopes of man and punish all equally. So, consider this when talking about overall availability.
As to design, the back end is a simple process waiting upon a JMS message queue, and processing each message as they come. There are a multitude of examples of this available, and any JMS server will suit at a high level. All you need is to ensure that the message handling is transactional so that if a message processing fails, the message remains in the queue and can be redelivered to another message handler.
Your JMS queue's primary requirement is being clusterable. The JMS server itself is a single point of failure in the system. Lost the JMS server, and your system is pretty much dead in the water, so you'll need to be able to cluster the server and have the consumers and producers handle failover appropriately. Again, this is JMS server specific, most do it, but it's pretty routine in the JMS world.
The front end is where things get a little trickier, since the front end servers are the bridge from the synchronous world of the REST request to the asynchronous world of the back end processors. A REST request follows a typically RPC pattern of consuming the request payload from the socket, holding the connection open, processing the results, and delivering the results back down the originating socket.
To manifest this hand off, you should take a look at the Asynchronous Servlet handling the Servlet 3.0 introduced, and is available in Tomcat 7, the latest Jetty (not sure what version), Glassfish 3.x, and others.
In this case what you would do is when the request arrives, you convert the nominally synchronous Servlet call in to an Asynchronous call using HttpServletRequest.startAsync(HttpServletRequest request, HttpServletResponse response).
This returns an AsynchronousContext, and once started, allows the server to free up the processing thread. You then do several things.
Extract the parameters from the request.
Create a unique ID for the request.
Create a new back end request payload from your parameters.
Associate the ID with the AsyncContext, and retain the context (such as putting it in to a application wide Map).
Submit the back end request to the JMS queue.
At this point, the initial processing is done, and you simply return from doGet (or service, or whatever). Since you have not called AsyncContext.complete(), the server will not close out the connection to the server. Since you have the AsyncContext store in the map by the ID, it's handy for safe keeping for the time being.
Now, when you submitted the request to the JMS queue, it contained: the ID of the request (that you generated), any parameters for the request, and the identification of the actual server making the request. This last bit is important as the results of the processing needs to return to its origin. The origin is identified by the request ID and the server ID.
When your front end server started up, it also started a thread who's job it is to listen to a JMS response queue. When it sets up its JMS connection, it can set up a filter such as "Give me only messages for a ServerID of ABC123". Or, you could create a unique queue for each front end server and the back end server uses the server ID to determine the queue to return the reply to.
When the back end processors consume the message, they're take the request ID, and parameters, perform the work, and then take the result and put them on to the JMS response Queue. When it puts it the result back, it'll add the originating ServerID and the original Request ID as properties of the message.
So, if you got the request originally for Front End Server ABC123, the back end processor will address the results back to that server. Then, that listener thread will be notified when it gets a message. The listener threads task is to take that message and put it on to an internal queue within the front end server.
This internal queue is backed by a thread pool who's job is to send the request payloads back to the original connection. It does this by extracting the original request ID from the message, looking up the AsyncContext from that internal map discussed earlier, and then sending results down to the HttpServletResponse associated with the AsyncContext. At the end, it call AsyncContext.complete() (or a similar method) to tell the server that you're done and to allow it to release the connection.
For housekeeping, you should have another thread on the front end server who's job it is to detect when requests have been waiting in the map for too long. Part of the original message should have been a time the request started. This thread can wake up every second, scan the map for requests, and for any that have been there too long (say 30 seconds), it can put the request on to another internal queue, consumed by a collection of handlers designed to inform the client that the request timed out.
You want these internal queues so that the main processing logic isn't stuck waiting on the client to consume the data. It could be a slow connection or something, so you don't want to block all of the other pending requests to handle them one by one.
Finally, you'll need to account that you may well get a message from the response queue for a request that no longer exists in your internal map. For one, the request may have timed out, so it should not be there any longer. For another, that front end server may have stopped and been restarted, so it internal map of pending request will simply be empty. At this point, if you detect you have a reply for a request that no longer exists, you should simply discard it (well, log it, then discard it).
You can't reuse these requests, there's not such thing really as a load balancer going back to the client. If the client is allowing you to make callbacks via published end points, then, sure you can just have another JMS message handler make those requests. But that's not a REST kind of thing, REST at this level of discussion is more client/server/RPC.
As to which framework support Asynchronous Servlets at a higher level than a raw Servlet, (such as Jersey for JAX-RS or something like that), I can't say. I don't know what frameworks are supporting it at that level. Seems like this is a feature of Jersey 2.0, which is not out yet. There well may be others, you'll have to look around. Also, don't fixate on Servlet 3.0. Servlet 3.0 is simply a standardization of techniques used in individual containers for some time (Jetty notably), so you may want to look at container specific options outside of just Servlet 3.0.
But the concepts are the same. The big takeaway are the response queue listener with the filtered JMS connection, the internal request map to the AsyncContext, and the internal queues and thread pools to do the actual work within the application.
If you relax your requirement that it must be in Java, you could consider HAProxy. It's very lightweight, very standard, and does a lot of good things (request pooling / keepalives / queueing) well.
Think twice before you implement request queueing, though. Unless your traffic is extremely bursty it will do nothing but hurt your system's performance under load.
Assume that your system can handle 100 requests per second. Your HTTP server has a bounded worker thread pool. The only way a request pool can help is if you are receiving more than 100 requests per second. After your worker thread pool is full, requests start to pile up in your load balancer pool. Since they are arriving faster than you can handle them, the queue gets bigger ... and bigger ... and bigger. Eventually either this pool fills too, or you run out of RAM and the load balancer (and thus the entire system) crashes hard.
If your web server is too busy, start rejecting requests and get some additional capacity online.
Request pooling certainly can help if you can get additional capacity in time to handle the requests. It can also hurt you really badly. Think through the consequences before turning on a secondary request pool in front of your HTTP server's worker thread pool.
The design we use is a a REST interface receiving all the request and dispatching them to a message queue (i.e. Rabbitmq)
Then workers listen to the messages and execute them following certain rules. If everything goes down you would still have the request in the MQ and if you have a high number of request you can just add workers...
Check this keynote, it kind of shows the power of this concept!
http://www.springsource.org/SpringOne2GX2012
Because of browser compatibility issues, I have decided to use long polling for a real time syncing and notification system. I use Java on the backend and all of the examples I've found thus far have been PHP. They tend to use while loops and a sleep method. How do I replicate this sort of thing in Java? There is a Thread.sleep() method, which leads me to...should I be using a separate thread for each user issuing a poll? If I don't use a separate thread, will the polling requests be blocking up the server?
[Update]
First of all, yes it is certainly possible to do a straightforward, long polling request handler. The request comes in to the server, then in your handler you loop or block until the information you need is available, then you end the loop and provide the information. Just realize that for each long polling client, yes you will be tying up a thread. This may be fine and perhaps this is the way you should start. However - if your web server is becoming so popular that the sheer number of blocking threads is becoming a performance problem, consider an asynchronous solution where you can keep a large numbers of client requests pending - their request is blocking, that is not responding until there is useful data, without tying up one or more threads per client.
[original]
The servlet 3.0 spec provides a standard for doing this kind asynchronous processing. Google "servlet 3.0 async". Tomcat 7 supports this. I'm guessing Jetty does also, but I have not used it.
Basically in your servlet request handler, when you realize you need to do some "long" polling, you can call a method to create an asynchronous context. Then you can exit the request handler and your thread is freed up, however the client is still blocking on the request. There is no need for any sleep or wait.
The trick is storing the async context somewhere "convenient". Then something happens in your app and you want to push data to the client, you go find that context, get the response object from it, write your content and invoke complete. The response is sent back to the client without you having to tie up a thread for each client.
Not sure this is the best solution for what you want but usually if you want to do this at period intervals in java you use the ScheduleExecutorService. There is a good example at the top of the API document. The TimeUnit is a great enum as you can specify the period time easily and clearly. So you can specify it to run every x minutes, hours etc
I am working on a servlet that can take a few hours to complete the request. However, the client calling the servlet is only interested in knowing whether the request has been received by the servlet or not. The client doesn't want to wait hours before it gets any kind of response from the servlet. Also since calling the servlet is a blocking call, the client cannot proceed until it receives the response from the servlet.
To avoid this, I am thinking of actually launching a new thread in the servlet code. The thread launched by the servlet will do the time consuming processing allowing the servlet to return a response to the client very quickly. But I am not sure if this an acceptable way of working around the blocking nature of servlet calls. I have looked into NIO but it seems like it is not something that is guaranteed to work in any servlet container as the servlet container has be NIO based also.
What you need is a job scheduler because they give assurance that a job will be finished, even in case a server is restarted.
Take a look at java OSS job schedulers, most notably Quartz.
Your solution is correct, but creating threads in enterprise applications is considered a bad practice. Better use a thread pool or JMS queue.
You have to take into account what should happen server goes down during processing, how to react when multiple requests (think: hundreds or even thousands) occur at the same time, etc. So you have chosen the right direction, but it is a bit more complicated.
A thread isn't bad but I recommend throwing this off to an executor pool as a task. Better yet a long running work manager. It's not a bad practice to return quickly like you plan. I would recommend providing some sort of user feedback indicating where the user can find information about the long running job. So:
Create a job representing the work task with a unique ID
Send the job to your background handler object (that contains an executor)
Build a url for the unique job id.
Return a page describing where they can get the result
The page with the result will have to coordinate with this background job manager. While it's computing you can have this page describe the progress. When its done the page can display the results of the long running job.