In gRPC I would like some more information on the way the server handles requests.
Are requests executed in parallel? Or does the server spawn a new thread for each request, and execute them in parallel? Is there a way to modify this behavior? I understand that in client-streaming rpc's that message order is guaranteed.
If I send Request A followed by Request B to the same RPC, is it guaranteed that A will be executed first before B begins processing? Or are they each their own thread and executed in parallel with no guarantee that A finishes before B.
Ideally I would like to send a request to the server, the server acknowledges receipt of the request, and then the request is added to a queue to be processed sequentially, and returns a response once it's been processed. An approach I was exploring is to use an external task queue (like RabbitMQ) to queue the work done by the service but I want to know if there is a better approach.
Also -- on a somewhat related note -- does gRPC have a native retry counter mechanism? I have a particularly error-prone RPC that may have to retry up to 3 times (with an arbitrary delay between retries) before it is successful. This is something that could be implemented with RabbitMQ as well.
grpc-java passes RPCs to the service using the Executor provided by ServerBuilder.executor(Executor), or a cached thread pool if no executor is provided.
There is no ordering between simultaneous RPCs. RPCs can arrive in any order.
You could use a server-streaming RPC to allow the server to respond twice, once for acknowledgement and once for completion. You can use a oneof in the response message to allow sending the two different responses.
grpc-java as experimental retry support. gRFC A6 describes the support. The configuration is delivered to the client via service config. Retries are disabled by default, so overall you would want something like channelBuilder.defaultServiceConfig(serviceConfig).enableRetry(). You can also reference the hedging example which is very similar to retries.
Related
Here are two links which seem to be contradicting each other. I'd sooner trust the docs:
Link 1
Request processing on the server works by default in a synchronous processing mode
Link 2
It already is multithreaded.
My question:
Which is correct. Can it be both synchronous and multithreaded?
Why do the docs say the following?:
in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used
If the docs are correct, why is the default action synchronous? All requests are asynchronous on client-side javascript by default for user experience, it would make sense then that the default action for server-side should also be asynchronous too.
If the client does not need to serve requests in a specific order, then who cares how "EXPENSIVE" the operation is. Shouldn't all operations simply be asynchronous?
Request processing on the server works by default in a synchronous processing mode
Each request is processed on a separate thread. The request is considered synchronous because that request holds up the thread until the request is finished processing.
It already is multithreaded.
Yes, the server (container) is multi-threaded. For each request that comes in, a thread is taken from the thread pool, and the request is tied to the particular request.
in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used
Yes, so that we don't hold up the container thread. There are only so many threads in the container thread pool to handle requests. If we are holding them all up with long processing requests, then the container may run out of threads, blocking other requests from coming in. In asynchronous processing, Jersey hands the thread back to the container, and handle the request processing itself in its own thread pool, until the process is complete, then send the response up to the container, where it can send it back to the client.
If the client does not need to serve requests in a specific order, then who cares how "EXPENSIVE" the operation is.
Not really sure what the client has to do with anything here. Or at least in the context of how you're asking the question. Sorry.
Shouldn't all operations simply be asynchronous?
Not necessarily, if all the requests are quick. Though you could make an argument for it, but that would require performance testing, and numbers you can put up against each other and make a decision from there. Every system is different.
I use Apache Camel 2.10.0 with spring-ws component to route some (20+) WS/SOAP operations.
The sample code looks like:
from("spring-ws:rootqname:{http://my/name/space}myOperation1?endpointMapping=#endpointMapping")
from("spring-ws:rootqname:{http://my/name/space}myOperation2?endpointMapping=#endpointMapping")
from("spring-ws:rootqname:{http://my/name/space}myOperation3?endpointMapping=#endpointMapping")
Operations normally access several DB and could last up to couple of seconds
It works perfectly, but now I have a new requirement: 3 of the operations must be synchronized.
For example: if client1 calls operation1 1ms before client2 calling operation1, the client1’s call must be finished before starting client2’s one.
Same is valid for 1 client calling 2 different operations.
For example: if client1 calls operation1 1ms before calling operation2, the operation1’s call must be finished before starting operation2’s one. Clients call the WS asynchronously and this cannot be changed
The application is running with WebLogic 10.3.5.
Reducing the container threads to 1 only would affect all operations, thus I was thinking about adding of some custom queue (JMS style) only to these 3 operations.
Do you have any better idea?
It looks all the calling some be put into the queue first, then we can decide which one should be invoked then.
ActiveMQ is easy to setup and would work with Camel very well.
You need to route the request to JMS queue first, the queue itself is the transaction break point, then you consume JMS messages sequentially. You have much more control of the threading and message consuming by using the message pattern
What is the best technology solution (framework/approach) to have a Request Queue in front of a REST service.
so that i can increase the no of instances of REST service for higher availability and by placing Request queue in front to form a service/transaction boundary for the service client.
I need good and lightweight technology/framework choice for Request Queue (java)
Approach to implement a competing consumer with it.
There's a couple of issues here, depending on your goals.
First, it only promotes availability of the resources on the back end. Consider if you have 5 servers handling queue requests on the back end. If one of those servers goes down, then the queued request should fall back in to the queue, and be redelivered to one of the remaining 4 servers.
However, while those back end servers are processing, the front end servers are holding on to the actual, initiating requests. If one of those front end servers fails, then those connections are lost completely, and it will be up to the original client to resubmit the request.
The premise perhaps is that simpler front end systems are at a lower risk for failure, and that's certainly true for software related failure. But networks cards, power supplies, hard drives, etc. are pretty agnostic to such false hopes of man and punish all equally. So, consider this when talking about overall availability.
As to design, the back end is a simple process waiting upon a JMS message queue, and processing each message as they come. There are a multitude of examples of this available, and any JMS server will suit at a high level. All you need is to ensure that the message handling is transactional so that if a message processing fails, the message remains in the queue and can be redelivered to another message handler.
Your JMS queue's primary requirement is being clusterable. The JMS server itself is a single point of failure in the system. Lost the JMS server, and your system is pretty much dead in the water, so you'll need to be able to cluster the server and have the consumers and producers handle failover appropriately. Again, this is JMS server specific, most do it, but it's pretty routine in the JMS world.
The front end is where things get a little trickier, since the front end servers are the bridge from the synchronous world of the REST request to the asynchronous world of the back end processors. A REST request follows a typically RPC pattern of consuming the request payload from the socket, holding the connection open, processing the results, and delivering the results back down the originating socket.
To manifest this hand off, you should take a look at the Asynchronous Servlet handling the Servlet 3.0 introduced, and is available in Tomcat 7, the latest Jetty (not sure what version), Glassfish 3.x, and others.
In this case what you would do is when the request arrives, you convert the nominally synchronous Servlet call in to an Asynchronous call using HttpServletRequest.startAsync(HttpServletRequest request, HttpServletResponse response).
This returns an AsynchronousContext, and once started, allows the server to free up the processing thread. You then do several things.
Extract the parameters from the request.
Create a unique ID for the request.
Create a new back end request payload from your parameters.
Associate the ID with the AsyncContext, and retain the context (such as putting it in to a application wide Map).
Submit the back end request to the JMS queue.
At this point, the initial processing is done, and you simply return from doGet (or service, or whatever). Since you have not called AsyncContext.complete(), the server will not close out the connection to the server. Since you have the AsyncContext store in the map by the ID, it's handy for safe keeping for the time being.
Now, when you submitted the request to the JMS queue, it contained: the ID of the request (that you generated), any parameters for the request, and the identification of the actual server making the request. This last bit is important as the results of the processing needs to return to its origin. The origin is identified by the request ID and the server ID.
When your front end server started up, it also started a thread who's job it is to listen to a JMS response queue. When it sets up its JMS connection, it can set up a filter such as "Give me only messages for a ServerID of ABC123". Or, you could create a unique queue for each front end server and the back end server uses the server ID to determine the queue to return the reply to.
When the back end processors consume the message, they're take the request ID, and parameters, perform the work, and then take the result and put them on to the JMS response Queue. When it puts it the result back, it'll add the originating ServerID and the original Request ID as properties of the message.
So, if you got the request originally for Front End Server ABC123, the back end processor will address the results back to that server. Then, that listener thread will be notified when it gets a message. The listener threads task is to take that message and put it on to an internal queue within the front end server.
This internal queue is backed by a thread pool who's job is to send the request payloads back to the original connection. It does this by extracting the original request ID from the message, looking up the AsyncContext from that internal map discussed earlier, and then sending results down to the HttpServletResponse associated with the AsyncContext. At the end, it call AsyncContext.complete() (or a similar method) to tell the server that you're done and to allow it to release the connection.
For housekeeping, you should have another thread on the front end server who's job it is to detect when requests have been waiting in the map for too long. Part of the original message should have been a time the request started. This thread can wake up every second, scan the map for requests, and for any that have been there too long (say 30 seconds), it can put the request on to another internal queue, consumed by a collection of handlers designed to inform the client that the request timed out.
You want these internal queues so that the main processing logic isn't stuck waiting on the client to consume the data. It could be a slow connection or something, so you don't want to block all of the other pending requests to handle them one by one.
Finally, you'll need to account that you may well get a message from the response queue for a request that no longer exists in your internal map. For one, the request may have timed out, so it should not be there any longer. For another, that front end server may have stopped and been restarted, so it internal map of pending request will simply be empty. At this point, if you detect you have a reply for a request that no longer exists, you should simply discard it (well, log it, then discard it).
You can't reuse these requests, there's not such thing really as a load balancer going back to the client. If the client is allowing you to make callbacks via published end points, then, sure you can just have another JMS message handler make those requests. But that's not a REST kind of thing, REST at this level of discussion is more client/server/RPC.
As to which framework support Asynchronous Servlets at a higher level than a raw Servlet, (such as Jersey for JAX-RS or something like that), I can't say. I don't know what frameworks are supporting it at that level. Seems like this is a feature of Jersey 2.0, which is not out yet. There well may be others, you'll have to look around. Also, don't fixate on Servlet 3.0. Servlet 3.0 is simply a standardization of techniques used in individual containers for some time (Jetty notably), so you may want to look at container specific options outside of just Servlet 3.0.
But the concepts are the same. The big takeaway are the response queue listener with the filtered JMS connection, the internal request map to the AsyncContext, and the internal queues and thread pools to do the actual work within the application.
If you relax your requirement that it must be in Java, you could consider HAProxy. It's very lightweight, very standard, and does a lot of good things (request pooling / keepalives / queueing) well.
Think twice before you implement request queueing, though. Unless your traffic is extremely bursty it will do nothing but hurt your system's performance under load.
Assume that your system can handle 100 requests per second. Your HTTP server has a bounded worker thread pool. The only way a request pool can help is if you are receiving more than 100 requests per second. After your worker thread pool is full, requests start to pile up in your load balancer pool. Since they are arriving faster than you can handle them, the queue gets bigger ... and bigger ... and bigger. Eventually either this pool fills too, or you run out of RAM and the load balancer (and thus the entire system) crashes hard.
If your web server is too busy, start rejecting requests and get some additional capacity online.
Request pooling certainly can help if you can get additional capacity in time to handle the requests. It can also hurt you really badly. Think through the consequences before turning on a secondary request pool in front of your HTTP server's worker thread pool.
The design we use is a a REST interface receiving all the request and dispatching them to a message queue (i.e. Rabbitmq)
Then workers listen to the messages and execute them following certain rules. If everything goes down you would still have the request in the MQ and if you have a high number of request you can just add workers...
Check this keynote, it kind of shows the power of this concept!
http://www.springsource.org/SpringOne2GX2012
There is one controlling entity and several 'worker' entities. The controlling entity requests certain data from the worker entities, which they will fetch and return in their own manner.
Since the controlling entity can agnostic about the worker entities (and the working entities can be added/removed at any point), putting a JMS provider in between them sounds like a good idea. That's the assumption at least.
Since it is an one-to-many relation (controller -> workers), a JMS Topic would be the right solution. But, since the controlling entity is depending on the return values of the workers, request/reply functionality would be nice as well (somewhere, I read about the TopicRequester but I cannot seem to find a working example). Request/reply is typical Queue functionality.
As an attempt to use topics in a request/reply sort-of-way, I created two JMS topis: request and response. The controller publishes to the request topic and is subscribed to the response topic. Every worker is subscribed to the request topic and publishes to the response topic. To match requests and responses the controller will subscribe for each request to the response topic with a filter (using a session id as the value). The messages workers publish to the response topic have the session id associated with them.
Now this does not feel like a solution (rather it uses JMS as a hammer and treats the problem (and some more) as a nail). Is JMS in this situation a solution at all? Or are there other solutions I'm overlooking?
Your approach sort of makes sense to me. I think a messaging system could work. I think using topics are wrong. Take a look at the wiki page for Enterprise Service Bus. It's a little more complicated than you need, but the basic idea for your use case, is that you have a worker that is capable of reading from one queue, doing some processing and adding the processed data back to another queue.
The problem with a topic is that all workers will get the message at the same time and they will all work on it independently. It sounds like you only want one worker at a time working on each request. I think you have it as a topic so different types of workers can also listen to the same queue and only respond to certain requests. For that, you are better off just creating a new queue for each type of work. You could potentially have them in pairs, so you have a work_a_request queue and work_a_response queue. Or if your controller is capable of figuring out the type of response from the data, they can all write to a single response queue.
If you haven't chosen an Message Queue vendor yet, I would recommend RabbitMQ as it's easy to set-up, easy to add new queues (especially dynamically) and has really good spring support (although most major messaging systems have spring support and you may not even be using spring).
I'm also not sure what you are accomplishing the filters. If you ensure the messages to the workers contain all the information needed to do the work and the response messages back contain all the information your controller needs to finish the processing, I don't think you need them.
I would simply use two JMS queues.
The first one is the one that all of the requests go on. The workers will listen to the queue, and process them in their own time, in their own way.
Once complete, they will put bundle the request with the response and put that on another queue for the final process to handle. This way there's no need for the the submitting process to retain the requests, they just follow along with the entire procedure. A final process will listen to the second queue, and handle the request/response pairs appropriately.
If there's no need for the message to be reliable, or if there's no need for the actual processes to span JVMs or machines, then this can all be done with a single process and standard java threading (such as BlockingQueues and ExecutorServices).
If there's a need to accumulate related responses, then you'll need to capture whatever grouping data is necessary and have the Queue 2 listening process accumulate results. Or you can persist the results in a database.
For example, if you know your working set has five elements, you can queue up the requests with that information (1 of 5, 2 of 5, etc.). As each one finishes, the final process can update the database, counting elements. When it sees all of the pieces have been completed (in any order), it marks the result as complete. Later you would have some audit process scan for incomplete jobs that have not finished within some time (perhaps one of the messages erred out), so you can handle them better. Or the original processors can write the request to a separate "this one went bad" queue for mitigation and resubmission.
If you use JMS with transaction, if one of the processors fails, the transaction will roll back and the message will be retained on the queue for processing by one of the surviving processors, so that's another advantage of JMS.
The trick with this kind of processing is to try and push the state with message, or externalize it and send references to the state, thus making each component effectively stateless. This aids scaling and reliability since any component can fail (besides catastrophic JMS failure, naturally), and just pick up where you left off when you get the problem resolved an get them restarted.
If you're in a request/response mode (such as a servlet needing to respond), you can use Servlet 3.0 Async servlets to easily put things on hold, or you can put a local object on a internal map, keyed with the something such as the Session ID, then you Object.wait() in that key. Then, your Queue 2 listener will get the response, finalize the processing, and then use the Session ID (sent with message and retained through out the pipeline) to look up
the object that you're waiting on, then it can simply Object.notify() it to tell the servlet to continue.
Yes, this sticks a thread in the servlet container while waiting, that's why the new async stuff is better, but you work with the hand you're dealt. You can also add a timeout to the Object.wait(), if it times out, the processing took to long so you can gracefully alert the client.
This basically frees you from filters and such, and reply queues, etc. It's pretty simple to set it all up.
Well actual answer should depend upon whether your worker entities are external parties, physical located outside network, time expected for worker entity to finish their work etc..but problem you are trying to solve is one-to-many communication...u added jms protocol in your system just because you want all entities to be able to talk in jms protocol or asynchronous is reason...former reason does not make sense...if it is latter reason, you can choose other communication protocol like one-way web service call.
You can use latest java concurrent APIs to create multi-threaded asynchronous one-way web service call to different worker entities...
My system consists of a "proxy" class that receives "request" packets, marshals them and sends them over the network to a server, which unmarshals them, processes, and returns some "response packet".
My "submit" method on the proxy side should block until a reply is received to the request (packets have ids for identification and referencing purposes) or until a timeout is reached.
If I was building this in early versions of Java, I would likely implement in my proxy a collection of "pending messages ids", where I would submit a message, and wait() on the corresponding id (with a timeout). When a reply was received, the handling thread would notify() on the corresponding id.
Is there a better way to achieve this using an existing library class, perhaps in java.util.concurrency?
If I went with the solution described above, what is the correct way to deal with the potential race condition where a reply arrives before wait() is invoked?
The simple way would be to have a Callable that talks to the server and returns the Response.
// does not block
Future<Response> response = executorService.submit(makeCallable(request));
// wait for the result (blocks)
Response r = response.get();
Managing the request queue, assigning threads to the requests, and notifying the client code is all hidden away by the utility classes.
The level of concurrency is controlled by the executor service.
Every network call blocks one thread in there.
For better concurrency, one could look into using java.nio as well (but since you are talking to same server for all requests, a fixed number of concurrent connections, maybe even just one, seems to be sufficient).