Long running webservice architecture

Long running webservice architecture - java

We use axis2 for building our webservices and a Jboss server to run the logic of all of our applications. We were asked to build a webservice that talks to a bean that could take up to 1 hour to respond (depending on the size of the request) so we would not be able to keep the connection with the consumers opened during that time.
We could use an asynchronous webservice but that hasn't come out all that well so we decided we could implement a bean that will do the logic behind the webservice and have the service invoke that bean asynchronously. The webservice will generate a token that will pass to the consumer and the consumer can use it to query the status of the request.
The questions I have are:
How to I query the status of the bean on the Jboss server once I have returned from the method in the service that created that bean. Do I need to use stateful beans?
Can I use stateful beans if I want to do asynchronous calls from the webservice side?

Another approach you could take is to make use of JMS and a DB.
The process would be
In web service call, put a message on a JMS Queue
Insert a record into a DB table, and return a unique id for that record to the client
In an MDB that listens to the Queue, call the bean
When the bean returns, update the DB record with a "Done" status
When the client calls for status, read the DB record, return "Not Done" or "Done" depending on the record.
When the client calls and the record indicates "Done", return "Done" and delete the record
This process is a bit heavier on resource usage, but has some advantages
A Durable JMS Queue will redeliver if your bean method throws an Exception
A Durable JMS Queue will redeliver if your server restarts
By using a DB table instead of some static data, you can support a clustered or load balanced environment

I don't think stateful session beans are the answer to your problem, they're designed for long-running conversational sessions, which isn't your scenario.
My recommendation would be to use a Java5-style ExecutorService thread pool, created using the Executors factory class:
When the web service server initializes, create an ExecutorService instance.
Web service call comes in, the handler creates an instance of Callable. The Callable.call() method would make the actual invocation on the business logic bean, in whatever form that takes.
This Callable is passed to ExecutorService.submit(), which immediately returns a Future object representing the eventual result of the call. The Executor will start to invoke your Callable in a separate thread.
Generate a random token, store the Future in a Map with the token as the key.
Return the token to the web service client (steps 1 to 4 should happen immediately)
Later, he web service client makes another call asking for the result, passing in the token
The server looks up the Future using the token, and calls get() on the Future, with a timeout value so that it only waits a short time for the answer. The get() call will return the execution result of whatever the Callable invoked.
If the answer is available, return it to the client, and remove the Future from the `Map.
Otherwise, tell the client to come back later.
It's a pretty robust approach. You can even configure the ExecutorService to limit the number of calls that can be in execution at the same time, if you so desire.

Related

How Spring MVC controller handle multiple long http requests?

As I found, controllers in String are singletones Are Spring MVC Controllers Singletons?
The question is, how Spring handles multiple long time consuming requests, to the same mapping? For example, when we want to return a model which require long calculations or connection to other server- and there are a lot of users which are sending request to the same url?
Async threads I assume- are not a solution, because method need to end before next request will be maintained? Or not..?

Requests are handled using a thread-pool (Container-managed), so each request has an independent context, it does not matter whether if the Controller is Singleton or not.
One important thing is that Singleton instances MUST NOT share state between requests to avoid unexpected behaviours or race conditions.
The thread-pool capacity will define the number of requests the server could handle in a sync model.
If you want an async approach you coud use many options like:
Having a independent thread pool that processes tasks from container threads, or
Use a queue to push tasks and use an scheduler process tasks, or
Use Websockets to make requests and use (1) or (2) for processing and then receive the notification when done.

How to attach async job to a HTTP session?

I need to run some time-consuming task from a controller. To do it I have implemented an #Async method in my service so that the controller can return immediately (for example with 202 Created status).
The problem is that the task need access to some session-scoped beans. With this approach I am getting org.springframework.beans.factory.BeanCreationException: Error creating bean with name (...): Scope 'session' is not active for the current thread (...).
The same result is when I manually create an ExecutionService instead of #Async.
Is it possible to somehow make a worker thread attached to the current session?
EDIT
The purpose is to implement a bulk operation, providing a way to monitor the status of processing. Something like described in this answer: https://stackoverflow.com/a/28787774/718590
If I run it synchronously, there will be no indication of the status (how many items processed), and a request timeout may occur.

If I correctly understand, you want to be able to start a long time asynchronous processing from a spring web application, and be able to follow advancement of processing from the session that started it. And the processing could use beans contained in the session.
For a good separation of concerns, I would never have an asynchronous thread know a session. The session is related to HTTP and can be destroyed at any time before the thread can finish (or even begin in race conditions) its processing.
IMHO, a correct design would be to create a class containing all the informations shared between the web part and the asynchronous processing : the status (whatever it can be), the user that started processing if is is relevant and every other relevant piece of information. In your controller (of preferently in the service method called by the controller) you prepare an object of that class, and pass it to the #Async method. Then before returning, the controller stores the object in session. That way :
the asynchronous processing has all its required information, even is the session is destroyed later. It does not need to know the session and only cares for its processing and updates its status
the session of the web application knows that the asynchronous processing is running, know how it was started and what is the current status
It can be adapted to your real problem, but this should meet your requirements.

How Do Applications handle Asynchronous Responses - via Callback

I have been doing Java for a few years but I have not had much experience with Asynchronous programming.
I am working on an application that makes SOAP web service calls to some Synchronous web services and currently the implementation of my consuming application is Synchronous also ie. my applications threads block while waiting for the response.
I am trying to learn how to handle these SOAP calls in an asynchronous way - just for the hell of it but I have some high-level questions which I cant seem to find any answers to.
I am using CXF but my question is not specifically about CXF or SOAP, but higher-level, in terms of asynchronous application architecture I think.
What I want to know (working thru a scenario) - at a high level - is:
So I have a Thread (A) running in my JVM that makes a call to a remote web service
It registers a callback method and returns a Future
Thread (A) has done its bit and gets returned to its pool once it has returned the Future
The remote web service response returns and Thread (B) gets allocated and calls the callback method (which generally populates the Future with a result I believe)
Q1. I cant get my head off the blocking thread model - if Thread (A) is no longer listening to that network socket then how does the response that comes back from the remote service get allocated Thread (B) - is it simply treated as a new request coming into the server/container which then allocates a thread to service it?
Q2. Closely related to Q1 I imagine: if no Thread has the Future, or handler (with its callback method) on its stack, then how does the response from the remote web service get associated with the callback method it needs to call?
Or, in another way of asking, how does Thread B (now dealing with the response) get given a reference to the Future/Callback object?
Very sorry my question is so long - and thanks to anyone who gave their time to read through it! :)

I don't see why you'd add all this complexity using asynchronous Threading.
The way to design an asynchronous soap service:
You have one service sending out a response to a given client / clients.
Those clients work on the response given asynchronously.
When done, they would call another soap method to return their response.
The response will just be stored in a queue (e.g. a database table), without any extra logic. You'd have a "Worker" Service working on the incoming tasks. If a response is needed again another method on the other remote service would be called. The requests I would store as events in the database, which would later be asynchronously handled by an EventHandler. See
Hexagonal Architecture:
https://www.youtube.com/watch?v=fGaJHEgonKg

Your Q1 and Q2 seem to have more to do with multithreading than they have to do with asynchronous calls.
The magic of asynchronous web service calls is that you don't have to worry about multithreading to handle blocking while waiting for a response.
It's a bit unclear from the question what the specific problem statement is (i.e., what you are hoping to have your application do while blocking or rather than blocking), but here are a couple ways that you could use asynchronous web service calls that will allow you to do other work.
For the following cases, assume that the dispatch() method calls Dispatch.invokeAsync(T msg, AsyncHandler handler) and returns a Future:
1) Dispatch multiple web service requests, so that they run in parallel:
If you have multiple services to consume and they can all execute independently, dispatch them all at once and process the responses when you have received them all.
ArrayList<Future<?>> futures = new ArrayList<Future<?>>();
futures.add(serviceToConsume1.dispatch());
futures.add(serviceToConsume2.dispatch());
futures.add(serviceToConsume3.dispatch());
// now wait until all services return
for(Future f<?> : futures) {
f.get();
}
// now use responses to continue processing
2) Polling:
Future<?> f = serviceToConsume.dispatch();
while(!f.isDone()) {
// do other work here
}
// now use response to continue processing

Complex Spring Framework Service Layer

EDITED SHORT VERSION OF THE POST:
Haven't had enough views, so I'm summarizing the question:
My architecture is completely stateless and async, the front-end makes a petition to a REST API and then long-polls for the response. This Rest API queues petitions into a messaging queue, and each petition is dequeued and processed by the Back-end.
I want this Back-end to follow the "traditional" Spring #Service interface and ServiceImpl approach, however is kind of hard because of how I'm doing it.
One Thread dequeues the petition (Producer), spawns a new Thread (Consumer), and then it processes all the petition within that thread, which later sends back to a "responses pool" where it gets polled. That petition might need to use several #Service's and merge the responses from each, maybe even the same #Service twice.
How would you do it? For more information, check the description below!
ORIGINAL LONG POST:
I have a large application with 3 layers like this:
Front-end (Spring-MVC): Views and Controllers, "Model" are async requests to REST API in Middleware to queue the petition first and then long-polling for an answer
Middleware (Spring-MVC): The rest API. Two main functions: receives a petition from front-end and queues it, receives an answer from Backend and stores it on responses cache until retrieved by front-end
Back-End (Spring Standalone App): Producer/Consumer pattern, ONE Producer dequeues petition and creates a Prototype Consumer for each petition. The consumer implements InitializingBean, so it goes something like this: It is initialized, many Autowired fields are initialized and then afterPropertiesSet is executed and many fields which depends on the petition are set.
I also have a Repository Layer of HibernateDaos, which does all the querying to the database.
I'm missing a properly built Service Layer and that's what this question is all about.
Let me put a little bit more of context. What I have right now is like one only HUGE service with 221 functions (The Consumer's file is very long), and one petition may need to invoke several of this functions, and the result of each is merged into a List of DTOs, which is later received by the front-end.
I want to split this one and only service into several, in a logical match to "it's" corresponding Repository, however I've faced the following problems:
Keep this in mind:
One petition has many Actions, one action is a call to a function of a Service.
Each Consumer is a single and unique Thread.
Every time a Consumer Thread starts, a transaction is started and right before returning it is commited, unless rollbacked.
I need all the services of that petition to be executed in the same thread and transaction.
When the consumer runs afterPropertiesSet, several fields specific to that request are initialized by some parameters which are always sent.
With a good Service Layer I want to acomplish:
I don't want to have to initialize all these parameters always for each service of the petition, I want them to be global to the petition/Thread, however, I don't want to have to pass then as parameters to all the 221 functions.
I want to lazily initialize the new services, only if needed, and when it is initialized, I want to set all the parameters I mentioned above. Each service needs to be a Prototype to the petition, however I feel like is dumb initializing it twice if needed within the same petition (2 actions for the same service on one petition), i.e. I want it to behave like a "Request" scope, however, it is not a request since it is not Web Based, it is a new Thread initialized by the Producer when de-queuing a petition.
I was thinking of having a prototype ServicesFactory per Consumer which is initialized with all the parameters afterPropetiesSet in the Consumer, inside this ServicesFactory all possible Services are declared as Class fields, and when a specific service is requested, if it's field is null it is initialized and all fields are set, if not null, the same instance is returned. The problem with this approach, I that I'm losing Dependency Injection on all the Services. I've been reading about ServiceFactoryBean thinking maybe this is the way to go, however I really can't get a hold to it. The fact that it needs all the parameters of the Consumer, and that it needs to be an unique ServiceFactoryBean per Consumer is really confusing.
Any thoughts?
Thanks

Based on the description I don't think this is a good case for using the protoype scope, in this case the ideal scope seems to be thread scope.
As a solution, the simplest would be to make all services singleton. Then the consumer reads the petition from the inbound queue and starts processing.
One of the services that is also singleton and gets injected in all services needed, let's call it PetitionScopedService.
This service internally uses a ThreadLocal, which is a thread scoped holder for a variable of type PetitionContext. PetitionContext on it's turn contain all information that is global to that petition.
All the consumer needs to do is to set the initial values of the petition context, and any caller of PetitionScopedService on the same thread will be able to read those values in a transparent way. Here is some sample code:
public class PetitionContext {
... just a POJO, getters and setters etc.
}
#Service
public class PetitionScopedService {
private ThreadLocal<PetitionContext> = new ThreadLocal<PetitionContext>();
public doSomethingPetitionSpecific() {
... uses the petition context ...
}
}
#Service
public class SomeOtherService {
#Autowired
private PetitionScopedService petitionService;
... use petition service that is a singleton with thread scoped internal state, effectivelly thread scoped ...
}

Points 2 and 3 need more reorganizing, prefer to check "Spring Integration" for both "Middleware" and "(Spring Standalone App): Producer/Consumer pattern" actually spring integration made to solve these 2 points, and using publish/subscribe if you are doing 2 or more actions at same time, the other point why you are using REST in "Middleware" are these "Middleware" services exposed by another app rather than your front end, in this case you can integrate this part in your Spring-MVC front end app using "content negotiation", otherwise if you are going to use "Spring Integration" you will find multiple ways for communication.

How to handle asynchronous operations in REST

I need to understand what approaches there are to handle asynchronous operations in REST and what their advantages and disadvantages are. Some approaches I found:
Resource Based: Where the status of an operation is modeled as a status. User makes an async REST call (PUT, POST etc.) gets Accepted or In-Progress response (202). Further a status URI is polled repeatedly via GET to check status/progress/messages from operation execution.
Question: How long should this resource be active at Server? If the client polls in large intervals where in between the operation completes, how do we return the status? Seems like persisting the execution status would work. But how long to persist, when to archive/delete, is this kind of standard approach?
Callback Based: Where an async request is required to have a callback URI. The request gets processed asynchronously and upon completion makes a call to the callback URI with the operation status/result.
Question: This seems more elegant and involving less overhead at the server side. But how to handle scenarios where the callback server is intermittently down, not responding, etc.? Implement a typical retries where the callback URI provides retries configuration as well? Is there any other downside to this approach?
Servlet 3.0 Asynchronous support: Where an HTTP client makes a connection to a Java Servlet, which remains open until it is explicitly closed and until closed client and server can communicate asynchronously over it.
Question: Since its Servlet 3.0 spec, I think Jersey, the Spring REST implementation, doesn't utilize this approach as of now. Is there any specific REST implementation which utilizes a similar approach or pointer on ways to make it possible?
Any other approaches, maybe commercial ones?

Spring 3.2+ supports the async features of Servlet 3.0. From the Spring Blog:
you can make any existing controller method asynchronous by changing it to return a Callable. For example a controller method that returns a view name, can return Callable instead. An #ResponseBody that returns an object called Person can return Callable instead. And the same is true for any other controller return value type.
Jersey 2+ also supports asyncronous servers. See the Asynchronous Services and Clients chapter in the reference docs.

I think, the approach depends on time gap between initial request and the end of operation.
For short-time operations ( < 10s ) I would just keep the request open and return response when operation finished;
For long operations ( < 30m ) I would use servlet 3.0 or Comet model;
For extremely long operations ( hours, days ) good enough way, as for me, is just client-based polling or Comet with big timeouts.

I'm dealing now with the same situation and found the common approach of using Location header response to give a resource that can be monitored to check status (by polling of course). That seems to be the best, but in my case, I'm not creating a resource so I don't have a location to check the status (my async process is just to build a cache page).
You can always use your own headers to give an estimated time to complete the operation. Anyway I'm thinking of using Retry-After header to give an estimated time. What do you guys think?

I know this is old but I thought I'd chime in here to say that if what you want is something that can scale out in a stateless environment then you should go with your first option. You can perform the underlying operation anywhere and if you put the result in something like redis it will not matter to what web server the client makes subsequent polling requests. I'll usually put the polling interval in the response I sent to the client. When there a result is ready I will return the client a SEE OTHER that includes the id of the result in the URI.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.