Understanding launching of Spring batch Jobs in a web container - java

I was reading the Spring Batch documentation when I came across the fact that we would have to use a different implementation of the TaskExecutor interface (The Asynchronous version) if we would efficiently have to run batch jobs from a web container.
I am assuming that an Http request would trigger the batch job. And as I understand it, when the client launches the job via the run method of the JobLauncher interface, the client has to wait for the JobExecution object to be returned back and since a typical batch job would run for hours at an end, this might not be very feasible if the jobs are executed synchronously. Now, the AsyncTaskExecutor would execute each step in separate threads and would return the JobExecution object immediately with an UNKNOWN status.
Firstly, can someone please explain to me, how this works from a client-server connection perspective? In each case, would the client not wait for the batch to be finished before he terminates the session? Or, would the client not know about the exit status of the batch job? Is the whole problem to do with the connection having to remain till the batch ends?
As an example, say the client has a web page which sends an HTTP get request, which is served by a servlet's doget method. This method calls the run method of the job launcher. This method will return the JobExecution object. And the rest of the story is as above.
Thanks,
Aditya.

It depends a bit on what your servlet does after it has called the run method and received the JobExecution object in return. I will assume that the doget method just returns after run is called.
If you do not use an asynchronous executor the call to the run method on the job launcher will be executed synchronously. That is, the call will wait until the batch job is done and the JobExecution object is returned. From a connection perspective, the client's HTTP connection will remain open during the entire batch job. The HTTP connection will be closed when the servlet's doGet method returns (or before if some kind of timeout is encountered on some level, e.g. firewall or a socket read timeout).
If you do use a asynchronous executor the call to the run method will return immediately. The doGet method will return after that, the HTTP response will be sent to the client and the connection will be closed (assuming there is not HTTP keep-alive).

Running Jobs from within a Web Container
Usually jobs are launched from the command-line. However, there are many cases where launching from an HttpRequest is a better option. Many such use cases include reporting, ad-hoc job running, and web application support. Because a batch job by definition is long running, the most important concern is ensuring to launch the job asynchronously:
Here Spring MVC controller launches a Job using a JobLauncher that has been configured to launch asynchronously, which immediately returns a JobExecution. The Job will likely still be running, however, this nonblocking behaviour allows the controller to return immediately, which is required when handling an HttpRequest.
An example is below:
#Controller
public class JobLauncherController {
#Autowired
JobLauncher jobLauncher;
#Autowired
Job job;
#RequestMapping("/jobLauncher.html")
public void handle() throws Exception{
jobLauncher.run(job, new JobParameters());
}
}
source

Related

Long running job Spring

I want to create a web page which takes a string as an input and start a process. The process will run for long time. I need to email the results after processing.
Which spring API should I use ?
The user will close the browser once he makes the request . I am a newbie to Java EE and spring .
Can anyone say the architecture that is needed to accomplish this ?
You could use some executor service:
#Bean
public ExecutorService executorService() {
return Executors.newCachedThreadPool();
}
Then
executorService.execute(yourLongRunningTaskRunnable);
in your controller, where yourLongRunningTaskRunnable is, of course, a Runnable. This like schedules your task for execution. The task could send an email if needed in the end.
This is not a Spring API, but it seems suitable for such a task.

Small Demo application on DeferredResult In spring

Hi I just Need an application on DeferredResult in Spring MVC which clear it's working.
It is easier when you understand the concept of DeferredResult:
Your controller is eventually a function executed by the servlet container (for that matter, let's assume that the server container is Tomcat) worker thread. Your service flow start with Tomcat and ends with Tomcat. Tomcat gets the request from the client, holds the connection, and eventually returns a response to the client. Your code (controller or servlet) is somewhere in the middle.
Consider this flow:
Tomcat get client request.
Tomcat executes your controller.
Release Tomcat thread but keep the client connection (don't return response) and run heavy processing on different thread.
When your heavy processing complete, update Tomcat with its response and return it to the client (by Tomcat).
Because the servlet (your code) and the servlet container (Tomcat) are different entities, then to allow this flow (releasing tomcat thread but keep the client connection) we need to have this support in their contract, the package javax.servlet, which introduced in Servlet 3.0 . Spring MVC use this new Servlet 3.0 capability when the return value of the controller is DeferredResult (BTW, also Callable). DeferredResult is a class designed by Spring to allow more options (that I will describe) for asynchronous request processing in Spring MVC, and this class just holds the result (as implied by its name) so it means you need some kind of thread that will run you async code. What do you get by using DeferredResult as the return value of the controller? DeferredResult has built-in callbacks like onError, onTimeout, and onCompletion. It makes error handling very easy.
Here you can find a simple working examples I created.
The main part from the github example:
#RestController
public class DeferredResultController {
static ExecutorService threadPool = getThreadPool();
private Request request = new Request();
#RequestMapping(value="/deferredResultHelloWorld/{name}", method = RequestMethod.GET)
public DeferredResult<String> search(#PathVariable("name") String name) {
DeferredResult<String> deferredResult = new DeferredResult<>();
threadPool.submit(() -> deferredResult.setResult(request.runSleepOnOtherService(name)));
return deferredResult;
}
}
https://spring.io/blog/2012/05/14/spring-mvc-3-2-preview-adding-long-polling-to-an-existing-web-application
and the source code is: https://github.com/spring-projects/spring-amqp-samples/tree/spring-mvc-async
for useful articles see blog post series about spring async support:
https://spring.io/blog/2012/05/07/spring-mvc-3-2-preview-introducing-servlet-3-async-support

jersey ws 2.0 #suspended AsyncResponse, what does it do?

I am analyzing some jersey 2.0 code and i have a question on how the following method works:
#Stateless
#Path("/mycoolstuff")
public class MyEjbResource {
…
#GET
#Asynchronous //does this mean the method executes on child thread ?
public void longRunningOperation(#Suspended AsyncResponse ar) {
final String result = executeLongRunningOperation();
ar.resume(result);
}
private String executeLongRunningOperation() { … }
}
Lets say im at a web browser and i type in www.mysite/mycoolstuff
this will execute the method but im not understanding what the asyncResponse is used for neither the #Asynchronous annotation. From the browser how would i notice its asychnronous ? what would be the difference in removing the annotation ? Also the suspended annotation after reading the documentation i'm not clear its purpose.
is the #Asynchronous annotation simply telling the program to execute this method on a new thread ? is it a convenience method for doing "new Thread(.....)" ?
Update: this annotation relieves the server of hanging onto the request processing thread. Throughput can be better. Anyway from the official docs:
Request processing on the server works by default in a synchronous processing mode, which means that a client connection of a request is processed in a single I/O container thread. Once the thread processing the request returns to the I/O container, the container can safely assume that the request processing is finished and that the client connection can be safely released including all the resources associated with the connection. This model is typically sufficient for processing of requests for which the processing resource method execution takes a relatively short time. However, in cases where a resource method execution is known to take a long time to compute the result, server-side asynchronous processing model should be used. In this model, the association between a request processing thread and client connection is broken. I/O container that handles incoming request may no longer assume that a client connection can be safely closed when a request processing thread returns. Instead a facility for explicitly suspending, resuming and closing client connections needs to be exposed. Note that the use of server-side asynchronous processing model will not improve the request processing time perceived by the client. It will however increase the throughput of the server, by releasing the initial request processing thread back to the I/O container while the request may still be waiting in a queue for processing or the processing may still be running on another dedicated thread. The released I/O container thread can be used to accept and process new incoming request connections.
#Suspended have more definite if you used it, else it will not make any difference of using it.
Let's talk about benefits of it:
#Suspended will pause/Suspend the current thread until it gets response,by default #NO_TIMEOUT no suspend timeout set. So it doesn't mean your request response (I/O)thread will get free and be available for other request.
Now Assume you want your service to be a response with some specific time, but the method you are calling from resource not guarantee the response time, then how will you manage your service response time? At that time, you can set suspend timeout for your service using #Suspended, and even provide a fall back response when time get exceed.
Below is some sample of code for setting suspend/pause timeout
public void longRunningOperation(#Suspended AsyncResponse ar) {
ar.setTimeoutHandler(customHandler);
ar.setTimeout(10, TimeUnit.SECONDS);
final String result = executeLongRunningOperation();
ar.resume(result);
}
for more details refer this
The #Suspended annotation is added before an AsyncResponse parameter on the resource method to tell the underlying web server not to expect this thread to return a response for the remote caller:
#POST
public void asyncPost(#Suspended final AsyncResponse ar, ... <args>) {
someAsyncMethodInYourServer(<args>, new AsyncMethodCallback() {
#Override
void completed(<results>) {
ar.complete(Response.ok(<results>).build());
}
#Override
void failed(Throwable t) {
ar.failed(t);
}
}
}
Rather, the AsyncResponse object is used by the thread that calls completed or failed on the callback object to return an 'ok' or throw an error to the client.
Consider using such asynchronous resources in conjunction with an async jersey client. If you're trying to implement a ReST service that exposes a fundamentally async api, these patterns allow you to project the async api through the ReST interface.
We don't create async interfaces because we have a process that takes a long time (minutes or hours) to run, but rather because we don't want our threads to ever sleep - we send the request and register a callback handler to be called later when the result is ready - from milliseconds to seconds later - in a synchronous interface, the calling thread would be sleeping during that time, rather than doing something useful. One of the fastest web servers ever written is single threaded and completely asynchronous. That thread never sleeps, and because there is only one thread, there's no context switching going on under the covers (at least within that process).
The #suspend annotation makes the caller actually wait until your done work. Lets say you have a lot of work to do on another thread. when you use jersey #suspend the caller just sits there and waits (so on a web browser they just see a spinner) until your AsyncResponse object returns data to it.
Imagine you had a really long operation you had to do and you want to do it on another thread (or multiple threads). Now we can have the user wait until we are done. Don't forget in jersey you'll need to add the " true" right in the jersey servlet definition in web.xml to get it to work.

callback function do background jobs after the completing the action in java spring

I am new to Java Spring Framework, I am Rails developer I have requirement in java spring like I need to do background jobs but after the response send to the end User. It should not wait for the jobs to complete. But the jobs should run every time action completes.
Is a webservice app. We have Service, Bo and DAO layers and we are logging any exceptions occurred while processing the user data in database before response send to user, but now we want to move(Exception handling) after response send to user to increase the performance.
I remember in rails we have callbacks/filters after the action executed it calls the methods we want to executed. Same is available in java Spring?
Thanks,
Senthil
I assume the use case is something like a user requests a long-running task, and you want to return a response immediately and then launch the task in the background.
Spring can help with this. See
http://docs.spring.io/spring/docs/3.2.x/spring-framework-reference/html/scheduling.html
In particular see the #Async annotation.
With respect to the client getting a response back following the async processing (exception or otherwise), you can do it, but it's extra work.
Normally the immediate response would include some kind of ID that the client could come back with after some period of time. (For example, when you run a search against the Splunk API, it gives you a job ID, and you come back later with that job ID to check on the result). If this works, do that. The client has to poll but the implementation is the simplest.
If not, then you have to have some way for the client to listen for the response. This could be a "reply-to" web service endpoint on the client (perhaps passed in with the original request as a custom X-Reply-To HTTP header), or it could be a message queue, etc.

Spring's DeferredResult setResult interaction with timeouts

I'm experimenting with Spring's DeferredResult on Tomcat, and I'm getting crazy results. Is what I'm doing wrong, or is there some bug in Spring or Tomcat? My code is simple enough.
#Controller
public class Test {
private DeferredResult<String> deferred;
static class DoSomethingUseful implements Runnable {
public void run() {
try { Thread.sleep(2000); } catch (InterruptedException e) { }
}
}
#RequestMapping(value="/test/start")
#ResponseBody
public synchronized DeferredResult<String> start() {
deferred = new DeferredResult<>(4000L, "timeout\n");
deferred.onTimeout(new DoSomethingUseful());
return deferred;
}
#RequestMapping(value="/test/stop")
#ResponseBody
public synchronized String stop() {
deferred.setResult("stopped\n");
return "ok\n";
}
}
So. The start request creates a DeferredResult with a 4 second timeout. The stop request will set a result on the DeferredResult. If you send stop before or after the deferred result times out, everything works fine.
However if you send stop at the same time as start times out, things go crazy. I've added an onTimeout action to make this easy to reproduce, but that's not necessary for the problem to occur. With an APR connector, it simply deadlocks. With a NIO connector, it sometimes works, but sometimes it incorrectly sends the "timeout" message to the stop client and never answers the start client.
To test this:
curl http://localhost/test/start & sleep 5; curl http://localhost/test/stop
I don't think I'm doing anything wrong. The Spring documentation seems to say it's okay to call setResult at any time, even after the request already expired, and from any thread ("the
application can produce the result from a thread of its choice").
Versions used: Tomcat 7.0.39 on Linux, Spring 3.2.2.
This is an excellent bug find !
Just adding more information about the bug (that got fixed) for a better understanding.
There was a synchronized block inside setResult() that extended up to the part of submitting a dispatch. This can cause a deadlock if a timeout occurs at the same time since the Tomcat timeout thread has its own locking that permits only one thread to do timeout or dispatch processing.
Detailed explanation:
When you call "stop" at the same time as the request "times out", two threads are attempting to lock the DeferredResult object 'deferred'.
The thread that executes the "onTimeout" handler
Here is the excerpt from the Spring doc:
This onTimeout method is called from a container thread when an async request times out before the DeferredResult has been set. It may invoke setResult or setErrorResult to resume processing.
Another thread that executes the "stop" service.
If the dispatch processing called during the stop() service obtains the 'deferred' lock, it will wait for a tomcat lock (say TomcatLock) to finish the dispatch.
And if the other thread doing timeout handling has already acquired the TomcatLock, that thread waits to acquire a lock on 'deferred' to complete the setResult()!
So, we end up in a classic deadlock situation !

Categories

Resources