I'm trying to write a Spring web application on a Weblogic server that makes several independent database SELECTs(i.e. they can safely be called concurrently), one of which takes 15 minutes to execute.
Once all the results are fetched, an email containing the results will be sent to a user list.
What's a good way to get around this problem? Is there a Spring library that can help or do I go ahead and create daemon threads to do the job?
EDIT: This will have to be done at the application layer (business requirement) and the email will be sent out by the web application.
Are you sure you are doing everything optimally? 15 minutes is a really long time unless you have a gabillion rows across dozens of tables and need a heckofalot of joins....this is your highest priority -- why is it taking so long?
Do you do the email job at set intervals, or is it invoked from your web app? If set intervals, you should do it in an outside job, possibly on another machine. You can use daemons or the quartz scheduler.
If you need to fire this process off from the web app, you need to do it asynchronously. You could use JMS, or you could just have a table into which you enter a new job request, with daemon process that looks for new jobs every X time period. Firing off background threads is possible, but its error prone and not worth the complication, especially since you have other valid options that are simpler.
If you are asking about Spring support for long-running, possibly asynchronous tasks, you have a choice between Spring JMS support and Spring Batch.
You can use spring quartz to schedule the job. That way the jobs will run in the same container but will not require an http request to trigger them.
Related
I am currently working on a scheduled task that runs behind the scenes of my Spring web application. The task uses a cron scheduler to execute at midnight every night, and clean-up unused applications for my portal (my site allows users to create an application to fill out, and if they don't access the form within 30 days, my background task will delete it from our DB and inform the user to create a new form if needed with an email). Everything works great in my test environment, and I am ready to move to QA.
However, my next environment uses two load balanced servers to process requests. This is a problem, as the cron scheduler and my polling task run concurrently on both servers. While the read/writes to the DB won't be an issue, the issue lies with sending the notification email to the application user. Without any polling locks, two emails have the possibility to be generated and sent, and I would like to avoid this. Normally, we would use a SQL stored procedure and have a field in our DB for a lock, and then set/release whenever the polling code is called, so only one instance of the polling will be executed. However, with my new polling task, we don't have any fields available, so I am trying to work on a SPRING solution. I found this resource online:
http://www.springframework.net/doc-latest/reference/html/threading.html
And I was thinking of using it as
Semaphore _pollingLock = new Semaphore(1);
_pollingLock.aquire();
try {
//run my polling task
}
finally {
//release lock
}
However, I'm not sure if this will just ensure the second instance executes after, or it skips the second instance and will never execute. Or, is this solution not even appropriate, and there is a better solution. Again, I am using Spring java framework, so any solution that exists there would be my best bet.
Two ways that we've handled this sort of problem in the past both start with designating one of our clustered servers as the one responsible for a specific task (say, sending email, or running a job).
In one solution, we set a JVM parameter on all clustered servers identifying the server name of the one server on which your process should run. For example -DemailSendServer=clusterMember1
In another solution, we simply provided a JVM parameter in the startup of this designated server alone. For example -DsendEmailFromMe=true
In both cases, you can add a tiny bit of code in your process to gate it based on the value or presence of the startup parameter.
I've found the second option simpler to use since the presence of the parameter is enough to allow the process to run. In the first solution, you would have to compare the current server name against the value of the parameter instead.
We haven't done much with Spring Batch, but I would assume there is a way to configure Batch to run a job on a single server within a cluster as well.
I need some design and developments inputs on reading messages from queue. i have following requirements and constraints
i need read message from queue and inert to db.
messages can come at any interval (100's at same time or 1 by one with few mins gap)
don't have any MDB container to host (just plain tomcat server)
Need to write java application to perform the above.
so not very sure how to put this simple application.
if is use quartz scheduler to trigger job to read all messages in the queue then not sure before even that complete next instance of scheduler might start and create problem.
please suggest me any inputs.
this is basically some utility so i don't want to spend too long time nor too much resources on this.
thanks & regards
LR
The usage of an ESB like Mule or Camel would simplify a lot your development. You'd find already developed components (called endpoints) for reading from a queue, and writing into a db. Also for scheduling jobs with quartz.
I have a user workflow where at a specific time a webservice is called, and the results are presented to the user.
According to the search request and the queried results, I want to perform some database updates and statistic logging.
As the workflow pauses while the webservice is requested, I thought about creating some kind of background thread that performs these database actions, while the user can already continue the workflow without having to wait for database actions to complete.
Do you think this is a good practice? How could I create such onetime running background threads?
If you only want to run in the background, then an Executor service is a good solution.
If you need to ensure that queued requests survive events like a server restart, then you need a persistent queue like a JMS Queue. There are some nice, free open source JMS implementations that serve this purpose.
If service call teakes little time (say 1 or 2 seconds) then it is a waste of time to develop such feature.
If it takes significant amount of time you should do this in background.
I'm working on an application that uses Quartz for scheduling Jobs. The Jobs to be scheduled are created programmatically by reading a properties file. My question is: if I have a cluster of several nodes which of these should create schedules programmatically? Only one of these? Or maybe all?
i have used quartz in a web app, where users, among other things, could create quartz jobs that performed certain tasks.
We have had no problems on that app provided that at least the job names are different for each job. You can also have different group names, and if i remember correctly the jobgroup+jobname combination forms a job key.
Anyway we had no problem with creating an running the jobs from different nodes, but quartz at the time(some 6 months ago, i do not believe this has changed but i am not sure) did not offer the possibility to stop jobs in the cluster, it only could stop jobs on the node the stop command was executed on.
If instead you just want to create a fixed number of jobs when the application starts you better delegate that job to one of the nodes, as the jobs name/group will be read from the same properties file for each node, and conflicts will arise.
Have you tried creating them on all of them? I think you would get some conflict because of duplicate names.
So I think one of the members should create the schedules during startup.
You should only have one system scheduling jobs for the cluster if they are predefined in properties like you say. If all of the systems did it you would needlessly recreate the jobs and maybe put them in a weird state if every server made or deleted the same jobs and triggers.
You could simply only deploy the properties for the jobs to one server and then only one server would try to create them.
You could make a separate app that has the purpose of scheduling the jobs and only run it once.
If these are web servers you could make a simple secured REST API that triggers the scheduling process. Then you could write an automated script to access the API and kick off the scheduling of jobs as part of a deployment or whenever else you desired. If you have multiple servers behind a load balancer it should go to only one server and schedule the jobs which quartz would save to the database backed jobstore. The other nodes in the cluster would receive them the next time they update from the database.
Asynchronous jobs such as download scores from the website, or send emails after completion of some critical tasks.
Rightnow we when we download some scores, we have to wait on the current page to get the response page or to get file downloaded.
Is there a possibility that i can click on download scores and it happens in the background so that i can navigate to other parts
of the website, and in the mean-time check the status of the job. Or Schedule some job later in the future and get its execution results
via email.
Ours is a struts 2 webapplication with Hibernate 3.5 ORM. After browsing into some java scheduling libraries, got some info on Quartz.
But is Quartz the right library for the above requirements or any other library that i can try for?
Please guide me in the right direction.
You will need some sort of asynchronous processing support. You can use:
quartz-scheduler - this library is very comprehensive and allows you to schedule all sorts of jobs. If you want to use it only for the purpose of scheduling jobs in the background and run them immediately, might be an overkill
use thread pool, see Executors class
jms queue can listen on requests and process them asynchronously in mdbs
Finally you can take advantage of #Async/#Asynchronous support in spring or ejb
Then you mut somehow restore the results. Depening on whether you want to deliver them directly in the browser or via e-mail:
every time you are rendering a page, check whether there aren't any completed/in progress jobs. If there are some completed jobs, display an extra link on the page somewhere (sort of notification). If the job is in progress, start an ajax request and ask every other second or use long-polling/comet to receive the result immediately
if you want to send results by e-mail, just send it after the job finishes. Much simpler but less user-friendly IMHO.
Quartz is certainly one way to do that - and works well if you want to schedule a job to run at a particular time or with a particular frequency.
If you just want to kick something off in the background in response to a user action, and check its status, there are a few other ways to do it which may be better suited to this pattern:
the java.util.concurrent package: you can set up a ThreadPoolExecutor and submit tasks to it that implement Callable. You get back a Future<T> object that you can check for completion (isDone) and get its result when complete (get).
with EJB or Spring, there is also a concept of a (session) bean method being #Async or #Asynchronous, which return a Future<T> as well and behave as above. Basically this just abstracts away the thread-pool creation and management from your code, and moves it into the container or framework.