I'm working on an application that uses Quartz for scheduling Jobs. The Jobs to be scheduled are created programmatically by reading a properties file. My question is: if I have a cluster of several nodes which of these should create schedules programmatically? Only one of these? Or maybe all?
i have used quartz in a web app, where users, among other things, could create quartz jobs that performed certain tasks.
We have had no problems on that app provided that at least the job names are different for each job. You can also have different group names, and if i remember correctly the jobgroup+jobname combination forms a job key.
Anyway we had no problem with creating an running the jobs from different nodes, but quartz at the time(some 6 months ago, i do not believe this has changed but i am not sure) did not offer the possibility to stop jobs in the cluster, it only could stop jobs on the node the stop command was executed on.
If instead you just want to create a fixed number of jobs when the application starts you better delegate that job to one of the nodes, as the jobs name/group will be read from the same properties file for each node, and conflicts will arise.
Have you tried creating them on all of them? I think you would get some conflict because of duplicate names.
So I think one of the members should create the schedules during startup.
You should only have one system scheduling jobs for the cluster if they are predefined in properties like you say. If all of the systems did it you would needlessly recreate the jobs and maybe put them in a weird state if every server made or deleted the same jobs and triggers.
You could simply only deploy the properties for the jobs to one server and then only one server would try to create them.
You could make a separate app that has the purpose of scheduling the jobs and only run it once.
If these are web servers you could make a simple secured REST API that triggers the scheduling process. Then you could write an automated script to access the API and kick off the scheduling of jobs as part of a deployment or whenever else you desired. If you have multiple servers behind a load balancer it should go to only one server and schedule the jobs which quartz would save to the database backed jobstore. The other nodes in the cluster would receive them the next time they update from the database.
Related
I have multiple jobs and they all share the same resource. This resource is some ad-hoc build script, and so it cannot be ran concurrently.
Is it possible to define in Quartz that some jobs cannot run concurrently?
So, if one of the jobs is already running, the spawned job is queued.
I had encounter a similar scenario in my application, try out the below approach and see if it works for you.
Put the code that runs your ad-hoc build script in a synchronized block.
With this only one thread will run your ad-hoc script at a time, even when multiple threads are trying to run the same resource.
With this, you can increase the thread count to a suitable value as well instead of setting it to 1, like below.
spring.quartz.properties.org.quartz.threadPool.threadCount=5
If you want to run multiple quartz scheduler instances on different machines sharing a single database, then you should consider Configure Clustering with JDBC-JobStore. Please refer this link for more info http://www.quartz-scheduler.org/documentation/quartz-2.2.2/configuration/ConfigJDBCJobStoreClustering.html
I have many jobs with Quartz Schedule and I execute in my java app.
(one event on my webapp active a job to run)
My webapp runs in 4 machines with two instances each, so:
how can i make sure that the same job does not run 2 times on 2 different machines?
and how i make sure tha there are no problems with persistence in the database?
You can configure Quartz clustering with JDBC-JobStore. Basically all your instances should share the same database, and Quartz will ensure that only one instance will run a specific job. More info can be found here
I am working on a scheduled job that will run at certain interval (eg. once a day at 1pm), scheduled through Cron. I am working with Java and Spring.
Writing the scheduled job is easy enough - it does: grab list of people will certain criteria from db, for each person do some calculation and trigger a message.
I am working on a single-node environment locally and in testing, however when we go to production, it will be multi-node environment (with load balancer, etc). My concern is how would multi node environment affect the scheduled job?
My guess is I could (or very likely would) end up with triggering duplicate message.
Machine 1: Grab list of people, do calculation
Machine 2: Grab list of people, do calculation
Machine 1: Trigger message
Machine 2: Trigger message
Is my guess correct?
What would be the recommended solution to avoid the above issue? Do I need to create a master/slave distributed system solution to manage multi node environment?
If you have something like three Tomcat instances, each load balanced behind Apache, for example, and on each your application runs then you will have three different triggers and your job will run three times. I don't think you will have a multi-node environment with distributed job execution unless some kind of mechanism for distributing the parts of the job is in place.
If you haven't looked at this project yet, take a peek at Spring XD. It handles Spring Batch Jobs and can be run in distributed mode.
I've got a Spring Web application that's running on two different instances.
The two instances aren't aware of each other, they run on distinct servers.
That application has a scheduled Quartz job but my problem is that the job shouldn't execute simultaneously on the instances, as its a mail sending job, it could cause duplicate emails being sent.
I'm using RAMJobStore and JDBCJobStore is not an option for me due to the large number of tables it requires.(I cant afford to create many tables due to internal restriction)
The solutions I thought about:
-creating a single control table, that has to be checked everytime a job starts (with repeatable read isolation level to avoid concurrency issues) The problem is that if the server is killed, the table might be left in a invalid state.
-using properties to define a single server to be the job running server. Problem is that if that server goes down, jobs will stop running
Has anyone ever experienced this problem and do you have any thoughts to share?
Start with the second solution (deactivate qartz on all nodes except one). It is very simple to do and it is safe. Count how frequently your server goes down. If it is inacceptable then try the first solution. The problem with the first solution is that you need a good skill in mutithreaded programming to implement it without bugs. It is not so simple if multithreading is not your everyday task. And a cost of some bug in your implementation may be bigger than actual profit.
I'm trying to write a Spring web application on a Weblogic server that makes several independent database SELECTs(i.e. they can safely be called concurrently), one of which takes 15 minutes to execute.
Once all the results are fetched, an email containing the results will be sent to a user list.
What's a good way to get around this problem? Is there a Spring library that can help or do I go ahead and create daemon threads to do the job?
EDIT: This will have to be done at the application layer (business requirement) and the email will be sent out by the web application.
Are you sure you are doing everything optimally? 15 minutes is a really long time unless you have a gabillion rows across dozens of tables and need a heckofalot of joins....this is your highest priority -- why is it taking so long?
Do you do the email job at set intervals, or is it invoked from your web app? If set intervals, you should do it in an outside job, possibly on another machine. You can use daemons or the quartz scheduler.
If you need to fire this process off from the web app, you need to do it asynchronously. You could use JMS, or you could just have a table into which you enter a new job request, with daemon process that looks for new jobs every X time period. Firing off background threads is possible, but its error prone and not worth the complication, especially since you have other valid options that are simpler.
If you are asking about Spring support for long-running, possibly asynchronous tasks, you have a choice between Spring JMS support and Spring Batch.
You can use spring quartz to schedule the job. That way the jobs will run in the same container but will not require an http request to trigger them.