Spring Batch: Single job instance listener - java

I want to make sure that only one job instance is allowed to run. and if another instance is already running then stop it.
So i implemented a listener that checks the number of running Jobs like below: (i'm not sur if it's the correct impl to stop the jobs)
public class SingleJobInstanceListener implements JobExecutionListener {
#Override
public void beforeJob(JobExecution jobExecution) {
final String jobName = jobExecution.getJobInstance().getJobName();
LOGGER.info("Listener to check one {} job instance is running", jobName);
if (Constants.JOB_NAME.equals(jobName)) {
final Set<JobExecution> executionSet = jobExplorer.findRunningJobExecutions(jobName);
if (executionSet.size() > 1) {
for (JobExecution execution : executionSet) {
execution.stop();
LOGGER.info("{} job instance {} is stopped", jobName, execution.getJobInstance().getInstanceId());
}
}
}
}
}
to run the job:
#Scheduled(cron = "0/10 * * * * *")
public void runSpringBatchJob() {
LOGGER.info("Job was started");
final JobExecution streamExecution = jobLauncher.run(job, newExecution());
LOGGER.info("Exit data job with status: {}", streamExecution.getStatus());
LOGGER.info("Exit data job ID: {}", streamExecution.getJobId());
LOGGER.info("-----------------------------------------------------");
}
LOG:
SpringBatchJobLauncher : Job was started
SingleJobInstanceListener : Listener to check one testJob job instance is running
SingleJobInstanceListener : testJob job instance 178 is stopped
SingleJobInstanceListener : testJob job instance 160 is stopped
SingleJobInstanceListener : testJob job instance 154 is stopped
StreamWriter : Writing data: [......]
SpringBatchJobLauncher : Exit data job with status: COMPLETED
SpringBatchJobLauncher : Exit data job ID: 178
SpringBatchJobLauncher : -----------------------------------------------------
SpringBatchJobLauncher : Job was started
SingleJobInstanceListener : Listener to check one testJob job instance is running
SingleJobInstanceListener : testJob job instance 160 is stopped
SingleJobInstanceListener : testJob job instance 179 is stopped
SingleJobInstanceListener : testJob job instance 154 is stopped
StreamWriter : Writing data: [......]
SpringBatchJobLauncher : Exit data job with status: COMPLETED
SpringBatchJobLauncher : Exit data job ID: 179
so my question is what is the reason that makes the job instanced 3 times (because each time the job is running it's instanced 3 times), and if the job 178 is stoped why is it running again Exit data job ID: 178

The feature you are trying to implement is already provided by Spring Batch, given that you correctly define a centralized transactional job repository and correctly design job instances with distinct identifying job parameters.
With this listener approach, you are running an additional job execution and checking if there is another one currently running. With Spring Batch's built-in feature, the second job execution is not even started, the job launcher will prevent that at startup time (this fail-fast approach is more efficient).
With locally scheduled jobs like you seem to have, a good identifying job parameter could be the run timestamp. For example, you will have a job instance every 10 minutes. Now if another scheduled method tries to run the same job instance say of 9:10 at the same time as the first method, then Spring Batch will prevent one of them from running, thanks to the centralized transactional job repository approach.
This also works in a clustered environment where another instance of your application (at the JVM level) tries to run the same job instance of 9:10 at the same time as the first application. Spring Batch will also prevent a duplicate job execution of the same instance. However, in a distributed environment where clock synchronization is not guaranteed, the single timestamp parameter won't be enough to identify job instances, you would need to add another discriminator to the identifying job parameters set in order to uniquely identify job instances at the cluster level.

Related

Quartz trigger does not fire immediately

I'd like to execute the job ~immediately with quartz scheduler using jdbc datastore. However I have like 20-30 seconds delay between the scheduling and trigger fire even though I schedule with now() or calling triggerJob.
I tried to execute the job with a simple trigger:
JobKey key = //...
JobDetail jobDetail = newJob(jobBean.getClass())
.withIdentity(key)
.usingJobData(new JobDataMap(jobParams))
.storeDurably()
.build();
Trigger trigger = newTrigger()
.withIdentity(key.getName(), key.getGroup())
.startNow()
.withSchedule(SimpleScheduleBuilder.simpleSchedule()
.withMisfireHandlingInstructionFireNow()
.withRepeatCount(0))
.build();
scheduler.scheduleJob(jobDetail, trigger);
And I also tried to trigger with scheduler:
JobKey key = // ...
JobDetail jobDetail = newJob(jobBean.getClass())
.withIdentity(key)
.storeDurably()
.build();
scheduler.addJob(jobDetail, true);
scheduler.triggerJob(key, new JobDataMap(jobParams));
Here are the listener logs that shows the delay.
2019-05-15 13:59:52,066Z INFO [nio-8081-exec-2] c.m.f.s.logger.SchedulingListener : Job added: newsJobTemplate:1557928791965
2019-05-15 13:59:52,066Z INFO [nio-8081-exec-2] c.m.f.s.logger.SchedulingListener : Job scheduled: newsJobTemplate:1557928791965
2019-05-15 14:00:18,660Z INFO [eduler_Worker-1] c.m.f.s.logger.TriggerStateListener : Trigger fired: QUARTZ_JOBS.newsJobTemplate:1557928791965 {}
2019-05-15 14:00:18,703Z INFO [eduler_Worker-1] c.m.f.s.logger.JobExecutionListener : Job will be executed: QUARTZ_JOBS.newsJobTemplate:1557928791965
2019-05-15 14:00:19,284Z INFO [eduler_Worker-1] c.m.f.s.logger.JobExecutionListener : Job was executed: QUARTZ_JOBS.newsJobTemplate:1557928791965
I found crumbs here and there that suggested that the problem is transaction related.
So I removed #Transactional from the service method and voila it worked.
Looks like when you call trigger the scheduler thread asyncronously tries to look up schedules and triggers from the DB but the transaction is not committed at that time. Later the scheduler thread looks up the db again and it finds it finally.
zolee's answer describes the problem perfectly, but there are also a few things one can do to solve it.
One imperfect solution is to reduce org.quartz.scheduler.idleWaitTime. In fact, the problem itself is described, though somewhat obliquely, in the quartz configuration doc, org.quartz.scheduler.idleWaitTime section.
Normally you should not have to ‘tune’ this parameter, unless you’re
using XA transactions, and are having problems with delayed firings of
triggers that should fire immediately.
That will allow you to reduce 30-second delay to 5 seconds or even less.
A full solution is to extend QuartzScheduler to add transaction support. Exact implementation will depend on what library/code you're using for transaction support, but it worked for us perfectly.
class TransactionAwareScheduler extends QuartzScheduler {
#Override
protected void notifySchedulerThread(long candidateNewNextFireTime) {
if (insideTransaction) {
transaction.addCommitHook(() -> {
super.notifySchedulerThread(candidateNewNextFireTime);
});
}
} else {
super.notifySchedulerThread(candidateNewNextFireTime);
}
}

start the job immediately using quartz framework?

I am running some jobs periodically using quartz framework. I have a below quartz_config xml file which contains all the jobs I am running and at what interval.
<job-scheduling-data
xmlns="http://www.quartz-scheduler.org/xml/JobSchedulingData"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.quartz-scheduler.org/xml/JobSchedulingData http://www.quartz-scheduler.org/xml/job_scheduling_data_2_0.xsd"
version="1.8">
<schedule>
<job>
<name>ParserApp</name>
<job-class>com.process.task.ParserApp</job-class>
</job>
<trigger>
<cron>
<name>ParserApp</name>
<job-name>ParserApp</job-name>
<cron-expression>0 0 0/3 1/1 * ? *</cron-expression>
</cron>
</trigger>
</schedule>
</job-scheduling-data>
I am running ParserApp job every 3 hours. Now what I have noticed is - whenever I start my application, it doesn't start ParserApp job immediately. What it does instead is, it starts ParserApp jobs after 3 hours only which is fine as per cron expression. Is there any way by which I can start ParserApp job immediately whenever application is started up and then next run should happened after 3 hours only just like java ScheduledExecutorService does?
In the below code, as soon as you start executorService, it will call parserApp immediately and then it will call parserApp again after 3 hours periodically. Is there any way to do the same thing using quartz-scheduler?
executorService.scheduleAtFixedRate(new Runnable() {
#Override
public void run() {
parserApp();
}
}, 0, 3, TimeUnit.HOURS);
Below is how I am starting all the jobs using quartz scheduler:
StdSchedulerFactory factory = new StdSchedulerFactory();
try {
factory.initialize(App.class.getClassLoader().getResourceAsStream("quartz.properties"));
Scheduler scheduler = factory.getScheduler();
// starts all our jobs using quartz_config.xml file
scheduler.start();
} catch (SchedulerException ex) {
logger.logError("error while starting scheduler= ", ExceptionUtils.getStackTrace(ex));
}

Unable to Execute More than a spark Job "Initial job has not accepted any resources"

Using a Standalone Spark Java to execute the below code snippet, I'm getting the Status is always WAITING with the below error.It doesn't work when I try to add the Print statement. Is there any configuration I might have missed to run multiple jobs?
15/09/18 15:02:56 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MapPartitionsRDD[2] at filter at SparkTest.java:143)
15/09/18 15:02:56 INFO TaskSchedulerImpl: Adding task set 0.0 with 2
tasks
15/09/18 15:03:11 WARN TaskSchedulerImpl: Initial job has not accepted
any resources; check your cluster UI to ensure that workers are
registered and have sufficient resources
15/09/18 15:03:26 WARN TaskSchedulerImpl: Initial job has not accepted
any resources; check your cluster UI to ensure that workers are
registered and have sufficient resources
15/09/18 15:03:41 WARN TaskSchedulerImpl: Initial job has not accepted
any resources; check your cluster UI to ensure that workers are
registered and have sufficient resources
JavaRDD<String> words = input.flatMap(new FlatMapFunction<String, String>() //Ln:143
{
public Iterable<String> call(String x)
{
return Arrays.asList(x.split(" "));
}
});
// Count all the words
System.out.println("Total words is" + words.count())
This error message means that your application is requesting more resources from the cluster than the cluster can currently provide i.e. more cores or more RAM than available in the cluster.
One of the reasons for this could be that you already have a job running which uses up all the available cores.
When this happens, your job is most probably waiting for another job to finish and release resources.
You can check this in the Spark UI.

Quartz scheduler shuts down before job execution

New to quartz scheduler. What I am trying to achieve is to fire one trigger in the future and then shutdown the scheduler. I am using scheduler.shutdown(true) for this, but it shuts down before executing the job. I have to call shutdown() as I am going to implement the scheduler in a web app.
So how do I shutdown scheduler after job executes ?
JOB:
public class HelloJob implements Job {
public HelloJob(){
}
public void execute(JobExecutionContext context)
throws JobExecutionException {
System.out.println("Hello Quartz on " + new Date());
}
}
Scheduler:
public class QuartzTest {
public void scheduleLoad(String time) {
try {
// Transform user input into a date
SimpleDateFormat dateFormat = new SimpleDateFormat("MM/dd/yyyy:HH:mm:ss");
Date scheduleDate = dateFormat.parse(time);
// Print Current vs. Scheduled time/date
System.out.println("Current time - " + new Date());
System.out.println("Scheduled time - " + scheduleDate);
// Grab the Scheduler instance from the Factory
Scheduler scheduler = StdSchedulerFactory.getDefaultScheduler();
// and start it off
scheduler.start();
// Define a job and tie it to a class
JobDetail job = newJob(HelloJob.class)
.withIdentity("job1", "group1")
.build();
// Trigger job to run now and repeat every 10 secs
SimpleTrigger trigger = (SimpleTrigger) newTrigger()
.withIdentity("trigger1", "group1")
.startAt(scheduleDate)
.forJob("job1","group1")
.build();
// Schedule job using trigger
scheduler.scheduleJob(job, trigger);
// Shutdown the scheduler after job is executed
scheduler.shutdown(true);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
String runTime = "04/10/2013:20:07:00";
QuartzTest quartz = new QuartzTest();
quartz.scheduleLoad(runTime);
}
}
Output:
Current time - Wed Apr 10 20:06:31 IST 2013
Scheduled time - Wed Apr 10 20:07:00 IST 2013
[main] INFO org.quartz.impl.StdSchedulerFactory - Using default implementation for ThreadExecutor
[main] INFO org.quartz.simpl.SimpleThreadPool - Job execution threads will use class loader of thread: main
[main] INFO org.quartz.core.SchedulerSignalerImpl - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
[main] INFO org.quartz.core.QuartzScheduler - Quartz Scheduler v.2.1.7 created.
[main] INFO org.quartz.simpl.RAMJobStore - RAMJobStore initialized.
[main] INFO org.quartz.core.QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v2.1.7) 'DefaultQuartzScheduler' with instanceId 'NON_CLUSTERED'
Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
NOT STARTED.
Currently in standby mode.
Number of jobs executed: 0
Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.
[main] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.properties'
[main] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler version: 2.1.7
[main] INFO org.quartz.core.QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started.
[main] INFO org.quartz.core.QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED shutting down.
[main] INFO org.quartz.core.QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED paused.
[main] INFO org.quartz.core.QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED shutdown complete.
quartz.properties:
org.quartz.scheduler.instanceName = MyScheduler
org.quartz.threadPool.threadCount = 3
org.quartz.jobStore.class = org.quartz.simpl.RAMJobStore
org.quartz.scheduler.skipUpdateCheck: true
I think you're misunderstanding the purpose of scheduler.shutdown(true); It will wait for executing jobs to finish, but it will NOT wait for scheduled jobs to start and finish. Your job is not starting before you shutdown the scheduler. You could put a Thread.sleep(wait); before you shut it down. To get your code to run as I understand you want it, remove this line (you basically don't ever need to shut down the scheduler):
// Shutdown the scheduler after job is executed
scheduler.shutdown(true);
Also, move this line so that it gets executed only once. Where you put it, depends on your application, it could be in the main method for a standalone app, or in the init method of a Servlet or a Listener if running in a web application:
// and start it off
scheduler.start();
Followup:
I don't understand why you need to shut down Quartz. What happens if another user needs to schedule a task? Are you planning on starting a different Quartz instance per each scheduled job? It would make more sense to just have it running and schedule tasks as needed. That's the normal way to run Quartz. You might be able to have it behave in the way you want, but it might be contrived. If you really just want something that starts up and shuts down after the task runs, you might want to look at the Timer and TimerTask provided by the JDK. See example here.

Quartz Enterprise Scheduler: Job that schedules itself

I am using Quartz Enterprise Job Scheduler (1.8.3). The job configuration comes from several xml files and we have a special job that detects changes in these xml files and re-schedules jobs. This works dandy, but the problem is that I also need this "scheduler job" to re-schedule itself. Once this job re-schedules itself, for some reason, I see that it gets executed many times. I don't see any exceptions, though.
I have replicated and isolated the problem. This would be the entry-point:
public class App {
public static void main(final String[] args) throws ParseException, SchedulerException {
// get the scheduler from the factory
final Scheduler scheduler = StdSchedulerFactory.getDefaultScheduler();
// start the scheduler
scheduler.start();
// schedule the job to run every 20 seconds
final JobDetail jobDetail = new JobDetail("jobname", "groupname", TestJob.class);
final Trigger trigger = new CronTrigger("triggername", "groupname", "*/20 * * * * ?");
// set the scheduler in the job data map, so the job can re-configure itself
jobDetail.getJobDataMap().put("scheduler", scheduler);
// schedule job
scheduler.scheduleJob(jobDetail, trigger);
}
}
And this would be the job class:
public class TestJob implements Job {
private final static Logger LOG = Logger.getLogger(TestJob.class);
private final static AtomicInteger jobExecutionCount = new AtomicInteger(0);
public void execute(final JobExecutionContext context) throws JobExecutionException {
// get the scheduler from the data map
final Scheduler scheduler = (Scheduler) context.getJobDetail().getJobDataMap().get("scheduler");
LOG.info("running job! " + jobExecutionCount.incrementAndGet());
// buid the job detail and trigger
final JobDetail jobDetail = new JobDetail("jobname", "groupname", TestJob.class);
// this time, schedule it to run every 35 secs
final Trigger trigger;
try {
trigger = new CronTrigger("triggername", "groupname", "*/50 * * * * ?");
} catch (final ParseException e) {
throw new JobExecutionException(e);
}
trigger.setJobName("jobname");
trigger.setJobGroup("groupname");
// set the scheduler in the job data map, so this job can re-configure itself
jobDetail.getJobDataMap().put("scheduler", scheduler);
try {
scheduler.rescheduleJob(trigger.getName(), jobDetail.getGroup(), trigger);
} catch (final SchedulerException e) {
throw new JobExecutionException(e);
}
}
}
I've tried both with scheduler.rescheduleJob and with scheduler.deleteJob then scheduler.scheduleJob. No matter what I do, this is the output I get (I'm using log4j):
23:22:15,874 INFO SchedulerSignalerImpl:60 - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
23:22:15,878 INFO QuartzScheduler:219 - Quartz Scheduler v.1.8.3 created.
23:22:15,883 INFO RAMJobStore:139 - RAMJobStore initialized.
23:22:15,885 INFO QuartzScheduler:241 - Scheduler meta-data: Quartz Scheduler (v1.8.3)
'MyScheduler' with instanceId '1'
Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
NOT STARTED.
Currently in standby mode.
Number of jobs executed: 0
Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 3 threads.
Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.
23:22:15,885 INFO StdSchedulerFactory:1275 - Quartz scheduler 'MyScheduler' initialized from default resource file in Quartz package: 'quartz.properties'
23:22:15,886 INFO StdSchedulerFactory:1279 - Quartz scheduler version: 1.8.3
23:22:15,886 INFO QuartzScheduler:497 - Scheduler MyScheduler_$_1 started.
23:22:20,018 INFO TestJob:26 - running job! 1
23:22:50,004 INFO TestJob:26 - running job! 2
23:22:50,010 INFO TestJob:26 - running job! 3
23:22:50,014 INFO TestJob:26 - running job! 4
23:22:50,016 INFO TestJob:26 - running job! 5
...
23:22:50,999 INFO TestJob:26 - running job! 672
23:22:51,000 INFO TestJob:26 - running job! 673
Notice how at 23:22:20,018, the job runs fine. At this point, the job re-schedules itself to run every 50 seconds. The next time it runs (at 23:22:50,004), it gets scheduled hundreds of times.
Any ideas on how to configure a job while executing that job? What am I doing wrong?
Thanks!
Easy.
First off you have a couple misunderstandings about Cron Expressions. "*/20 * * * * ?" is every twenty seconds as the comment implies, but only because 60 is evenly divisible by 20. "/50 ..." is not every fifty seconds. it is seconds 0 and 50 of every minute. As another example, "/13 ..." is seconds 0, 13, 26, 39, and 52 of every minute - so between second 52 and the next minute's 0 second, there is only 8 seconds, not 13. So with */50 you'll get 50 seconds between every other firing, and 10 seconds between the others.
That however is not the cause of your rapid firing of the job. The problem is that the current second is "50" and you are scheduling the new trigger to fire on second "50", so it immediately fires. And then it is still second 50, and the job executes again, and it schedules another trigger to fire on second 50, and so on, as many times as it can during the 50th second.
You need to set the trigger's start time into the future (at least one second) or it will fire on the same second you are scheduling it, if the schedule matches the current second.
Also if you really need every "N" seconds type of schedule, I suggest SimpleTrigger rather than CronTrigger. SimpleTrigger can do "every 35 seconds" or "every 50 seconds" no problem. CronTrigger is meant for expressions like "on seconds 0, 15, 40 and 43 of minutes 15 and 45 of the 10 o'clock hour on every Monday of January".

Categories

Resources