I'm quite new to Quartz and now I need to schedule some jobs in Spring web application.
I know about Spring + Quartz integration (I'm using Spring v 3.1.1) but I'm wondering if this is the right way to follow.
In particular I need to persist my scheduled tasks in a DB so I can re-initialize them when application is restarted.
Are there some utilities provided by Spring scheduling wrapper to do this?
Can you suggest me some "well known" approach to follow?
Here is one way I handle this scenario.
First in my Spring Configuration I specify a SchedulerFactoryBean from which I can inject the Scheduler into other beans.
<bean name="SchedulerFactory"
class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="applicationContextSchedulerContextKey">
<value>applicationContext</value>
</property>
</bean>
Then when I create a job in my application I store the details of the job in the database. This service is called by one of my controllers and it schedules the job:
#Component
public class FollowJobService {
#Autowired
private FollowJobRepository followJobRepository;
#Autowired
Scheduler scheduler;
#Autowired
ListableBeanFactory beanFactory;
#Autowired
JobSchedulerLocator locator;
public FollowJob findByClient(Client client){
return followJobRepository.findByClient(client);
}
public void saveAndSchedule(FollowJob job) {
job.setJobType(JobType.FOLLOW_JOB);
job.setCreatedDt(new Date());
job.setIsEnabled(true);
job.setIsCompleted(false);
JobContext context = new JobContext(beanFactory, scheduler, locator, job);
job.setQuartzGroup(context.getQuartzGroup());
job.setQuartzName(context.getQuartzName());
followJobRepository.save(job);
JobSchedulerUtil.schedule(new JobContext(beanFactory, scheduler, locator, job));
}
}
The JobContext I build contains detail about the job and is eventually passed to a utility for scheduling jobs. Here is that code for the utility method that actually schedules the job. Notice that in my service I autowire the JobScheduler and pass it to the JobContext. Also notice that I store the job in the database using my repository.
/**
* Schedules a DATA_MINING_JOB for the client. The job will attempt to enter
* followers of the target into the database.
*/
#Override
public void schedule(JobContext context) {
Client client = context.getNetworkSociallyJob().getClient();
this.logScheduleAttempt(context, client);
JobDetail jobDetails = JobBuilder.newJob(this.getJobClass()).withIdentity(context.getQuartzName(), context.getQuartzGroup()).build();
jobDetails.getJobDataMap().put("job", context.getNetworkSociallyJob());
jobDetails.getJobDataMap().put("repositories", context.getRepositories());
Trigger trigger = TriggerBuilder.newTrigger().withIdentity(context.getQuartzName() + "-trigger", context.getQuartzGroup())
.withSchedule(cronSchedule(this.getSchedule())).build();
try {
context.getScheduler().scheduleJob(jobDetails, trigger);
this.logSuccess(context, client);
} catch (SchedulerException e) {
this.logFailure(context, client);
e.printStackTrace();
}
}
So after all of this code executes I two things have happened, my job is store in the database and its been scheduled using the quartz scheduler. Now if the application restarts I want to reschedule my jobs with the scheduler. To do this I register a bean that implements ApplicationListener<ContextRefreshedEvent> which is called by Spring each time the container restarts or is started.
<bean id="jobInitializer" class="com.network.socially.web.jobs.JobInitializer"/>
JobInitializer.class
public class JobInitializer implements ApplicationListener<ContextRefreshedEvent> {
Logger logger = LoggerFactory.getLogger(JobInitializer.class);
#Autowired
DataMiningJobRepository repository;
#Autowired
ApplicationJobRepository jobRepository;
#Autowired
Scheduler scheduler;
#Autowired
JobSchedulerLocator locator;
#Autowired
ListableBeanFactory beanFactory;
#Override
public void onApplicationEvent(ContextRefreshedEvent event) {
logger.info("Job Initilizer started.");
//TODO: Modify this call to only pull completed & enabled jobs
for (ApplicationJob applicationJob : jobRepository.findAll()) {
if (applicationJob.getIsEnabled() && (applicationJob.getIsCompleted() == null || !applicationJob.getIsCompleted())) {
JobSchedulerUtil.schedule(new JobContext(beanFactory, scheduler, locator, applicationJob));
}
}
}
}
This class autowires the scheduler and a repository that grabs instances of each of my jobs that implement the ApplicationJob interface. Using the information from these database records I can use my scheduler utility to reconstruct my jobs.
So basically I manually store the jobs in my database and manually schedule them by injecting a instance of the Scheduler in appropriate beans. To rescheduled them, I query my database and then schedule them using the ApplicationListener to account for restarts and starts of the container.
I suppose there are quite some documentation available for Spring and Quartz JDBC job store integration; for instance:
If you're using annotation configuration: http://java.dzone.com/articles/configuring-quartz
If you're using XML configuration: http://arkuarku.wordpress.com/2011/01/06/spring-quartz-using-jdbcjobstore-simple/
Quartz Job Stores: http://quartz-scheduler.org/documentation/quartz-2.x/tutorials/tutorial-lesson-09
Spring 3.x SchedulerFactoryBean for Quartz
Related
I'm new in software. I'm working to understand async programming in Spring Boot. As seen above, I set thread pool size 2. When I requested same url three times one after another. My two requests are working asynchronously. Third one is waiting. This is ok. But when I don't use the asynchronous feature (neither #async annotation nor threadpool), it still performs transactions asynchronously, as before. So I'm confused. Spring Boot rest controller behaves asynchronously by default? Why we use #async in Spring Boot? Or do I misunderstand that?
#Service
public class TenantService {
#Autowired
private TenantRepository tenantRepository;
#Async("threadPoolTaskExecutor")
public Future<List<Tenant>> getAllTenants() {
System.out.println("Execute method asynchronously - "
+ Thread.currentThread().getName());
try {
List<Tenant> allTenants = tenantRepository.findAll();
Thread.sleep(5000);
return new AsyncResult<>(allTenants);
} catch (InterruptedException e) {
//
}
return null;
}
}
#Configuration
#EnableAsync
public class AsyncConfig {
#Bean(name = "threadPoolTaskExecutor")
public Executor threadPoolTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(2);
executor.setMaxPoolSize(2);
executor.setQueueCapacity(100);
executor.setThreadNamePrefix("AsynchThread-");
executor.initialize();
return executor;
}
#Bean(name = "threadPoolTaskExecutor2")
public Executor threadPoolTaskExecutor2() {
return new ThreadPoolTaskExecutor();
}
}
I'm assuming you are using the default embedded Tomcat from Spring Boot. If that's the case, then you are not misunderstanding. Tomcat will indeed work asynchronously by default, meaning it will start a new thread for every request (see this for on that).
The #Async annotation does not aim to replace the functionality that Tomcat provides in this case. Instead, that annotation allows executing any method of a bean in a separate thread. For your particular use case, it might be enough to let Tomcat start a new thread for every request, but sometimes you might want to parallelize work further.
An example on when you would probably want to use both is when a request must trigger some heavy computation, but the response does not depend on it. By using the #Async annotation, you can start the heavy computation on another thread, and let the request finish sooner (effectively allowing the server to handle other requests while the heavy computation runs independently on another thread).
I've got a job that needs to be run every 1 minute. I've decided to move from Spring's #Scheduled annotation to Quartz jobs to take advantage of its clustered mode during blue/green deployment. For that I used a configuration similar to this:
#Bean
public JobDetailFactoryBean fooJob() {
JobDetailFactoryBean factoryBean = new JobDetailFactoryBean();
factoryBean.setJobClass(FooJob.class);
factoryBean.setName(JOB_IDENTITY);
factoryBean.setGroup(FooJob.class.getName());
factoryBean.setDurability(true);
return factoryBean;
}
#Bean
public SimpleTriggerFactoryBean fooTrigger(#Qualifier("fooJob") JobDetail jobDetail) {
SimpleTriggerFactoryBean factoryBean = new SimpleTriggerFactoryBean();
factoryBean.setName(JOB_IDENTITY);
factoryBean.setGroup(FooJob.class.getName());
factoryBean.setJobDetail(jobDetail);
factoryBean.setStartDelay(0L);
factoryBean.setRepeatInterval(INTERVAL_SECONDS * 1000);
factoryBean.setMisfireInstruction(Trigger.MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY);
return factoryBean;
}
Also I've got a postgres job store configured. Nothing wrong here - it works as expected.
Now my question is: what if in some time in the future I don't need this job run anymore? Then I remove this configuration but the job and the trigger are still persisted in the job store and the job is run even though the configuration is removed.
My expectation is that when I deploy a code with no beans from above, then the job is removed from the store. Is it somehow possible?
I have deployed a Spring Batch project as a task on Spring Cloud Data Flow.
I have launched the task for the first time (It was going to run about 5 minutes which made the status of task is "running"). However, I have relaunched this task for the second time when it's status is running and it worked again.
1) How can I configure that the task could not be relaunched until it's completed status for the last time.
2) Besides, I found that when stopping the job, it will execute until it finishes the current step(My job has 2 steps and the second step was not executed ). However, are there any ways to shut down it (step 1) immediately when I stop the job?
How to avoid relaunching task while it's status is still running on Spring Cloud Data Flow
Spring Cloud Task 2.0.0 added a feature that allows you to prevent tasks from running concurrently. See https://spring.io/blog/2018/05/07/spring-cloud-task-2-0-0-release-is-now-available#restricting-concurrent-task-execution
You need to activate the spring.cloud.task.singleInstanceEnabled=true property.
You can find more details in the Restricting Spring Cloud Task Instances section.
You can use JobExplorer class for checking running jobs, see jobExplorer.findRunningJobExecutions(jobName) method.
In your JobLauncher class you can something like that:
#Component
public class MyJobLauncher {
private static final Logger LOG = LoggerFactory.getLogger(MyJobLauncher.class);
private final Job job;
private final JobLauncher jobLauncher;
private final JobExplorer jobExplorer;
#Autowired
public MyJobLauncher(#Qualifier("myJob") Job job, JobLauncher jobLauncher, JobExplorer jobExplorer) {
this.job = job;
this.jobLauncher = jobLauncher;
this.jobExplorer = jobExplorer;
}
public void launchJob() throws JobParametersInvalidException, JobExecutionAlreadyRunningException,
JobRestartException, JobInstanceAlreadyCompleteException, IOException {
if (jobExplorer.findRunningJobExecutions("myJob").isEmpty()) {
LOG.info("Starting myJob");
jobLauncher.run(job, new JobParameters());
LOG.info("Stopping myJob");
}
}
}
For shutting down the job, check JobExecution.stop() method which can be fetched from jobExplorer.getJobExecution() method.
We have a requirement to carry out data movement from 1 database to other and exploring spring batch for the same. User of our application selects source and target datasource along with the list of tables for which the data needs to be moved.
Need help with following:
The information necessary to build a job comes at runtime from our web application - that includes datasource details and list of table names. We would like to create a new job by sending these details to the job builder module and launch it using JobLauncher. How do we write this job builder module?
We may have multiple users raising data movement requests in parallel, so need a way to create multiple jobs and run them in suitable order.
We have used the Java based configuration to create a job and launch it from a web container. The configuration is as follows
#Bean
public Job loadDataJob(JobCompletionNotificationListener listener) {
RunIdIncrementer inc = new RunIdIncrementer();
inc.setKey(new Date().toString());
JobBuilder builder = jobBuilderFactory.get("loadDataJob")
.incrementer(inc)
.listener(listener);
SimpleJobBuilder simpleBuilder = builder.start(preExecute());
for(String s : getTables()){
simpleBuilder.next(etlTable(s));
}
simpleBuilder.next(postExecute());
return simpleBuilder.build();
}
#Bean
#Scope("prototype")
public Step etlTable(String tableName) {
return stepBuilderFactory.get(tableName)
.<Map<String,Object>, Map<String,Object>> chunk(1000)
.reader(dbDataReader(tableName))
.processor(processor())
.writer(dbDataWriter(tableName))
.build();
}
Currently we have hardcoded the source and target datasource details into respective beans. The getTables() returns a list of tables (hardcoded) for which the data needs to be moved.
RestController that launches the job
#RestController
public class MyController {
#Autowired
JobLauncher jobLauncher;
#Autowired
Job job;
#RequestMapping("/launchjob")
public String handle() throws Exception {
try {
JobParameters jobParameters = new JobParametersBuilder().addLong("time", new Date().getTime()).toJobParameters();
jobLauncher.run(job, jobParameters);
} catch (Exception e) {
}
return "Done";
}
}
Concerning your first question, you definitely have to use JavaConfiguration. Moreover, you shouldn't define your steps as spring beans, if you want to create a job with a dynamic number of steps (for instance a step per table you have to copy).
I've written a couple of answers to questions about how to create jobs dynamically. Have a look at them, they might be helpful
Spring batch execute dynamically generated steps in a tasklet
Spring batch repeat step ending up in never ending loop
Spring Batch - How to generate parallel steps based on params created in a previous step
Spring Batch - Looping a reader/processor/writer step
Edited
Some remarks concerning your second question:
Firstly, you are using a normal JobLauncher and I assume your instantiate the SimpleJobLauncher. This means, you can provide a job with jobparameters, as you have shown in your code above. However, the provided "job" does not have to be a "SpringBean"-instance, so you don't have to Autowire it and therefore, you can use create-methodes as I suggested in the answers to the questions mentioned above.
Secondly, if you create your Job instance for every request dynamically, there is no need to pass the whole configuration as jobparameters, since you can pass the "configuration properties" like datasource and tables to be copied directly as parameters to your "createJob" method. You could even create your DataSource-instances "on the fly", if you don't know all possible datasources in advance.
Thirdly, I would consider every request as a "single run", which cannot be "restarted". Hence, I'd just but some "meta information" into the jobparameters like user, date/time, datasource names (urls) and a list of tables to be copied. I would use this kind of information just as a kind of logging/auditing which requests where issued, but I wouldn't use the jobparameter-instances as controlparameters inside the job itself (again, you can pass the values of these parameters during the construction time of the job and steps by passing them to your create-Methods, so the structure of your job is created according to your parameters and hence, during runtime - when you could access your jobparameters - there is nothing to do based on the jobparameters).
Finally, if a request fails (meaning the jobs exits with an error) simply a new request has to be executed in order to retry, but this request would be a complete new request and not a restart of an already executed job launch (since I would add the request time to my jobparameters, every launch would be a unique launch).
Edited 2:
Not creating the Job as a Bean doesn't mean to not use Autowiring. Here is an example, aus I would structure my Beans.
#Component
#EnableBatchProcessing
#Import() // list with imports as neede
public class JobCreatorComponent {
#Autowire
private StepBuilderFactory stepBuilder;
#Autowire
private JobBuilderFactory jobBuilder;
public Job createJob(all the parameters you need) {
return jobBuilder.get(). ....
}
}
#RestController
#Import(JobCreatorComponent.class)
public class MyController {
#Autowired
JobLauncher jobLauncher;
#Autowired
JobCreatorComponent jobCreator;
#RequestMapping("/launchjob")
public String handle() throws Exception {
try {
Job job = jobCreator.createJob(... params ...);
JobParameters jobParameters = new JobParametersBuilder().addLong("time", new Date().getTime()).toJobParameters();
jobLauncher.run(job, jobParameters);
} catch (Exception e) {
}
return "Done";
}
}
by using #JobScope on itemreader no need to do things manually at run time just have to annoted your respective reader with #Jobscope, on each interaction with controller you will get fresh record processing.
This is type of job on demand where you can execute the job for goals like do the db migration or get the specific reporting like that.
I'm trying to write a multi-tenant Spring Boot application but having trouble to eager initialize beans when the server starts (i.e. not lazily once the tenant requests the bean)
To support multi-tenancy, i created a #CustomerScoped annotation that creates objects based on a ThreadLocal String value.
My configuration provides a bean like this and lazily initializes it:
#Autowired
private AutowireCapableBeanFactory beanFactory;
#Bean
#CustomerScoped
public Scheduler getScheduler() {
CreateDefaults job = factory.createBean(CreateDefaults.class));
Scheduler scheduler = new Scheduler();
scheduler.schedule(job);
return scheduler;
}
#PostConstruct
public void init() {
CustomerScope.setCustomer("tenant1");
getScheduler();
CustomerScope.setCustomer("tenant2");
getScheduler();
CustomerScope.clearCustomer();
}
When starting the server, two Schedulers should be created, each of which would execute their own instance of "Create Defaults".
When tenants access the application themselves, they should be getting their own instance of this Scheduler.
This seems to work but i wonder whether this is the correct way of doing things.
In particular, i am worried about the fact that the beanFactory isn't scoped itself.
Would this approach work and scale for more complex systems?
My code sample was actually correct.
The Beanfactory doesn't need to be scoped itself, it just has to be made aware of the scope, which in my case can be achieved in the configuration:
#Bean
public static CustomScopeConfigurer customScope() {
CustomScopeConfigurer configurer = new CustomScopeConfigurer();
configurer.addScope(CustomerScope.CUSTOMER_SCOPE_NAME, new CustomerScope());
return configurer;
}