I'm working in Spring-quartz batch. I'm trying to implement Multi-threading for the Batch application.
I come across 2 possible way of multi threading,
Use Quartz Thread pool
Use Task Executors.
I used Quartz thread pool and it is working fine but was wondering what the advantage i will get if i also implement task Executor.
I'm doing all this as xml configuration.
Please suggest me which should be used and what is the benefit of one over the other.
Thanks
I would choose task executors if all you need is to keep N workers picking pieces of work from the common queue. The advantage is that you do not need any external libraries for this. Quartz thread pool was created before Java 5 - that is why it exists.
Executor is good enough for running concurrent tasks within a JVM. But if you want to distribute tasks across multiple JVMs in a clustered environment, then you should explore Quartz using the JDBC Store.
Quartz is more of a scheduling framework where you can setup jobs to run on a periodic basis. But I have also used it heavily for concurrent programming.
Related
I am writing a spring batch application which should only run one Job Instance at a time. This should also be true if multiple application instances are started. Sadly, the jobs can’t be parallelized and are invoked at random.
So, what I am looking for is a spring boot configuration which allows me to synchronize the job execution within one processor as well as in the distributed case. I have already found some approaches like the JobLauncherSynchronizer (https://docs.spring.io/spring-batch-admin/trunk/apidocs/org/springframework/batch/admin/launch/JobLauncherSynchronizer.html) but all the solutions I have found work either only on one processor or protect just a fraction of the job execution.
Is there any spring boot configuration which prevents multiple concurrent executions of the same job, even across multiple concurrently running application instances (which share the same database)?
Thank you in advance.
Is there any spring boot configuration which prevents multiple concurrent executions of the same job, even across multiple concurrently running application instances (which share the same database)?
Not to my knowledge. If you really want to have a global synchronization at the job level (ie a single job instance at a time), you need a global synchronizer like the JobLauncherSynchronizer you linked to.
Quartz scheduler is used to schedule timed java jobs at my workplace. The scheduler itself is deployed as an application to a Weblogic servers (a cluster of machines). This scheduler can schedule jobs which implement the Job interface and override the execute() method. These jobs are deployed to the Weblogic servers as libraries which are then used by the scheduler. (One library includes multiple jobs.)
I have not managed to find informative sources on how these jobs are run or how they share resources.
I looked at the Quartz documentation but could not find what I was looking for.
I have multiple questions, though I believe a single answer may cover all of them.
Do all the jobs created through the scheduler share a single JVM? If they don't, then based on what is a job allocated to a given JVM?
I assume that a separate thread is allocated to each job - is that correct?
If all scheduled jobs run in the same JVM and each has its own thread, then concurrent execution of the same job with different parameters will create a need for making the job thread safe or a need to disable concurrent execution of the job, will not it?
Thank you.
You may want to read Quartz scheduler tutorial to find out how Quartz works. To answer your questions:
This depends on whether you run a Quartz scheduler cluster (i.e. multiple Quartz scheduler instances sharing the same job store) or a standalone Quartz scheduler instance. In clustered deployments, individual Quartz scheduler instances compete to execute jobs by creating DB row locks. The scheduler instance that first manages to create the row lock, is the scheduler instance that executes a particular job. In a standalone Quartz scheduler deployment, the completion does not exist and it is always the single Quartz scheduler instance that ends up executing all jobs.
Quartz uses a thread pool and when it needs to execute a job, it simply allocates a free thread from the pool and uses it to execute the job. After the job finishes executing, Quartz returns the thread back to the pool.
Instances of Quartz job implementation classes are not shared. That means, when Quartz is about to execute a job, it instantiates the configured org.quartz.Job class and invokes its execute method passing it the job execution context as a parameter. Once the job execution completes, the org.quartz.Job instance is discarded and eventually garbage-collected, i.e. it is not reused by Quartz. If your org.quartz.Job class declares / accesses some static fields, singletons etc., then you may need to synchronize access to these shared resources where necessary.
I am investigating whether or not Quartz can be used for a project I am working on. I need to:
Limit the execution of jobs to specific time ranges (which I know Quartz is great at).
Limit jobs based on "resources".
When I say resources, I referring to both exclusive and quantitative resources. For example, I would like to define a resource something like "LINUX_MACHINE" with a count of 5. Only a maximum of 5 jobs requiring the LINUX_MACHINE machine resource can be run at any one time. Is this possible to do using Quartz?
So it looks like you can limit jobs based on resources by creating multiple schedulers and then limiting the thread pool for each scheduler.
Info on multiple schedulers: http://www.quartz-scheduler.org/documentation/quartz-2.2.x/cookbook/MultipleSchedulers
Info on Quartz config to set the thread pool for the scheduler: http://www.quartz-scheduler.org/documentation/quartz-2.2.x/configuration/ConfigThreadPool
I had written a code using executor service in java. Here I am creating 10 worker threads to process database fetched rows. Each thread will be assigned with one resultant row. This approach will work fine when the application is deployed and running on single instance/node.
Can anyone suggest how this will behave when my application is deployed in multiple nodes/cluster?
Do I have to take care of any part of code before deploying into cluster?
04/12/15: Any more suggestions?
You should consider the overhead of each task. Unless the task is of moderate size, you might want to batch them.
In a distributed context the overhead if much higher so you are more likely to need to batch the work.
You will need to a framework, so the considerations will depend on the framework you chose.
Do java Quartz Scheduler support Asynchronous job scheduling.If so,is it by default or have to customize jobs to run asynchronously.
Not only it supports this behaviour but there is basically no other way. Once you schedule a job and a trigger (in any thread) this job will be executed asynchronously in a thread pool. You have some control over that thread pool like the number of threads.
Another issue is parallel execution of the same job. By default the same job can run in multiple threads started by different threads, unless the job is stateful.
Yes and it should be by default. I am using Quartz in my Grails application for my website and it spins off new threads for each job.