How to do "massive" Job Scheduling (Quartz?) - java

I have a general question related to the quartz scheduling framework:
I need to perform a task after a fixed amount of time after a user registration. For the sake of simplicity let's say exactly 1 hour after registration of a user in my system. The job MUST be done, even if the system is restarting during this one hour the task must be remembered and it MUST be performed later if my system is down at the usual time.
Is this something where I can or where I would use Quartz? I looked at persistent jobs which looks quite promising but I am not sure if this will still work out for 1000 jobs a day. Furthermore, I am not sure about the performance implications. Maybe someone can help me with information here.
If Quartz is not the right choice, which other ways/frameworks do you see for this issue? My application is a Java 6/Spring 3 based Web-App.
Thanks for your help!

We are using quartz persisted job store successfully in our production environment for a SaaS platform application where 100s of jobs are running.

Related

Java Quartz/Cron Mongodb Concurrency Issue

I am planning to have a large collection containing items that need to be processed by a Quartz/cron job. So the cron job will set to run periodically and access mongodb to find the 5 oldest items and process them.
This would be ok for a single server running the cron job. But, later on if i run 2 servers each running the same cron job, I fear that the 2 cron jobs may run at same time and grab 5 same items which may lead to race condition issues.
What is the best practice to avoid this problem?
I'm thinking of putting logic to each job to grab the 5 oldest items that have PENDING status, then immediately change the status to PROCESSED so that other jobs cannot process them. Do you think this will work?
The same issue I got using quartz scheduler.
You have to involved data base. I am sharing the GitHub link so you can follow the given code.
https://github.com/faizakram/Spring_MongoDb_Quartz_Cluster
Any confusions please comment so I can improve my way to understand
Thanks

Batchlet vs EJB Timer

I did an application to do some testing on network nodes like ping test, retrieve disk space ans so on.
I use a scheduled batchlet to run the actions but I wonder if it is the rigth use of batchlet?
Does an EJB timer should be more relevant? Also, when I run a batchlet, my glassfish server keeps a log of the batch job and I don't necessary need it (especially with the amount of batch jobs genereted during a day).
If I need to run some job in the same schedule time, I think batchled can do it but EJB timer too?
Could you give me your input on the rigth way to achieve this?
Thanks,
Ersch
This isn't a question with a clear answer, but there is a bit of a cost in factoring your application as a batch job, and I would look at what I'm getting to see if it's worth doing so.
So you're thinking about a job consisting of a single Batchlet step. Well, there'd be nothing gained from "restart" functions, neither at the failing step within a job nor leveraging checkpoints within a chunk step. The batchlet programming model is quite simple... even if you really like #BatchProperty you'd have to deal with an XML now to do so.
This only starts to get more interesting if you want to start, view, and manage these executions along with the rest of your batch jobs. This might be because you're working with an implementation that offers some kind of implementation-specific add-on function. An example of this could be an integration with external scheduler software, allowing jobs to be scheduled by it. At the other extreme, if you found value in having a persisted record of all your batch job executions in one place (the job repository, usually a persistent DB), then that could also make this worthwhile for you.
But if you don't care for any of that, then an EJB timer could be the way to go instead.
Using an EJB timer is appropriate when your task executes in an eye blink (or thereabouts).
Otherwise use the batching mechanism.
Long running tasks executed from EJB timers can be problematical because they execute in transactions which normally time out after a short period of time. Increasing this transaction time out also increases the chances of database and perhaps other resource locks which can impact normal operation of your application.

Java/Database project automation

I have a Java/Database project in Netbeans that I would like to run once a day at a set time. I am using Derby for the database driver. I am trying to automate a process.
How can I 'schedule' this program to run at specified times?
How can I customize this to keep running until a certain criteria is met?
Say my criteria is that It has to populate 500 rows in the database. (So say at the scheduled time it runs it can only populate 400 rows, then maybe 2 hours later it tries running again to fill the last 100 rows)
Lastly, what are the best practices of automation and scheduled tasks?
How can I 'schedule' this program to run at specified times?
This can be done one of two ways, depending on your operating system - write a job that kicks off the java program at the intervals you need. You may then hook up the job to be started off on start up.
In Linux you can accomplish this with a cron job or so. On windows you may refer to this http://support.microsoft.com/kb/308569.
You may also program the scheduler into your java program using http://quartz-scheduler.org or http://www.sauronsoftware.it/projects/cron4j/ .
How can I customize this to keep running until a certain criteria is met?
This is perhaps best established from within your program, although it is hard to give you directions without much info.
Lastly, what are the best practices of automation and scheduled tasks?
Depending on your application architecture, scheduling and automation can be handled either from within the app or get support from the operating system. The criteria depends on how much control the application needs, which platform makes scheduling easy etc.
Hope this helps.
Quartz is a scheduling project for Java. I have used it in many projects and find it to be very intuitive.
It may be a little over the top for what your after but worth a look anyway.
You can make use of Timer for scheduling the events & the events/task must be implemented using TimerTask

What framework to use for advanced job scheduling in Java?

In my application I need to have periodically run background tasks (which I can easily do with Quartz - i.e. schedule a given job to be run at a specific time periodically).
But I would like to have a little bit more control. In particular I need to:
have the system rerun a task that wasn't run at its scheduled time (i.e. the server was down and because of this the task was not run. In such a situation I want the 'late' task to be run ASAP)
it would be nice to easily control tasks - i.e. run a task on demand or see when a given task was last run or reschedule a given task to be run at a different time
It seems to me that the above points can be achieved with Spring Batch Admin, but I don't have much experience in this area yet. Also, I've seen numerous posts on how Spring Batch is not a scheduling tool so I'm becoming to have doubts what the right tool for the job is here.
So my question is: can the above be achieved with Spring Batch Admin? Or perhaps Quartz is enough but needs configuring to do the above? Or maybe I need both? Or something else?
Thanks a lot :)
Peter
have the system rerun a task that wasn't run at its scheduled time
This feature in Quartz is called Misfire Instructions and does exactly what you need - but is a lot more flexible. All you need is to define JDBCJobStore.
it would be nice to easily control tasks - i.e. run a task on demand or see when a given task was last run or reschedule a given task to be run at a different time
You can use Quartz JMX to access various information (like previous and next run time) or query the Quartz database tables directly. There are also free and commercial management tools basex on the above input. I believe you can also manually run jobs there.
Spring Batch can be integrated with Quartz, but not replace it.

Concurrent database access pattern for web applications

I'm trying to write a Spring web application on a Weblogic server that makes several independent database SELECTs(i.e. they can safely be called concurrently), one of which takes 15 minutes to execute.
Once all the results are fetched, an email containing the results will be sent to a user list.
What's a good way to get around this problem? Is there a Spring library that can help or do I go ahead and create daemon threads to do the job?
EDIT: This will have to be done at the application layer (business requirement) and the email will be sent out by the web application.
Are you sure you are doing everything optimally? 15 minutes is a really long time unless you have a gabillion rows across dozens of tables and need a heckofalot of joins....this is your highest priority -- why is it taking so long?
Do you do the email job at set intervals, or is it invoked from your web app? If set intervals, you should do it in an outside job, possibly on another machine. You can use daemons or the quartz scheduler.
If you need to fire this process off from the web app, you need to do it asynchronously. You could use JMS, or you could just have a table into which you enter a new job request, with daemon process that looks for new jobs every X time period. Firing off background threads is possible, but its error prone and not worth the complication, especially since you have other valid options that are simpler.
If you are asking about Spring support for long-running, possibly asynchronous tasks, you have a choice between Spring JMS support and Spring Batch.
You can use spring quartz to schedule the job. That way the jobs will run in the same container but will not require an http request to trigger them.

Categories

Resources