I need to execute a task in the future, just once.
Requirements:
- The environment is clustered, so need to take care of competition in the moment that the task gets fired, it cannot execute twice;
- The task can be scheduled a month ahead and cannot be just scheduled in memory as soon as the node can be restarted or even destroyed at a certain moments (it's an Amazon Elastick Beanstalk environment);
Any suggestions will be welcome.
One idea: Instead of trying to get the cron/timer tool to only execute once, you could schedule the task on all of the nodes, but then use some kind of coordination between nodes to decide which one will actually execute the task.
Related
we are using spring sceduler using
#Scheduled(cron = "0 15 10 15 * ?")
the problem is that some time we have maintenece and the system is down when the job is sceduled to run.
is there another sceduler we can use ? maybe a parameter that checks if there was scedualed job that didnt run during maintenence and run it when the system is up?
or a recomenation for a different scedualer to use
Thanks
M. Deinum mentioned Quartz as a possible solution. It is a very advanced scheduling product that may handle scheduling for multiple nodes insuring that the job would run only on one node. It has many other features. I haven't used it in long while so you can look up if it is something you want to use.
However, I have dealt with your particular case in a simpler way. Part of the scheduled job responsibility was upon each run to write down into a DB table the last scheduled time (the one in the past that triggered the current run), the next scheduled time and the actual last execution time. Then, after a down time when the server starts up it has to check if the next scheduled time is in the past (also the last execution time will be older then the next scheduled time). If it is so, it is your flag that the the job missed its running due to down time (or any other reason). So you can reschedule or run it now
P.S. This will not address your actual problem, but I wrote my own scheduler and published it as part of an open-source library. My scheduler allows you to set the time intervals in more human readable form such as "4h" for 4 hours or "30m" for 30 minutes and so forth. Also it can handle multiple tasks scheduling and allows you to specify the number of threads that will handle all your scheduled tasks. You can read about it here. The library is called MgntUtils and you can get it as Maven artifacts or from Github repository releases (with source code and Javadoc included). You can read an article about the library that describes some of the features here
I did an application to do some testing on network nodes like ping test, retrieve disk space ans so on.
I use a scheduled batchlet to run the actions but I wonder if it is the rigth use of batchlet?
Does an EJB timer should be more relevant? Also, when I run a batchlet, my glassfish server keeps a log of the batch job and I don't necessary need it (especially with the amount of batch jobs genereted during a day).
If I need to run some job in the same schedule time, I think batchled can do it but EJB timer too?
Could you give me your input on the rigth way to achieve this?
Thanks,
Ersch
This isn't a question with a clear answer, but there is a bit of a cost in factoring your application as a batch job, and I would look at what I'm getting to see if it's worth doing so.
So you're thinking about a job consisting of a single Batchlet step. Well, there'd be nothing gained from "restart" functions, neither at the failing step within a job nor leveraging checkpoints within a chunk step. The batchlet programming model is quite simple... even if you really like #BatchProperty you'd have to deal with an XML now to do so.
This only starts to get more interesting if you want to start, view, and manage these executions along with the rest of your batch jobs. This might be because you're working with an implementation that offers some kind of implementation-specific add-on function. An example of this could be an integration with external scheduler software, allowing jobs to be scheduled by it. At the other extreme, if you found value in having a persisted record of all your batch job executions in one place (the job repository, usually a persistent DB), then that could also make this worthwhile for you.
But if you don't care for any of that, then an EJB timer could be the way to go instead.
Using an EJB timer is appropriate when your task executes in an eye blink (or thereabouts).
Otherwise use the batching mechanism.
Long running tasks executed from EJB timers can be problematical because they execute in transactions which normally time out after a short period of time. Increasing this transaction time out also increases the chances of database and perhaps other resource locks which can impact normal operation of your application.
I'm working on a project that will record data on real time events using Java on a linux system.
I have all of the HTML scraping stuff down, that's fine, what I need to figure out is the scheduling and management of the tasks.
There are potentially up to forty events occurring each week, at varying times and events can last up to three hours.
I can create and update the calendar of these events at will, my problem is how to:
Schedule a process to scrape each event at the right time, and update the schedule if there's a change.
Ensure once the scrape process has begun that it stays running for the entire (indeterminate) duration of the event.
Can anyone advise how best to approach this? I'm not sure where I need to start.
Thanks!
a) Schedule a process to scrape each event at the right time, and
update the schedule if there's a change.
If you do not want to use a library, a good starting point for scheduling your tasks can be ScheduledExecutorService. Though you may find other scheduling frameworks useful for your problem out of which Quartz can specifically give you a flexibility in how to schedule the next task based on the current schedule execution results; it also provides a cron capability so that if your schedule is fixed, you can take advantage of a fixed scheduled calendar.
b) Ensure once the scrape process has begun that it stays running for
the entire (indeterminate) duration of the event.
Assuming that you're using a library for HTML scraping, you don't need to ensure it's running since it will be Java task object initiated from your application.
Say my web app has two spring cron jobs scheduled to run every minute/hour. Before I'm redeploying my app I can't just shut it down, I must wait for jobs to be finished correctly. I can provide some flag in database or somewhere else, so that jobs will stop to run iteratively - every minute or hour - put some check inside job function to return/do nothing on the next trigger call if such flag checked.
But how can I wait for current job call to finish? And how to see it from ant or other outside script - to choose good time for server shutdown.
There may be solution to put some flag in db or file, and read it. But may be there is some more accurate way - jms or something like that?
Instead of using the database you can use JMX. -- But in my humble opinion: If you already use a database but no JMX yet, then use a DB flag, or a simple file, to signal the timer to "stop".
More nice than using the flag and check for every invocation if the flag is set would be shoting shutdown() down the ThreadPoolTaskSchedule or ThreadPoolTaskExecutor.
But this would require a second task (run every minute or shorter) to check the flag and then shutdown to scheduler.
To make the ant task wait for a timespan, after setting the flag, to make sure that the last task is done, you can use the WaitFor task.
If you're using a TaskExecutor to run the "cron" jobs, then you could use the method described by skaffman in this answer to get the results of the task execution. If you need access to this from ant, you could probably use spring's JMS support with ant-jms.
In my application I need to have periodically run background tasks (which I can easily do with Quartz - i.e. schedule a given job to be run at a specific time periodically).
But I would like to have a little bit more control. In particular I need to:
have the system rerun a task that wasn't run at its scheduled time (i.e. the server was down and because of this the task was not run. In such a situation I want the 'late' task to be run ASAP)
it would be nice to easily control tasks - i.e. run a task on demand or see when a given task was last run or reschedule a given task to be run at a different time
It seems to me that the above points can be achieved with Spring Batch Admin, but I don't have much experience in this area yet. Also, I've seen numerous posts on how Spring Batch is not a scheduling tool so I'm becoming to have doubts what the right tool for the job is here.
So my question is: can the above be achieved with Spring Batch Admin? Or perhaps Quartz is enough but needs configuring to do the above? Or maybe I need both? Or something else?
Thanks a lot :)
Peter
have the system rerun a task that wasn't run at its scheduled time
This feature in Quartz is called Misfire Instructions and does exactly what you need - but is a lot more flexible. All you need is to define JDBCJobStore.
it would be nice to easily control tasks - i.e. run a task on demand or see when a given task was last run or reschedule a given task to be run at a different time
You can use Quartz JMX to access various information (like previous and next run time) or query the Quartz database tables directly. There are also free and commercial management tools basex on the above input. I believe you can also manually run jobs there.
Spring Batch can be integrated with Quartz, but not replace it.