There are 2 ways user can generate a report.
User click a button on the front end and job will run to generate the report.
User can schedule the report to generate weekly, monthly, etc.
On scenario 1, I decided to first save the request to a table, say "REQUEST_TBL". Right after that, I will run ThreadPoolTaskExecutor which picks up the specific request from "REQUEST_TBL". There could be a lot of users that can request to generate a report. But each user is given only up to 30 reports to generate for life (if user wants to generate a new report, he needs to delete any old reports).
On scenario 2, user can schedule a certain report to generate weekly, or monthly. Then a weekly (or monthly or etc) job will run and generate this report that the user scheduled.
Now, I am not sure on how to implement the report generator job. Whether I use ThreadPoolTaskExecutor or not. Or use the same program to handle user request and user scheduled request for report.
I am planning to let one job to run every minute to read "REQUEST_TBL" and for each record I will run ThreadPoolTaskExecutor.execute(). But if there are 1000 users all the same time they requested report, then how should I implement the creation of thread. Also for the scheduled job, I am planning to run it endofday only. The scheduled job will read from the same "REQUEST_TBL" and look for request that is scheduled. For scenario 1, if I want to run a job for every, say, 2 minutes, until what time should I run it? Cause it may be that at the end of that day, a scheduled report will need to run. Also, I thought of running a job for every, say 2 minutes, because if the server went down, there's no way to regenerate the report once the server is started.
I would appreciate your suggestion
You are asking many question at once. So few thoughts here:
definitely don't create thread yoursels. Use rather one of the Executors and limit the number of threads in this way.
for the rest I'll do it in the most consistent way: every record in REQUEST_TBL will also have time when it needs to be generated. So in scenario 1 you will save current time together with the request. With the scenario 2 you'll create a record(s) with timestamp which is week (months) ahead.
then you can run a job every minute or two to query requests with request time before or equal now. And schedule a job to the executor for each returned record.
Related
I have an server and a table my current timed Jobs are started in a class where I have different methodes annotated with the #Schedule annotation.
Now I have another form of timed jobs where different requests should be sent to a service in definable time interval. So the user is able to choose something like every 5 minutes or every hour or daily. I will make a list of valid intervals so that there won't be values like every 38 minutes.
So this new timer has to look into the table in whitch interval a job has to be done and then call the function to get the data from the service.
Is this possible without making a new column for something like "Next run"? and what about timers that run at the same time like a 5 Minute a hourly and a dayly timer run all at once once in a day.
Rather than adding a "Next Run" column, I would add a new table that logs when each job was run. That way when your service worker runs it can check the configured run-interval to the last run-time.
Our project is in airline domain and our system is managing the flights for an airline. The system get the flight details from external source as messages and we will always have flights up to 90 days in future in our DB.
We have a requirement to send a message to external system, X minutes before the departure time of a flight. For example 90 minutes before the departure of a flight, a message need to sent to external system. This need to happen for all the flight for a day.
We are planning to implement the solution like when a flight message comes into our system , we will create a quartz trigger for that flight to send the message 90 minutes before its departure time.
But the problem we are facing is there will be more than 300 flights in a day. That means at least 300 triggers are created in system for a single day and we think it will lead to any performance bottlenecks in the scheduler system.
Please suggest if there is any better alternate for this solution. Whether we can achieve it by just one trigger which will query database in frequent intervals and do the complex logic of sending flight message for all flights which satisfy the condition.
I found a way to solve the issue and posting it here so that somebody else can benefit from it.
Creating large number of quartz triggers is not at all a good idea and we resolved the issue just with two jobs.
The first job will run daily once at midnight and it will find all the flights for that day and calculate its message sending time. This information is written to a table.
Another job is created that runs every five minutes and read the table and see if any message need sent for for the current time and also any failed ones in previous attempt. Then it will send the message and update the status accordingly.
In my Spring project, I have an entity Customer.
Now once we get a new Customer, we persist it in our system, and exactly after one hour, I want to check if the Customer has made any purchase.
If yes, I take some action. If no, the some other.
I contemplated two strategies,
1) Firing up an event when the Customer is persisted. And then having the event listener thread sleep for one hour. I believe this will be a very bad way to handle this.
2) Having a cron check every once in a while for customers for whom one hour has passed since registration. But then, I figure it will be very difficult to be accurate. I would have to run the cron every minute which won't be great.
Any ideas?
You could use the 'ScheduledThreadPoolExecutor' which as per javadoc is:
A ThreadPoolExecutor that can additionally schedule commands to run after a given delay, or to execute periodically
In your case, when a customer is created, you can use the 'schedule' method to wake up after 1 hour and then perform required activities. This method can also be used if you want those activities to be executed periodically as well.
I believe run the cron every minute is not that bad, how many customers would you handle in one minute?
Although not sure why you cannot use the event when a registered Customer will make any purchase i.e. when a particular registered customer will make purchase you can take the action inline as and then.
You described 2 strategies both will work but I would prefer to run cron job which you can configure explicitly. In that way you avoid the overhead of maintaining the threads. If you configure the cron job timing correctly and allow a single job to run at a time I do not see any problem with that. Remember cron jobs are used for batch processing rather than handling events.
I have around 1000 entries in my datastore and this is likely to increase with time to around 10,000 entries. My task is to update each row's certain properties and save it back and this task has to be performed every 24 hours.
So, what should I use?
First, you create a cron job that runs every 24 hours.
Second, you need to decide what this cron job will do. The simplest option is to update all 1,000 records. You can retrieve and save all entities in large batches (i.e. 500 per call). If this is a simple update of values, it will take just a few seconds.
Since cron jobs are not retried if they fail, a better option is to create a task and add it to the queue. All updates will happen within that task.
NB: Make sure that if your task is retried, it won't mess the data. If this is not possible, you will have to use some kind of flag (i.e. timestamp of last update) to separate updated entities from those that still need updates.
As your data set grows, your cron job can start multiple tasks to update, for example, 1,000 records in each task.
In the task queue the tasks have to be added to the queue manually though code. If you want to do this task automatically every x time, what you need is a cron job.
You need both,
Cron job to start your batch update job every 24 hours
Task-queues to process you records.
I'm about to create a small application which will be responsible for sending out various reports to various users at various intevals. We might be talking about 50 or 100 different reports going to different people. Some reports needs to be generated every day, some every week, and some every month.
I've been using the Quartz library earlier to run tasks at regular intervals. However, in order to keep things simple I like the thought of having a single Quartz thread taking care of all reports. That is, the thread should loop through all reports, say every 15 minutes, and determine wether it is time for one or more to be generated and sent. It does not matter if a report is generated at 12:00 or 12:15.
I'm thinking about wether it would be possible, somehow, for each report to set up specific times such as "mon#12:00,wed#12:00" or "fri#09:30". Then, based on that, the thread would determine if it was time to send a report or not.
My question is; has anyone else done something like this and does any libraries exist which can make it easy to implement this task?
why not simply register a separate quartz task instance for each report and let Quartz handle all the scheduling for you? That is after all the point behind it.
you can create just single thread and it would ping a "job schedule data structure" at some time interval to see if it needs to run a report. If yes, it would run the report, otherwise, it would go for a short nap and ping again after specified sleep time.
It will cause problem if one job takes too much time to complete and you start accumulating jobs.
The job schedule data structure would keep its record sorted by time stamp.