I did an application to do some testing on network nodes like ping test, retrieve disk space ans so on.
I use a scheduled batchlet to run the actions but I wonder if it is the rigth use of batchlet?
Does an EJB timer should be more relevant? Also, when I run a batchlet, my glassfish server keeps a log of the batch job and I don't necessary need it (especially with the amount of batch jobs genereted during a day).
If I need to run some job in the same schedule time, I think batchled can do it but EJB timer too?
Could you give me your input on the rigth way to achieve this?
Thanks,
Ersch
This isn't a question with a clear answer, but there is a bit of a cost in factoring your application as a batch job, and I would look at what I'm getting to see if it's worth doing so.
So you're thinking about a job consisting of a single Batchlet step. Well, there'd be nothing gained from "restart" functions, neither at the failing step within a job nor leveraging checkpoints within a chunk step. The batchlet programming model is quite simple... even if you really like #BatchProperty you'd have to deal with an XML now to do so.
This only starts to get more interesting if you want to start, view, and manage these executions along with the rest of your batch jobs. This might be because you're working with an implementation that offers some kind of implementation-specific add-on function. An example of this could be an integration with external scheduler software, allowing jobs to be scheduled by it. At the other extreme, if you found value in having a persisted record of all your batch job executions in one place (the job repository, usually a persistent DB), then that could also make this worthwhile for you.
But if you don't care for any of that, then an EJB timer could be the way to go instead.
Using an EJB timer is appropriate when your task executes in an eye blink (or thereabouts).
Otherwise use the batching mechanism.
Long running tasks executed from EJB timers can be problematical because they execute in transactions which normally time out after a short period of time. Increasing this transaction time out also increases the chances of database and perhaps other resource locks which can impact normal operation of your application.
Related
I have a Java/Database project in Netbeans that I would like to run once a day at a set time. I am using Derby for the database driver. I am trying to automate a process.
How can I 'schedule' this program to run at specified times?
How can I customize this to keep running until a certain criteria is met?
Say my criteria is that It has to populate 500 rows in the database. (So say at the scheduled time it runs it can only populate 400 rows, then maybe 2 hours later it tries running again to fill the last 100 rows)
Lastly, what are the best practices of automation and scheduled tasks?
How can I 'schedule' this program to run at specified times?
This can be done one of two ways, depending on your operating system - write a job that kicks off the java program at the intervals you need. You may then hook up the job to be started off on start up.
In Linux you can accomplish this with a cron job or so. On windows you may refer to this http://support.microsoft.com/kb/308569.
You may also program the scheduler into your java program using http://quartz-scheduler.org or http://www.sauronsoftware.it/projects/cron4j/ .
How can I customize this to keep running until a certain criteria is met?
This is perhaps best established from within your program, although it is hard to give you directions without much info.
Lastly, what are the best practices of automation and scheduled tasks?
Depending on your application architecture, scheduling and automation can be handled either from within the app or get support from the operating system. The criteria depends on how much control the application needs, which platform makes scheduling easy etc.
Hope this helps.
Quartz is a scheduling project for Java. I have used it in many projects and find it to be very intuitive.
It may be a little over the top for what your after but worth a look anyway.
You can make use of Timer for scheduling the events & the events/task must be implemented using TimerTask
I am doing a web application which has Java as a front end and shell script as a back end. The concept is I need to process multiple files in the back end. I will get the date range from the user (for example from July 1st-8th) and for each day process around 100 files. So in total I have 800 files to process.
I will get these details from JSP and delegate a background call to shell script and get back the results and display the same to the user.
Now I did all these in a sequential approach - by which I mean without threads. So there is only one main thread that executes and the user has to wait till 800 files are processed sequentially. However this is really slow. And because of this I am thinking of going for threads. Since I am a beginner of threads, I read a some stuffs regarding this and I have come up with the following idea:
As I read threads work have to be split .. I thought of splitting the
8 day work to 4 threads where each thread would perform 2 day work
I would like to know whether I am following a correct approach and my major concerns are:
Is it recommended to spawn multiple threads from a web application
Whether or not this is a good approach
Some guidance of how to proceed with this. An example instance would be great. Thank you.
Yes, you can run the long processing job in multi-threaded or in any high performance environment. You should also you Servlet 3.0 Asynchronous Request Processing to suspend the request thread and wait till the Long processing task is done.
Yes, there's nothing wrong with spawning multiple threads from a web application. In fact, if you're running a Servlet container (which you most likely are since you're using Java), it's already spawning multiple threads for you. In general a Servlet container will automatically spawn a new thread (or reuse one out of a pool) to handle each request it receives.
Your approach is fine, thought you'll want to fine-tune the number of threads to something that is suitable given the hardware configuration of your system and the amount of concurrent load on your web service. Also note that while spinning up a bunch of threads will reduce the total amount of time needed to process all the data, it will still leave a potentially large chunk of time before any data is ready to go back to the user. So you might get a better result by doing smaller work units sequentially, and posting each batch of results to the user-interface as soon as it is ready. Then it will still take a long while for the user to have all the data, but that can start viewing at least a portion of it almost immediately.
The way to improve user experience is not by parallelizing at Servlet level on 100000 threads but rather to provide incremental rendering of the view. First of all it would be useful to separate your application in multiple layers, according to the MVC pattern for example.
Saying that, you will have to look on how
Create a service that is able to return partial answers and a last answer, meaning that all available data has been returned. Each of this answers can be computed in parallel to improve performance.
Fill a web page incrementally, tipically by calling back this service which returns a JSON string you use to add data to the DOM. Every time you get an answer, if this is a partial answer, you call again the service providing the previous sequence number.
If you look to Liligo to understand, you will see how this is works. The technique I described is known as polling, but there are others technique to obtain similar asynchronous results at UI Level. In general, you don't want to work directly with the Servlet API, which is a very low level API,but rather use a reasonable framework or abstraction for that.
If you want a warm advice, you should have a look to the Play! framework http://www.playframework.org/documentation/2.0.2/JavaStream HTTP streaming.
Create threads in a web application is not a good solution. It is a bad design because normally it would be the container (web server) who is charged with that activity. So I think you have to find another solution.
I suggest you putting the shell scripts in cron, scheduled to run each minute, and to "activate" them you can touch files that act as semaphores. At each run the scripts verify if the web application touched the semaphore file, if so they read the date interval from those files and then start to process.
In my application I need to have periodically run background tasks (which I can easily do with Quartz - i.e. schedule a given job to be run at a specific time periodically).
But I would like to have a little bit more control. In particular I need to:
have the system rerun a task that wasn't run at its scheduled time (i.e. the server was down and because of this the task was not run. In such a situation I want the 'late' task to be run ASAP)
it would be nice to easily control tasks - i.e. run a task on demand or see when a given task was last run or reschedule a given task to be run at a different time
It seems to me that the above points can be achieved with Spring Batch Admin, but I don't have much experience in this area yet. Also, I've seen numerous posts on how Spring Batch is not a scheduling tool so I'm becoming to have doubts what the right tool for the job is here.
So my question is: can the above be achieved with Spring Batch Admin? Or perhaps Quartz is enough but needs configuring to do the above? Or maybe I need both? Or something else?
Thanks a lot :)
Peter
have the system rerun a task that wasn't run at its scheduled time
This feature in Quartz is called Misfire Instructions and does exactly what you need - but is a lot more flexible. All you need is to define JDBCJobStore.
it would be nice to easily control tasks - i.e. run a task on demand or see when a given task was last run or reschedule a given task to be run at a different time
You can use Quartz JMX to access various information (like previous and next run time) or query the Quartz database tables directly. There are also free and commercial management tools basex on the above input. I believe you can also manually run jobs there.
Spring Batch can be integrated with Quartz, but not replace it.
I have a general question related to the quartz scheduling framework:
I need to perform a task after a fixed amount of time after a user registration. For the sake of simplicity let's say exactly 1 hour after registration of a user in my system. The job MUST be done, even if the system is restarting during this one hour the task must be remembered and it MUST be performed later if my system is down at the usual time.
Is this something where I can or where I would use Quartz? I looked at persistent jobs which looks quite promising but I am not sure if this will still work out for 1000 jobs a day. Furthermore, I am not sure about the performance implications. Maybe someone can help me with information here.
If Quartz is not the right choice, which other ways/frameworks do you see for this issue? My application is a Java 6/Spring 3 based Web-App.
Thanks for your help!
We are using quartz persisted job store successfully in our production environment for a SaaS platform application where 100s of jobs are running.
I need to schedule a task that will perform the task at the given time by user. But this scheduling I need to run whether the application is running or not. So how could I specify the scheduling using quartz?
I am writing the code for the situation in servlet and then from where I need to run that servlet I am bit confused about it, because if I would use the load-on-startup it will call servlet every time the application is loaded so it will cause the job detail to be duplicated in data tables. And the scheduling will stop when user will logout the session. But I want the scheduling to be running till the tomcat is running.
Any help is appreciated. Thanks in advance.
To avoid duplication of the job in the database you can write your code to first check for the job before scheduling, or to first delete the job (whether or not it exists) before scheduling.
In 2.x API there is a checkExists() set of methods.