Good alternative to crontab for java batch programs?

Good alternative to crontab for java batch programs? - java

I've been looking for replacements for my companies current batch processing system(java SE + crontab), since there is a lot of java code/shell script duplication, most jobs are ETL and do very similar steps and also i want to provide platform independence instead of relying on crontab, to be more specific with our job role, the current job creation steps are this:
Develop a java program that meets a business requirement.
Test it in a production like enviroment until it meets the business requirement needs.
Pass it to a production server with a shell script that provides file maintenance, java prgram execution and error handling routines(avoid 2 processes of the same name running, mail log to support and developers in case of program error, check output file existence after java program ends if it's relevant for the interface), and specify recurrence data(how often will this program run).
Much of the same logic is being designed and developed into a system that contains generic routines that these programs or "interfaces"(thats how they call it there) do independently(using copy-pasted code usually since most routines are similar), but i am still missing a very important part which i need help with, this concerns the scheduler implementation that i use, and i need it to meet one of these two needs:
-I want to guarantee that whenever i stop the scheduling server for a system update(due to new jobs being added, etc) or whatever other reason, those jobs that could not run due to the system being down(example is 3 jobs that could not run at 3:00 P.M. because the system was down), get to run when the server gets back up, even though their respective scheduling time is gone.
OR in case that the first thing is not possible then:
-I need a way to update the scheduler with new jobs and also update the jars that provide these jobs without restarting the scheduler(sort of like OSGi).
Either of these conditions would satisfy my requirements, and would end my search for the replacement, i've looked into Quartz, Oddjob(theres a scheduler in production with this scheduler, but it needs restarting each time you add new jobs/libraries, does not satisfy my needs) and OSGi using an application server, but i am looking for better suggestions, in case you also know better options, they are also much appreciated.

You might also want to take a look at http://jcrontab.sourceforge.net/
Jcrontab is a scheduler written in Java. The project objective is to provide a fully functional schedules for Java projects.

Alright, found just what i wanted, Quartz does the trick, but i have to develop my own UI Management, FORTUNATELY, there's this project http://code.google.com/p/myschedule/ which contains all that i need(add, remove, resuming jobs), and it is cheap to run the webapp, since you can use tomcat. Now i can focus on designing reusable jobs :), thank god for Quartz!

Related

Git hooks for automation

I have below requirements,
In commit message we have to capture some of the mandatory data like FileName, user story, description etc and later these will be stored in db. Can we create conditional tags so that developer can give either userstory, CR# or defect# based reason for which fix is required?
We have spring boot gradle project. Here we have to automate the static code analysis and Junit Analysis through Pre Commit Hook. Can we write these hooks in java?
Please share some examples.

For your commit messages, there is a customary way to include various pieces of data, which is the trailer system. These are lines of the following format:
Signed-off-by: A U Thor <author#example.com>
Fixes: 1234
It is possible for you to extract data from this either by parsing the commit message yourself or by using git interpret-trailers. This may be able to be done in JGit if you want to do it in Java, but it may or may not have built-in support for parsing trailers. It is strongly recommended that you use the exact same parsing technique that Git does if you need to implement it yourself to avoid creating compatibility problems.
The general rule with hooks on Unix is that they must be either a binary or a script executable by the operating system. Usually Java JARs don't meet that requirement, so you'll probably want to write a shell script wrapper that invokes your Java code. However, this will be very slow, since you'll need to start a full JVM every time.
In addition, you should be aware that, as the Git FAQ outlines, hooks are not an effective tool for controlling policy:
It’s common to try to use pre-commit hooks (or, for commit messages, commit-msg hooks) to check these things, which is great if you’re working as a solo developer and want the tooling to help you. However, using hooks on a developer machine is not effective as a policy control because a user can bypass these hooks with --no-verify without being noticed (among various other ways). Git assumes that the user is in control of their local repositories and doesn’t try to prevent this or tattle on the user.
Therefore, if you want to have effective controls, you need to perform these actions on your CI server. You can still provide hooks for developers who wish to use them, but you cannot rely on them being run. The FAQ also mentions this as another reason why you don't want to mandate hooks:
In addition, some advanced users find pre-commit hooks to be an impediment to workflows that use temporary commits to stage work in progress or that create fixup commits, so it’s better to push these kinds of checks to the server anyway.

How to run a scheduled task on a single openshift pod only?

Story: in my java code i have a few ScheduledFuture's that i need to run everyday on specific time (15:00 for example), the only available thing that i have is database, my current application and openshift with multiple pods. I can't move this code out of my application and must run it from there.
Problem: ScheduledFuture works on every pod, but i need to run it only once a day. I have a few ideas, but i don't know how to implement them.
Idea #1:
Set environment variable to specific pod, then i will be able to check if this variable exists (and its value), read it and run schedule task if required. I know that i have a risk of hovered pods, but that's better not to run scheduled task at all than to run it multiple times.
Idea #2:
Determine a leader pod somehow, this seems to be a bad idea in my case since it always have "split-brain" problem.
Idea #3 (a bit offtopic):
Create my own synchronization algorithm thru database. To be fair, it's the simplest way to me since i'm a programmer and not SRE. I understand that this is not the best one tho.
Idea #4 (a bit offtopic):
Just use quartz schedule library. I personally don't really like that and would prefer one of the first two ideas (if i will able to implement them), but at the moment it seems like my only valid choice.
UPD. May be you have some other suggestions or a warning that i shouldn't ever do that?

I would suggest to use a ready-to-use solution. Getting those things right, especially covering all possible corner-cases wrt. reliability, is hard. If you do not want to use quartz, I would at least suggest to use a database-backed solution. Postgres, for example, has SELECT ... FOR UPDATE SKIP LOCKED; (scroll down to the section "The Locking Clause") which may be used to implement one-time only scheduling.

You can create cron job using openshift
https://docs.openshift.com/container-platform/4.7/nodes/jobs/nodes-nodes-jobs.html
and have this job trigger some endpoint in you application that will invoke your logic.

Is it thread-safe to run multiple CommandLineRunner at same time via Spring Boot?

I have an old application (simple Java files starting from main() method) that is called by 4-5 autosys jobs all at same time passing different arguments.
Now, I have refactored this app to use spring boot CommandLineRunner. I changed only starting file to use spring boot to upgrade it.
Question: will this create any problem if my autosys job call this app 5 times at the same time passing different params ? Any issue with conflict of thread execution or object or any other?
I could not find my answer anywhere..though As far I know, all these 5 calls should create different JVM and execute the spring bean from CommandLineRunner. They should be treated all separate...
The call from autosys is simple
“Java -jar javaApp.jar arg1 arg2”
need your expertise suggestion.
Quick help is appreciated.

If you have created different CommandLineRunner classes in your Spring app, then Spring runs them sequentially in the main thread. No additional threads are involved. The runners are run as just about the last step in the initialization of the app. I happen to know this because it recently mattered to what I was doing, and so I looked at the source code.
To see this all for yourself, all you have to do is put a breakpoint at the start of one of your runners, and then look up the call stack. You'll see the loop over the runners directly above you, and you'll see that the bottom of the stack contains your app's main(). Unlike many adventures into the Spring source code, this one was very simple. I recommend you do this if you can just to see how remarkably simple it is.
The above being the case, it sounds like there's no real difference between what you have with Spring and what you had before...other than, of course, changes you've made yourself to your logic. Spring doesn't add any complexity here.
As #MrR says, the only issue you might have is with contention for external resources if you're running multiple copies of your app at the same time. But you would have had those with the old code as well. Spring doesn't introduce anything new here either.

Is there a way to run a jar at scheduled times using IBM websphere

I have created a stand alone jar which does some job. The job has to be run at scheduled time regularly. I can invoke the jar using windows schedular. But I want to know if I can use tomcat or IBM web sphere server to invoke the jar at scheduled times.
Of course I can create a war file and deploy it in server but that is not what I am looking for.
Thanks in Advance

Many ways but WAS specific asked so will suggest to use cron job feature.
Go to WAS Console > Resources > Schedulers. To implement a CRON like calendar set the UserCalendar on the TaskInfo object, sample:
taskInfo.setUserCalendar(null, "CRON");
taskInfo.setStartTimeInterval("* 10 * * ?")
Read more here in docs.

I am not aware if tomcat/websphere provides explicit support to handle your issue. Since you have mentioned multiple app servers, I suggest to go for simple java's timer packaged inside your jar, an example listed here
This approach will give you app server independence as well as OS independence. You can for cron4J or quartz if your requirements grow.

Alternative build manager to Hudson

I work at a software company where our primary development language is Java. Naturally, we use Hudson for continuous builds, which it works brilliantly for. However, Hudson is not so good at some of the other things we ask it to do. We also use Hudson jobs to deploy binaries, refresh databases, run load testing, run regressions, etc. We really run into trouble when there are build dependencies (i.e. load testings requires DB refresh).
Here's the one thing that Hudson doesn't do that we really need:
Build dependency: It supports build dependencies for Ant builds, but not for Hudson jobs. We're using the URL invocation feature to cause a Hudson job to invoke another Hudson job. The problem is that Hudson always returns a 200 and does not block until the job is done. This means that the calling job doesn't know a) if the build failed and b) if it didn't fail, how long it took.
It would be nice to not have to use shell scripting to specify the behavior of a build, but that's not totally necessary.
Any direction would be nice. Perhaps we're not using Hudson the right way (i.e. should all builds be Ant builds?) or perhaps we need another product for our one-click deployment, load testing, migration, DB refresh, etc.
Edit:
To clarify, we have parameters in our builds that can cause different dependencies depending on the parameters. I.e. sometimes we want load testing with a DB refresh, sometimes without a DB refresh. Unfortunately, creating a Hudson job for each combination of parameters (as the Join plugin requires) won't work because sometimes the different combinations could lead to dozens of jobs.

I don't think I understand your "build dependency" requirements. Any Hudson job can be configured to trigger another (downstream) job, or be triggered by another (upstream) job.
The Downstream-Ext plugin and Join plugin allow for more complex definition of build dependencies.

There is a CLI for Hudson which allows you to issue commands to a Hudson instance. Use "help" to get precise details. I believe there is one which allows you to invoke a build and await its finish.
http://wiki.hudson-ci.org/display/HUDSON/Hudson+CLI

Do you need an extra job for your 'dependencies'?
Your dependencies sound for me like an extra build step. The script that refreshes the DB can be stored in your scm and every build that needs this step will check it out. You can invoke that script if your parameter "db refresh" is true. This can be done with more than just one of your modules. What is the advantage? Your script logic is in your scm (It's always good to have a history of the changes). You still have the ability to update the script once for all your test jobs (since hey all check out the same script). In addition you don't need to look at several scripts to find out whether your test ran successful or not. Especially if you have one job that is part of several execution lines, it becomes difficult to find out what job triggered which run. Another advantage is that you have less jobs on your Hudson and therefore it is easier to maintain.

I think what you are looking for is http://wiki.jenkins-ci.org/display/JENKINS/Parameterized+Trigger+Plugin This plugin lets you execute other jobs based on the status of previous jobs. You can even call a shell script from the downstream project to determine any additional conditions. which can in turn call the API for more info.
For example we have a post-build step to notify us, this calls back the JSON API to build a nice topic in our IRC channel that says "All builds ok" or "X,Y failed" , etc.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.