We have a Spring Batch application scheduled to run every 30 minutes, that creates workers on the Cloud as separate pods.
In the Configuration class, one of the beans connects to a database and reads some properties. If this DB connection fails for some reason, then the worker does not start and the Master job does not get triggered again after 30 minutes.
This is happening because if the worker fails on startup itself, it does not update the final status in the DB or communicate it to the master as Failed. Hence, the Master assumes it is still running and does not trigger the Batch again.
Does anyone have any suggestions on how to handle this and how to ensure the Master triggers the workers again on the scheduled duration?
the problem is about High avaliablity.
You could add redis in the front of db. If we cannot read the config from redis and then connect the db.
2ndly, add retry service like resilience4j into your bean to read your config for multiple times.
3rdly, for warning, you could add related warning service of your cloud to inform you which pod failed to start. Then you are able to restart that pod manually or automatically.
Related
I have a java project and I am trying to scale my project so want to spin up 3 instance of a single microservice.
But I have an issue
to explain
From UI when user logs in, every 10sec a api request(for the specific user) goes to backend which gives the status of spring batch job (either running or not running for a particular user who has logged in). This works fine with only 1 instance.
but when I have 3 instances (instance 1, 2 & 3) of the same application,
say
first request at 10sec goes to instance 1 and job is running for the logged in user - it returns job is running. - correct
second request at 20sec goes to instance 2, since no job is running in instance 2 (job is running in instance 1) it return no job is running - incorrect
third request at 30sec goes to instance 3, since no job is running in instance 3 (job is running in instance 3) it return no job is running - incorrect
How do I make sure to get "job is running" as the status till the job in instance 1 gets over.
I am using spring microservices
please help in on this
Thanks in advance
Need to get job is running till job finishes in instance 1 for the particular user, for every api request
I assume that you're using Spring Batch and its in memory job repository.
If you are wanting to scale, you should really use a separate database to keep the metadata of those jobs. As per this example.
Configure and deploy your database, add a dataSource, ensure that your jobRepository uses that dataSource. All servers will then return the same values.
If this cannot be achieved, you should at least ensure that your load balancer has sticky sessions enabled.
I developed an application with Spring boot 2.2.4.RELEASE and Quartz (v 2.2.3) in the cluster.
I have a Master Job that finds the records in a table and schedule these records via scheduler
`org.springframework.scheduling.quartz.SchedulerFactoryBean
Every single job scheduled has a logic that interacts with DB via HikariCP (Connection pool).
The rule must be that in case of application shut-down the application has to wait until the end of every running job. I will be able to set this rule to
org.springframework.scheduling.quartz.SchedulerFactoryBean
via property
setWaitForJobsToCompleteOnShutdown(true); `
The solution it's working fine but I saw that the Connection Pool (HikariCP) is closed without to wait for the end jobs to run. It leads to the loss of the interaction logicon DB.
I'd like to avoid this thing.
During shut-down of Spring boot, is it possible to prioritize the objects close into the context to do to finish every single job process regularly ?
my context is followed:
My enviroment is composed from two machines with Spring boot and Quartz in cluster configured on both.
I have a Master job with #DisallowConcurrentExecution annotation, that turns every 30 seconds, takes the records, on DB, with id fixed and schedules the jobs slaves, always with DisallowConcurrentExecution annotation, that make a defined logic. In anomaly cases (example: sudden shutdown machine), seem that some the jobs dont't be able to terminate his flow, remaining in the ERROR state.
How can I resume or unblock those jobs are in trigger state to ERROR by quartz objects in java ?
Because, currently, those jobs id can't anymore to be schedule from quartz.
Application log says:
org.quartz.ObjectAlreadyExistsException: Unable to store Job : 'sre.153', because one already exists with this identification.
I have this problem: my app has a quartz scheduler to run a task each X minutes. This app is deployed in two server instances, so each instance is executing the task at the same time. I want to execute only one task at the same time.
We have configured Quartz with Spring and our application server is WAS.
Which options do you suggest?
You could setup quartz cluster with JDBC job store - then every job fire will be executed by only one cluster node. You can find more information on that topic in quartz documentation
Our use of Quartz so far has been to configure the database backed scheduler and any jobs/triggers in the spring config which is then loaded when the app is run on the cluster. Each server in the cluster then shares the triggers so that the triggers are only run by one of the servers each time.
I now want to dynamically create new triggers for existing jobDetail beans (which are managed by Spring) on any one of the servers, but I need all of the servers in the cluster to be aware of this new Trigger. I also need them to be aware of the trigger being removed by one of the servers.
Using the current set up, will this just work? Does quartz periodically check the database for new triggers?
If not, what other approaches might solve this problem?
I'm fairly new to Quartz so apologies if i've missed something fundamental.
Thanks for your help.
quartz always performs a check against the database when looking for triggers that needs to be executed. so, if one server delete or add a trigger, the other server(s) will automaticly see it.