I have a java project and I am trying to scale my project so want to spin up 3 instance of a single microservice.
But I have an issue
to explain
From UI when user logs in, every 10sec a api request(for the specific user) goes to backend which gives the status of spring batch job (either running or not running for a particular user who has logged in). This works fine with only 1 instance.
but when I have 3 instances (instance 1, 2 & 3) of the same application,
say
first request at 10sec goes to instance 1 and job is running for the logged in user - it returns job is running. - correct
second request at 20sec goes to instance 2, since no job is running in instance 2 (job is running in instance 1) it return no job is running - incorrect
third request at 30sec goes to instance 3, since no job is running in instance 3 (job is running in instance 3) it return no job is running - incorrect
How do I make sure to get "job is running" as the status till the job in instance 1 gets over.
I am using spring microservices
please help in on this
Thanks in advance
Need to get job is running till job finishes in instance 1 for the particular user, for every api request
I assume that you're using Spring Batch and its in memory job repository.
If you are wanting to scale, you should really use a separate database to keep the metadata of those jobs. As per this example.
Configure and deploy your database, add a dataSource, ensure that your jobRepository uses that dataSource. All servers will then return the same values.
If this cannot be achieved, you should at least ensure that your load balancer has sticky sessions enabled.
Related
As stated in the documentation passing from quarkus-resteasy to reactive is as simple as changing only maven dependency and it should works.
In our quarkus project, we create a session in #PreMatching filter and save it in thread local. In other filters we update session with other information. Session information are then used in all services and resources.
After somme performance tests, even we have more RAM and CPU we see only 200 threads used in prometheus. We have the possibility to change max-thread in properties file or moving to reactive programming.
At start point, we changed to quarkus reactive and we experienced the following problems:
When adding reactive to quarkus-resteasy we experienced the following:
The first thread that intercept request is an event loop thread. Then it delegate execution to worker thread if method annotated with #Blocking, else it continue execution with event loop thread.
Problem 1) Blocking method:
All information added to session in #PreMatching filter are lost. But information added in the session from other filters with priority for example #Priority(Priorities.AUTHENTICATION - 10) are present in the worker thread.
Problem 2) NoBlocking method:
With a few requests all thing works fine, but with 100 parallel requests, we experience a random losing session information. After googling, event loop thread use a different context (Vertx thread context) and I haven't find any documentation that explain how to move from ThreadLocal to Vertx Thread context while moving from quarkus to quarkus reactive.
Problem 3) Transform method to return Uni, ThreadLocal information are propagated but I haven't yet test it under load. [UPDATE] same as Problem 2)
Problem 4) Cant find how to run integration tests in Vertx Thread
Any help on how to move projects using ThreadLocal to anything else when moving to quarkus reactive will be appreciated
We have a Spring Batch application scheduled to run every 30 minutes, that creates workers on the Cloud as separate pods.
In the Configuration class, one of the beans connects to a database and reads some properties. If this DB connection fails for some reason, then the worker does not start and the Master job does not get triggered again after 30 minutes.
This is happening because if the worker fails on startup itself, it does not update the final status in the DB or communicate it to the master as Failed. Hence, the Master assumes it is still running and does not trigger the Batch again.
Does anyone have any suggestions on how to handle this and how to ensure the Master triggers the workers again on the scheduled duration?
the problem is about High avaliablity.
You could add redis in the front of db. If we cannot read the config from redis and then connect the db.
2ndly, add retry service like resilience4j into your bean to read your config for multiple times.
3rdly, for warning, you could add related warning service of your cloud to inform you which pod failed to start. Then you are able to restart that pod manually or automatically.
I developed an application with Spring boot 2.2.4.RELEASE and Quartz (v 2.2.3) in the cluster.
I have a Master Job that finds the records in a table and schedule these records via scheduler
`org.springframework.scheduling.quartz.SchedulerFactoryBean
Every single job scheduled has a logic that interacts with DB via HikariCP (Connection pool).
The rule must be that in case of application shut-down the application has to wait until the end of every running job. I will be able to set this rule to
org.springframework.scheduling.quartz.SchedulerFactoryBean
via property
setWaitForJobsToCompleteOnShutdown(true); `
The solution it's working fine but I saw that the Connection Pool (HikariCP) is closed without to wait for the end jobs to run. It leads to the loss of the interaction logicon DB.
I'd like to avoid this thing.
During shut-down of Spring boot, is it possible to prioritize the objects close into the context to do to finish every single job process regularly ?
my context is followed:
My enviroment is composed from two machines with Spring boot and Quartz in cluster configured on both.
I have a Master job with #DisallowConcurrentExecution annotation, that turns every 30 seconds, takes the records, on DB, with id fixed and schedules the jobs slaves, always with DisallowConcurrentExecution annotation, that make a defined logic. In anomaly cases (example: sudden shutdown machine), seem that some the jobs dont't be able to terminate his flow, remaining in the ERROR state.
How can I resume or unblock those jobs are in trigger state to ERROR by quartz objects in java ?
Because, currently, those jobs id can't anymore to be schedule from quartz.
Application log says:
org.quartz.ObjectAlreadyExistsException: Unable to store Job : 'sre.153', because one already exists with this identification.
Our use of Quartz so far has been to configure the database backed scheduler and any jobs/triggers in the spring config which is then loaded when the app is run on the cluster. Each server in the cluster then shares the triggers so that the triggers are only run by one of the servers each time.
I now want to dynamically create new triggers for existing jobDetail beans (which are managed by Spring) on any one of the servers, but I need all of the servers in the cluster to be aware of this new Trigger. I also need them to be aware of the trigger being removed by one of the servers.
Using the current set up, will this just work? Does quartz periodically check the database for new triggers?
If not, what other approaches might solve this problem?
I'm fairly new to Quartz so apologies if i've missed something fundamental.
Thanks for your help.
quartz always performs a check against the database when looking for triggers that needs to be executed. so, if one server delete or add a trigger, the other server(s) will automaticly see it.