I believe this task is not much exotic, but due to the lack of clustering experience I feel hard to find the answer.
Our web-app performs some background operations by schedule (data querying and transfer).
Now Tomcat server on which it is running goes clustered. We need only one instance in cluster to perform these background operations, not all.
I see following options:
The ideal solution would be master/slave model of cluster, where slave instances of Tomcat has our application in inactive status (undeployed). If slave becomes a master, application gets deployed and starts to work. Is it possible ?
If not, then we need some notifications/events that we can implement listeners for, in order to know when some node starts up / shuts down. We will then programmaticaly make application in first raised node a master, and block unwanted process in other (slave) nodes. Further we will listen to startup / shutdown events from nodes to keep always a single active master. I were looking for such events API in Tomcat but wihout luck so far.
Does anyone have experience with such task ? How did you solve it ?
Thank you.
I don't know if there is a master/slave behavior setting in a Tomcat cluster, because I think all nodes need to be equal. But what about using the Quartz Clustering with JDBC-JobStore? You can define the tasks within a shared database and if a task is triggered, the first available node will execute it. So all nodes in your cluster will have the same behavior while only a singe node will execute the same task at a time:
"Only one node will fire the job for each firing. ... It won't necessarily be the same node each time - it will more or less be random which node runs it. The load balancing mechanism is near-random for busy schedulers (lots of triggers) but favors the same node for non-busy (e.g. few triggers) schedulers."
If a node fails while executing a task the next available node will retry:
"Fail-over occurs when one of the nodes fails while in the midst of executing one or more jobs. When a node fails, the other nodes detect the condition and identify the jobs in the database that were in progress within the failed node."
Related
I am working on a scheduled job that will run at certain interval (eg. once a day at 1pm), scheduled through Cron. I am working with Java and Spring.
Writing the scheduled job is easy enough - it does: grab list of people will certain criteria from db, for each person do some calculation and trigger a message.
I am working on a single-node environment locally and in testing, however when we go to production, it will be multi-node environment (with load balancer, etc). My concern is how would multi node environment affect the scheduled job?
My guess is I could (or very likely would) end up with triggering duplicate message.
Machine 1: Grab list of people, do calculation
Machine 2: Grab list of people, do calculation
Machine 1: Trigger message
Machine 2: Trigger message
Is my guess correct?
What would be the recommended solution to avoid the above issue? Do I need to create a master/slave distributed system solution to manage multi node environment?
If you have something like three Tomcat instances, each load balanced behind Apache, for example, and on each your application runs then you will have three different triggers and your job will run three times. I don't think you will have a multi-node environment with distributed job execution unless some kind of mechanism for distributing the parts of the job is in place.
If you haven't looked at this project yet, take a peek at Spring XD. It handles Spring Batch Jobs and can be run in distributed mode.
I have cluster setup up and running ...Jboss 7.1.1.Final and mod_cluster mod_cluster-1.2.6.Final.
mod_cluster load balancing is happening bitween two nodes - nodeA nodeB.
But when I stop one node and start, mod_cluster still sends the all load to the other node. It is not distributing load after comeback.
What is configuration changes required this ? I could see both nodes enabled in mod_cluster_manager. But it directs load only to one node even after the other node comeback after fail over.
Thanks
If you are seeing existing requests being forwarded to the active node, then it's because of sticky session being enabled. This is the default behavior.
If you are seeing new requests are not being forwarded to the new node (even when it's not busy) then it is a different issue. You may want to look at the load balancing factor/algorithm that you are currently utilizing in your mod-cluster subsystem.
It came to my mind, that you might actually be seeing the correct behaviour -- within a short time span. Take a look at my small FAQ: I started mod_cluster and it looks like it's using only one of the workers. TL;DR: If you send only a relatively small amount of requests, it might look like the load balancing doesn't work whereas it's actually correct not to flood fresh newcomers with a barrage of requests at once.
I've got a Spring Web application that's running on two different instances.
The two instances aren't aware of each other, they run on distinct servers.
That application has a scheduled Quartz job but my problem is that the job shouldn't execute simultaneously on the instances, as its a mail sending job, it could cause duplicate emails being sent.
I'm using RAMJobStore and JDBCJobStore is not an option for me due to the large number of tables it requires.(I cant afford to create many tables due to internal restriction)
The solutions I thought about:
-creating a single control table, that has to be checked everytime a job starts (with repeatable read isolation level to avoid concurrency issues) The problem is that if the server is killed, the table might be left in a invalid state.
-using properties to define a single server to be the job running server. Problem is that if that server goes down, jobs will stop running
Has anyone ever experienced this problem and do you have any thoughts to share?
Start with the second solution (deactivate qartz on all nodes except one). It is very simple to do and it is safe. Count how frequently your server goes down. If it is inacceptable then try the first solution. The problem with the first solution is that you need a good skill in mutithreaded programming to implement it without bugs. It is not so simple if multithreading is not your everyday task. And a cost of some bug in your implementation may be bigger than actual profit.
I have a low-CPU queue processing task that I need to keep running for a potentially long period of time. In case the task fails, I'd like to have the task running in a high-availability clustered environment, and the task should "switch" to another machine if the first machine fails. What is the best way to make sure that I have the task running on exactly one machine in the cluster at a time, with seamless failover on machine failure?
Right now, I'm planning to use JGroups to implement this feature. I'll keep one channel for each task, and only the channel leader will execute the task while the other members "follow along." Then, if the channel leader ever changes, the new channel leader picks up where the last one left off.
Has anyone used JGroups to solve this problem? What was your experience?
You might get some inspiration and direction from the JBoss 4.2.3+ Clustered Singleton. The define a service that runs on one, and only one node in a cluster of nodes. If that node fails, or is ejected from the cluster, a new node is assigned the singleton. The underlying implementation [of JBoss Clustering] is JGroups.
I'm working on an application that uses Quartz for scheduling Jobs. The Jobs to be scheduled are created programmatically by reading a properties file. My question is: if I have a cluster of several nodes which of these should create schedules programmatically? Only one of these? Or maybe all?
i have used quartz in a web app, where users, among other things, could create quartz jobs that performed certain tasks.
We have had no problems on that app provided that at least the job names are different for each job. You can also have different group names, and if i remember correctly the jobgroup+jobname combination forms a job key.
Anyway we had no problem with creating an running the jobs from different nodes, but quartz at the time(some 6 months ago, i do not believe this has changed but i am not sure) did not offer the possibility to stop jobs in the cluster, it only could stop jobs on the node the stop command was executed on.
If instead you just want to create a fixed number of jobs when the application starts you better delegate that job to one of the nodes, as the jobs name/group will be read from the same properties file for each node, and conflicts will arise.
Have you tried creating them on all of them? I think you would get some conflict because of duplicate names.
So I think one of the members should create the schedules during startup.
You should only have one system scheduling jobs for the cluster if they are predefined in properties like you say. If all of the systems did it you would needlessly recreate the jobs and maybe put them in a weird state if every server made or deleted the same jobs and triggers.
You could simply only deploy the properties for the jobs to one server and then only one server would try to create them.
You could make a separate app that has the purpose of scheduling the jobs and only run it once.
If these are web servers you could make a simple secured REST API that triggers the scheduling process. Then you could write an automated script to access the API and kick off the scheduling of jobs as part of a deployment or whenever else you desired. If you have multiple servers behind a load balancer it should go to only one server and schedule the jobs which quartz would save to the database backed jobstore. The other nodes in the cluster would receive them the next time they update from the database.