I am currently working on a scheduled task that runs behind the scenes of my Spring web application. The task uses a cron scheduler to execute at midnight every night, and clean-up unused applications for my portal (my site allows users to create an application to fill out, and if they don't access the form within 30 days, my background task will delete it from our DB and inform the user to create a new form if needed with an email). Everything works great in my test environment, and I am ready to move to QA.
However, my next environment uses two load balanced servers to process requests. This is a problem, as the cron scheduler and my polling task run concurrently on both servers. While the read/writes to the DB won't be an issue, the issue lies with sending the notification email to the application user. Without any polling locks, two emails have the possibility to be generated and sent, and I would like to avoid this. Normally, we would use a SQL stored procedure and have a field in our DB for a lock, and then set/release whenever the polling code is called, so only one instance of the polling will be executed. However, with my new polling task, we don't have any fields available, so I am trying to work on a SPRING solution. I found this resource online:
http://www.springframework.net/doc-latest/reference/html/threading.html
And I was thinking of using it as
Semaphore _pollingLock = new Semaphore(1);
_pollingLock.aquire();
try {
//run my polling task
}
finally {
//release lock
}
However, I'm not sure if this will just ensure the second instance executes after, or it skips the second instance and will never execute. Or, is this solution not even appropriate, and there is a better solution. Again, I am using Spring java framework, so any solution that exists there would be my best bet.
Two ways that we've handled this sort of problem in the past both start with designating one of our clustered servers as the one responsible for a specific task (say, sending email, or running a job).
In one solution, we set a JVM parameter on all clustered servers identifying the server name of the one server on which your process should run. For example -DemailSendServer=clusterMember1
In another solution, we simply provided a JVM parameter in the startup of this designated server alone. For example -DsendEmailFromMe=true
In both cases, you can add a tiny bit of code in your process to gate it based on the value or presence of the startup parameter.
I've found the second option simpler to use since the presence of the parameter is enough to allow the process to run. In the first solution, you would have to compare the current server name against the value of the parameter instead.
We haven't done much with Spring Batch, but I would assume there is a way to configure Batch to run a job on a single server within a cluster as well.
Related
I've got a Spring Web application that's running on two different instances.
The two instances aren't aware of each other, they run on distinct servers.
That application has a scheduled Quartz job but my problem is that the job shouldn't execute simultaneously on the instances, as its a mail sending job, it could cause duplicate emails being sent.
I'm using RAMJobStore and JDBCJobStore is not an option for me due to the large number of tables it requires.(I cant afford to create many tables due to internal restriction)
The solutions I thought about:
-creating a single control table, that has to be checked everytime a job starts (with repeatable read isolation level to avoid concurrency issues) The problem is that if the server is killed, the table might be left in a invalid state.
-using properties to define a single server to be the job running server. Problem is that if that server goes down, jobs will stop running
Has anyone ever experienced this problem and do you have any thoughts to share?
Start with the second solution (deactivate qartz on all nodes except one). It is very simple to do and it is safe. Count how frequently your server goes down. If it is inacceptable then try the first solution. The problem with the first solution is that you need a good skill in mutithreaded programming to implement it without bugs. It is not so simple if multithreading is not your everyday task. And a cost of some bug in your implementation may be bigger than actual profit.
I have a number of backend processes (java applications) which run 24/7. To monitor these backends (i.e. to check if a process is not responding and notify via SMS/EMAIL) I have written another application.
The old backends now log heartbeat at regular time interval and this new applications checks if they are doing it regularly and notifies if necessary.
Now, We have two options
either run it as a scheduled task, which will run after every (let say) 15 min and stop after doing its job or
Run it as another backend process with 15 min sleep time.
The issue we can foresee right now is that what if this monitor application goes into non-responding state? So, my question is Is there any difference between both the cases or both are same? What option would suit my case more?
Please note this is a specific case and is not same as this or this
Environment: Java, hosted on LINUX server
By scheduled task, do you mean triggered by the system scheduler, or as a scheduled thread in the existing backend processes?
To capture unexpected termination or unresponsive states you would be best running a separate process rather than a thread. However, a scheduled thread would give you closer interaction with the owning process with less IPC overhead.
I would implement both. Maintain a record of the local state in each backend process, with a scheduled task in each process triggering a thread to update the current state of that node. This update could be fairly frequent, since it will be less expensive than communicating with a separate process.
Use your separate "monitoring app" process to routinely gather the information about all the backend processes. This should occur less frequently - whether the process is running all the time, or scheduled by a cron job is immaterial since the state is held in each backend process. If one of the backends become unresponsive, this monitoring app will be able to determine the lack of response and perform some meaningful probes to determine what the problem is. It will be this component that will then notify your SMS/Email utility to send a report.
I would go for a backend process as it can maintain state
have a look at the quartz scheduler from terracotta
http://terracotta.org/products/quartz-scheduler
It will be resilient to transient conditions and you only need provide a simple wrap so the monitor app should be robust providing you get the threading stuff right in the quartz.properties file.
You can use nagios core as core and Naptor to monitoring your application. Its easy to setup and embed with your application development.
You can check at this link:
https://github.com/agunghakase/Naptor/tree/ver1.0.0
I have a swing desktop application that is installed on many desktops within a LAN. I have a mysql database that all of them talk to. At precisely 5 PM everyday, there is a thread that will wake up in each of these applications and try to back up files to a remote server. I would like to prevent all the desktop applications from doing the same thing.
The way I was thinking to do this was:
After waking up at 5PM , all the applications will try to write a row onto a MYSQL table. They will write the same information. Only 1 will succeed and the others will get a duplicate row exception. Whoever succeeds, then goes on to run the backup program.
My questions are:
Is this right way of doing things? Is there any better (easier) way?
I know we can do this using sockets as well. But I dont want to go down that route... too much of coding also I would need to ensure that all the systems can talk to each other first (ping)
Will mysql support such as a feature. My DB is INNO DB. So I am thinking it does. Typically I will have about 20-30 users in the LAN. Will this cause a huge overhead for the DB to handle.
If you could put an intermediate class in between the applications and the database that would queue up the results and allow them to proceed in an orderly manner you'd have it knocked.
It sounds like the applications all go directly against the database. You'll have to modify the applications to avoid this issue.
I have a lot of questions about the design:
Why are they all writing "the same row"? Aren't they writing information for their own individual instance?
Why would every one of them have exactly the same primary key? If there was an auto increment or timestamp you would't have this problem.
What's the isolation set to on the database connection? If it's set to SERIALIZABLE, you'll force each one to wait until the previous one is done, at the cost of performance.
Could you have them all write files to a common directory and pick them up later in an orderly way?
I'm just brainstorming now.
It seems you want to backup server data not client data.
I recommend to use a 3-tier architecture using Java EE.
You could use a Timer Service then to trigger the backup.
Though usually a backup program is an independent program e.g. started by a cron job on the server. But again: you'll need a server to do this properly, not just a shared folder.
Here is what I would suggest. Instead of having all clients wake up at the same time and trying to perform the backup, stagger the time at which they wake up.
So when a client wakes up
- It will check some table in your DB (MYSQL) to see if a back up job has completed or is running currently. If the job has completed, the client will go on with its normal duties. You can decide how to handle the case when the job is running.
- If the client finds that the back up job has not been run for the day, it will start the back up job. At the same time will modify the row to indicate that the back up job has started. Once the back up has completed the client will modify the table to indicate that the back up has completed.
This approach will prevent a spurt in network activity and can also provide a rudimentary form of failover. So if one client fails, another client at a later time can attempt the backup. (this is a bit more involved though. Basically it comes down to what a client should do when it sees that a back up job is on going).
Like the title says, I have written a program runs 'in the background', preferably as a Windows service. (It happens to be written in Java, with the service part provided by the tanuki wrapper, if this matters. Also, I'm running Vista, but am assuming that this happens on all versions of Windows with UAC.) I run the service as 'User X'.
I also have a companion GUI program which is typically run from the start menu (unprivileged - i.e. 'asInvoker') - also as 'User X'.
The background program (aka the service) creates files. My main need is for the unelevated GUI program to be able to read, write, and delete these files that are created by the service.
This works without hassle as long as 'User X' is not a member of the Administrators group. (Of course an admin login is required to create the service, but that's okay.)
It also works if I turn off UAC, or if I run the background program not as a service (eg. from a command prompt).
But I just can't get it to work when 'User X' is a member of Administrators, and the background program is running as a service.
The symptoms of this problem are that process explorer shows my service process as running privileged (which I glean from the processes properties' Security tab showing 'BUILTIN\Administrators - Owner'). Also, all files created by the service are owned by 'Administrators'.
If I run my background program unprivileged from a command prompt, then process explorer shows 'BUILTIN\Administrators - Deny' and all files created by the program are owned by 'User X'.
Interesting question. I just looked up some information and cannot seem to find an answer for your question as asked initially, but I have a few alternative suggestions.
First, is it feasible to change your service app so that it creates the files required then it changes the permissions on them to what you want?
Second, does the service itself really have to run as "User X"? If so, why? Is there any way around that restriction? If you can bypass that requirement, then you can just make a normal user for the service to run as.
Third, you said preferably as a service, but not that this is a requirement. Does the environment this is used in allow you to use a scheduled task? The task scheduler itself runs as a system service, and it spawns other processes to do the work of the tasks you set up. And, when setting up a scheduled task, there is an option (a check box if you're using the GUI interface) to run the task with highest privileges or not. If you go this route, you can either have the task run at logon, or you can have it run at system start (in which case, make sure you do NOT have selected "run only if logged on"). This should otherwise be similar to your service setup.
Based on your comment below, I think the third suggestion might still be an option. You could still have status information similar to that of a service by making the program handle this in its own way. Your application could have a socket open for its cross-process communication. The background process could open a ServerSocket on a known port, and it could listen for status requests.
Your client application that your users are using could attempt to connect to this socket. If the socket connects, the process is running, otherwise it is not.
If you wanted only a "running/not running" status, this would be sufficient, and the ServerSocket could accept() a connection and then immediately shutdown and close the resulting Socket; you don't even have to accept or send any information since the initial connection is all you need.
If you want to keep the ability to startup/shutdown the task, you could use this same ServerSocket for that ability. If you aren't using the socket for any other data (only for the running-or-not mentioned above), you could have the background process terminate upon receiving any data at all on the socket, regardless of what it is, and the client (or whatever you use to shut down the background process) need only connect and send a byte instead of connect and immediately disconnect.
For startup, if you want to restrain the background process to one instance, there are a few ways to easily do that. I think you should be able to configure it via the task scheduler to only allow one instance of the task. Even if not, you could have a background process starting up connect to the given port it would otherwise listen on to see if it gets a connection from something else already there, if yes this is a second instance of it so abort. Or, even simpler yet, the creation of the ServerSocket should automatically fail if you are using a static port number, so just let the new ServerSocket(myPort) fail on its own, catch the exception, and abort. So there are three different ways to ensure that your process is acting like a proper service.
To start it up in the first place, you can tell the task scheduler to start it up on user logon, or on system boot as mentioned before. You can also configure the task so that users can initiate it themselves (if for whatever reason it's not already running), in fact, you could even have the client the user's are interacting with check on the status of the process and possibly start it automatically if it's not already started - try making a new process and exec() a command such as "schtasks /run /tn "Your Task Name""
I think that covers all the bases you mentioned, and then some. And all of the above should be pretty simple. If you do decide that this might be the route you'd like to take and if either I've overlooked something or you have other criteria which further restrict you from this, let us know again.
In the end I implemented a work-around using Windows scheduled tasks, similar to what is described above, but instead of implementing my own 'start/stop' interface, I wrote a Windows service that manages my program, run as a task. When the service starts, it starts a task, and when the service is asked to stop, is stops the task. So instead of using a socket for the parent to query if the child is running, I use schtasks /Query and parse the output. To make the task exit if the parent exits, I used an RMI keepalive method on my app that was already there.
Windows scheduled tasks have some undesirable defaults for a service that are modifiable through the task scheduler GUI, but not through schtasks' command-line options - namely ExecutionTimeLimit, DisallowStartIfOnBatteries, StopIfGoingOnBatteries.) But these options can be queried and modified using the '/XML' option to schtasks /Query and /Create. So that's what I did.
I also needed to detect if I'm running on a newer or older version of Windows, because if it's an older version (without UAC) then this will all be unnecessary but more importantly defining the task will not work without supplying a password, because the /NP option to schtasks is not available.
The only weakness (other than being complicated) that I know of with my implementation is due to schtasks' note on the /NP option - "Only local resources are available." This turns out to mean that mapped network drives won't be accessible (and I hope that's all it means.) I have SMB support implemented independently, in Java, in my app where it is needed, so this weakness wasn't the end of the world.
This was a lot of work for what can probably be done with a single Win32 call. Maybe one day I will figure out how to do that.
I'm trying to write a Spring web application on a Weblogic server that makes several independent database SELECTs(i.e. they can safely be called concurrently), one of which takes 15 minutes to execute.
Once all the results are fetched, an email containing the results will be sent to a user list.
What's a good way to get around this problem? Is there a Spring library that can help or do I go ahead and create daemon threads to do the job?
EDIT: This will have to be done at the application layer (business requirement) and the email will be sent out by the web application.
Are you sure you are doing everything optimally? 15 minutes is a really long time unless you have a gabillion rows across dozens of tables and need a heckofalot of joins....this is your highest priority -- why is it taking so long?
Do you do the email job at set intervals, or is it invoked from your web app? If set intervals, you should do it in an outside job, possibly on another machine. You can use daemons or the quartz scheduler.
If you need to fire this process off from the web app, you need to do it asynchronously. You could use JMS, or you could just have a table into which you enter a new job request, with daemon process that looks for new jobs every X time period. Firing off background threads is possible, but its error prone and not worth the complication, especially since you have other valid options that are simpler.
If you are asking about Spring support for long-running, possibly asynchronous tasks, you have a choice between Spring JMS support and Spring Batch.
You can use spring quartz to schedule the job. That way the jobs will run in the same container but will not require an http request to trigger them.