What is both faster and "better practice", using a polling system or a event based timer?
I'm currently having a discussion with a more senior coworker regarding how to implement some mission critical logic. Here is the situation:
A message giving an execution time is received.
When that execution time is reached, some logic must be executed.
Now multiple messages can be received giving different execution times, and the logic must be executed each time.
I think that the best way to implement the logic would be to create a timer that would trigger the logic when the message at the time in the message, but my coworker believes that I would be better off polling a list of the messages to see if the execution time has been reached.
His argument is that the polling system is safer as it is less complicated and thus less likely to be screwed up by the programmer. My argument is that by implementing it my way, we reduce the reduce the computational load and thus are more likely execute the logic when we actually want it to execute. How should I implement it and why?
Requested Information
The only time my logic would ever be utilized would almost certainly be at a time of the highest load.
The requirements do not specify how reliable the connection will be but everyone I've talked to has stated that they have never heard of a message being dropped
The scheduling is based on an absolute system. So, the message will have a execution time specifying when an algorithm should be executed. Since there is time synchronization, I have been instructed to assume that the time will be uniform among all machines.
The algorithm that gets executed uses some inputs which initially are volatile but soon stabilize. By postponing the processing, I hope to use the most stable information available.
The java.util.Timer effectively does what your colleague suggests (truth be told, in the end, there really aren't that many ways to do this).
It maintains a collection of TimerTasks, and it waits for new activity on it, or until the time has come to execute the next task. It doesn't poll the collection, it "knows" that the next task will fire in N seconds, and waits until that happens or anything else (such as a TimerTask added or deleted). This is better overall than polling, since it spends most of its time sleeping.
So, in the end, you're both right -- you should use a Timer for this, because it basically does what your coworker wants to do.
Related
I have a scenario where I have some timing data that I get from a MIDI file. When a MIDI note needs to be played, I need to send a command over UDP. Basically, I have instructions that say "play note A, wait 125ms, play note B, wait 300ms, play note C..." and each time I "play note X" I need to send data over UDP. I have tried using both a TimerTask and a simple thread with a loop that check the system time and calculate how much time has elapsed and decide whether or not to play a note based on that, but both methods seem to have timing issues. The TimerTask doesn't run exactly on the specified interval (which was stated in the documentation) so I get erratic messages. The thread works better, but it still hiccups sometimes which I assume is because other threads are getting priority over it.
Is there a better way to send this data with more accurate timing? Is there something I can use like the Clip interface in Java that is used for playing audio?
Any assistance is very much appreciated.
This is an approach just about doomed to failure. Let me count the issues here:
1)Android is not a real time OS. Neither is Linux (which its built on). Expecting ms level timings to happen exactly correctly is never going to work. Even if the clock is accurate enough to interrupt on a 1ms rate, there's no assurance that Linux will schedule your thread for wakeup.
2)TimerTasks aren't promised to be accurate even to the degree limited by 1.
3)You're then sending it somewhere via UDP? A protocol that has no assurance as to delivery or timing, to a receiver who will then do something with it- and that receiver may have additional timing issues of its own.
Throw out this entire approach and start over would be my advice. Every single step of this says bad idea.
How does things like scheduleAtFixedRate work? How does it work behind the scenes and is there a penalty to using it?
More specifically, I have a task that I want to run periodically, say every 12 hours. The period is not strict at all, so my first instinct was to check in every request (tomcat server) if it's been more than >12 hours since the task last executed and if so, execute it and reset the timer. The downside of this is that I have to do a small time check on every request, make sure the task is run only once (using a semaphore or something similar) and the task might not execute in a long time if there's no requests.
scheduleAtFixedRate makes it easier to schedule a recurring task, but since I don't know how it does it, I don't know what the performance impact is. Is there a thread continually checking if the task is due to run? etc.
edit:
In Timer.java, there's a mainLoop function which, in my understanding, is something like this (overly simplified):
while(true) {
currentTime = System.currentTimeMillis();
if(myTask.nextExecutionTime == currentTime) myTask.run();
}
Won't this loop try to run as fast as possible and use a ton of CPU (I know, obviously not, but why)? There's no Thread.sleep in there to slow things down.
You can read the code if you wish to work out how it works.
There is an overhead using ScheduledExecutorService in terms of CPU and memory, however on the scale of hours, minutes, second even milli-seconds, it probably not work worrying about. If you have a task running in the range of micro-seconds, I would consider something more light weight.
In short, the overhead is probably too small for you to notice. The benefit it gives you is ease of use, and it is likely to be worth it.
Even after reading http://krondo.com/?p=1209 or Does an asynchronous call always create/call a new thread? I am still confused about how to provide asynchronous calls on an inherently single-threaded system. I will explain my understanding so far and point out my doubts.
One of the examples I read was describing a TCP server providing asynch processing of requests - a user would call a method e.g. get(Callback c) and the callback would be invoked some time later. Now, my first issue here - we have already two systems, one server and one client. This is not what I mean, cause in fact we have two threads at least - one in the server and one on the client side.
The other example I read was JavaScript, as this is the most prominent example of single-threaded asynch system with Node.js. What I cannot get through my head, maybe thinking in Java terms, is this:If I execute the code below (apologies for incorrect, probably atrocious syntax):
function foo(){
read_file(FIle location, Callback c) //asynchronous call, does not block
//do many things more here, potentially for hours
}
the call to read file executes (sth) and returns, allowing the rest of my function to execute. Since there is only one thread i.e. the one that is executing my function, how on earth the same thread (the one and only one which is executing my stuff) will ever get to read in the bytes from disk?
Basically, it seems to me I am missing some underlying mechanism that is acting like round-robin scheduler of some sort, which is inherently single-threaded and might split the tasks to smaller ones or call into a multiothraded components that would spawn a thread and read the file in.
Thanks in advance for all comments and pointing out my mistakes on the way.
Update: Thanks for all responses. Further good sources that helped me out with this are here:
http://www.html5rocks.com/en/tutorials/async/deferred/
http://lostechies.com/johnteague/2012/11/30/node-js-must-know-concepts-asynchrounous/
http://www.interact-sw.co.uk/iangblog/2004/09/23/threadless (.NET)
http://ejohn.org/blog/how-javascript-timers-work/ (intrinsics of timers)
http://www.mobl-lang.org/283/reducing-the-pain-synchronous-asynchronous-programming/
The real answer is that it depends on what you mean by "single thread".
There are two approaches to multitasking: cooperative and interrupt-driven. Cooperative, which is what the other StackOverflow item you cited describes, requires that routines explicitly relinquish ownership of the processor so it can do other things. Event-driven systems are often designed this way. The advantage is that it's a lot easier to administer and avoids most of the risks of conflicting access to data since only one chunk of your code is ever executing at any one time. The disadvantage is that, because only one thing is being done at a time, everything has to either be designed to execute fairly quickly or be broken up into chunks that to so (via explicit pauses like a yield() call), or the system will appear to freeze until that event has been fully processed.
The other approach -- threads or processes -- actively takes the processor away from running chunks of code, pausing them while something else is done. This is much more complicated to implement, and requires more care in coding since you now have the risk of simultaneous access to shared data structures, but is much more powerful and -- done right -- much more robust and responsive.
Yes, there is indeed a scheduler involved in either case. In the former version the scheduler is just spinning until an event arrives (delivered from the operating system and/or runtime environment, which is implicitly another thread or process) and dispatches that event before handling the next to arrive.
The way I think of it in JavaScript is that there is a Queue which holds events. In the old Java producer/consumer parlance, there is a single consumer thread pulling stuff off this queue and executing every function registered to receive the current event. Events such as asynchronous calls (AJAX requests completing), timeouts or mouse events get pushed on to the Queue as soon as they happen. The single "consumer" thread pulls them off the queue and locates any interested functions and then executes them, it cannot get to the next Event until it has finished invoking all the functions registered on the current one. Thus if you have a handler that never completes, the Queue just fills up - it is said to be "blocked".
The system has more than one thread (it has at least one producer and a consumer) since something generates the events to go on the queue, but as the author of the event handlers you need to be aware that events are processed in a single thread, if you go into a tight loop, you will lock up the only consumer thread and make the system unresponsive.
So in your example :
function foo(){
read_file(location, function(fileContents) {
// called with the fileContents when file is read
}
//do many things more here, potentially for hours
}
If you do as your comments says and execute potentially for hours - the callback which handles fileContents will not fire for hours even though the file has been read. As soon as you hit the last } of foo() the consumer thread is done with this event and can process the next one where it will execute the registered callback with the file contents.
HTH
I have an requirement where I have to send the alerts when the record in db is not updated/changed for specified intervals. For example, if the received purchase order doesn't processed within one hour, the reminder should be sent to the delivery manager.
The reminder/alert should sent exactly at the interval (including seconds). If the last modified time is 13:55:45 means, the alert should be triggered 14:55:45. There could be million rows needs to be tracked.
The simple approach could be implementing a custom scheduler and all the records will registered with it. But should poll the database to look for the change every second and it will lead to performance problem.
UPDATE:
Another basic approach would be a creating a thread for each record and put it on sleep for 1 hour (or) Use some queuing concept which has timeout. But still it has performance problems
Any thoughts on better approach to implement the same?
probably using internal JMS queue would be better solution - for example you may want to use scheduled message feature http://docs.jboss.org/hornetq/2.2.2.Final/user-manual/en/html/examples.html#examples.scheduled-message with hornetq.
You can ask broker to publish alert message after exactly 1h. From the other hand during processing of some trading activity you can manually delete this message meaning that the trade activity has been processed without errors.
Use Timer for each reminder.i.e. If the last modified time is 17:49:45 means, the alert should be triggered 18:49:45 simply you should create a dynamic timer scheduling for each task it'll call exact after one hour.
It is not possible in Java, if you really insist on the "Real-timeness". In Java you may encouter Garbage collector's stop-the-world phase and you can never guarantee the exact time.
If the approximate time is also permissible, than use some kind of scheduled queue as proposed in other answers, if not, than use real-time Java or some native call.
If we can assume that the orders are entered with increasing time then:
You can use a Queue with elements that have the properties time-of-order and order-id.
Each new entry that is added to the DB is also enqueued to this Queue.
You can check the element at the start of the Queue each minute.
When checking the element at the start of the Queue, if an hour has passed from the time-of-order, then search for the entry with order-id in the DB.
If found and was not updated then send a notification, else dequeue it from the Queue .
Because I'm executing a time critical task every second, I compared several methods to find the best way to ensure that my task is really executed in fixed time steps. After calculating the standard derivation of the error for all methods it seems like using the method scheduledExecutorService.scheduleAtFixedRate() leads to the best results, but I don't have a clue why it is so.
Does anybody know how that method internally works? How does it for example in comparison to a simple sleep() ensure that the referenced task is really executed in fixed time steps?
A 'normal' Java VM cannot make any hard real-time guarantees about execution times (and as a consequence of that also about scheduling times). If you really need hard real time guarantees you should have a look at a real time VM like Java RTS. Of course you need a real time OS in that case too.
Regarding comparison to Thread.sleep(): the advantage of scheduledExecutorService.scheduleAtFixedRate() compared to a (naive) usage of Thread.sleep() is that it isn't affected by the execution time of the scheduled task. See ScheduledFutureTask.runPeriodic() on how this is implemented.
You can have a look of the OpenJDK7 implementation of ScheduledThreadPoolExecutor.java which is the only class implementing the ScheduledExecutorService interface.
However as far as I know there is guarantee in the ScheduledExecutorService about accuracy. So even if your measurments show this to be acurate that may not be the case if you switch to a different platform, vm or jdk .