How to design a Real Time Alerting System?

How to design a Real Time Alerting System? - java

I have an requirement where I have to send the alerts when the record in db is not updated/changed for specified intervals. For example, if the received purchase order doesn't processed within one hour, the reminder should be sent to the delivery manager.
The reminder/alert should sent exactly at the interval (including seconds). If the last modified time is 13:55:45 means, the alert should be triggered 14:55:45. There could be million rows needs to be tracked.
The simple approach could be implementing a custom scheduler and all the records will registered with it. But should poll the database to look for the change every second and it will lead to performance problem.
UPDATE:
Another basic approach would be a creating a thread for each record and put it on sleep for 1 hour (or) Use some queuing concept which has timeout. But still it has performance problems
Any thoughts on better approach to implement the same?

probably using internal JMS queue would be better solution - for example you may want to use scheduled message feature http://docs.jboss.org/hornetq/2.2.2.Final/user-manual/en/html/examples.html#examples.scheduled-message with hornetq.
You can ask broker to publish alert message after exactly 1h. From the other hand during processing of some trading activity you can manually delete this message meaning that the trade activity has been processed without errors.

Use Timer for each reminder.i.e. If the last modified time is 17:49:45 means, the alert should be triggered 18:49:45 simply you should create a dynamic timer scheduling for each task it'll call exact after one hour.

It is not possible in Java, if you really insist on the "Real-timeness". In Java you may encouter Garbage collector's stop-the-world phase and you can never guarantee the exact time.
If the approximate time is also permissible, than use some kind of scheduled queue as proposed in other answers, if not, than use real-time Java or some native call.

If we can assume that the orders are entered with increasing time then:
You can use a Queue with elements that have the properties time-of-order and order-id.
Each new entry that is added to the DB is also enqueued to this Queue.
You can check the element at the start of the Queue each minute.
When checking the element at the start of the Queue, if an hour has passed from the time-of-order, then search for the entry with order-id in the DB.
If found and was not updated then send a notification, else dequeue it from the Queue .

Related

Process Multiple Events based on order of Timestamp

I want to process multiple events in order of their timestamps coming into the system via multiple source systems like MQ, S3 ,KAFKA .
What approach should be taken to solve the problem?

As soon as an event comes in, the program can't know if another source will send events that should be processed before this one but have not arrived yet. Therefore, you need some waiting period, e.g. 5 minutes, in which events won't be processed so that late events have a chance to cut in front.
There is a trade-off here, making the waiting window larger will give late events a higher chance to be processed in the right order, but will also delay event processing.
For implementation, one way is to use a priority-queue that sorts by min-timestamp. All event sources write to this queue and events are consumed only from the top and only if they are at least x seconds old.
One possible optimisation for the processing lag: As long as all data sources provide at least one event that is ready for consumption, you can safely process events until one source is empty again. This only works if sources provide their own events in-order. You could implement this by having a counter for each data source of how many events exist in the priority-queue.
Another aspect is what happens to the priority-queue when a node crashes. Using acknowledgements should help here, so that on crash the queue can be rebuilt from unacknowledged events.

Creating Amazon SNS messages to be processed in the future

For the last few years we have used our own RM Application to process events related to our applications. This works by polling a database table every few minutes, looking for any rows that have a due date before now, and have not been processed yet.
We are currently making the transition to SNS, with SQS Worker tiers processing them. The problem with this approach is that we can't future date our messages. Our applications sometimes have events that we don't want to process until a week later.
Are there any design approaches, alternative services, clever tricks we could employ that would allow us to do achieve this?
One solution would be to keep our existing application running, at a simplified level, so all it does is send the SNS notifications when they are due, but the aim of this project is to try and do away with our existing app.

The database approach would be the wisest, being careful that each row is only processed once.
Amazon Simple Notification Service (SNS) is designed to send notifications immediately. There is no functionality for a delayed send (although some notification types are retried if they fail).
Amazon Simple Queue Service (SQS) does have a delay feature, but only up to 15 minutes -- this is useful if you need to do some work before the message is processed, such as copying related data to Amazon S3.
Given that your requirement is to wait until some future arbitrary time (effectively like a scheduling system), you could either start a process and tell it to sleep for a certain amount of time (a bad idea in case systems are restarted), or continue your approach of polling from a database.
If all jobs are scheduled for a distant future (eg at least one hour away), you theoretically only need to poll the database once an hour to retrieve the earliest scheduled time.

A week might be too long as SQS message retention itself is only 15 days. If you are okay with maximum retention of 15days, one idea is to keep the changing the visibility of a message every time you receive until it is ready for processing. The maximum allowed visibility timeout is 12 hours. More on visibility timeout and APIs for changing them,
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ChangeMessageVisibility.html
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/AboutVT.html

I found this approach: https://github.com/alestic/aws-sns-delayed. Basically, you can use a step function with a wait step in there

Performance considerations of using an event based Timers vs Polling

What is both faster and "better practice", using a polling system or a event based timer?
I'm currently having a discussion with a more senior coworker regarding how to implement some mission critical logic. Here is the situation:
A message giving an execution time is received.
When that execution time is reached, some logic must be executed.
Now multiple messages can be received giving different execution times, and the logic must be executed each time.
I think that the best way to implement the logic would be to create a timer that would trigger the logic when the message at the time in the message, but my coworker believes that I would be better off polling a list of the messages to see if the execution time has been reached.
His argument is that the polling system is safer as it is less complicated and thus less likely to be screwed up by the programmer. My argument is that by implementing it my way, we reduce the reduce the computational load and thus are more likely execute the logic when we actually want it to execute. How should I implement it and why?
Requested Information
The only time my logic would ever be utilized would almost certainly be at a time of the highest load.
The requirements do not specify how reliable the connection will be but everyone I've talked to has stated that they have never heard of a message being dropped
The scheduling is based on an absolute system. So, the message will have a execution time specifying when an algorithm should be executed. Since there is time synchronization, I have been instructed to assume that the time will be uniform among all machines.
The algorithm that gets executed uses some inputs which initially are volatile but soon stabilize. By postponing the processing, I hope to use the most stable information available.

The java.util.Timer effectively does what your colleague suggests (truth be told, in the end, there really aren't that many ways to do this).
It maintains a collection of TimerTasks, and it waits for new activity on it, or until the time has come to execute the next task. It doesn't poll the collection, it "knows" that the next task will fire in N seconds, and waits until that happens or anything else (such as a TimerTask added or deleted). This is better overall than polling, since it spends most of its time sleeping.
So, in the end, you're both right -- you should use a Timer for this, because it basically does what your coworker wants to do.

Pushing update events without flooding the client. Clean way to do this?

I think this is a solved problem but my google-fu wasn't good enough.
I have a table tracking the status of multiple things. Server can push changes to clients at will.
The problem is I don't want to push update if the last update is less than 5 seconds ago.
What is the cleanest way of achieving this without making another event manager thread?
My current stab looks like this:
pushEvent(){
Look up the last update time
if: last update less than 5 sec ago
then do nothing
else
pushToClients
}
It works good enough for most part, but obviously the last update could be left unpushed.
What is a good way of doing this?
Some ways I have thought of:
Add a 5 second delay to all push (eg Thread.sleep), that way I can
check if an update has already been scheduled. Not ideal but no one
would mind.
Do the push, then set doNotPush=true. Use a timer to
set it back to false.
Thanks,

On server, you can use HashMap which will contain pair {current event, last sent event} as values and event's recepients (clients) as keys. Then use scheduled periodic task which will iterate over this map, check if current event != last sent event and, if yes, will send current event to client (and put it intolast sent).
One drawback can be that if sending to client is slow, there can be jams of events in outgoing queues. You can work around this by sending asynchronously (e.g., by offloading events to be sent into separate Executor).

Is there a way to assure FIFO (first in, first out) behavior with Task Queues on GAE?

Is there a way to assure FIFO (first in, first out) behavior with Task Queues on GAE?
GAE Documentation says that FIFO is one of the factors that affect task execution order, but the same documentation says that “the system's scheduling may 'jump' new tasks to the head of the queue” and I have confirmed this behavior with a test. The effect: my events are being processed out of order.
Docs says:
https://developers.google.com/appengine/docs/java/taskqueue/overview-push
The order in which tasks are executed depends on several factors:
The position of the task in the queue. App Engine attempts to process tasks based on FIFO > (first in, first out) order. In general, tasks are inserted into the end of a queue, and
executed from the head of the queue.
The backlog of tasks in the queue. The system attempts to deliver the lowest latency
possible for any given task via specially optimized notifications to the scheduler.
Thus, in the case that a queue has a large backlog of tasks, the
system's scheduling may "jump" new tasks to the head of the queue.
The value of the task's etaMillis property. This property specifies the
earliest time that a task can execute. App Engine always waits until
after the specified ETA to process push tasks.
The value of the task's countdownMillis property. This property specifies the minimum
number of seconds to wait before executing a task. Countdown and eta
are mutually exclusive; if you specify one, do not specify the other.
What I need to do? In my use case, I'll process 1-2 million events/day coming from vehicles. These events can be sent at any interval (1 sec, 1 minute or 1 hour). The order of the event processing has to be assured. I need process by timestamp order, which is generated on a embedded device inside the vehicle.
What I have now?
A Rest servlet that is called by the consumer and creates a Task (Event data is on payload).
After this, a worker servlet get this Task and:
Deserialize Event data;
Put Event on Datastore;
Update Vehicle On Datastore.
So, again, is there any way to assure just FIFO behavior? Or how can I improve this solution to get this?

You need to approach this with three separate steps:
Implement a Sharding Counter to generate a monotonically
increasing ID. As much as I like to use the timestamp from
Google's server to indicate task ordering, it appears that timestamps
between GAE servers might vary more than what your requirement is.
Add your tasks to a Pull Queue instead of a Push Queue. When
constructing your TaskOption, add the ID obtained from Step #1 as a tag.
After adding the task, store the ID somewhere on your datastore.
Have your worker servlet lease Tasks by a certain tag from the Pull Queue.
Query the datastore to get the earliest ID that you need to fetch, and use the ID as
the lease tag. In this way, you can simulate FIFO behavior for your task queue.
After you finished your processing, delete the ID from your datastore, and don't forget to delete the Task from your Pull Queue too. Also, I would recommend you run your task consumptions on the Backend.
UPDATE:
As noted by Nick Johnson and mjaggard, sharding in step #1 doesn't seem to be viable to generate a monotonically increasing IDs, and other sources of IDs would then be needed. I seem to recall you were using timestamps generated by your vehicles, would it be possible to use this in lieu of a monotonically increasing ID?
Regardless of the way to generate the IDs, the basic idea is to use datastore's query mechanism to produce a FIFO ordering of Tasks, and use task Tag to pull specific task from the TaskQueue.
There is a caveat, though. Due to the eventual consistency read policy on high-replication datastores, if you choose HRD as your datastore (and you should, the M/S is deprecated as of April 4th, 2012), there might be some stale data returned by the query on step #2.

I think the simple answer is "no", however partly in order to help improve the situation, I am using a pull queue - pulling 1000 tasks at a time and then sorting them. If timing isn't important, you could sort them and put them into the datastore and then complete a batch at a time. You've still got to work out what to do with the tasks at the beginning and ends of the batch - because they might be out of order with interleaving tasks in other batches.

Ok. This is how I've done it.
1) Rest servlet that is called from the consumer:
If Event sequence doesn't match Vehicle sequence (from datastore)
Creates a task on a "wait" queue to call me again
else
State validation
Creates a task on the "regular" queue (Event data is on payload).
2) A worker servlet gets the task from the "regular" queue, and so on... (same pseudo code)
This way I can pause the "regular" queue in order to do a data maintenance without losing events.
Thank you for your answers. My solution is a mix of them.

You can put the work to be done in a row in the datastore with a create timestamp and then fetch work tasks by that timestamp, but if your tasks are being created too quickly you will run into latency issues.

Don't know the answer myself, but it may be possible that tasks enqueued using a deferred function might execute in order submitted. Likely you will need an engineer from G. to get an answer. Pull queues as suggested seem a good alternative, plus this would allow you to consider batching your put()s.
One note about sharded counters: they increase the probability of monotonically increasing ids, but do not guarantee them.

The best way to handle this, the distributed way or "App Engine way" is probably to modify your algorithm and data collection to work with just a timestamp, allowing arbitrary ordering of tasks.
Assuming this is not possible or too difficult, you could modify your algorithm as follow:
when creating the task don't put the data on payload but in the datastore, in a Kind with an ordering on timestamps and stored as a child entity of whatever entity you're trying to update (Vehicule?). The timestamps should come from the client, not the server, to guarantee the same ordering.
run a generic task that fetch the data for the first timestamp, process it, and then delete it, inside a transaction.

Following this thread, I am unclear as to whether the strict FIFO requirement is for all transactions received, or on a per-vehicle basis. Latter has more options vs. former.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.