My programme is a notification service, it basically receives http requests(client sends notifications) and forwards them to a device.
I want it to work the following way:
receive client notification request
save it to the database(yes, i need this step, its mandatory)
async threads watch new requests in database
async threads forward them to the destination(device).
In this case the programme can send client confirmation straight away after the step 2).
Thus, not waiting for the destination to respond(device response time can be too long).
If I stored client notification in memory i would use BlockingQueue. But I need to persist my notifications in db. Also, I cannot use Message Queues, because clients want rest endpoints to send notifications.
Help me to work out the architecture of such a mechanism.
PS In Java, Postgresql
Here are some ideas that can lead to the solution:
Probably the step 2 is mandatory to make sure that the request is persisted so that rather it will be queried. So we're talking about some "data model" here.
With this in mind, if you "send" the confirmation "right away after the step 2" - what if later you want to do some action with this data (say, send it somewhere) and this action doesn't succeed. You store it on disk? what happens if the disk is full?
The most important question is what happens to your data model (in the database) in this case? Should the entry in the database still be there or the whole "logical" action has failed? This is something you should figure out depending on the actual system the answers can be different.
The most "strict" solution would use transactions in the following (schematic) way:
tr = openTransaction()
try {
saveRequestIntoDB(data);
forwardToDestination(data);
tr.commit();
} catch(SomeException ex) {
tr.rollback();
}
With this design, if something goes wrong during the "saveRequest" step - well, nothing will happen. If the data is stored in db, but then forwardToDestination fails - then the transaction will be rolled back and the record won't be stored in DB.
If all the operations succeed - the transaction will be committed.
Now It looks like you still can use the messaging system in step 4. Sending message can be fast and won't add any significant overhead to the whole request.
On the other hand, the benefits are obvious:
- Who listens to these "notifications"? If you send something and only one service should receive and process the notification how do you make sure that others won't get it? How would you implement the opposite - what if all the services should get the notification and process it independently?
These facilities are already implemented by any descent messaging system.
I can't really understand the statement:
I cannot use Message Queues, because clients want rest endpoints to send notifications.
Since the whole flow is originated by the client's request I don't see any contradication here. The code that is called from rest endpoint (which is after all is a logic entrypoint that should be implemented by you) can call the database, persist the data and then send the notification...
Related
I have a service (ServiceA) with an endpoint to which client can subscribe and after subscription, this service produces data continuously using server sent events.
If this is important, I am using Project Reactor with Java.
It may be important, so I'll explain what this endpoint does. Every 15 seconds it fetches data from another service (ServiceB), checks if there were some changes with data that it fetched 15 seconds ago and if there were, it prouces a new event with this data, if there were no changes, it does not send anything (so the payload to the client is as small as possible).
Now, this application can have multiple clients connected at once and they all ask for the same data - it is not filtered by the user etc.
Is it sensible that this observable producing the output is shared between multiple clients?
Of course it would save us a lot of unnecessary calls to the ServiceB, but I wonder if there are any counterindications to this approach - it is the first time I am writing reactive program on the backend (coming from the RxJS) and I don't know if this would cause any concurrency problems or any other sort of problems.
The other benefit I can see is that a new client connecting would immediately be served the last received data from the ServiceB (it usually takes about 4s per call to retrieve this data).
I also wonder if it would be possible that this observable is calling the ServiceB only if there are some subscribers - i.e. until there is at least one subscriber, call the service, if there are no subscribers stop calling it, when a new subscriber subscribes call it again but first fetch the client the last fetched data (no matter how old or stale it may be).
your SSE source can perfectly be shared using the following pattern:
source.publish().refCount();
Note that you need to store the return value of that call and return that same instance to subsequent callers in order for the sharing to occur.
Once all subscribers unsubscribe, refCount will also cancel its subscription to the original source. After that the first subscriber to come in will trigger a new subscription to the source, which you should craft so that it fetches the latest data and re-initializes a polling cycle every 15s.
Consider user cart and checkout: a customer can perform addItemToCart action which will be handled by main DB instance. However, getUserCartItems action might be performed on Read Replica and it might not contain result of the first action yet due to Replica Lag. Even if we try to minimize this lag, still it's possible to hit this case, so I'm wondering what solutions have you tried in production?
According to #Henrik answer, we have 3 options:
1. Wait at user till consistent.
This means we need to perform polling (regular or long polling) on the client and wait until Replica will receive update. However, I assume Replica Lag shouldn't be longer than 1-5 secs. Also, the less Replica Lag, the more performance down we will have.
2. Ensure consistency through 2PC.
If I understood correctly, we need to combine both addItemToCart insert and getUserCartItems select into one aggregate operation on backend and return getUserCartItems as addItemToCart response. However, the next request might still not get updated info due to lag… Yes it returns immediate confirmation about successful operation and the application can continue, however proceeding to checkout requires user cart items in order to show price correctly, so we are not fixing the problem anyway.
3. Fool the client.
Application stores/caches all successfully send data and uses it for showing. Yes, this is a solution, but it definitely requires additional business logic to be implemented:
Perform getUserCartItems request;
if (getUserCartItems returned success)
Store addItemToCart in local storage;
else
Show error and retry;
Perform getUserCartItems request;
if (getUserCartItems contains addItemToCart ID)
Update local storage / cache and proceed with it.
else
Use existing data from local storage;
How do you deal with eventual inconsistency?
The correct answer is to NOT send SELECT queries to a read slave if the data needs to be immediately available.
You should structure your application such that all real-time requests hit your master, and all other requests hit one of your read slaves.
For things where you don't need real-time results, you can fool the user quite well using something like AJAX requests or websockets (websockets is going to make your application a lot more resource friendly as you won't be hammering your backend servers with multiple AJAX requests).
In designing my GWT/GAE app, it has become evident to me that my client-side (GWT) will be generating three types of requests:
Synchronous - "answer me right now! I'm important and require a real-time response!!!"
Asynchronous - "answer me when you can; I need to know the answer at some point but it's really not all that ugent."
Command - "I don't need an answer. This isn't really a request, it's just a command to do something or process something on the server-side."
My game plan is to implement my GWT code so that I can specify, for each specific server-side request (note: I've decided to go with RequestFactory over traditional GWT-RPC for reasons outside the scope of this question), which type of request it is:
SynchronousRequest - Synchronous (from above); sends a command and eagerly awaits a response that it then uses to update the client's state somehow
AsynchronousRequest - Asynchronous (from above); makes an initial request and somehow - either through polling or the GAE Channel API, is notified when the response is finally received
CommandRequest - Command (from above); makes a server-side request and does not wait for a response (even if the server fails to, or refuses to, oblige the command)
I guess my intention with SynchronousRequest is not to produce a totally blocking request, however it may block the user's ability to interact with a specific Widget or portion of the screen.
The added kicker here is this: GAE strongly enforces a timeout on all of its frontend instances (60 seconds). Backend instances have much more relaxed constraints for timeouts, threading, etc. So it is obvious to me that AsynchronousRequests and CommandRequests should be routed to backend instances so that GAE timeouts do not become an issue with them.
However, if GAE is behaving badly, or if we're hitting peak traffic, or if my code just plain sucks, I have to account for the scenario where a SynchronousRequest is made (which would have to go through a timeout-regulated frontend instance) and will timeout unless my GAE server code does something fancy. I know there is a method in the GAE API that I can call to see how many milliseconds a request has before its about to timeout; but although the name of it escapes me right now, it's what this "fancy" code would be based off of. Let's call it public static long GAE.timeLeftOnRequestInMillis() for the sake of this question.
In this scenario, I'd like to detect that a SynchronousRequest is about to timeout, and somehow dynamically convert it into an AsynchronousRequest so that it doesn't time out. Perhaps this means sending an AboutToTimeoutResponse back to the client, and force the client to decide about whether to resend as an AsynchronousRequest or just fail. Or perhaps we can just transform the SynchronousRequest into an AsynchronousRequest and push it to a queue where a backend instance will consume it, process it and return a response. I don't have any preferences when it comes to implementation, so long as the request doesn't fail or timeout because the server couldn't handle it fast enough (because of GAE-imposed regulations).
So then, here is what I'm actually asking here:
How can I wrap a RequestFactory call inside SynchronousRequest, AsynchronousRequest and CommandRequest in such a way that the RequestFactory call behaves the way each of them is intended? In other words, so that the call either partially-blocks (synchronous), can be notified/updated at some point down the road (asynchronous), or can just fire-and-forget (command)?
How can I implement my requirement to let a SynchronousRequest bypass GAE's 60-second timeout and still get processed without failing?
Please note: timeout issues are easily circumvented by re-routing things to backend instances, but backends don't/can't scale. I need scalability here as well (that's primarily why I'm on GAE in the first place!) - so I need a solution that deals with scalable frontend instances and their timeouts. Thanks in advance!
If the computation that you want GAE to do is going to take longer than 60 seconds, then don't wait for the results to be computed before sending a response. According to your problem definition, there is no way to get around this. Instead, clients should submit work orders, and wait for a notification from the server when the results are ready. Requests would consist of work orders, which might look something like this:
class ComputeDigitsOfPiWorkOrder {
// parameters for the computation
int numberOfDigitsToCompute;
// Used by the GAE app to contact the requester when results are ready.
ClientId clientId;
}
This way, your GAE app can respond as soon as the work order is saved (e.g. in Task Queue), and doesn't have to wait until it actually finishes calculating a billion digits of pi before responding. Your GWT client then waits for the result using the Channel API.
In order to give some work orders higher priority, you can use multiple task queues. If you want Task Queue work to scale automatically, you'll want to use push queues. Implementing priority using push queues is a little tricky, but you can configure high priority queues to have faster feed rate.
You could replace Channel API with some other notification solution, but that would probably be the most straightforward.
I'm building a web service with a RESTful interface (lets call it MY_API). This service relies on another RESTful webservice to handle certain aspects (calling it OTHER_API). I'd like to determine what types of best practices I should consider using to handle failures of OTHER_API.
Scenario
My UI is a single page javascript application. There are some fairly complex actions a user can take, which can easily take the user a minute or two to complete. When they are done, they click the SAVE button and MY_API is called to save the data.
MY_API has everything it needs to persist the information submitted by the user. However, there is an action that must take place that is handled by OTHER_API. For instance, OTHER_API might handle sending out an emails. Or perhaps it handles adding line items to my user's billing statement. In both cases, these are critical things than must be completed, but they don't have to happen right now, they just need to happen eventually.
If OTHER_API fails, I don't want to simply tell the user their action has failed, as they spent a lot of time doing it and this will make the experience less than optimal.
Questions
So should I create some sort of Message or Event Queue that can save these failed REST requests to OTHER_API and process them later?
Any advice or suggestions on techniques to go about saving REST requests for delayed processing?
Is there a recommended open source message queue solution that would work for this type of scenario with JSON-based REST web services? Java is preferred as my backend is written in it.
Are there other techniques I should consider?
Rather than approach this by focusing on the failure state, it'd be faster and more robust to recognize that these actions should be performed asynchronously and out-of-band from the request by the UI. You should indeed use a message/event/job queue, and just pop those jobs right onto that queue as quickly as possible, and respond to the original request as quickly as possible. Once you've done that, the asynchronous job can be performed independently of the original request, and at its own pace — including with retries as needed.
If you want your API to indicate that there are aspects of the request which have not completed, you can use the HTTP response Status Code 202 (Accepted).
I'm building an application with distributed parts.
Meaning, while one part (writer) maybe inserting, updating information to a database, the other part (reader) is reading off and acting on that information.
Now, i wish to trigger an action event in the reader and reload information from the DB whenever i insert something from the writer.
Is there a simple way about this?
Would this be a good idea? :
// READER
while(true) {
connect();
// reload info from DB
executeQuery("select * from foo");
disconnect();
}
EDIT : More info
This is a restaurant Point-of-sale system.
Whenever the cashier punches an order into the db - the application in the kitchen get's the notification. This is the kind of system you would see at McDonald's.
The applications needn't be connected in any way. But each part will connect to a single mySQL server.
And i believe i'd expect immediate notifications.
You might consider setting up an embedded JMS server in your application, I would recommend ActiveMQ as it is super easy to embed.
For what you want to do a JMS Topic is a perfect fit. When the cashier punches in an order the order is not written to the database but in a message on the Topic, let's name it newOrders.
On the topic there are 2 subscribers : NewOrderPersister and KitchenNotifier. These will each have an onMessage(Message msg) method which contains the details of the order. One saves it to the database, the other adds it to a screen or yells it through te kitchen with text-to-speech, whatever.
The nice part of this is that the poster does not need to know which and how many subscribers are there waiting for the messages. So if you want a NewOrderCOunter in the backoffice to keep an online count of how much money the owner has made today, or add a "FreanchFiresOrderListener" to have a special display near the deep frying pan, nothing has to change in the rest of the application. They just subscribe to the topic.
The idea you are talking about is called "polling". As Graphain pointed out you must add a delay in the loop. The amount of delay should be decided based on factors like how quickly you want your reader to detect any changes in database and how fast the writer is expected to insert/update data.
Next improvement to your solution could be to have an change-indicator within the database. Your algo will look something like:
// READER
while(true) {
connect();
// reload info from DB
change_count=executeQuery("select change_count from change_counters where counter=foo");
if(change_count> last_change_count){
last_change_count=change_count;
reload();
}
disconnect();
}
The above change will ensure that you do not reload data unnecessarily.
You can further tune the solution to keep a row level change count so that you can reload only the updated rows.
I don't think it's a good idea to use a database to synchronize processes. The parties using the database should synchronize directly, i.e., the writer should write its orders and then notify the kitchen that there is a new order. Then again the notification could be the order itself (or some ID for the database). The notifications can be sent via a message broker.
It's more or less like in a real restaurant. The kitchen rings a bell when meals are finished and the waiters fetch them. They don't poll unnecessarily.
If you really want to use the database for synchronization, you should look into triggers and stored procedures. I'm fairly sure that most RDBMS allow the creation of stored procedures in Java or C that can do arbitrary things like opening a Socket and communicating with another Computer. While this is possible and not as bad as polling I still don't think of it as a very good idea.
Well to start with you'd want some kind of wait timer in there or it is literally going to poll every instance of time it can which would be a pretty bad idea unless you want to simulate what it would be like if Google was hosted on one database.
What kind of environment do the apps run in? Are we talking same machine notification, cross-network, over the net?
How frequently do updates occur and how soon does the reader need to know about them?
I have done something similar before using jGroups I don't remember the exact details as it was quite a few years ago but I had a listener on the "writer" end which would then use JGroups to send out notification of change which would cause the receivers to respond accordingly.