How to create acceptance tests for async micro services

How to create acceptance tests for async micro services - java

If I have Microservice, which should create User but since user creation is complex it uses queue, and user is actually created by the consumer the endpoint only takes request and returns ok or fail.
How do I create acceptance test for this acceptance criteria:
Given: User who wants to register
When: api is requested for user creation
Then: create user AND set hosting environment_id on new user
For this I have to wait while the environment is actually set up, which takes up to 30 seconds. And if I implement sleep inside my test, then I hit anti pattern wait and see how to properly test it without failing best practices?

most proper might be, to return a response instantly, let's say "setup process started" (with a setup process id) and then have another API method, which will "obtain setup status" (for that setup process id) - and then proceed, when "setup has completed".
because, alike this nothing will be stuck for 30s, neither in tests nor production - and one could display a progress bar to the user, which indicates the current status, so that they will have an estimate how long it will take - whilst not getting the impression, that something is stuck or would not work.
one barely can test asynchronously, while the setup process by itself won't be asynchronous; and long-running tasks without any kind of status indicator are barely acceptable for delivery - because this only appears valid, while knowing what is going on in the background, but not whilst not knowing that.
whenever testing hits an anti-pattern, this is an indicator, that the solution might be sub-optimal.

I don't presume to tell you exactly how to code your acceptance tests without more detail regarding language or testing stack, but the simplest solution is to implement a dynamic wait that continuously polls the state of the system for a desired result before moving forward, breaking the loop (presuming you would use some form of loop, but that’s up to you) when the expected/desired response has been received.
This "polling" can take many forms such as:
a) querying for an expected update to a database (perhaps a value within a table is updated when the user is created)
b) pinging the dependent service until you receive the proper "signal" you are expecting to indicate user creation. For example, perhaps a GET request to another service (or another endpoint of the same service) returns a status of “created” for the given user, signifying that the user has been created.
Without further technical information I can’t give you exact instructions, but dynamic polling is the solution I use every day to test our asynchronous microservice architecture.
Keep in mind, this dynamic polling solution operates on the assumption that you have access to the service(s) and/or database(s) that contain the indicator for which you are "polling" when it is time to move forward with your test. Again, I'm the signal to move forward is something transparent such as a status change for the newly created user, the user's existence in a database/table either external or internal to the microservice, etc.
Some other assumptions in this scenario are:
a) sufficient non-functional performance of the System Under Test, where poor non-functional performance of the System Under Test would be a constraint.
b) a lack of resource constraints as resources are consumed somewhat heavily during the "polling", as resources are consumed somewhat heavily during the period of “polling”. (think Azure dynamic resource flexing, which can be costly over time).
Note: Be careful for infinite loops. You should insert some sort of constraint that exits the polling loop (and likely results in a failed test) after a reasonable period of time or number of attempts at your discretion.

Create a query service that given the user attributes (id, or name etc), will return the status of the user.
For the acceptance criteria, will be 2 part
create-user service returns 200
get-status service returns 200 (you can call it in a loop in your test).
This service will be helpful in the long run for various reasons
Check how long is it taking to the async process to complete.
At any time you can get status of any user, including to validate if a user is truly deleted / inactivated etc
You can mock this service results in your end-to-end integrated testing.

Related

Using GAE Task Queues for handling persistence operations

Looking to get opinions on whether or not it would be a good idea to hand persistence operations off to a task queue. For example, a user submits a new 'order', I use bean validation to verify that everything is ok, and then hand over the processing/persisting of the order to a task queue, and respond back to the user faster.
My hesitance is that the persistence 'could' fail, but once I've validated the bean, the chances are low. Are task queues usually used to handle tasks that are relatively trivial? My main concern is what happens if a task that the task queue has fails, since it's done asynchronously, how can I notify the user?

Tasks will retry automatically. If the failure is caused by the infrastructure, the task will be completed on a subsequent try. So you need to worry only about cases where a failure was caused by your code (code bug) or data (validation bug). If you iron out the bugs, you can use tasks with no hesitations and don't worry about the notifications.
In either case, if processing an order takes a couple of seconds, I probably wouldn't bother with task queues. From a user experience perspective, users want to feel that the app did some work with their order, so a 1-2 seconds delay with response is acceptable and even expected.

We have implemented a huge app of logistic flows and some of our processes take 2-3 minutes to read lot of data from BigQuery, do the work and send an e-mail with attachments.
To notify the user you can use the Channel API and/or send an e-mail.
You'll have to provide the user id, mail address or something like that into the task parameters because it is run by the system.
You can't ask to App Engine the current logged user, it will null everytime.
Like said Andrei :
you need to worry only about cases where a failure was caused by your
code (code bug) or data (validation bug).
Don't let an exception go out of the task otherwise the entire task will be run again.

How do you deal with eventual inconsistency when using Amazon RDS with Read Replica?

Consider user cart and checkout: a customer can perform addItemToCart action which will be handled by main DB instance. However, getUserCartItems action might be performed on Read Replica and it might not contain result of the first action yet due to Replica Lag. Even if we try to minimize this lag, still it's possible to hit this case, so I'm wondering what solutions have you tried in production?
According to #Henrik answer, we have 3 options:
1. Wait at user till consistent.
This means we need to perform polling (regular or long polling) on the client and wait until Replica will receive update. However, I assume Replica Lag shouldn't be longer than 1-5 secs. Also, the less Replica Lag, the more performance down we will have.
2. Ensure consistency through 2PC.
If I understood correctly, we need to combine both addItemToCart insert and getUserCartItems select into one aggregate operation on backend and return getUserCartItems as addItemToCart response. However, the next request might still not get updated info due to lag… Yes it returns immediate confirmation about successful operation and the application can continue, however proceeding to checkout requires user cart items in order to show price correctly, so we are not fixing the problem anyway.
3. Fool the client.
Application stores/caches all successfully send data and uses it for showing. Yes, this is a solution, but it definitely requires additional business logic to be implemented:
Perform getUserCartItems request;
if (getUserCartItems returned success)
Store addItemToCart in local storage;
else
Show error and retry;
Perform getUserCartItems request;
if (getUserCartItems contains addItemToCart ID)
Update local storage / cache and proceed with it.
else
Use existing data from local storage;
How do you deal with eventual inconsistency?

The correct answer is to NOT send SELECT queries to a read slave if the data needs to be immediately available.
You should structure your application such that all real-time requests hit your master, and all other requests hit one of your read slaves.
For things where you don't need real-time results, you can fool the user quite well using something like AJAX requests or websockets (websockets is going to make your application a lot more resource friendly as you won't be hammering your backend servers with multiple AJAX requests).

Synchronous, Asynchronous and Command Client Requests with GWT and GAE

In designing my GWT/GAE app, it has become evident to me that my client-side (GWT) will be generating three types of requests:
Synchronous - "answer me right now! I'm important and require a real-time response!!!"
Asynchronous - "answer me when you can; I need to know the answer at some point but it's really not all that ugent."
Command - "I don't need an answer. This isn't really a request, it's just a command to do something or process something on the server-side."
My game plan is to implement my GWT code so that I can specify, for each specific server-side request (note: I've decided to go with RequestFactory over traditional GWT-RPC for reasons outside the scope of this question), which type of request it is:
SynchronousRequest - Synchronous (from above); sends a command and eagerly awaits a response that it then uses to update the client's state somehow
AsynchronousRequest - Asynchronous (from above); makes an initial request and somehow - either through polling or the GAE Channel API, is notified when the response is finally received
CommandRequest - Command (from above); makes a server-side request and does not wait for a response (even if the server fails to, or refuses to, oblige the command)
I guess my intention with SynchronousRequest is not to produce a totally blocking request, however it may block the user's ability to interact with a specific Widget or portion of the screen.
The added kicker here is this: GAE strongly enforces a timeout on all of its frontend instances (60 seconds). Backend instances have much more relaxed constraints for timeouts, threading, etc. So it is obvious to me that AsynchronousRequests and CommandRequests should be routed to backend instances so that GAE timeouts do not become an issue with them.
However, if GAE is behaving badly, or if we're hitting peak traffic, or if my code just plain sucks, I have to account for the scenario where a SynchronousRequest is made (which would have to go through a timeout-regulated frontend instance) and will timeout unless my GAE server code does something fancy. I know there is a method in the GAE API that I can call to see how many milliseconds a request has before its about to timeout; but although the name of it escapes me right now, it's what this "fancy" code would be based off of. Let's call it public static long GAE.timeLeftOnRequestInMillis() for the sake of this question.
In this scenario, I'd like to detect that a SynchronousRequest is about to timeout, and somehow dynamically convert it into an AsynchronousRequest so that it doesn't time out. Perhaps this means sending an AboutToTimeoutResponse back to the client, and force the client to decide about whether to resend as an AsynchronousRequest or just fail. Or perhaps we can just transform the SynchronousRequest into an AsynchronousRequest and push it to a queue where a backend instance will consume it, process it and return a response. I don't have any preferences when it comes to implementation, so long as the request doesn't fail or timeout because the server couldn't handle it fast enough (because of GAE-imposed regulations).
So then, here is what I'm actually asking here:
How can I wrap a RequestFactory call inside SynchronousRequest, AsynchronousRequest and CommandRequest in such a way that the RequestFactory call behaves the way each of them is intended? In other words, so that the call either partially-blocks (synchronous), can be notified/updated at some point down the road (asynchronous), or can just fire-and-forget (command)?
How can I implement my requirement to let a SynchronousRequest bypass GAE's 60-second timeout and still get processed without failing?
Please note: timeout issues are easily circumvented by re-routing things to backend instances, but backends don't/can't scale. I need scalability here as well (that's primarily why I'm on GAE in the first place!) - so I need a solution that deals with scalable frontend instances and their timeouts. Thanks in advance!

If the computation that you want GAE to do is going to take longer than 60 seconds, then don't wait for the results to be computed before sending a response. According to your problem definition, there is no way to get around this. Instead, clients should submit work orders, and wait for a notification from the server when the results are ready. Requests would consist of work orders, which might look something like this:
class ComputeDigitsOfPiWorkOrder {
// parameters for the computation
int numberOfDigitsToCompute;
// Used by the GAE app to contact the requester when results are ready.
ClientId clientId;
}
This way, your GAE app can respond as soon as the work order is saved (e.g. in Task Queue), and doesn't have to wait until it actually finishes calculating a billion digits of pi before responding. Your GWT client then waits for the result using the Channel API.
In order to give some work orders higher priority, you can use multiple task queues. If you want Task Queue work to scale automatically, you'll want to use push queues. Implementing priority using push queues is a little tricky, but you can configure high priority queues to have faster feed rate.
You could replace Channel API with some other notification solution, but that would probably be the most straightforward.

Handling asynchronous saving with the possibility of time-critical errors?

So, to explain this, I'll start out by going through the application stack.
The system is running JSP with jQuery on top, talking through a controller layer with a service layer, which in turn utilizes a persistence layer implemented in Hibernate.
Now, traditionally, errors like having overlapping contracts has been handled through throwing exceptions up through the layers until they're translated into an error message for the user.
Now I have an object that at any given time can only be tied to one contract. At the moment, when I save a contract, I look at all of these objects and check if they're already covered by an existing contract. However, since multiple clients can be saving at any given time, this introduces the risk of getting past the check on two separate contracts, leading to one object being tied to two contracts at the same time.
To combat this, the idea was to use a queue, put objects into the queue from the main thread, and then have a separate thread take them out one by one, saving them.
However, here's the problem. For one, I would like the user to know that the saving is currently happening, for another, if by accident the scenario before happens, and two contracts with the same object covering the same time is in the queue, the second one will fail, and this needs to be sent back to the user.
My initial attempt was to keep data fields on the object put into the queue, and then check against those in a blocking wait, and then throw an exception or report success based on what happens. That deadlocked the system completely.
Anyone able to point me in the right direction with regards to techniques and patterns I should be using for this?

I can't really tell why you have a deadlock without seeing your code. I can think of some other options though:
Poll the thread to see its state (not as good).
Use some kind of eventing system. You would have an event listener (OverlappingContractEventListener perhaps) and then you would trigger the event from the thread when the scenario happens. The event handler would need to persist this information somehow.
If you are going for this approach, then on the client side you will need to poll.
You can poll a specific controller (using setInterval and AJAX) that looks up the corresponding information for the object to see what state its in. This information should have been persisted by your event listener.
You can use web workers (this is supported in Chrome, Firefox, Safari, and Opera. IE will support it in 10) and perform the polling in the background.
There is one other way that doesn't involve eventing. It depends on you figuring out the source of your deadlock though. Once you fix the source of your deadlock you can do one of two things:
Perform an AJAX call to the controller. The controller will wait for the service to return information. The code to issue feedback to the user will be inside the success handler of your controller.
Use a web worker to perform the call in the background. The web worker would also perform an AJAX call and wait for the response.

Shouldn't you be doing the check for duplicate contracts in the database? Depending on the case, you can do this with a constraint, trigger, o stored procedure. If it fails, send an exception up the stack. That's normally the way to handle things like this. You can then catch the exception in jQuery and display an error:
jQuery Ajax error handling, show custom exception messages
Hope this helps.

If a REST web service call fails, should a message or event queue be used to retry later?

I'm building a web service with a RESTful interface (lets call it MY_API). This service relies on another RESTful webservice to handle certain aspects (calling it OTHER_API). I'd like to determine what types of best practices I should consider using to handle failures of OTHER_API.
Scenario
My UI is a single page javascript application. There are some fairly complex actions a user can take, which can easily take the user a minute or two to complete. When they are done, they click the SAVE button and MY_API is called to save the data.
MY_API has everything it needs to persist the information submitted by the user. However, there is an action that must take place that is handled by OTHER_API. For instance, OTHER_API might handle sending out an emails. Or perhaps it handles adding line items to my user's billing statement. In both cases, these are critical things than must be completed, but they don't have to happen right now, they just need to happen eventually.
If OTHER_API fails, I don't want to simply tell the user their action has failed, as they spent a lot of time doing it and this will make the experience less than optimal.
Questions
So should I create some sort of Message or Event Queue that can save these failed REST requests to OTHER_API and process them later?
Any advice or suggestions on techniques to go about saving REST requests for delayed processing?
Is there a recommended open source message queue solution that would work for this type of scenario with JSON-based REST web services? Java is preferred as my backend is written in it.
Are there other techniques I should consider?

Rather than approach this by focusing on the failure state, it'd be faster and more robust to recognize that these actions should be performed asynchronously and out-of-band from the request by the UI. You should indeed use a message/event/job queue, and just pop those jobs right onto that queue as quickly as possible, and respond to the original request as quickly as possible. Once you've done that, the asynchronous job can be performed independently of the original request, and at its own pace — including with retries as needed.
If you want your API to indicate that there are aspects of the request which have not completed, you can use the HTTP response Status Code 202 (Accepted).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.