I'm implementing a basic Task Queue in the Google App Engine. Nothing fancy as I'm learning the introductions. The (push) queue works fine, but I'd like to send a little confirmation (or fail) message to the user of the relevant http session when the task is finished.
The way my structure is setup:
a HttpServlet receives an incoming HttpServletRequest
some info is retrieved from the HttpSession and is used to produce a task + store it in the task queue
another HttpServlet (the worker) receives an incoming request from the queue to execute the task
task gets executed
(this is the step I didn't manage to implement:) Send some info to a .jsp for the HttpSession that inserted the task in the queue
Normally I would do this by retrieving the session from the HttpServletRequest, but in this case, the one initiating the request is the queue itself (and not the user who initiated the task). I can't pass the HttpSession as an array of bytes to the parameter since I need to keep a reference to the same session.
I was able to pass the id from the session as a parameter to my worker, but I couldn't figure out how to find the reference back to the session through this id.
It's possible something could be done with FutureValue here: https://github.com/GoogleCloudPlatform/appengine-pipelines/tree/master/java/src/main/java/com/google/appengine/tools/pipeline
but I'm completely lost on how it's used.
Saving http sessions per user in a HashMap / datastore seems like bad practice since the user might not want to keep the data.
So, any ideas on how to send an asynchronous message back to the user that initiated the task upon task completion/failure?
We implemented our own session system on top of appengine but I believe what we did would work the same for you. Save the start of the job in an entity in the datastore linked up to the session id of the user. Make sure to timestamp it and then use polling on your client side to check if the job has been finished. Update the job's status when it is complete and then purge the entity when you next poll for the status update. To clean up the data that sticks around if the user closes the session just use a appengine cron job to go through and clean up the old entities after a set time.
This has worked well for our implementation and it is a bit more work but that's what you are dealing with when your tasks are async.
Related
In Java in a web service, I have a requirement I want to return the response to the user after configured threshold time reaches and wants to continue processing after that.
Let's say I have a service it does step1, step 2, and the configured threshold is 1 second. Let's say step1 is completed at 1 second I want to return an acknowledgment response to the user and continue processing with step2 and wants to store response in DB or something like that.
Please let me know if anyone has any solutions or thoughts on this problem
There are multiple ways to achieve this
HTTP Layer
On HTTP layer, if the response comes back before the threshold, then I'd be tempted to send back a 200 Success.
However, if it takes more time than the threshold, you could use 202 Accepted
Looking at the RFC, its use case looks like this
6.3.3. 202 Accepted
The 202 (Accepted) status code indicates that the request has been
accepted for processing, but the processing has not been completed.
The request might or might not eventually be acted upon, as it might
be disallowed when processing actually takes place. There is no
facility in HTTP for re-sending a status code from an asynchronous
operation.
The 202 response is intentionally noncommittal. Its purpose is to
allow a server to accept a request for some other process (perhaps a
batch-oriented process that is only run once per day) without
requiring that the user agent's connection to the server persist
until the process is completed. The representation sent with this
response ought to describe the request's current status and point to
(or embed) a status monitor that can provide the user with an
estimate of when the request will be fulfilled.
Now, of course, instead of having a mix of 200 and 202, you could just return 202 everytime
Application Layer
In your application layer, you'll typically want to make use of asynchronous processing for this purpose.
There are multiple ways to leverage this way of working, you can:
Post a message on a queue/topic and let a message broker take care of dispatching it to another part of the app, or another app and let this part do the processing
Save the request inside of a database, and have another service poll the database for new requests, similar to queueing explained above, without JMS
If you're using Java EE, your EJB container allows you to work with #Asynchronous which will call a method asynchronously and return (so you'll be able to return 202)
If you're using Spring, it has an #Async annotation for the same purpose as hereabove
There are definitely other methods you could use to achieve this use case, but I think the ones I presented are the most common ones
My programme is a notification service, it basically receives http requests(client sends notifications) and forwards them to a device.
I want it to work the following way:
receive client notification request
save it to the database(yes, i need this step, its mandatory)
async threads watch new requests in database
async threads forward them to the destination(device).
In this case the programme can send client confirmation straight away after the step 2).
Thus, not waiting for the destination to respond(device response time can be too long).
If I stored client notification in memory i would use BlockingQueue. But I need to persist my notifications in db. Also, I cannot use Message Queues, because clients want rest endpoints to send notifications.
Help me to work out the architecture of such a mechanism.
PS In Java, Postgresql
Here are some ideas that can lead to the solution:
Probably the step 2 is mandatory to make sure that the request is persisted so that rather it will be queried. So we're talking about some "data model" here.
With this in mind, if you "send" the confirmation "right away after the step 2" - what if later you want to do some action with this data (say, send it somewhere) and this action doesn't succeed. You store it on disk? what happens if the disk is full?
The most important question is what happens to your data model (in the database) in this case? Should the entry in the database still be there or the whole "logical" action has failed? This is something you should figure out depending on the actual system the answers can be different.
The most "strict" solution would use transactions in the following (schematic) way:
tr = openTransaction()
try {
saveRequestIntoDB(data);
forwardToDestination(data);
tr.commit();
} catch(SomeException ex) {
tr.rollback();
}
With this design, if something goes wrong during the "saveRequest" step - well, nothing will happen. If the data is stored in db, but then forwardToDestination fails - then the transaction will be rolled back and the record won't be stored in DB.
If all the operations succeed - the transaction will be committed.
Now It looks like you still can use the messaging system in step 4. Sending message can be fast and won't add any significant overhead to the whole request.
On the other hand, the benefits are obvious:
- Who listens to these "notifications"? If you send something and only one service should receive and process the notification how do you make sure that others won't get it? How would you implement the opposite - what if all the services should get the notification and process it independently?
These facilities are already implemented by any descent messaging system.
I can't really understand the statement:
I cannot use Message Queues, because clients want rest endpoints to send notifications.
Since the whole flow is originated by the client's request I don't see any contradication here. The code that is called from rest endpoint (which is after all is a logic entrypoint that should be implemented by you) can call the database, persist the data and then send the notification...
I have a task to implement a distributed Queuing System something like the Amazon SQS.
If there is GET Request, I have to deliver the message to the user from the main queue and put the message in the invisible queue. And immediately a DELETE Request should come and I should delete the message from the invisible queue.
In case there is no DELETE Request, I am supposed to increase the redelivery count and send the message back to the main queue. This will happen till the redelivery count becomes 5 after which I will delete the message permanently.
Now my doubt is, how do I know that there has been no DELETE request which means that I should send the message back to the main queue?
My program works for the case where the DELETE Request follows the GET Request. I am using java for this implementation.
First of all, at the design level, the get and delete should be done in one action. Notice that in the JDK, the pull() operation of Queue will do get and delete. if you insist on separate actions, at the very least you should support an optional get-and-delete request type.
now, there is a problem when you want to detect an action that did not happen because it can forever "maybe happen in the future". So you need to set a window of time after which you decide that the expected action did not happen.
what is usually done is that you attach a "received" timestamp to the request (and also re-deliver count) before putting it in the invisible queue (a better name would be "pending delete requests" queue) you can wrap the request in a custom java class that adds the properties.
actually, I don't think a queue is a good choice for a collection. when a delete request does come, you need random access to the request. so perhaps a hash map is a better choice.
you will need to implement a Timer that invokes tasks every x seconds. the tasks will scan the pendingDeleteRequests map for requests that did not recevie delete in the allowed window of time and remove from the map.
last note: some messaging systems have "dead letter" feature, which is a destination where notices of failed deliveries are sent. this will help in debugging of problems.
I need to run some time-consuming task from a controller. To do it I have implemented an #Async method in my service so that the controller can return immediately (for example with 202 Created status).
The problem is that the task need access to some session-scoped beans. With this approach I am getting org.springframework.beans.factory.BeanCreationException: Error creating bean with name (...): Scope 'session' is not active for the current thread (...).
The same result is when I manually create an ExecutionService instead of #Async.
Is it possible to somehow make a worker thread attached to the current session?
EDIT
The purpose is to implement a bulk operation, providing a way to monitor the status of processing. Something like described in this answer: https://stackoverflow.com/a/28787774/718590
If I run it synchronously, there will be no indication of the status (how many items processed), and a request timeout may occur.
If I correctly understand, you want to be able to start a long time asynchronous processing from a spring web application, and be able to follow advancement of processing from the session that started it. And the processing could use beans contained in the session.
For a good separation of concerns, I would never have an asynchronous thread know a session. The session is related to HTTP and can be destroyed at any time before the thread can finish (or even begin in race conditions) its processing.
IMHO, a correct design would be to create a class containing all the informations shared between the web part and the asynchronous processing : the status (whatever it can be), the user that started processing if is is relevant and every other relevant piece of information. In your controller (of preferently in the service method called by the controller) you prepare an object of that class, and pass it to the #Async method. Then before returning, the controller stores the object in session. That way :
the asynchronous processing has all its required information, even is the session is destroyed later. It does not need to know the session and only cares for its processing and updates its status
the session of the web application knows that the asynchronous processing is running, know how it was started and what is the current status
It can be adapted to your real problem, but this should meet your requirements.
I want to make an AJAX call to my Java webapp. The Java webapp will in turn make an asynchronous return call elsewhere. The result of that call will then be returned as the result of AJAX request.
The crux of my question is what would I do with the HttpRequest whilst I'm waiting for the second call to return?
Do I just block and wait for the call within the AJAX handler method or do I store the request somewhere and wait for a callback? How would I handle errors / timeouts?
For those who care further information as to how I arrived at this situation follows:
This is part of an XMPP based instant messaging system. There is one global support user which is displayed as an icon on every page in our webapp. I also want to display the presence of this user, so, I could just use the IM system to request this users presence on every single page load for every user and eventually DDOS myself. Instead I want to have a single user query the presence from the webapp periodically and cache the result.
The AJAX call is therefore to the server which will then either return the cached presence or query the XMPP server asynchronously.
You shouldn't have to block and wait for the AJAX call. That is, don't make the call synchronously. What you should do on the Java side is figure out a way to block while you wait for the response to come back from your asynchronous call (i.e., figure out to a way to make the request synchronously. The performance hit will be on the first call for any new data. Subsequent calls will hit the cache, so you should be good). You can maintain a cache for this data, so you can check the cache first to see if the data exists. If it doesn't make the call and store the result in the cache. Otherwise, grab the data from the cache and send it back to the view. Since AJAX is asynchronous, your callback will be called as soon as the data comes back from the server.
here is what i would do:
when the page startup, init an job to retrieve data array you need for that specific page, you need to identify the job and the job result for later usage
use ajax from the page to poll for the job result, once the job is done, the poll finishes and returned with data
cache the entries you have requested as Vivin indicated
cache the job result on your server and give it a time-out option
HTTP requests, i.e. HttpServletRequest objects are not serializable. Therefore you cannot store them in a persistent store of any sort, for the duration of the call. It doesn't make sense anyway to store the request, for its life is limited to the duration of the HTTP request itself, given the stateless nature of the HTTP protocol.
This effectively means that you have to hold on to the HttpServletResponse object for the duration of the call. The HttpServletRequest object is no longer needed, once the parsing of the HTTP request is performed, and once all the data is available to your application; it is the response object that is of importance in your context.
The response could be populated with the cached copy of the user status. If the copy in the cache is stale, you might want to refresh it synchronously from the XMPP server (after all, it affects the performance of just one page load). You could query asynchronously from within the application server, but some result must be returned to the browser (so there might be a few edges cases that need to be taken care of).