Amazon SWF Flow - Handling timeouts

Amazon SWF Flow - Handling timeouts - java

I'm working on a relatively simple workflow using Amazon's Flow framework for Java. I think I have a decent grasp of everything that's going on right now, but I have one area I'm still uncertain about: how should I go about handling timeouts?
The main timeout with my workflow is the executionStartToCloseTimeoutSeconds on the workflow itself, but I'd imagine the process is the same regardless of which timeout fires. It seems that most of the time, when the task times out, it just kind of disappears. I'd like to be able to know when this happens and do something (e.g. send an e-mail or log it somehow). I searched around and couldn't find any example of anything being notified that a timeout happened.

Activity timeout is delivered to the workflow code in the form of an Exception and can be easily handled.
IMHO workflow execution timeout is similar to kill -9 in Unix. It kills workflow without giving it chance to perform cleanup. So the main use for it is to ensure that broken workflow instances do not stay open forever.
For all business level timeouts do not rely on workflow timeouts, use timers instead. When timer fires your workflow code can execute notification activity and terminate the workflow with appropriate failure status.

http://docs.aws.amazon.com/amazonswf/latest/developerguide/swf-timeout-types.html
For activity related timeouts, the short answer is that your decider (i.e. workflow) logic should handle it. You should not have to worry about things timing out once you validate the logic and have retries in place.
For workflow timeouts you will need to inspect the workflow history / state to figure out that it timed out. You can definitely list workflow executions but you probably have to go through the SWF API directly (i.e. not through Flow). You want to do this anyway to catch failed workflows.
A pattern I've used and seen being used with SWF is to have an external way of keeping track of the work you've dispatched through SWF (think a DB) and use that to check in on work that was started and never completed. The workflow itself updates this when it completes (or as it's completing major pieces of work) so it's trivial to figure out which workflows are problematic.

Related

What is the Difference between Scheduling a Get Api call every single second, and Doing a Subscriber API?

I am Writing a Java Application where when the Data Changes an image should change,
My Colleagues are asking me to do a Scheduler where you have to call a get api every 1 second
My Suggestion is to use Pub-Sub so that whenever event happens , then only the data is changed
is Subscriber and Scheduler one and the Same?
No code

Publish/subscribe is a nicer option, theoretically.
The differences:
Polling is a kind of busy waits, with multiple clients causing superfluous network traffic. The client is active.
Publish/Subcribe needs an active server that does a push notification to all subscribers. Meanwhile there is sufficient support in HTML5/JavaScript and in java. The server is active.
Unfortunately publish/subscribe will probably be a bit harder to realize. Best would be to make a proof of concept in a separate application. Things like asynchroneous Ajax might appear.
Also some publish/subscribe libraries might still use under the hood polling at the client side, instead of push notifications.
So the colleagues' advise might be based on the simpler, unproblematic implementation.
Depending on the leeway you are given, and in the interest of architectural research: a prototype with a load test for both implementations would be fine. Hope never dies.

It's no the same:
Scheduler is when you explicitly choose when to make the request. You can to it every second, minute or whatever. Every time you create a new request.
Pub-Sub is when you create a permatent connection to the source of events, and when an event is published you consume it. You don't have here multiple requests, it's rather a socket connection.

How to best run a database background action?

I have a user workflow where at a specific time a webservice is called, and the results are presented to the user.
According to the search request and the queried results, I want to perform some database updates and statistic logging.
As the workflow pauses while the webservice is requested, I thought about creating some kind of background thread that performs these database actions, while the user can already continue the workflow without having to wait for database actions to complete.
Do you think this is a good practice? How could I create such onetime running background threads?

If you only want to run in the background, then an Executor service is a good solution.
If you need to ensure that queued requests survive events like a server restart, then you need a persistent queue like a JMS Queue. There are some nice, free open source JMS implementations that serve this purpose.

If service call teakes little time (say 1 or 2 seconds) then it is a waste of time to develop such feature.
If it takes significant amount of time you should do this in background.

Which errors to handle in clojure REST+disk-I/O app?

I have a server application which, somewhat simplified, periodically takes measurements via a rest-api from a not beefy-enough-server. The values should be cached locally (and are timestamped/immutable), maybe stored as a FloatBuffer where every position corresponds to a measurement sample. There's a webbrowser application which periodically makes ajax requests to update some neat statistics on the webpage, like this picture:
Assuming that the server is up and running, there are still many places where errors could occur
The REST measurement server could be unreachable (where the server just keeps storing measurements locally)
The network connection to the measurement server could be down
The storage could be full or somehow corrupt
The Browser could lose contact with the server and try to take it up again
My strategy for coping with errors in general should be the following:
If there are problems getting values from the measurement service via REST, there should be retries every minute. If the error persists for more than 30 minutes consequtively the administrator should be notified. In case of disk problems the administratior should be notified at once, or preferably even before the disk goes full.
The end user experience should be as transparent to the errors as possible, but the application should still function as sanely as possible, by notifying the user an error have occured but also show the latest data availiable.
How do I find which errors to cope with regarding network problems (using clj-http via an agent triggered by a ScheduledThreadPoolExecutor job to make REST request) and regarding problems with disk when trying to flush the FloatBuffer?
What is a sane way to implement the quite stateful yet algorithmic strategy mentioned above? Should I simply handle the error when the agent reports it and switch to some kind of a recovery-mode job?

In an interaction like this involving several components over different systems, the end user should be avoided to do many synchronous operations. Its only the sync operations that are time bound and require error reporting immediately.
Once the interaction of the end user will the system is async, you have a lot of choice on the error handling mechanism too... At the point where the end user interacts with the system you can have a error mapper that translates all the errors that come from the various components to the user understandable messages.
The user should be given an API to query the status of the request he submitted. That should be able to tell if the request is complete or if there is an error. If the network connections are going to take more time the status message can inform the user about that.
Every component will report error at some point in any distributed system. There are error listener interfaces provided by some APIs for this. This will asynchronously report errors to the user. Have a look at APIs like JMS (http://docs.oracle.com/javaee/5/tutorial/doc/bnceh.html). They are proven to be used in complex systems and have good error handling mechanisms.

multithreaded web application in java

I am doing a web application which has Java as a front end and shell script as a back end. The concept is I need to process multiple files in the back end. I will get the date range from the user (for example from July 1st-8th) and for each day process around 100 files. So in total I have 800 files to process.
I will get these details from JSP and delegate a background call to shell script and get back the results and display the same to the user.
Now I did all these in a sequential approach - by which I mean without threads. So there is only one main thread that executes and the user has to wait till 800 files are processed sequentially. However this is really slow. And because of this I am thinking of going for threads. Since I am a beginner of threads, I read a some stuffs regarding this and I have come up with the following idea:
As I read threads work have to be split .. I thought of splitting the
8 day work to 4 threads where each thread would perform 2 day work
I would like to know whether I am following a correct approach and my major concerns are:
Is it recommended to spawn multiple threads from a web application
Whether or not this is a good approach
Some guidance of how to proceed with this. An example instance would be great. Thank you.

Yes, you can run the long processing job in multi-threaded or in any high performance environment. You should also you Servlet 3.0 Asynchronous Request Processing to suspend the request thread and wait till the Long processing task is done.

Yes, there's nothing wrong with spawning multiple threads from a web application. In fact, if you're running a Servlet container (which you most likely are since you're using Java), it's already spawning multiple threads for you. In general a Servlet container will automatically spawn a new thread (or reuse one out of a pool) to handle each request it receives.
Your approach is fine, thought you'll want to fine-tune the number of threads to something that is suitable given the hardware configuration of your system and the amount of concurrent load on your web service. Also note that while spinning up a bunch of threads will reduce the total amount of time needed to process all the data, it will still leave a potentially large chunk of time before any data is ready to go back to the user. So you might get a better result by doing smaller work units sequentially, and posting each batch of results to the user-interface as soon as it is ready. Then it will still take a long while for the user to have all the data, but that can start viewing at least a portion of it almost immediately.

The way to improve user experience is not by parallelizing at Servlet level on 100000 threads but rather to provide incremental rendering of the view. First of all it would be useful to separate your application in multiple layers, according to the MVC pattern for example.
Saying that, you will have to look on how
Create a service that is able to return partial answers and a last answer, meaning that all available data has been returned. Each of this answers can be computed in parallel to improve performance.
Fill a web page incrementally, tipically by calling back this service which returns a JSON string you use to add data to the DOM. Every time you get an answer, if this is a partial answer, you call again the service providing the previous sequence number.
If you look to Liligo to understand, you will see how this is works. The technique I described is known as polling, but there are others technique to obtain similar asynchronous results at UI Level. In general, you don't want to work directly with the Servlet API, which is a very low level API,but rather use a reasonable framework or abstraction for that.
If you want a warm advice, you should have a look to the Play! framework http://www.playframework.org/documentation/2.0.2/JavaStream HTTP streaming.

Create threads in a web application is not a good solution. It is a bad design because normally it would be the container (web server) who is charged with that activity. So I think you have to find another solution.
I suggest you putting the shell scripts in cron, scheduled to run each minute, and to "activate" them you can touch files that act as semaphores. At each run the scripts verify if the web application touched the semaphore file, if so they read the date interval from those files and then start to process.

Backend process VS scheduled task

I have a number of backend processes (java applications) which run 24/7. To monitor these backends (i.e. to check if a process is not responding and notify via SMS/EMAIL) I have written another application.
The old backends now log heartbeat at regular time interval and this new applications checks if they are doing it regularly and notifies if necessary.
Now, We have two options
either run it as a scheduled task, which will run after every (let say) 15 min and stop after doing its job or
Run it as another backend process with 15 min sleep time.
The issue we can foresee right now is that what if this monitor application goes into non-responding state? So, my question is Is there any difference between both the cases or both are same? What option would suit my case more?
Please note this is a specific case and is not same as this or this
Environment: Java, hosted on LINUX server

By scheduled task, do you mean triggered by the system scheduler, or as a scheduled thread in the existing backend processes?
To capture unexpected termination or unresponsive states you would be best running a separate process rather than a thread. However, a scheduled thread would give you closer interaction with the owning process with less IPC overhead.
I would implement both. Maintain a record of the local state in each backend process, with a scheduled task in each process triggering a thread to update the current state of that node. This update could be fairly frequent, since it will be less expensive than communicating with a separate process.
Use your separate "monitoring app" process to routinely gather the information about all the backend processes. This should occur less frequently - whether the process is running all the time, or scheduled by a cron job is immaterial since the state is held in each backend process. If one of the backends become unresponsive, this monitoring app will be able to determine the lack of response and perform some meaningful probes to determine what the problem is. It will be this component that will then notify your SMS/Email utility to send a report.

I would go for a backend process as it can maintain state
have a look at the quartz scheduler from terracotta
http://terracotta.org/products/quartz-scheduler
It will be resilient to transient conditions and you only need provide a simple wrap so the monitor app should be robust providing you get the threading stuff right in the quartz.properties file.

You can use nagios core as core and Naptor to monitoring your application. Its easy to setup and embed with your application development.
You can check at this link:
https://github.com/agunghakase/Naptor/tree/ver1.0.0

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Amazon SWF Flow - Handling timeouts - java

Related

What is the Difference between Scheduling a Get Api call every single second, and Doing a Subscriber API?

How to best run a database background action?

Which errors to handle in clojure REST+disk-I/O app?

multithreaded web application in java

Backend process VS scheduled task

Categories

Resources