I have a job done with the quartz plugin, but I would like to run it only if my User table contains users with active status.
Thank you
I have a job done with the quartz plugin, but I would like to run it
only if my User table contains users with active status.
There isn't enough information here to know for sure what the right thing to do is but one thing to consider is you can schedule the Quartz job to run at whatever frequency you like and have the job query the database regarding user active status, and then the job can react accordingly.
I think what you want is a webhook; after a put/post/delete (ie idempotent call), you call another endpoint.
Related
I need to automate the workflow after an event occurred. I do have experience in CRUD applications but not in Workflow/Batch processing. Need help in designing the system.
Requirement
The workflow involves 5 steps. Each step is a REST call and are dependent on previous step.
EX of Steps: (VerifyIfUserInSystem, CreateUserIfNeeded, EnrollInOpt1, EnrollInOpt2,..)
My thought process is to maintain 2 DB Tables
WORKFLOW_STATUS Table which contains columns like
(foreign key(referring to primary table), Workflow Status: (NEW, INPROGRESS, FINISHED, FAILED), Completed Step: (STEP1, STEP2,..), Processed Time,..)
EVENT_LOG Table to maintain the track of Events/Exceptions for a particular record
(foreign key, STEP, ExceptionLog)
Question
#1. Is this a correct approach to orchestrate the system(which is not that complex)?
#2. As the steps involve REST Calls, I might have to stop the process when a service is not available and resume the process in a later point of time. I am not sure for many retry attempts should be made and how to maintain the no of attempts made before marking it as FAILED. (Guessing create another column in the WORKFLOW_STATUS table called RETRY_ATTEMPT and set some limit before marking it Failed)
#3 Is the EVENT_LOG Table a correct design and what datatype(clob or varchar(2048)) should I be using for exceptionlog? Every step/retry attempts will be inserted as a new record to this table.
#4 How to reset/restart a FAILED entry after a dependent service is back up.
Please direct me to an blogs/videos/resources if available.
Thanks in advance.
Have you considered using a workflow orchestration engine like Netflix's Conductor? docs, Github.
Conductor comes with a lot of the features you are looking for built in.
Here's an example workflow that uses two sequential HTTP requests (where the 2nd requires a response from the first):
Input supplies an IP address (and a Accuweather API key)
{
"ipaddress": "98.11.11.125"
}
HTTP request 1 locates the zipCode of the IP address.
HTTP request 2 uses the ZipCode (and the apikey) to report the weather.
The output from this workflow is:
{
"zipcode": "04043",
"forecast": "rain"
}
Your questions:
I'd use an orchestration tool like Conductor.
Each of these tasks (defined in Conductor) have retry logic built in. How you implement will vary on expected timings etc. Since the 2 APIs I'm calling here are public (and relatively fast), I don't wait very long between retries:
"retryCount": 3,
"retryLogic": "FIXED",
"retryDelaySeconds": 5,
Inside the connection, there are more parameters you can tweak:
"connectionTimeOut": 1600,
"readTimeOut": 1600
There is also exponential retry logic if desired.
The event log is stored in ElasticSearch.
You can build error pathways for all your workflows.
I have this workflow up and running in the Conductor Playground called "Stack_overflow_sequential_http". Create a free account. Run the workflow - click "run workflow, select "Stack_overflow_sequential_http" and use the JSON above to see it in action.
The get_weather connection is a very slow API, so it may fail a few times before succeeding. Copy the workflow, and play with the timeout values to improve the success.
You describe an Enterprise Integration Pattern with enrichments/transformations from REST calls and stateful aggregation of the results over time (consequently meaning many such flows may be in progress at any one time). Apache Camel was designed for exactly these scenarios.
See What exactly is Apache Camel?
There are few tables that quartz scheduler uses for scheduling jobs and to identify which job is running currently. It uses the following tables :
qrtz_fired_triggers
qrtz_simple_triggers
qrtz_simprop_triggers
qrtz_cron_triggers
qrtz_blob_triggers
qrtz_triggers
qrtz_job_details
qrtz_calendars
qrtz_paused_trigger_grps
qrtz_locks
qrtz_scheduler_state
So what is the purpose of each of these tables and what does it siginifies?
Thanks in advance.
I had the chance to work on quartz recently. I'm myself not 100% clear on this topic and I'm going to try my best to answer your question from my personal experience.
You must remember this basic flow-
1. Create a job.
2. Create a Trigger.
3. Scheduler(job, trigger)
All the above tables are based on the above 3 steps.
qrtz_triggers is where general information of a trigger is saved.
qrtz_simple_triggers, qrtz_simprop_triggers, qrtz_crons_triggers, qrtz_blob_triggers have a foreign key relation to qrtz_triggers which save those specific details. Ex. Cron has cron expression which is unique to it.
qrtz_job_details is simply the task to be executed.
qrtz_fired_triggers is a log of all the triggers that were fired.
qrtz_paused trigger is to save the information about triggers which are not active.
Calendars are useful for excluding blocks of time from the the trigger’s firing schedule. For instance, you could create a trigger that fires a job every weekday at 9:30 am, but then add a Calendar that excludes all of the business’s holidays. (taken from website. I havent' worked on it)
I honestly haven't worked in qrtz_locks, qrtz_scheduler_sate tables.
Check out this image which I reverse engineered using MySQL workbench.
I can provide some inputs for qrtz_lock and qrtz_scheduler_sate tables:
qrtz_lock stores the value of the instance name executing the job, to avoid the sceanario of multiple nodes executing the same job
qrtz_scheduler_state is for capturing the node state so that if in any case one node gets down or failed to execute one of the job then the other instance running in clustering mode can pick the misfired job.
i use the latest Acitiviti 5.22.0 engine (to be more concrete i use Alfresco Process Services 1.6.3) and i have implemented a Spring bean that gets executed every 10 minutes to generate a JSON representation of all my processes (process name, startDate, endDate, current taskName(s) and assignee(s)), to send them to an audit server. The problem is, that i only need to send all changed processes since the last run.
I do not want to send the JSON as soon as a process changes but to do a batch update of my audit system every 10 minutes.
To accomplish this, i've tried different approaches. My latest one:
Create a event listener bean that listens to all PROCESS_STARTED, PROCESS_COMPLETED, PROCESS_CANCELLED, TASK_COMPLETED, ...
Every time the event is triggered, store a process variable "_dirty" and set it to true
Every 10 minutes (wenn my JSON-bean is executed) query for all processes with the "_dirty" variable set to true
After sending the JSON to the audit system, set all "_dirty" process variables to false.
The problem with this approach: I am not able to update the "_dirty" variable after a process is ended. At least i don't know how.
My second approach would be to to store the processInstanceId on every event into a "global" property, but i don't know how to store this "global" property into database in case the server restarts. Is there a way to persist a property or an Entity into DB without creating an extra table, DAO, etc.?
Any ideas on how to solve this task? All tips are very much appreciated!
AFAIK, There's no such option
But you look at this. and see if it can be helpful in your case.
https://www.activiti.org/userguide/#_database_tables
As Linus suggested: This is not possible, so I needed some completely different approach.
I am creating an Ad-Hoc task now and store my properties as a local task variable. The Ad-Hoc task is owned by a system account and not assigned to anybody. This way I can make sure, no one of my real users tries to "complete" the task. Also I've written some code to generate the task if needed, so in case i want to clean it, it is created automatically the next time i want to store data.
Creating an Ad-Hoc task is quite easy by using org.activiti.engine.TaskService autowiring into my class.
Task task = taskService.newTask();
task.setDelegationState(DelegationState.PENDING);
task.setName("Some name goes here");
task.setTenantId("your tenant id (if any)");
task.setOwner("your system accounts ID");
task.setCategory("i use a special category to later query for the task");
taskService.saveTask(task);
After saving the task to the database, I can use the taskService to store and retrieve variables like this:
taskService.setVariableLocal(task.getId(), "variableKey", "variableValue");
Or query for the task like this:
Task task = taskService.createTaskQuery().taskDelegationState(DelegationState.PENDING).taskCategory("your special category").singleResult();
Not a very nice solution (I recommend having the task cached in a bean or something, so you don't need to query it all the time or even cache its values or something), but it works.
I have a scenario where I check for a specific value in the Database every 10 seconds or so. And, if the value is YES, then I execute a bunch of shell scripts from a Java application.
Now, the value in database is only updated to YES once in a while depending on the user submitting a job on a web page. Therefore, running a while loop to check for this value in database seems to be a very bad design and I would like to implement a much cleaner approach using listeners (Observer design pattern).
How would such an implementation look like? Any examples I can follow to do this?
Yes there is much better job. So there is something called binlog reader in mysql. Thats how master and slave sync is done in mysql cluster database.
So either you write your own logic over https://github.com/shyiko/mysql-binlog-connector-java which gets all the chane event on table
or use https://github.com/zendesk/maxwell to read events from particular table and whenver any change in value is there check if it matches your condition and excute the script or java application on basis of that instead of running it as a cron.
The general idea is to use DB triggers, register DB listener from Java side and be notified from DB side when some event has happened.
Pls review proposed solutions
How to implement a db listener in Java
Can anyone please suggest me the best approach for my requirement? I need to automatically update the table value after some specified time, using Java and MySQL as the database.
Using Quartz scheduler you can achieve this. You need to create one job and run that job at the required time so that it will fetch the data from the database and according to that you will do what's needed.
Quartz Scheduler Tutorial
Create a timer in Java.
Add a task to timer that updates value to MySQL.
Start timer
Example of Java Timer API: here