Fetch Spring Batch Job Execution with specific Job parameter

Fetch Spring Batch Job Execution with specific Job parameter - java

I want to fetch the lastJobExecution with specific job parameter.
Scenario:
I want to store userEmail as a job parameter, and then fetch last job execution of that user :
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("userEmail", new JobParameter("something#xyz.com", true));
JobParameters jobParameters = new JobParameters(parameters);
jobRepository.getLastJobExecution("jobName", jobParameters)
Issue in above approach:
If I only use put only userEmail as a job parameter, same user wont be able to trigger another job once previous job is finished, as error: A job instance already exists with given job parameters.
So I am planning to add userEmail + startTime as job parameter. So same user can trigger multiple jobs.
However, now I want to fetch last job execution of that user, I need both userEmail + startTime
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("userEmail", new JobParameter("something#xyz.com", true));
parameters.put("startTime", new JobParameter(123L, true));
JobParameters jobParameters = new JobParameters(parameters);
jobRepository.getLastJobExecution("jobName", jobParameters)
But I dont have startTime when fetching last job execution of that user.
Is there any way to fetch last job execution with only 1 job parameter?
Or do I need to write my own JdbcTemplate-based DAO implementation to run the select query ?
Tried using long way, like fetch all job executions by job name and then filter.
But this is quite inefficient way.

Identifying job parameters are hashed together to produce a key that is used to identify job instances. So there is no direct way to fetch the last job execution of a job instance with a single parameter. You need to fetch is yourself with a custom SQL query or programmatically by filtering job executions as you did.
That said, it feels like this a job parameters design issue, rather than a limitation in Spring Batch. The choice of start time a job parameter is not suitable to your case as it is not constant in time, hence you can't have it when fetching the last job execution. You need to find a way to uniquely identify job instances which allows you to have all information needed when fetching executions.

Related

identifing a scheduled business object report

I am doing a java application that has to download only scheduled reports from a Business Object Server. For scheduling the reports I am using Info View the following way
1) Clic on the report
2) Action --> Schedule
3) Set Recurrence, Format and Destinations
The report then has a number of instances, as opposed to not scheduled reports, which have zero instances.
In the code, for separate the scheduled reports I am using
com.crystaldecisions.sdk.occa.infostore.ISchedulingInfo
IInfoObject ifo = ((IInfoObject) result.get( i ))
ISchedulingInfo sche = ifo.getSchedulingInfo();
this should give info about scheduling right? but for some reason this is returning an object(not a null, how I suppose it should return) for not scheduled reports.
And the info returned by its methods (say getBeginDate, getEndDate, etc) are similar for both kinds.
I tried to filter the reports using SI_CHILDREN > 0 the query
SELECT * FROM CI_INFOOBJECTS WHERE SI_PROGID = 'CrystalEnterprise.Webi' "
+ AND SI_CHILDREN > 0 AND SI_PARENTID = " + String.valueOf( privateFolderId )
+ " ORDER BY SI_NAME ASC "
is this a right way to filter the scheduled reports?

So Webi, Crystal etc. implement the ISchedulable interface. This means that your non-instance InfoObject WILL return an ISchedulingInfo, regardless of whether or not it has been scheduled.
If an object is scheduled, an instance is created with SI_SCHEDULE_STATUS = 9 (ISchedulingInfo.ScheduleStatus.PENDING)
The job then runs (SI_SCHEDULE_STATUS = 0), and either completes (SI_SCHEDULE_STATUS=1) or fails (SI_SCHEDULE_STATUS = 3). It can also be paused (SI_SCHEDULE_STATUS = 8)
So to find all instances that are scheduled, you need a query like:
select * from ci_infoObjects where si_instance=1 and si_schedule_status not in (1,3)
This will get you anything that isn't a success or a failure

A scheduled report will have a child instance which holds the scheduling information and has the scheduled report as its parent. (You can see this instance in the history list in BI Launch Pad.)
You can retrieve recurrently scheduled child instances from the CMS like this:
SELECT * FROM CI_INFOOBJECTS WHERE SI_PROGID = 'CrystalEnterprise.Webi'
and si_recurring = 1
This will isolate any the reports which are scheduled to be executed (or to be more precise, the child "scheduling" instances described above). You can then call getSchedulingInfo() on the child instance to get further info about this scheduling.
Bear in mind the the SI_PARENTID field, not the SI_ID field, returned by the above query gives you the ID of the initial WebI report.

How to pass List to Spring Batch ItemReader via REST call

I'm working on a REST method that will perform a job using Spring Batch.
I have a simple job defined,
<job id="myIndexJob" xmlns="http://www.springframework.org/schema/batch">
<step id="step1">
<tasklet>
<chunk reader="myIndexItemReader" processor="myIndexItemProcessor" writer="myIndexItemWriter" commit-interval="1" />
</tasklet>
</step>
</job>
This job mimics a question I posted earlier,
Spring Batch ItemReader list processed only once
But this time, instead of executing the job on a schedule, I want to manually execute it via a REST call.
The problem I'm having is passing a List to the myIndexItemReader. My REST call will generate a List based on some query string. The previous question I posted got it's List handed to it via the spring bean in the XML each time the step ran.
I'd like to do something like this,
#RequestMapping(value="/rest/{regex}", method=RequestMethod.GET)
public void run(#PathVariable String regex) {
List<String> myList = new ArrayList<>();
myList.add("something");
long nanoBits = System.nanoTime() % 1000000L;
if (nanoBits < 0) {
nanoBits *= -1;
}
String dateParam = new Date().toString() + System.currentTimeMillis()
+ "." + nanoBits;
JobParameters param = new JobParametersBuilder()
.addString("date", dateParam)
.toJobParameters();
JobExecution execution = jobLauncher.run(job, param);
}
but I just don't know how to pass myList to the myIndexItemReader.
As of now I can do this by creating a RepeatTemplate and calling iterate on a callback, but the job chunk seems more clean.
Anyone have any ideas or suggestions? Thanks /w

I took an alternate approach and stored information in a database table based on the REST criteria. Then the ItemReader read the table and cleared it after each run.

You can pass queries as job parameters, but you have to be carefully because string job parameter has a finite lenght (250) (look Metadata schema).
If this can be a limit you can precompile a property file like that:
queries.properties
query1=<query string 1>
query2=<query string 2>
query3=<query string 3>
queryn=<query string n>
As job parameters you can pass:
queriesIdsCount (integer): number of queries (0..n)
queryId0 (string): identifier of query in queries.properties file (ex: query2)
queryId1 (string): (ex. query3)
queryIdn (string): (ex. query1)
and so on so you can select queries from your list.
With a Tasklet or a usual Reader/Process/Writer (as first step) you can process your job parameters and create List<> using REST.
Job parameters are available using spEL look about late-binding.
I hope I was clear, English is not my native language.

Creating entities using task queues do not always get created

I have a task that simply creates an entity into the datastore. I now queue up many tasks into a named push queue and let it run. When it completes, I see in the log that all of the task request were run. However, the number of entities created was actually lower than expected.
The following is an example of the code I used to test this. I ran 10000 tasks and the final result only has around 9200 entities in the datastore.
I use RestEasy to expose urls for the task queues.
queue.xml
<queue>
<name>testQueue</name>
<rate>5/s</rate>
</queue>
Test Code
#GET
#Path("/queuetest/{numTimes}")
public void queueTest(#PathParam("numTimes") int numTimes) {
for(int i = 1; i <= numTimes; i++) {
Queue queue = QueueFactory.getQueue("testQueue");
TaskOptions taskOptions = TaskOptions.Builder.withUrl("/queuetest/worker/" + i).method(Method.GET);
queue.add(taskOptions);
}
}
#GET
#Path("/queuetest/worker/{index}")
public void queueTestWorker(#PathParam("index") String index) {
DateFormat df = new SimpleDateFormat("MM/dd/yyyy HH:mm:ss");
Date today = Calendar.getInstance().getTime();
String timestamp = df.format(today);
Entity tObj = new Entity("TestObj");
tObj.setProperty("identifier", index);
tObj.setProperty("timestamp", timestamp);
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Key key = datastore.put(tObj);
}
I have ran this a few times and not once have I seen all of the entities created.
Is it possible that tasks can be discarded if there is too much contention on the queue?
Is this the expected behavior for a task queue?
#### EDIT
I followed mitch's suggestion to log the entity IDs that are created and found that they are indeed created as expected. But the logs themselves displayed some weird behavior in which logs from some tasks appear in another task's log. And when that happens, some tasks show 2 entity IDs in a single request.
For the tasks that display 2 entity IDs, the first one it logs are the missing entities in the datastore. Does this mean there is a problem with a high number of puts to the datastore? (The entities i'm creating are NOT part of a larger entity group, i.e. It doesn't refer to #parent)

Why don't you add a log statement after each datastore.put() call which logs the ID of the newly created entity. Then you can compare the log to the datastore contents and you will be able to tell if the problem is that datastore.put() is not being invoked successfully 1000 times or if the problem is that some of the successful put calls are not resulting in entities that you see in the datastore.

Why Quartz scheduler's unscheduleJob is deleting both trigger and job detail?

Im trying to execute the following quartz scheduler code in the a cluster environment.
scheduler.unscheduleJob("genericJobTrigger", "DEFAULT");
where as
Scheduler scheduler = (Scheduler) context.getBean("scheduler");
JobDetail genericJob = (JobDetail) context.getBean("genericJob");
CronTrigger genericJobTrigger = (CronTrigger) context.getBean("genericJobTrigger");
Above piece of code is deleting entries from trigger and job detail. It supposed to remove only trigger right?
Why Quartz scheduler's unscheduleJob is deleting both trigger and job detail?

durability is set true to Jobs to avoid deleting the JOBS when Triggers are deleted.

Whenever you are creating an object of JobDetail then set storeDurably(), refer the below example:
return JobBuilder.newJob(ScheduledJob.class)
.setJobData(jobDataMap)
.withDescription("job executes at specified frequency")
.withIdentity(UUID.randomUUID().toString(), "email-jobs")
.storeDurably() //This will not allow to delete automatially
.build();
Also you can verify it by checking the value of IS_DURABLE column in jobDetails table.

Quartz Scheduler - Updating only the JobDataMap, between jobs

I have a Quartz Job that I can schedule with some Cron Trigger.
ReportSchedule reportSchedule = ... // my object
JobDetail jobDetail = new JobDetail(reportSchedule.getScheduleName(),
reportSchedule.getScheduleGroup(),
ExtendedReportJob.class /* my job */);
jobDetail.getJobDataMap().put("reportSchedule", reportSchedule);
jobDetail.setDescription(reportSchedule.getScheduleDescription());
CronTrigger trigger = ...; // depends on the report schedule
scheduler.scheduleJob(jobDetail, trigger);
This code successfully writes the job and details to a database.
The reportSchedule object contains specific parameters that are required for the job. However, I may want to change the parameters.
I can do this with
scheduler.deleteJob(name, group);
scheduler.scheduleJob(jobDetail, trigger);
// where jobDetail.getJobDataMap() has the updated reportSchedule
Doing this, however, will trigger the job right away since the trigger depends on the report schedule and I don't want to change it (I want to keep original date). So my question: Is there any way to modify the JobDetail or JobDataMap between jobs without changing the Trigger?
I'm using Quartz 1.6.0.

The solution is simple enough, just have to know the API.
The Scheduler class has the following method
Scheduler#addJob(JobDetail, boolean);
In which the passed JobDetail will overwrite the previous one if the boolean argument is set to true.
So
// name and group are the primary key of the job detail
final JobDetail jobDetail = new JobDetail(name, group, ExtendedReportJob.class);
// reportSchedule is the object I've previously modified
jobDetail.getJobDataMap().put(ORStatics.REPORT_SCHEDULE, reportSchedule);
jobDetail.setDescription(reportSchedule.getScheduleDescription());
// overwrite the previous job, however retaining the triggers
scheduler.addJob(jobDetail, true);
will update the job detail in persistent storage. Since the primary key for the table containing the JobDetail will remain the same, we don't need to change the triggers. They will still execute it as scheduled.

What about getting the trigger with getTrigger(String triggerName, String triggerGroup) and store it in a variable. Then create a new job with your new jobDataMap and use the old trigger?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Fetch Spring Batch Job Execution with specific Job parameter - java

Related

identifing a scheduled business object report

How to pass List to Spring Batch ItemReader via REST call

Creating entities using task queues do not always get created

Why Quartz scheduler's unscheduleJob is deleting both trigger and job detail?

Quartz Scheduler - Updating only the JobDataMap, between jobs

Categories

Resources