How to get the JobExecutionId in ItemReader in Spring Batch 4 - java

I need helps.
I want to create a batch application which the scenario is
Read data from database and then write it to text file. [ Consider this as the first step]
When the first step is done, the second step is writing a ctl file which contains the writeCount of the first step.
My approach is that I create a stepExecutionListener to put the jobId context to JobExecutionContext.
So, In ItemReader of second step, I can read from the database. But I don't know how to get the jobExecutionId so that I can query the Mysql to get the right record.
Here is the code
public class WriteDataCtlFile {
private static final Logger log = LoggerFactory.getLogger(WriteDataCtlFile.class);
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Bean
public Step writeCtlFile(ItemReader<JobContext> ctlReader,
ItemProcessor<JobContext, CtlFile> ctlProcessor,
ItemWriter<CtlFile> ctlWriter){
return stepBuilderFactory.get("writeCtlFile")
.<JobContext, CtlFile>chunk(100)
.reader(ctlReader)
.processor(ctlProcessor)
.writer(ctlWriter)
.build();
}
#JobScope
#Bean
public ItemReader<JobContext> ctlReader(DataSource dataSource, JobContextMapper jobContextMapper) {
JdbcCursorItemReader<JobContext> reader = new JdbcCursorItemReader<>();
reader.setDataSource(dataSource);
reader.setSql("SELECT short_context FROM BATCH_JOB_EXECUTION_CONTEXT WHERE JOB_EXECUTION_ID = ?");
// THIS IS WHERE I WANT TO GET jobId
reader.setPreparedStatementSetter(new JobIdPrepareStatement(jobId));
reader.setRowMapper(jobContextMapper);
return reader;
}
#Bean
public ItemProcessor<JobContext, CtlFile> ctlProcessor(){
return new ItemProcessor<JobContext, CtlFile>() {
#Override
public CtlFile process(JobContext jobContext) throws Exception {
return new CtlFile(jobContext.getShort_context());
}
};
}
#Bean
public FlatFileItemWriter<CtlFile> ctlWriter(){
FlatFileItemWriter<CtlFile> flatFileItemWriter = new FlatFileItemWriter<>();
flatFileItemWriter.setResource(new FileSystemResource("C:\\Users\\wathanyu.phromma\\data-output.ctl"));
flatFileItemWriter.setLineAggregator(new LineAggregator<CtlFile>() {
#Override
public String aggregate(CtlFile ctlFile) {
Gson gson = new Gson();
Map<String, Object> map = gson.fromJson(ctlFile.getWrittenRecordsCount(), Map.class);
return String.valueOf(map.get("writeCount"));
}
});
return flatFileItemWriter;
}
}
public class WriteDataTxtFile {
private static final Logger log = LoggerFactory.getLogger(WriteDataTxtFile.class);
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Bean
public Step writeTxtFile(
ItemReader<Account> reader,
ItemProcessor<Account, Account> processor,
ItemWriter<Account> writer){
return stepBuilderFactory.get("writeTxtFile")
.<Account, Account>chunk(2)
.reader(reader)
.processor(processor)
.writer(writer)
.listener(new WriteDataTxtStepListener())
.build();
}
#Bean
#StepScope
public JdbcCursorItemReader<Account> reader(DataSource dataSource, AccountMapper accountMapper) {
log.info("test");
JdbcCursorItemReader<Account> reader = new JdbcCursorItemReader<>();
reader.setDataSource(dataSource);
reader.setSql("SELECT * FROM account");
reader.setRowMapper(accountMapper);
return reader;
}
#Bean
public ItemProcessor<Account, Account> processor(){
return new ItemProcessor<Account, Account>() {
#Override
public Account process(Account account) throws Exception {
return account;
}
};
}
#Bean
public FlatFileItemWriter<Account> writer(){
FlatFileItemWriter<Account> flatFileItemWriter = new FlatFileItemWriter<>();
flatFileItemWriter.setResource(new FileSystemResource("C:\\Users\\wathanyu.phromma\\data-output.txt"));
flatFileItemWriter.setLineAggregator(new DelimitedLineAggregator<Account>(){{
setDelimiter("|");
setFieldExtractor(new BeanWrapperFieldExtractor<Account>(){{
setNames(new String[]{ "id", "accountId", "accountName","createdAt", "updatedAt"});
}});
}});
return flatFileItemWriter;
}
public class WriteDataTxtStepListener implements StepExecutionListener {
private static final Logger log = LoggerFactory.getLogger(WriteDataTxtStepListener.class);
#Override
public void beforeStep(StepExecution stepExecution) {
Date date = new Date();
String currentDate = new SimpleDateFormat("YYYY-mm-dd").format(date);
stepExecution.getJobExecution().getExecutionContext().put("jobId", stepExecution.getJobExecutionId());
stepExecution.getJobExecution().getExecutionContext().put("date", currentDate);
log.info("JobId = " + stepExecution.getJobExecutionId());
log.info("Before Step Count = " + stepExecution.getWriteCount());
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
stepExecution.getJobExecution().getExecutionContext().put("writeCount", stepExecution.getWriteCount());
log.info("After Step Count = " + stepExecution.getWriteCount());
log.info("ExitStatus = " + stepExecution.getExitStatus().getExitCode());
return stepExecution.getExitStatus();
}
}
public class WriteDataToFlatFile {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Bean
public Job readFromApi(Step writeTxtFile, Step writeCtlFile){
return jobBuilderFactory.get("readFromApiToFlatFile")
.incrementer(new RunIdIncrementer())
.start(writeTxtFile)
.next(writeCtlFile)
.build();
}
#Bean
public DataSource dataSource(){
DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName("com.mysql.jdbc.Driver");
dataSource.setUrl("jdbc:mysql://localhost:3306/xxxx?useSSL=false");
dataSource.setUsername("xxxx");
dataSource.setPassword("xxxx");
return dataSource;
}
}

To get data from the job execution context in the reader of your second step, you can inject the value as a parameter in your bean definition method like this:
#JobScope
#Bean
public ItemReader<JobContext> ctlReader(DataSource dataSource, JobContextMapper jobContextMapper, #Value("#{jobExecutionContext['jobId']}") int jobId) {
// use jobId
}
Hope this helps.

Related

Spring batch with two jobs FlatFileItemWriter and ClassifierCompositeItemWriter working together

Issue
I've to create a Spring batch project with two jobs that can be executed independently and together. Each job has the necessary code to read from database and to write using FlatFileItemWriter and ClassifierCompositeItemWriter. I've found that if I execute the Jobs independently (-Dspring.batch.job.names=schoolJob,-Dspring.batch.job.names=studentJob), the files are generated fine, but when I execute the Jobs together (-Dspring.batch.job.names=schoolJob,studentJob), the files of a Job only have the footer and the header. There seems to be something wrong but I can't find the cause.
Some code
Batch config, job and steps
#Configuration
#EnableBatchProcessing
#SuppressWarnings("rawtypes, unchecked")
public class MyJobConfiguration
{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private JdbcTemplate jdbcTemplate;
#Autowired
private ConfigurableApplicationContext applicationContext;
#Bean
public Step studentStep1() {
return stepBuilderFactory.get("calculateDistinctValuesAndRegisterStudentWriters")
.tasklet(new DynamicStudentWritersConfigurationTasklet(jdbcTemplate,
applicationContext))
.build();
}
#Bean
public Step schoolStep1() {
return stepBuilderFactory.get("calculateDistinctValuesAndRegisterSchoolWriters")
.tasklet(new DynamicSchoolWritersConfigurationTasklet(jdbcTemplate,
applicationContext))
.build();
}
#Bean
#JobScope
public Step studentStep2(StudentReader reader,
#Qualifier("studentClassfierItemWriter")
ClassifierCompositeItemWriter<Student> writer) {
SimpleStepBuilder<Student, Student> studentStep2 = stepBuilderFactory.get(
"readWriteStudents").<Student, Student>chunk(2).reader(reader).writer(writer);
Map<String, FlatFileItemWriter> beansOfType = applicationContext.getBeansOfType(
FlatFileItemWriter.class);
for (FlatFileItemWriter flatFileItemWriter : beansOfType.values())
{
studentStep2.stream(flatFileItemWriter);
}
return studentStep2.build();
}
#Bean
#JobScope
public Step schoolStep2(SchoolReader reader,
#Qualifier("schoolClassfierItemWriter")
ClassifierCompositeItemWriter<School> writer) {
SimpleStepBuilder<School, School> schoolStep2 = stepBuilderFactory.get("readWriteSchools")
.<School, School>chunk(2)
.reader(reader)
.writer(writer);
Map<String, FlatFileItemWriter> beansOfType = applicationContext.getBeansOfType(
FlatFileItemWriter.class);
for (FlatFileItemWriter flatFileItemWriter : beansOfType.values())
{
schoolStep2.stream(flatFileItemWriter);
}
return schoolStep2.build();
}
#Bean
public Job studentJob(Step studentStep1, Step studentStep2) {
return jobBuilderFactory.get("studentJob").start(studentStep1).next(studentStep2).build();
}
#Bean
public Job schoolJob(Step schoolStep1, Step schoolStep2) {
return jobBuilderFactory.get("schoolJob").start(schoolStep1).next(schoolStep2).build();
}
Data source configuration
#Configuration
class DatasourceConfig
{
#Bean
public DataSource dataSource()
{
String dbSchema = "/org/springframework/batch/core/schema-h2.sql";
String initData = "data.sql";
return new EmbeddedDatabaseBuilder().setType(EmbeddedDatabaseType.H2)
.addScript(dbSchema)
.addScript(initData)
.build();
}
}
Readers
#Component
class SchoolReader extends JdbcCursorItemReader<School>
{
#Autowired
private DataSource dataSource;
#Override
public void afterPropertiesSet() throws Exception
{
super.setName("schoolItemReader");
super.setDataSource(dataSource);
super.setSql("select * from school");
super.setRowMapper(new BeanPropertyRowMapper<>(School.class));
super.afterPropertiesSet();
}
}
#Component
class StudentReader extends JdbcCursorItemReader<Student>
{
#Autowired
private DataSource dataSource;
#Override
public void afterPropertiesSet() throws Exception
{
super.setName("studentItemReader");
super.setDataSource(dataSource);
super.setSql("select * from student");
super.setRowMapper(new BeanPropertyRowMapper<>(Student.class));
super.afterPropertiesSet();
}
}
Writers
#Configuration
public class SchoolWriter
{
#Autowired
private ConfigurableApplicationContext applicationContext;
#Bean(name = "schoolClassfierItemWriter")
#StepScope
public ClassifierCompositeItemWriter<School> itemWriter()
{
Map<String, FlatFileItemWriter> beansOfType = applicationContext.getBeansOfType(
FlatFileItemWriter.class);
Classifier<School, FlatFileItemWriter<School>> classifier = school -> beansOfType.get(
"school-group" + school.getGroupId() + "Writer");
return new ClassifierCompositeItemWriterBuilder().classifier(classifier).build();
}
}
#Configuration
public class StudentWriter
{
#Autowired
private ConfigurableApplicationContext applicationContext;
#Bean(name = "studentClassfierItemWriter")
#StepScope
public ClassifierCompositeItemWriter<Student> itemWriter()
{
Map<String, FlatFileItemWriter> beansOfType = applicationContext.getBeansOfType(
FlatFileItemWriter.class);
Classifier<Student, FlatFileItemWriter<Student>> classifier = student -> beansOfType.get(
"student-group" + student.getGroupId() + "Writer");
return new ClassifierCompositeItemWriterBuilder().classifier(classifier).build();
}
}
Tasklets
class DynamicSchoolWritersConfigurationTasklet implements Tasklet
{
private JdbcTemplate jdbcTemplate;
private ConfigurableApplicationContext applicationContext;
public DynamicSchoolWritersConfigurationTasklet(JdbcTemplate jdbcTemplate,
ConfigurableApplicationContext applicationContext)
{
this.jdbcTemplate = jdbcTemplate;
this.applicationContext = applicationContext;
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)
{
ConfigurableListableBeanFactory beanFactory = applicationContext.getBeanFactory();
String sql = "select distinct(groupId) from school";
List<Integer> groups = jdbcTemplate.queryForList(sql, Integer.class);
for (Integer group : groups)
{
String name = "school-group" + group + "Writer";
//#f:off
MutablePropertyValues propertyValues = new MutablePropertyValues();
propertyValues.addPropertyValue("name", name);
propertyValues.addPropertyValue("lineAggregator", new PassThroughLineAggregator<>());
propertyValues.addPropertyValue("resource", new FileSystemResource("school-" + group + ".txt"));
propertyValues.addPropertyValue("headerCallback", (FlatFileHeaderCallback) writer -> writer.write("header-school"));
propertyValues.addPropertyValue("footerCallback", (FlatFileFooterCallback) writer -> writer.write("footer-school"));
//#f:on
GenericBeanDefinition beanDefinition = new GenericBeanDefinition();
beanDefinition.setBeanClassName(FlatFileItemWriter.class.getName());
beanDefinition.setPropertyValues(propertyValues);
BeanDefinitionRegistry registry = (BeanDefinitionRegistry) beanFactory;
registry.registerBeanDefinition(name, beanDefinition);
}
return RepeatStatus.FINISHED;
}
}
class DynamicStudentWritersConfigurationTasklet implements Tasklet
{
private JdbcTemplate jdbcTemplate;
private ConfigurableApplicationContext applicationContext;
public DynamicStudentWritersConfigurationTasklet(JdbcTemplate jdbcTemplate,
ConfigurableApplicationContext applicationContext)
{
this.jdbcTemplate = jdbcTemplate;
this.applicationContext = applicationContext;
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)
{
ConfigurableListableBeanFactory beanFactory = applicationContext.getBeanFactory();
String sql = "select distinct(groupId) from student";
List<Integer> groups = jdbcTemplate.queryForList(sql, Integer.class);
for (Integer group : groups)
{
String name = "student-group" + group + "Writer";
//#f:off
MutablePropertyValues propertyValues = new MutablePropertyValues();
propertyValues.addPropertyValue("name", name);
propertyValues.addPropertyValue("lineAggregator", new PassThroughLineAggregator<>());
propertyValues.addPropertyValue("resource", new FileSystemResource("student-" + group + ".txt"));
propertyValues.addPropertyValue("headerCallback", (FlatFileHeaderCallback) writer -> writer.write("header-student"));
propertyValues.addPropertyValue("footerCallback", (FlatFileFooterCallback) writer -> writer.write("footer-student"));
//#f:on
GenericBeanDefinition beanDefinition = new GenericBeanDefinition();
beanDefinition.setBeanClassName(FlatFileItemWriter.class.getName());
beanDefinition.setPropertyValues(propertyValues);
BeanDefinitionRegistry registry = (BeanDefinitionRegistry) beanFactory;
registry.registerBeanDefinition(name, beanDefinition);
}
return RepeatStatus.FINISHED;
}
}
DAO
#Getter
#Setter
#ToString
#NoArgsConstructor
#AllArgsConstructor
public class School
{
private int id;
private String name;
private int groupId;
}
#Getter
#Setter
#ToString
#NoArgsConstructor
#AllArgsConstructor
public class Student
{
private int id;
private String name;
private int groupId;
}
This is similar to https://stackoverflow.com/a/67635289/5019386. I think you need to make your dynamic item writers step-scoped as well, something like:
propertyValues.addPropertyValue("scope", "step");
Please note that I did not try that. That said, I would really recommend making your app do one thing and do it well, ie isolate job definitions and package/run each job separately.

running a job in foreach loop with dynamic parameters spring batch

I have created a spring batch job with spring boot.
I customized the Reader to get json data from REST API and convert data to java object and the Writer will push data to queue.
i am calling my job in foreach loop to set parameters and send request to REST API with different langauges.
for the first iteration my job runs successfully but for other iteration it just display that it has finished.
Batch configuration :
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public RestWebClient webClient;
#Bean
public ItemReader<Code> reader() {
return new CodeAndLabelRestItemReader(webClient);
}
#Bean
public CodeAndLabelItemProcessor processor() {
return new CodeAndLabelItemProcessor("France","DP","transaction");
}
#Bean
public ItemWriter<CodeAndLabel> calWriter(AmqpTemplate amqpTemplate) {
return new CodeAndLabelItemWriter(amqpTemplate);
}
#Bean(name = "importJob")
public Job importCodesAndLabelsJob(JobCompletionNotificationListener listener, Step stepJms) {
return jobBuilderFactory.get("importJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(stepJms)
.end()
.build();
}
#Bean
public Step stepJms(ItemWriter<CodeAndLabel> writer) {
return stepBuilderFactory.get("stepJms")
.<Code, CodeAndLabel>chunk(10)
.reader(reader())
.processor(processor())
.writer(writer)
.build();
}
Reader :
public class CodeAndLabelRestItemReader implements ItemReader<Code>{
private final RestWebClient webClient;
private int nextCodeIndex;
private List<Code> codes;
public CodeAndLabelRestItemReader(RestWebClient webClient) {
this.webClient = webClient;
nextCodeIndex = 0;
}
#BeforeStep
public void beforeStep(final StepExecution stepExecution) {
JobParameters jobParameters = stepExecution.getJobParameters();
this.webClient.setEndPointSuffix(jobParameters.getString("endPointSuffix"));
}
#Override
public Code read() {
if(codesAndLabelsListNotInitialized()) {
codes = webClient.getCodes();
}
Code nextCode = null;
if (nextCodeIndex < codes.size()) {
nextCode = codes.get(nextCodeIndex);
nextCodeIndex++;
}
return nextCode;
}
private boolean codesAndLabelsListNotInitialized() {
return this.codes == null;
}
}
Processor :
public class CodeAndLabelItemProcessor implements ItemProcessor<Code, CodeAndLabel> {
private String populationId;
private String populationDataProvider;
private String transactionId;
public CodeAndLabelItemProcessor(String populationId, String populationDataProvider, String transactionId) {
this.populationId = populationId;
this.populationDataProvider = populationDataProvider;
this.transactionId = transactionId;
}
#Override
public CodeAndLabel process(Code code) throws Exception {
CodeAndLabel codeAndLabel = new CodeAndLabel();
codeAndLabel.setUid(code.getUid());
System.out.println("Converting (" + code + ") into (" + codeAndLabel + ")");
return codeAndLabel;
}
}
Writer :
public class CodeAndLabelItemWriter implements ItemWriter<CodeAndLabel>{
private AmqpTemplate template;
public CodeAndLabelItemWriter(AmqpTemplate template) {
this.template = template;
}
#Override
public void write(List<? extends CodeAndLabel> items) throws Exception {
if (log.isDebugEnabled()) {
log.debug("Writing to RabbitMQ with " + items.size() + " items."); }
for(CodeAndLabel item : items) {
template.convertAndSend(BatchConfiguration.topicExchangeName,"com.batchprocessing.queue",item);
System.out.println("item : "+item);
}
}
}
Listener :
#Component
public class JobCompletionNotificationListener extends JobExecutionListenerSupport {
#Autowired
private JdbcTemplate jdbcTemplate;
#Override
public void afterJob(JobExecution jobExecution) {
if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
System.out.println("JOB FINISHED");
}
}
}
And the class running the job :
#Component
public class Initialization {
// some code here
String[] languages = processLanguage.split(";");
for(String language : languages) {
JobParameters params = new JobParametersBuilder()
.addString("JobID",String.valueOf(System.currentTimeMillis()))
.addString("endPointSuffix",
"/codeAndLabel".concat(language.toUpperCase()))
.toJobParameters();
jobLauncher.run(job, params);
}
Output :
for first iteration :
Converting (WFR.SP.2C) into (WFR.SP.2C)
Converting (WFR.SP.3E) into (WFR.SP.3E)
Converting (WFR.SP.FC) into (WFR.SP.FC)
Converting (WFR.SP.FD) into (WFR.SP.FD)
Converting (WFR.SP.FI) into (WFR.SP.FI)
Converting (WFR.SP.FM) into (WFR.SP.FM)
item : WFR.SP.2C
item : WFR.SP.3E
item : WFR.SP.FC
item : WFR.SP.FD
item : WFR.SP.FI
item : WFR.SP.FM
JOB FINISHED
for second iteration
JOB FINISHED
I think in the second iteration the job is not running Reader processor and writer beans i don't know why.
Can anyone give some help on this please?

How to run a step in parallel mode with Spring Batch

I'am working on a spring batch. I have a partitioning step (of a list of objects) and then a slave step with Reader and Writer.
I want to execute the processStep in parallel mode. So, I want to have a specific instances of Reader-Writer for each partition.
For the moment, created partitions uses same instances of Reader-Writer. So, those operations are done in serial mode: Read and write the first partition and then do the same for the next one when the first is completed.
The spring boot configuration class:
#Configuration
#Import({ DataSourceConfiguration.class})
public class BatchConfiguration {
private final static int COMMIT_INTERVAL = 1;
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
#Qualifier(value="mySqlDataSource")
private DataSource mySqlDataSource;
public static int GRID_SIZE = 3;
public static List<Pojo> myList;
#Bean
public Job myJob() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return jobBuilderFactory.get("myJob")
.incrementer(new RunIdIncrementer())
.start(partitioningStep())
.build();
}
#Bean(name="partitionner")
public MyPartitionner partitioner() {
return new MyPartitionner();
}
#Bean
public SimpleAsyncTaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(GRID_SIZE);
return taskExecutor;
}
#Bean
public Step partitioningStep() throws NonTransientResourceException, Exception {
return stepBuilderFactory.get("partitioningStep")
.partitioner("processStep", partitioner())
.step(processStep())
.taskExecutor(taskExecutor())
.build();
}
#Bean
public Step processStep() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return stepBuilderFactory.get("processStep")
.<List<Pojo>, List<Pojo>> chunk(COMMIT_INTERVAL)
.reader(processReader())
.writer(processWriter())
.taskExecutor(taskExecutor())
.build();
}
#Bean
public ProcessReader processReader() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return new ProcessReader();
}
#Bean
public ProcessWriter processWriter() {
return new ProcessWriter();
}
}
The partitionner class
public class MyPartitionner implements Partitioner{
#Autowired
private IService service;
#Override
public Map<String, ExecutionContext> partition(int gridSize) {
// list of 300 object partitionned like bellow
...
Map<String, ExecutionContext> partitionData = new HashMap<String, ExecutionContext>();
ExecutionContext executionContext0 = new ExecutionContext();
executionContext0.putString("from", Integer.toString(0));
executionContext0.putString("to", Integer.toString(100));
partitionData.put("Partition0", executionContext0);
ExecutionContext executionContext1 = new ExecutionContext();
executionContext1.putString("from", Integer.toString(101));
executionContext1.putString("to", Integer.toString(200));
partitionData.put("Partition1", executionContext1);
ExecutionContext executionContext2 = new ExecutionContext();
executionContext2.putString("from", Integer.toString(201));
executionContext2.putString("to", Integer.toString(299));
partitionData.put("Partition2", executionContext2);
return partitionData;
}
}
The Reader class
public class ProcessReader implements ItemReader<List<Pojo>>, ChunkListener {
#Autowired
private IService service;
private StepExecution stepExecution;
private static List<String> processedIntervals = new ArrayList<String>();
#Override
public List<Pojo> read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
System.out.println("Instance reference: "+this.toString());
if(stepExecution.getExecutionContext().containsKey("from") && stepExecution.getExecutionContext().containsKey("to")){
Integer from = Integer.valueOf(stepExecution.getExecutionContext().get("from").toString());
Integer to = Integer.valueOf(stepExecution.getExecutionContext().get("to").toString());
if(from != null && to != null && !processedIntervals.contains(from + "" + to) && to < BatchConfiguration.myList.size()){
processedIntervals.add(String.valueOf(from + "" + to));
return BatchConfiguration.myList.subList(from, to);
}
}
return null;
}
#Override
public void beforeChunk(ChunkContext context) {
this.stepExecution = context.getStepContext().getStepExecution();
}
#Override
public void afterChunk(ChunkContext context) { }
#Override
public void afterChunkError(ChunkContext context) { }
}
}
The writer class
public class ProcessWriter implements ItemWriter<List<Pojo>>{
private final static Logger LOGGER = LoggerFactory.getLogger(ProcessWriter.class);
#Autowired
private IService service;
#Override
public void write(List<? extends List<Pojo>> pojos) throws Exception {
if(!pojos.isEmpty()){
for(Pojo item : pojos.get(0)){
try {
service.remove(item.getId());
} catch (Exception e) {
LOGGER.error("Error occured while removing the item [" + item.getId() + "]", e);
}
}
}
}
}
Can you please tell me what is wrong with my code?
Resolved by adding #StepScope to my reader and writer beans declaration:
#Configuration
#Import({ DataSourceConfiguration.class})
public class BatchConfiguration {
...
#Bean
#StepScope
public ProcessReader processReader() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return new ProcessReader();
}
#Bean
#StepScope
public ProcessWriter processWriter() {
return new ProcessWriter();
}
...
}
By this way, you I have an different instance of the chunck (Reader-Writer) for each partition.

spring batch, Itemwriterlistener is not registered and therefore not invoked, why?

I am trying to add a steplistener (ItemwriterListener) to my annotation batch configuration, no errors all, but it will not be invoked why?? It works in old xml configuration style, but not when using annotations.
code below. reader and processor are left out.
#ImportResource({ "classpath*:transform-delegator-job.xml", "classpath:config/context.xml" })
#SpringBootApplication
public class SpringBootTransformDelegatorJobApplication {
private final Logger logger = LoggerFactory.getLogger(this.getClass());
private static final List<String> OVERRIDDEN_BY_EXPRESSION_LIST = null;
private static final String OVERRIDDEN_BY_EXPRESSION_STRING = null;
#Autowired
private JobBuilderFactory jobBuilders;
#Autowired
private StepBuilderFactory stepBuilders;
#Bean
public JobBuilderFactory jobBuilderFactory(JobRepository jobRepository) {
return new JobBuilderFactory(jobRepository);
}
#Bean
public StepBuilderFactory stepBuilderFactory(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilderFactory(jobRepository, transactionManager);
}
#Bean
#StepScope
public ItemWriter<Record> fileItemWriter(#Value("#{jobParameters['tews.customer.url']}") String url, #Value("#{jobParameters['tews.customer.user']}") String user,
#Value("#{jobParameters['tews.customer.pwd']}") String pwd) {
FileItemWriter writer = new FileItemWriter();
TewsClient client = TewsClientFactory.getInstance(user, pwd, url);
writer.setTewsClient(client);
writer.setHrObjectDao(hrObjectDao(OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING));
return writer;
}
#Bean
#StepScope
public FlatFileItemReader<FieldSet> reader(#Value("#{jobParameters['input.file.delimitter']}") String delimitter, #Value("#{jobParameters['input.file.names']}") String filePath,
#Value("#{jobParameters['input.file.encoding']}") String encoding) throws Exception {
FlatFileItemReader<FieldSet> reader = new FlatFileItemReader<FieldSet>();
PathResource pathResources = new PathResource(Paths.get(filePath));
Scanner scanner = new Scanner(pathResources.getInputStream());
String names = scanner.nextLine();
scanner.close();
DelimitedLineTokenizer delimitedLineTokenizer = new DelimitedLineTokenizer();
delimitedLineTokenizer.setNames(names.split(delimitter));
delimitedLineTokenizer.setDelimiter(delimitter);
DefaultLineMapper<FieldSet> defaultLineMapper = new DefaultLineMapper<FieldSet>();
defaultLineMapper.setLineTokenizer(delimitedLineTokenizer);
defaultLineMapper.setFieldSetMapper(new PassThroughFieldSetMapper());
reader.setLineMapper(defaultLineMapper);
reader.setLinesToSkip(1);
reader.setEncoding(encoding);
reader.afterPropertiesSet();
return reader;
}
#Bean
#StepScope
public ItemProcessor<FieldSet, Record> csvFeedValidateProcessor(#Value("#{jobParameters['input.file.imeconfig.path']}") String imeConfigPath) {
FieldCollectionConfiguration fieldCollectionConfiguration = null;
try {
XMLUnmarshaller<FieldcollectionType> unmarshaller = new XMLUnmarshaller<FieldcollectionType>();
fieldCollectionConfiguration = fieldCollectionBeanToModelTransform().transform(unmarshaller.unmarshallByFile(FieldcollectionType.class, new File(imeConfigPath)));
} catch (UnmarshallingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
CsvFeedTransformProcessor csvFeedTransformProcessor = new CsvFeedTransformProcessor();
csvFeedTransformProcessor.setFieldCollectionConfiguration(fieldCollectionConfiguration);
return csvFeedTransformProcessor;
}
#Bean
#StepScope
public HRObjectDao hrObjectDao(#Value("#{jobParameters['ldap.customer.url']}") String url, #Value("#{jobParameters['ldap.customer.user']}") String user,
#Value("#{jobParameters['ldap.customer.pwd']}") String pwd, #Value("#{jobParameters['ldap.customer.bcontext']}") String bcontext) {
return new HRObjectDaoImpl(bcontext, url, user, pwd);
}
#Bean
public Transform<FieldcollectionType, FieldCollectionConfiguration> fieldCollectionBeanToModelTransform() {
return new FieldCollectionBeanToModelTransform();
}
#Bean
#StepScope
public MultiResourceItemReader<FieldSet> multiResourceReader(#Value("#{jobParameters['input.file.paths'].split(',')}") List<String> filePathList) throws Exception {
MultiResourceItemReader<FieldSet> multiResourceItemReader = new MultiResourceItemReader<FieldSet>();
multiResourceItemReader.setDelegate(reader(OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING));
PathResource[] pathResources = new PathResource[filePathList.size()];
for (int i = 0; i < filePathList.size(); i++) {
pathResources[i] = new PathResource(Paths.get(filePathList.get(i)));
}
multiResourceItemReader.setResources(pathResources);
return multiResourceItemReader;
}
#Bean
public JobParametersIncrementer jobParametersIncrementer() {
return new RunIdIncrementer();
}
#Bean
public Job job() throws Exception {
return jobBuilders.get("feedfiletransformer-delegate-job").listener(feedJobExecutionListener()).start(step1()).incrementer(jobParametersIncrementer()).build();
}
#Bean
public Step step1() throws Exception {
return stepBuilders.get("step1").listener(fileItemWriteListener(OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING)).<FieldSet, Record>chunk(1)
.reader(multiResourceReader(OVERRIDDEN_BY_EXPRESSION_LIST)).processor(csvFeedValidateProcessor(OVERRIDDEN_BY_EXPRESSION_STRING))
.writer(fileItemWriter(OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING, OVERRIDDEN_BY_EXPRESSION_STRING)).build();
}
#Bean
public FeedFileHandler feedFileHandler() {
return new FeedFileHandlerImpl();
}
#Bean
#StepScope
public ItemWriteListener<Path> fileItemWriteListener(#Value("#{jobParameters['feeddumpDirPath']}") String feeddumpDirPath,
#Value("#{jobParameters['processedOkDirPath']}") String processedOkDirPath, #Value("#{jobParameters['processedFailedDirPath']}") String processedFailedDirPath) {
FileItemWriteListener fileItemWriteListener = new FileItemWriteListener();
fileItemWriteListener.setFeedProcessedFailedDirectory(processedFailedDirPath);
fileItemWriteListener.setFeedProcessedOkDirectory(processedOkDirPath);
fileItemWriteListener.setFeeddumpDirPath(feeddumpDirPath);
fileItemWriteListener.setFeedFileHandler(feedFileHandler());
fileItemWriteListener.setRetryLimit(0);
return fileItemWriteListener;
}
#Bean
public JobExecutionListener feedJobExecutionListener() {
return new FeedJobExecutionListener();
}
public static void main(String[] args) throws Exception {
SpringApplication.run(SpringBootTransformDelegatorJobApplication.class, args);
}
}
For the record, it's been my experience that the best way to handle Spring java configuration is to return the explicit type on the #Bean method and inject the interface where needed. The reason for this is that the #Bean method signature serves as providing the type to the BeanDefinition. So if you return an interface, you may be hiding details that the framework needs while gaining virtually no benefits. So in your example, I'd change
#Bean
#StepScope
public ItemWriteListener fileItemWriteListener(#Value("#{jobParameters['feeddumpDirPath']}") String feeddumpDirPath, #Value("#{jobParameters['processedOkDirPath']}") String processedOkDirPath,
#Value("#{jobParameters['processedFailedDirPath']}") String processedFailedDirPath) {
To
#Bean
#StepScope
public FileItemWriteListener fileItemWriteListener(#Value("#{jobParameters['feeddumpDirPath']}") String feeddumpDirPath, #Value("#{jobParameters['processedOkDirPath']}") String processedOkDirPath,
#Value("#{jobParameters['processedFailedDirPath']}") String processedFailedDirPath) {
After a lot of "trial and error", I found the reason, not easy to see.
#Bean
#StepScope
public ItemWriteListener fileItemWriteListener(#Value("#{jobParameters['feeddumpDirPath']}") String feeddumpDirPath, #Value("#{jobParameters['processedOkDirPath']}") String processedOkDirPath,
#Value("#{jobParameters['processedFailedDirPath']}") String processedFailedDirPath) {
FileItemWriteListener fileItemWriteListener = new FileItemWriteListener();
fileItemWriteListener.setFeedProcessedFailedDirectory(processedFailedDirPath);
fileItemWriteListener.setFeedProcessedOkDirectory(processedOkDirPath);
fileItemWriteListener.setFeeddumpDirPath(feeddumpDirPath);
fileItemWriteListener.setFeedFileHandler(feedFileHandler());
fileItemWriteListener.setRetryLimit(0);
return fileItemWriteListener;
}
[public ItemWriteListener fileItemWriteListener]
was set like this before
public ItemWriter<Record> fileItemWriter.
so only works without the type parameter

How to optimize my performances using FlatFileItemReader and Asynchronous Processors

I have a simple csv file with ~400,000 line(one column only)
It takes me alot of time to read the records and process them
the processor validating records against couchbase
the writer - writing into remote topic
Takes me around 30 mins. thats insane.
I read that flatfileItemreader is not thread safe. so my chunk value is 1.
I read the Asynchronous processing could assist. but I cant see any improvements.
Thats my code:
#Configuration
#EnableBatchProcessing
public class NotificationFileProcessUploadedFileJob {
#Value("${expected.snid.header}")
public String snidHeader;
#Value("${num.of.processing.chunks.per.file}")
public int numOfProcessingChunksPerFile;
#Autowired
private InfrastructureConfigurationConfig infrastructureConfigurationConfig;
private static final String OVERRIDDEN_BY_EXPRESSION = null;
#Inject
private JobBuilderFactory jobs;
#Inject
private StepBuilderFactory stepBuilderFactory;
#Inject
ExecutionContextPromotionListener executionContextPromotionListener;
#Bean
public Job processUploadedFileJob() throws Exception {
return this.jobs.get("processUploadedFileJob").start((processSnidUploadedFileStep())).build();
}
#Bean
public Step processSnidUploadedFileStep() {
return stepBuilderFactory.get("processSnidFileStep")
.<PushItemDTO, PushItemDTO>chunk(numOfProcessingChunksPerFile)
.reader(snidFileReader(OVERRIDDEN_BY_EXPRESSION))
.processor(asyncItemProcessor())
.writer(asyncItemWriter())
// .throttleLimit(20)
// .taskJobExecutor(infrastructureConfigurationConfig.taskJobExecutor())
// .faultTolerant()
// .skipLimit(10) //default is set to 0
// .skip(MySQLIntegrityConstraintViolationException.class)
.build();
}
#Inject
ItemWriter writer;
#Bean
public AsyncItemWriter asyncItemWriter() {
AsyncItemWriter asyncItemWriter=new AsyncItemWriter();
asyncItemWriter.setDelegate(writer);
return asyncItemWriter;
}
#Bean
#Scope(value = "step", proxyMode = ScopedProxyMode.INTERFACES)
public ItemStreamReader<PushItemDTO> snidFileReader(#Value("#{jobParameters[filePath]}") String filePath) {
FlatFileItemReader<PushItemDTO> itemReader = new FlatFileItemReader<PushItemDTO>();
itemReader.setLineMapper(snidLineMapper());
itemReader.setLinesToSkip(1);
itemReader.setResource(new FileSystemResource(filePath));
return itemReader;
}
#Bean
public AsyncItemProcessor asyncItemProcessor() {
AsyncItemProcessor<PushItemDTO, PushItemDTO> asyncItemProcessor = new AsyncItemProcessor();
asyncItemProcessor.setDelegate(processor(OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION,
OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION));
asyncItemProcessor.setTaskExecutor(infrastructureConfigurationConfig.taskProcessingExecutor());
return asyncItemProcessor;
}
#Scope(value = "step", proxyMode = ScopedProxyMode.INTERFACES)
#Bean
public ItemProcessor<PushItemDTO, PushItemDTO> processor(#Value("#{jobParameters[pushMessage]}") String pushMessage,
#Value("#{jobParameters[jobId]}") String jobId,
#Value("#{jobParameters[taskId]}") String taskId,
#Value("#{jobParameters[refId]}") String refId,
#Value("#{jobParameters[url]}") String url,
#Value("#{jobParameters[targetType]}") String targetType,
#Value("#{jobParameters[gameType]}") String gameType) {
return new PushItemProcessor(pushMessage, jobId, taskId, refId, url, targetType, gameType);
}
#Bean
public LineMapper<PushItemDTO> snidLineMapper() {
DefaultLineMapper<PushItemDTO> lineMapper = new DefaultLineMapper<PushItemDTO>();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setDelimiter(",");
lineTokenizer.setStrict(true);
lineTokenizer.setStrict(true);
String[] splittedHeader = snidHeader.split(",");
lineTokenizer.setNames(splittedHeader);
BeanWrapperFieldSetMapper<PushItemDTO> fieldSetMapper = new BeanWrapperFieldSetMapper<PushItemDTO>();
fieldSetMapper.setTargetType(PushItemDTO.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(new PushItemFieldSetMapper());
return lineMapper;
}
}
#Bean
#Override
public SimpleAsyncTaskExecutor taskProcessingExecutor() {
SimpleAsyncTaskExecutor simpleAsyncTaskExecutor = new SimpleAsyncTaskExecutor();
simpleAsyncTaskExecutor.setConcurrencyLimit(300);
return simpleAsyncTaskExecutor;
}
How do you think I could improve the processing performances and make them faster?
thank you
ItemWriter code:
#Bean
public ItemWriter writer() {
return new KafkaWriter();
}
public class KafkaWriter implements ItemWriter<PushItemDTO> {
private static final Logger logger = LoggerFactory.getLogger(KafkaWriter.class);
#Autowired
KafkaProducer kafkaProducer;
#Override
public void write(List<? extends PushItemDTO> items) throws Exception {
for (PushItemDTO item : items) {
try {
logger.debug("Writing to kafka=" + item);
sendMessageToKafka(item);
} catch (Exception e) {
logger.error("Error writing item=" + item.toString(), e);
}
}
}
Increasing your commit count is where I'd begin. Keep in mind what the commit count means. Since you have it set at 1, you are doing the following for each item:
Start a transaction
Read an item
Process the item
Write the item
Update the job repository
Commit the transaction
Your configuration doesn't show what the delegate ItemWriter is so I can't tell, but at a minimum you are executing multiple SQL statements per item to update the job repository.
You are correct in that the FlatFileItemReader is not thread safe. However, you aren't using multiple threads to read, only process so there is no reason to set the commit count to 1 from what I can see.

Categories

Resources