Convert Message to Job to make it Spring Integration with Batch Processing - java

I am trying to process a series of files using Spring Integration in a batch fashion. I have this very old xml which tries to convert the messages into jobs
<int:transformer ref="messageToJobTransformer"/>
<batch-int:job-launching-gateway job-launcher="jobLauncher"/>
The messageToJobTransformer is a class which can convert a Message into a Job. The problem is I don't know where this file is now neither I want a xml config. I want it to be pure Java DSL. Here is my simple config.
return IntegrationFlows.from(Files.inboundAdapter(directory)
.preventDuplicates()
.patternFilter("*.txt")))
.handle(jobLaunchingGw())
.get();
And here is my bean for the gateway.
#Autowired
private JobLauncher jobLauncher;
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher);
}
EDIT:Updating the Batch Config class.
#Configuration
#EnableBatchProcessing
public class BatchConfig
{
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory steps;
#Bean
public ItemReader<String> reader(#Value({jobParameters['input.file.name']}") String filename) throws MalformedURLException
{
FlatFileItemReader<String> reader = new FlatFileItemReader<String>();
return reader;
}
#Bean
public Job job() throws MalformedURLException
{
return jobs.get("job").start(step()).build();
}
#Bean
public Step step() throws MalformedURLException
{
return steps.get("step").<String, String> chunk(5).reader(reader())
.writer(writer()).build();
}
#Bean
public ItemWriter<String> writer(#Value("#{jobParameters['input.file.name']}")
{
FlatFileItemWriter writer = new FlatFileItemWriter();
return writer;
}
}

Your question isn't clear. The JobLaunchingGateway expects JobLaunchRequest as a payload.
Since your Integration Flow begins from the Files.inboundAdapter(directory), I can assume that you that you have there some Job definitions. So, what you need here is some class which can parse the file and return JobLaunchRequest.
Something like this from the Spring Batch Reference Manual:
public class FileMessageToJobRequest {
private Job job;
private String fileParameterName;
public void setFileParameterName(String fileParameterName) {
this.fileParameterName = fileParameterName;
}
public void setJob(Job job) {
this.job = job;
}
#Transformer
public JobLaunchRequest toRequest(Message<File> message) {
JobParametersBuilder jobParametersBuilder =
new JobParametersBuilder();
jobParametersBuilder.addString(fileParameterName,
message.getPayload().getAbsolutePath());
return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());
}
}
After the definition that class as a #Bean you can use it from the .transform() EIP-method just before your .handle(jobLaunchingGw()).
UPDATE
#Bean
public FileMessageToJobRequest fileMessageToJobRequest(Job job) {
FileMessageToJobRequest fileMessageToJobRequest = new FileMessageToJobRequest();
fileMessageToJobRequest.setJob(job);
fileMessageToJobRequest.setfileParameterName("file");
return fileMessageToJobRequest;
}
...
#Bean
public IntegrationFlow flowToBatch(FileMessageToJobRequest fileMessageToJobRequest) {
return IntegrationFlows
.from(Files.inboundAdapter(directory)
.preventDuplicates()
.patternFilter("*.txt")))
.transform(fileMessageToJobRequest)
.handle(jobLaunchingGw())
.get();
}

Related

Java Spring Batch - Resource file is not injected into the tasklet

I'm doing the java examples from the book Spring Batch In Action chapter 1.
In this example, a tasket unzips a zip file. The tasklet receives the zip file path as a job parameter.
I implemented a test method that runs the job and passes the parameters.
#StepScope
#Component
public class DecompressTasklet implements Tasklet {
private static final Logger LOGGER = LogManager.getLogger(DecompressTasklet.class);
#Value("#{jobParameters['inputResource']}")
private Resource inputResource;
#Value("#{jobParameters['targetDirectory']}")
private String targetDirectory;
#Value("#{jobParameters['targetFile']}")
private String targetFile;
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
//code here
}
}
#Configuration
public class DescompressStep {
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private DecompressTasklet decompressTasklet;
#Bean
public Step stepDescompress() {
return stepBuilderFactory
.get(DescompressStep.class.getSimpleName())
.tasklet(decompressTasklet)
.build();
}
}
#EnableBatchProcessing
#Configuration
public class ImportProductsJob {
#Autowired
private DescompressStep descompressStep;
#Autowired
private ReadWriteProductStep readWriteProductStep;
#Bean
public Job job(JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory
.get("importProductsJob")
.start(descompressStep.stepDescompress())
.next(readWriteProductStep.stepReaderWriter())
.incrementer(new RunIdIncrementer())
.build();
}
}
Below is the test code that runs the job
#RunWith(SpringRunner.class)
#SpringBootTest
#SpringBatchTest
#AutoConfigureTestDatabase
public class ImportProductsIntegrationTest {
#Autowired
private JobRepositoryTestUtils jobRepositoryTestUtils;
#Autowired
private JobLauncherTestUtils jobLauncherTestUtils;
#After
public void cleanUp() {
jobRepositoryTestUtils.removeJobExecutions();
}
#Test
public void importProducts() throws Exception {
jobLauncherTestUtils.launchJob(defaultJobParameters());
}
private JobParameters defaultJobParameters() {
JobParametersBuilder paramsBuilder = new JobParametersBuilder();
paramsBuilder.addString("inputResource", "classpath:input/products.zip");
paramsBuilder.addString("targetDirectory", "./target/importproductsbatch/");
paramsBuilder.addString("targetFile", "products.txt");
paramsBuilder.addLong("timestamp", System.currentTimeMillis());
return paramsBuilder.toJobParameters();
}
}
The products.zip file is in src/main/resources/input
The problem is that when running the test the error occurs
java.lang.NullPointerException: null
at com.springbatch.inaction.ch01.DecompressTasklet.execute(DecompressTasklet.java:62) ~[classes/:na]
I verified that the inputResource property is null. Why does this error occur?
In your job definition, you have:
#Bean
public Job job(JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory
.get("importProductsJob")
.start(descompressStep.stepDescompress())
.next(readWriteProductStep.stepReaderWriter())
.incrementer(new RunIdIncrementer())
.build();
}
The way you are passing steps to start and next methods is incorrect (I don't even see how this would compile). What you can do is import step configuration classes and inject both steps in your job definition. Something like:
#EnableBatchProcessing
#Configuration
#Import({DescompressStep.class, ReadWriteProductStep.class})
public class ImportProductsJob {
#Bean
public Job job(JobBuilderFactory jobBuilderFactory,
Step stepDescompress, Step stepReaderWriter) {
return jobBuilderFactory
.get("importProductsJob")
.start(stepDescompress)
.next(stepReaderWriter)
.incrementer(new RunIdIncrementer())
.build();
}
}

Spring Batch doesn't call both ItemProcessor and ItemWriter in chunk-flow

I have a spring batch application to get a file in samba server
and generate a new file in a different folder on the same server.
However,
only ItemReader is called in the flow.
What is the problem? Thanks.
BatchConfiguration:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration extends BaseConfiguration {
#Bean
public ValeTrocaItemReader reader() {
return new ValeTrocaItemReader();
}
#Bean
public ValeTrocaItemProcessor processor() {
return new ValeTrocaItemProcessor();
}
#Bean
public ValeTrocaItemWriter writer() {
return new ValeTrocaItemWriter();
}
#Bean
public Job importUserJob(JobCompletionNotificationListener listener) throws Exception {
return jobBuilderFactory()
.get("importUserJob")
.incrementer(new RunIdIncrementer())
.repository(getJobRepository())
.listener(listener)
.start(this.step1())
.build();
}
#Bean
public Step step1() throws Exception {
return stepBuilderFactory()
.get("step1")
.<ValeTroca, ValeTroca>chunk(10)
.reader(this.reader())
.processor(this.processor())
.writer(this.writer())
.build();
}
}
BaseConfiguration:
public class BaseConfiguration implements BatchConfigurer {
#Bean
#Override
public PlatformTransactionManager getTransactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
#Override
public SimpleJobLauncher getJobLauncher() throws Exception {
final SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(this.getJobRepository());
return simpleJobLauncher;
}
#Bean
#Override
public JobRepository getJobRepository() throws Exception {
return new MapJobRepositoryFactoryBean(this.getTransactionManager()).getObject();
}
#Bean
#Override
public JobExplorer getJobExplorer() {
MapJobRepositoryFactoryBean repositoryFactory = this.getMapJobRepositoryFactoryBean();
return new SimpleJobExplorer(repositoryFactory.getJobInstanceDao(), repositoryFactory.getJobExecutionDao(),
repositoryFactory.getStepExecutionDao(), repositoryFactory.getExecutionContextDao());
}
#Bean
public MapJobRepositoryFactoryBean getMapJobRepositoryFactoryBean() {
return new MapJobRepositoryFactoryBean(this.getTransactionManager());
}
#Bean
public JobBuilderFactory jobBuilderFactory() throws Exception {
return new JobBuilderFactory(this.getJobRepository());
}
#Bean
public StepBuilderFactory stepBuilderFactory() throws Exception {
return new StepBuilderFactory(this.getJobRepository(), this.getTransactionManager());
}
}
ValeTrocaItemReader:
#Configuration
public class ValeTrocaItemReader implements ItemReader<ValeTroca>{
#Value(value = "${url}")
private String url;
#Value(value = "${user}")
private String user;
#Value(value = "${password}")
private String password;
#Value(value = "${domain}")
private String domain;
#Value(value = "${inputDirectory}")
private String inputDirectory;
#Bean
#Override
public ValeTroca read() throws MalformedURLException, SmbException, IOException, Exception {
File tempOutputFile = getInputFile();
DefaultLineMapper<ValeTroca> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(new DelimitedLineTokenizer() {
{
setDelimiter(";");
setNames(new String[]{"id_participante", "cpf", "valor"});
}
});
lineMapper.setFieldSetMapper(
new BeanWrapperFieldSetMapper<ValeTroca>() {
{
setTargetType(ValeTroca.class);
}
});
FlatFileItemReader<ValeTroca> itemReader = new FlatFileItemReader<>();
itemReader.setLinesToSkip(1);
itemReader.setResource(new FileUrlResource(tempOutputFile.getCanonicalPath()));
itemReader.setLineMapper(lineMapper);
itemReader.open(new ExecutionContext());
tempOutputFile.deleteOnExit();
return itemReader.read();
}
Sample of ItemProcessor:
public class ValeTrocaItemProcessor implements ItemProcessor<ValeTroca, ValeTroca> {
#Override
public ValeTroca process(ValeTroca item) {
//Do anything
ValeTroca item2 = item;
System.out.println(item2.getCpf());
return item2;
}
EDIT:
- Spring boot 2.1.2.RELEASE - Spring batch 4.1.1.RELEASE
Looking at your configuration, here are a couple of notes:
BatchConfiguration looks good. That's a typical job with a single chunk-oriented step.
BaseConfiguration is actually the default configuration you get when using #EnableBatchProcessing without providing a datasource. So this class can be removed
Adding #Configuration on ValeTrocaItemReader and marking the method read() with #Bean is not correct. This means your are declaring a bean named read of type ValeTroca in your application context. Moreover, your custom reader uses a FlatFileItemReader but has no added value compared to a FlatFileItemReader. You can declare your reader as a FlatFileItemReader and configure it as needed (resource, line mapper, etc ). This will also avoid the mistake of opening the execution context in the read method, which should be done when initializaing the reader or in the ItemStream#open method if the reader implements ItemStream
Other than that, I don't see from what you shared why the processor and writer are not called.
SOLVED: The problem was that even though I'm not using any databases, the spring batch, although configured to have the JobRepository in memory, needs a database (usually H2) to save the configuration tables, jobs, etc.
In this case, the dependencies of JDBC and without H2 in pom.xml were disabled. Just added them to the project and the problem was solved!

How to launch Spring Batch Job Asynchronously

I have followed the spring batch doc and couldn't get my job running Asynchronously.
So I am running the Job from a web container and the job will be triggered via a REST end point.
I wanted to get the JobInstance ID to pass it in response before completing the whole job. So they can check the status of the job later with the JobInstance ID instead of waiting. But I couldn't get it work. Below is the sample code I tried. Please let me know what am I missing or wrong.
BatchConfig to make Async JobLauncher
#Configuration
public class BatchConfig {
#Autowired
JobRepository jobRepository;
#Bean
public JobLauncher simpleJobLauncher() throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
}
Controller
#Autowired
JobLauncher jobLauncher;
#RequestMapping(value="/trigger-job", method = RequestMethod.GET)
public Long workHard() throws Exception {
JobParameters jobParameters = new JobParametersBuilder().
addLong("time", System.currentTimeMillis())
.toJobParameters();
JobExecution jobExecution = jobLauncher.run(batchComponent.customJob("paramhere"), jobParameters);
System.out.println(jobExecution.getJobInstance().getInstanceId());
System.out.println("OK RESPONSE");
return jobExecution.getJobInstance().getInstanceId();
}
And JobBuilder as component
#Component
public class BatchComponent {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
public Job customJob(String someParam) throws Exception {
return jobBuilderFactory.get("personProcessor")
.incrementer(new RunIdIncrementer()).listener(listener())
.flow(personPorcessStep(someParam)).end().build();
}
private Step personPorcessStep(String someParam) throws Exception {
return stepBuilderFactory.get("personProcessStep").<PersonInput, PersonOutput>chunk(1)
.reader(new PersonReader(someParam)).faultTolerant().
skipPolicy(new DataDuplicateSkipper()).processor(new PersonProcessor())
.writer(new PersonWriter()).build();
}
private JobExecutionListener listener() {
return new PersonJobCompletionListener();
}
private class PersonInput {
String firstName;
public PersonInput(String firstName) {
this.firstName = firstName;
}
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
}
private class PersonOutput {
String firstName;
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
}
public class PersonReader implements ItemReader<PersonInput> {
private List<PersonInput> items;
private int count = 0;
public PersonReader(String someParam) throws InterruptedException {
Thread.sleep(10000L); //to simulate processing
//manipulate and provide data in the read method
//just for testing i have given some dummy example
items = new ArrayList<PersonInput>();
PersonInput pi = new PersonInput("john");
items.add(pi);
}
#Override
public PersonInput read() {
if (count < items.size()) {
return items.get(count++);
}
return null;
}
}
public class DataDuplicateSkipper implements SkipPolicy {
#Override
public boolean shouldSkip(Throwable exception, int skipCount) throws SkipLimitExceededException {
if (exception instanceof DataIntegrityViolationException) {
return true;
}
return true;
}
}
private class PersonProcessor implements ItemProcessor<PersonInput, PersonOutput> {
#Override
public PersonOutput process(PersonInput item) throws Exception {
return null;
}
}
private class PersonWriter implements org.springframework.batch.item.ItemWriter<PersonOutput> {
#Override
public void write(List<? extends PersonOutput> results) throws Exception {
return;
}
}
private class PersonJobCompletionListener implements JobExecutionListener {
public PersonJobCompletionListener() {
}
#Override
public void beforeJob(JobExecution jobExecution) {
}
#Override
public void afterJob(JobExecution jobExecution) {
System.out.println("JOB COMPLETED");
}
}
}
Main Function
#SpringBootApplication
#EnableBatchProcessing
#EnableScheduling
#EnableAsync
public class SpringBatchTestApplication {
public static void main(String[] args) {
SpringApplication.run(SpringBatchTestApplication.class, args);
}
}
I am using annotation based configurations and use gradle with the below batch package.
compile('org.springframework.boot:spring-boot-starter-batch')
Please let me know if some more info needed. I couldn't find any example to run this common use case.
Thanks for you time.
Try this,In your Configuration You need to create customJobLauncher with SimpleAsyncTaskExecutor using the #Bean(name = "myJobLauncher") and same will be used #Qualifier in your controller.
#Bean(name = "myJobLauncher")
public JobLauncher simpleJobLauncher() throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
In your Controller
#Autowired
#Qualifier("myJobLauncher")
private JobLauncher jobLauncher;
If I look at your code I see a couple of mistake.
First of all your custom config is not loaded, because, if it was, the injection will fail for duplicate bean instance for the same interface.
There's a lot of magic in spring boot, but if you don't tell him to do some component scan, nothing will be loaded as espected.
The second problem that i can see is your BatchConfig class: it does not extends DefaultBatchConfigure, nor overrides getJobLauncher(), so even if the boot magic will load everything you'll get the default one.
Here is a configuration that will work and it's compliant with the documentation #EnableBatchProcessing API
BatchConfig
#Configuration
#EnableBatchProcessing(modular = true)
#Slf4j
public class BatchConfig extends DefaultBatchConfigurer {
#Override
#Bean
public JobLauncher getJobLauncher() {
try {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(getJobRepository());
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
} catch (Exception e) {
log.error("Can't load SimpleJobLauncher with SimpleAsyncTaskExecutor: {} fallback on default", e);
return super.getJobLauncher();
}
}
}
Main Function
#SpringBootApplication
#EnableScheduling
#EnableAsync
#ComponentScan(basePackageClasses = {BatchConfig.class})
public class SpringBatchTestApplication {
public static void main(String[] args) {
SpringApplication.run(SpringBatchTestApplication.class, args);
}
}
Although you’ve your custom jobLauncher, you’re running the job using default jobLauncher provided by Spring. Could you please autowire simpleJobLauncher in your controller and give it a try?
I know that this is an old question but I post this answer anyway for future users.
After reviewing your code I can't tell why you have this problem, but I can suggest you to use a Qualifier annotation plus use the ThreadPoolTaskExecutor like so and see if it solve your problem.
You may also check this tutorial: Asynchronous Spring Batch Job Processing for more details. It will help you configure a spring batch job asynchronously. This tutorial was written by me.
#Configuration
public class BatchConfig {
#Autowired
private JobRepository jobRepository;
#Bean
public TaskExecutor threadPoolTaskExecutor(){
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setMaxPoolSize(12);
executor.setCorePoolSize(8);
executor.setQueueCapacity(15);
return executor;
}
#Bean
public JobLauncher asyncJobLauncher() throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
jobLauncher.setTaskExecutor(threadPoolTaskExecutor());
return jobLauncher;
}
}
JobExecution jobExecution = jobLauncher.run(batchComponent.customJob("paramhere"), jobParameters);. Joblauncher will wait after the Job has been completed before returning anything, that why your service is probably taking long to respond if that is your problem.
If you want asynchronous capabilities, you might want to look at Spring's #EnableAsync & #Async.
#EnableAsync
According to spring documentation to return a response of the http request asynchronous it is required to use org.springframework.core.task.SimpleAsyncTaskExecutor.
Any implementation of the spring TaskExecutor interface can be used to control how jobs are asynchronously executed.
spring batch documentation
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor">
<bean class="org.springframework.core.task.SimpleAsyncTaskExecutor" />
</property>
If you're using Lombok this might help you:
TLDR: Lombok #AllArgsConstructor doesn't seem to work well with #Qualifier annotation
EDIT: if you have enable #Qualifier annotations in the lombok.config file to be able to use #Qualifier with #AllArgsConstructor like this:
lombok.copyableAnnotations += org.springframework.beans.factory.annotation.Qualifier
I know old question, however I had the exact same problem and none of the answers solved it.
I configured the async job launcher like this and added the qualifier to make sure this jobLauncher is injected:
#Bean(name = "asyncJobLauncher")
public JobLauncher simpleJobLauncher(JobRepository jobRepository) throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
And injected it like this
#Qualifier("asyncJobLauncher")
private final JobLauncher jobLauncher;
I was using Lombok #AllArgsConstructor after changing it to autowire, the correct job launcher got injected and the job is now executed asynchronously:
#Autowired
#Qualifier("asyncJobLauncher")
private JobLauncher jobLauncher;
Also I didn't had to extend my configuration from DefaultBatchConfigurer

Spring Batch Processor not running ItemProcessorListener

So I have a problem in Spring Batch 3.0.7.RELEASE and Spring 4.3.2.RELEASE where the Listeners are not running in my ItemProcessor class. Regular injection at the #StepScope level is working for #Value("#{jobExecutionContext['" + Constants.SECURITY_TOKEN + "']}") as seen below. But it isn't working for beforeProcess or beforeStep, I have tried both the annotation version and interface version. I'm almost 100% sure this was working at some point, but can't figure out why it's stopped.
Any ideas? Does it look like I have configured it wrong?
AppBatchConfiguration.java
#Configuration
#EnableBatchProcessing
#ComponentScan(basePackages = "our.org.base")
public class AppBatchConfiguration {
private final static SimpleLogger LOGGER = SimpleLogger.getInstance(AppBatchConfiguration.class);
private final static String OUTPUT_XML_FILE_PATH_PLACEHOLDER = null;
private final static String INPUT_XML_FILE_PATH_PLACEHOLDER = null;
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Bean(name = "cimAppXmlReader")
#StepScope
public <T> ItemStreamReader<T> appXmlReader(#Value("#{jobParameters[inputXmlFilePath]}")
String inputXmlFilePath) {
LOGGER.info("Job Parameter => App XML File Path :" + inputXmlFilePath);
StaxEventItemReader<T> reader = new StaxEventItemReader<T>();
reader.setResource(new FileSystemResource(inputXmlFilePath));
reader.setUnmarshaller(mecaUnMarshaller());
reader.setFragmentRootElementNames(getAppRootElementNames());
reader.setSaveState(false);
// Make the StaxEventItemReader thread-safe
SynchronizedItemStreamReader<T> synchronizedItemStreamReader = new SynchronizedItemStreamReader<T>();
synchronizedItemStreamReader.setDelegate(reader);
return synchronizedItemStreamReader;
}
#Bean
#StepScope
public ItemStreamReader<JAXBElement<AppIBTransactionHeaderType>> appXmlTransactionHeaderReader(#Value("#{jobParameters[inputXmlFilePath]}")
String inputXmlFilePath) {
LOGGER.info("Job Parameter => App XML File Path for Transaction Header :" + inputXmlFilePath);
StaxEventItemReader<JAXBElement<AppIBTransactionHeaderType>> reader = new StaxEventItemReader<>();
reader.setResource(new FileSystemResource(inputXmlFilePath));
reader.setUnmarshaller(mecaUnMarshaller());
String[] fragmentRootElementNames = new String[] {"AppIBTransactionHeader"};
reader.setFragmentRootElementNames(fragmentRootElementNames);
reader.setSaveState(false);
return reader;
}
#Bean
public Unmarshaller mecaUnMarshaller() {
Jaxb2Marshaller marshaller = new Jaxb2Marshaller();
marshaller.setPackagesToScan(ObjectFactory.class.getPackage().getName());
return marshaller;
}
#Bean
public Marshaller uberMarshaller() {
Jaxb2Marshaller marshaller = new Jaxb2Marshaller();
marshaller.setClassesToBeBound(ServiceRequestType.class);
marshaller.setSupportJaxbElementClass(true);
return marshaller;
}
#Bean(destroyMethod="") // To stop multiple close calls, see: http://stackoverflow.com/a/23089536
#StepScope
public ResourceAwareItemWriterItemStream<JAXBElement<ServiceRequestType>> writer(#Value("#{jobParameters[outputXmlFilePath]}")
String outputXmlFilePath) {
SyncStaxEventItemWriter<JAXBElement<ServiceRequestType>> writer = new SyncStaxEventItemWriter<JAXBElement<ServiceRequestType>>();
writer.setResource(new FileSystemResource(outputXmlFilePath));
writer.setMarshaller(uberMarshaller());
writer.setSaveState(false);
HashMap<String, String> rootElementAttribs = new HashMap<String, String>();
rootElementAttribs.put("xmlns:ns1", "http://some.org/corporate/message/2010/1");
writer.setRootElementAttributes(rootElementAttribs);
writer.setRootTagName("ns1:SetOfServiceRequests");
return writer;
}
#Bean
#StepScope
public <T> ItemProcessor<T, JAXBElement<ServiceRequestType>> appNotificationProcessor() {
return new AppBatchNotificationItemProcessor<T>();
}
#Bean
public ItemProcessor<JAXBElement<AppIBTransactionHeaderType>, Boolean> appBatchCreationProcessor() {
return new AppBatchCreationItemProcessor();
}
public String[] getAppRootElementNames() {
//get list of App Transaction Element Names
return AppProcessorEnum.getValues();
}
#Bean
public Step AppStep() {
// INPUT_XML_FILE_PATH_PLACEHOLDER and OUTPUT_XML_FILE_PATH_PLACEHOLDER will be overridden
// by injected jobParameters using late binding (StepScope)
return stepBuilderFactory.get("AppStep")
.<Object, JAXBElement<ServiceRequestType>> chunk(10)
.reader(appXmlReader(INPUT_XML_FILE_PATH_PLACEHOLDER))
.processor(appNotificationProcessor())
.writer(writer(OUTPUT_XML_FILE_PATH_PLACEHOLDER))
.taskExecutor(concurrentTaskExecutor())
.throttleLimit(1)
.build();
}
#Bean
public Step BatchCreationStep() {
return stepBuilderFactory.get("BatchCreationStep")
.<JAXBElement<AppIBTransactionHeaderType>, Boolean>chunk(1)
.reader(appXmlTransactionHeaderReader(INPUT_XML_FILE_PATH_PLACEHOLDER))
.processor(appBatchCreationProcessor())
.taskExecutor(concurrentTaskExecutor())
.throttleLimit(1)
.build();
}
#Bean
public Job AppJob() {
return jobBuilderFactory.get("AppJob")
.incrementer(new RunIdIncrementer())
.listener(AppJobCompletionNotificationListener())
.flow(AppStep())
.next(BatchCreationStep())
.end()
.build();
}
#Bean
public JobCompletionNotificationListener AppJobCompletionNotificationListener() {
return new JobCompletionNotificationListener();
}
#Bean
public TaskExecutor concurrentTaskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(1);
return taskExecutor;
}
}
AppBatchNotificationItemProcessor.java
#StepScope
public class AppBatchNotificationItemProcessor<E> extends AppAbstractItemProcessor<E, JAXBElement<ServiceRequestType>> implements ItemProcessor<E, JAXBElement<ServiceRequestType>>, StepExecutionListener {
// This is populated correctly
#Value("#{jobExecutionContext['" + Constants.SECURITY_TOKEN + "']}")
private SecurityToken securityToken;
#Autowired
private AppProcessorService processor;
#Override
public JAXBElement<ServiceRequestType> process(E item) throws BPException {
// Do Stuff
return srRequest;
}
#BeforeProcess
public void beforeProcess(E item) {
System.out.println("Doesn't execute");
}
#Override
public void beforeStep(StepExecution stepExecution) {
// Doesn't execute
System.out.println("Doesn't execute");
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
// Doesn't execute
System.out.println("Doesn't execute");
}
}
This is due to the fact that you are returning interfaces instead of implementations in your #Bean methods. IMHO, you should return the most specific type possible when using java configuration in Spring. Here's why:
When configuring via XML, you provide the class in the XML configuration. This exposes the implementation to Spring so that any interfaces the class implements can be discovered and handled appropriately. When using java configuration, the return type of the #Bean method serves as the replacement for that information. And there is the issue. If your return type is an interface, Spring only knows about that specific interface and not all the interfaces an implementation may implement. By returning the concrete type where you can, you give Spring insight into what you're actually returning so it can better handle the various registration and wiring use cases for you.
For your specific example, since you're returning an ItemProcessor and it's step scoped (therefore proxied), all Spring knows about are the methods/behaviors expected with the ItemProcessor interface. If you return the implementation (AppBatchNotificationItemProcessor), other behaviors can be autoconfigured.
As far as I remember, you have to register a reader, writer, processor directly as listener on the step, if you use StepScope.
StepScope prevents the framework from being able to figure out what kind of interfaces, resp. #annotations (e.g.#BeforeProcess) the proxy actually implements/defines and therefore it is not able to register it as a listener.
So, I assume if add
return stepBuilderFactory.get("AppStep")
.<Object, JAXBElement<ServiceRequestType>> chunk(10)
.reader(appXmlReader(INPUT_XML_FILE_PATH_PLACEHOLDER))
.processor(appNotificationProcessor())
.writer(writer(OUTPUT_XML_FILE_PATH_PLACEHOLDER))
.listener(appNotificationProcessor())
.taskExecutor(concurrentTaskExecutor())
.throttleLimit(1)
.build();
it will work.

How to optimize my performances using FlatFileItemReader and Asynchronous Processors

I have a simple csv file with ~400,000 line(one column only)
It takes me alot of time to read the records and process them
the processor validating records against couchbase
the writer - writing into remote topic
Takes me around 30 mins. thats insane.
I read that flatfileItemreader is not thread safe. so my chunk value is 1.
I read the Asynchronous processing could assist. but I cant see any improvements.
Thats my code:
#Configuration
#EnableBatchProcessing
public class NotificationFileProcessUploadedFileJob {
#Value("${expected.snid.header}")
public String snidHeader;
#Value("${num.of.processing.chunks.per.file}")
public int numOfProcessingChunksPerFile;
#Autowired
private InfrastructureConfigurationConfig infrastructureConfigurationConfig;
private static final String OVERRIDDEN_BY_EXPRESSION = null;
#Inject
private JobBuilderFactory jobs;
#Inject
private StepBuilderFactory stepBuilderFactory;
#Inject
ExecutionContextPromotionListener executionContextPromotionListener;
#Bean
public Job processUploadedFileJob() throws Exception {
return this.jobs.get("processUploadedFileJob").start((processSnidUploadedFileStep())).build();
}
#Bean
public Step processSnidUploadedFileStep() {
return stepBuilderFactory.get("processSnidFileStep")
.<PushItemDTO, PushItemDTO>chunk(numOfProcessingChunksPerFile)
.reader(snidFileReader(OVERRIDDEN_BY_EXPRESSION))
.processor(asyncItemProcessor())
.writer(asyncItemWriter())
// .throttleLimit(20)
// .taskJobExecutor(infrastructureConfigurationConfig.taskJobExecutor())
// .faultTolerant()
// .skipLimit(10) //default is set to 0
// .skip(MySQLIntegrityConstraintViolationException.class)
.build();
}
#Inject
ItemWriter writer;
#Bean
public AsyncItemWriter asyncItemWriter() {
AsyncItemWriter asyncItemWriter=new AsyncItemWriter();
asyncItemWriter.setDelegate(writer);
return asyncItemWriter;
}
#Bean
#Scope(value = "step", proxyMode = ScopedProxyMode.INTERFACES)
public ItemStreamReader<PushItemDTO> snidFileReader(#Value("#{jobParameters[filePath]}") String filePath) {
FlatFileItemReader<PushItemDTO> itemReader = new FlatFileItemReader<PushItemDTO>();
itemReader.setLineMapper(snidLineMapper());
itemReader.setLinesToSkip(1);
itemReader.setResource(new FileSystemResource(filePath));
return itemReader;
}
#Bean
public AsyncItemProcessor asyncItemProcessor() {
AsyncItemProcessor<PushItemDTO, PushItemDTO> asyncItemProcessor = new AsyncItemProcessor();
asyncItemProcessor.setDelegate(processor(OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION,
OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION));
asyncItemProcessor.setTaskExecutor(infrastructureConfigurationConfig.taskProcessingExecutor());
return asyncItemProcessor;
}
#Scope(value = "step", proxyMode = ScopedProxyMode.INTERFACES)
#Bean
public ItemProcessor<PushItemDTO, PushItemDTO> processor(#Value("#{jobParameters[pushMessage]}") String pushMessage,
#Value("#{jobParameters[jobId]}") String jobId,
#Value("#{jobParameters[taskId]}") String taskId,
#Value("#{jobParameters[refId]}") String refId,
#Value("#{jobParameters[url]}") String url,
#Value("#{jobParameters[targetType]}") String targetType,
#Value("#{jobParameters[gameType]}") String gameType) {
return new PushItemProcessor(pushMessage, jobId, taskId, refId, url, targetType, gameType);
}
#Bean
public LineMapper<PushItemDTO> snidLineMapper() {
DefaultLineMapper<PushItemDTO> lineMapper = new DefaultLineMapper<PushItemDTO>();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setDelimiter(",");
lineTokenizer.setStrict(true);
lineTokenizer.setStrict(true);
String[] splittedHeader = snidHeader.split(",");
lineTokenizer.setNames(splittedHeader);
BeanWrapperFieldSetMapper<PushItemDTO> fieldSetMapper = new BeanWrapperFieldSetMapper<PushItemDTO>();
fieldSetMapper.setTargetType(PushItemDTO.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(new PushItemFieldSetMapper());
return lineMapper;
}
}
#Bean
#Override
public SimpleAsyncTaskExecutor taskProcessingExecutor() {
SimpleAsyncTaskExecutor simpleAsyncTaskExecutor = new SimpleAsyncTaskExecutor();
simpleAsyncTaskExecutor.setConcurrencyLimit(300);
return simpleAsyncTaskExecutor;
}
How do you think I could improve the processing performances and make them faster?
thank you
ItemWriter code:
#Bean
public ItemWriter writer() {
return new KafkaWriter();
}
public class KafkaWriter implements ItemWriter<PushItemDTO> {
private static final Logger logger = LoggerFactory.getLogger(KafkaWriter.class);
#Autowired
KafkaProducer kafkaProducer;
#Override
public void write(List<? extends PushItemDTO> items) throws Exception {
for (PushItemDTO item : items) {
try {
logger.debug("Writing to kafka=" + item);
sendMessageToKafka(item);
} catch (Exception e) {
logger.error("Error writing item=" + item.toString(), e);
}
}
}
Increasing your commit count is where I'd begin. Keep in mind what the commit count means. Since you have it set at 1, you are doing the following for each item:
Start a transaction
Read an item
Process the item
Write the item
Update the job repository
Commit the transaction
Your configuration doesn't show what the delegate ItemWriter is so I can't tell, but at a minimum you are executing multiple SQL statements per item to update the job repository.
You are correct in that the FlatFileItemReader is not thread safe. However, you aren't using multiple threads to read, only process so there is no reason to set the commit count to 1 from what I can see.

Categories

Resources