I have a spring batch application as follows where I need to read two files daily and process:
MarketAFile_yyyy_mm_dd.csv
MarketBFile_yyyy_mm_dd.csv
I have configured Job which first needs to fetch these files from the fileshare dynamically based on the date:
#Bean
public Job job() {
return jobBuilderFactory.get("job")
.incrementer(new RunIdIncrementer())
.listener(listener())
.start(getFiles())
.next(step1())
.build();
}
#Bean
public Step getFiles() {
return stepBuilderFactory.get("getFiles")
.tasklet(fileMovingTasklet)
.build();
}
My FileMovingTasklet execute() method needs to access jobParameters which should be the name of the file (derived from enum filename) and the corresponding previousWorkingDate for that market. I am iterating over the list of markets as you can see below and want to dynamically set the filename and corresponding date as to build final filename for example:
MarketAFile_2018_02_15.csv
MarketBFile_2018_02_15.csv
How can I pass this final file name so I have it avaialble in execute() to perform a copy from fileshare to my local path?
#Component
public class FileMovingTasklet implements Tasklet, InitializingBean {
#Value("${file.suffix}")
private String suffix;
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
try {
//get files to look for, for all markets
//copy from file share to local
} catch (IOException e) {
}
return RepeatStatus.FINISHED;
}
private void copyFiles(...) {
}
}
}
Here is my Main entry point:
#SpringBootApplication
#EnableBatchProcessing
public class App implements CommandLineRunner {
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job job;
#Autowired
private PropertyHolder propertyHolder;
public static void main(String[] args) {
SpringApplication.run(App.class, args);
}
#Override
public void run(String... strings) throws Exception {
for (Market market : Market.values()) {
List<MonthDay> listOfHolidays = propertyHolder.getHolidayMap(market);
if (todayIsHoliday(listOfHolidays)) {
String previousWorkingDay = getPreviousWorkingDay(listOfHolidays); //2018_02_15
}
}
// JobParameters jobParameters = buildJobParameters();
// jobLauncher.run(job, jobParameters);
}
private JobParameters buildJobParameters() {
Map<String, JobParameter> confMap = new HashMap<>();
confMap.put(AS_OF_DATE, new JobParameter(new Date()));
return new JobParameters(confMap);
}
}
Enum class:
public enum Market {
MARKETA("MarketA", "MarketAFile"),
MARKETB("MarketA", "MarketBFile");
private final String market;
private final String fileName;
Market(String market, String filename) {
this.market = market;
this.fileName = filename;
}
}
Check Spring Batch accessing job parameter inside step
//TODO read doco..See here
Related
I have created a spring batch job with spring boot.
I customized the Reader to get json data from REST API and convert data to java object and the Writer will push data to queue.
i am calling my job in foreach loop to set parameters and send request to REST API with different langauges.
for the first iteration my job runs successfully but for other iteration it just display that it has finished.
Batch configuration :
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public RestWebClient webClient;
#Bean
public ItemReader<Code> reader() {
return new CodeAndLabelRestItemReader(webClient);
}
#Bean
public CodeAndLabelItemProcessor processor() {
return new CodeAndLabelItemProcessor("France","DP","transaction");
}
#Bean
public ItemWriter<CodeAndLabel> calWriter(AmqpTemplate amqpTemplate) {
return new CodeAndLabelItemWriter(amqpTemplate);
}
#Bean(name = "importJob")
public Job importCodesAndLabelsJob(JobCompletionNotificationListener listener, Step stepJms) {
return jobBuilderFactory.get("importJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(stepJms)
.end()
.build();
}
#Bean
public Step stepJms(ItemWriter<CodeAndLabel> writer) {
return stepBuilderFactory.get("stepJms")
.<Code, CodeAndLabel>chunk(10)
.reader(reader())
.processor(processor())
.writer(writer)
.build();
}
Reader :
public class CodeAndLabelRestItemReader implements ItemReader<Code>{
private final RestWebClient webClient;
private int nextCodeIndex;
private List<Code> codes;
public CodeAndLabelRestItemReader(RestWebClient webClient) {
this.webClient = webClient;
nextCodeIndex = 0;
}
#BeforeStep
public void beforeStep(final StepExecution stepExecution) {
JobParameters jobParameters = stepExecution.getJobParameters();
this.webClient.setEndPointSuffix(jobParameters.getString("endPointSuffix"));
}
#Override
public Code read() {
if(codesAndLabelsListNotInitialized()) {
codes = webClient.getCodes();
}
Code nextCode = null;
if (nextCodeIndex < codes.size()) {
nextCode = codes.get(nextCodeIndex);
nextCodeIndex++;
}
return nextCode;
}
private boolean codesAndLabelsListNotInitialized() {
return this.codes == null;
}
}
Processor :
public class CodeAndLabelItemProcessor implements ItemProcessor<Code, CodeAndLabel> {
private String populationId;
private String populationDataProvider;
private String transactionId;
public CodeAndLabelItemProcessor(String populationId, String populationDataProvider, String transactionId) {
this.populationId = populationId;
this.populationDataProvider = populationDataProvider;
this.transactionId = transactionId;
}
#Override
public CodeAndLabel process(Code code) throws Exception {
CodeAndLabel codeAndLabel = new CodeAndLabel();
codeAndLabel.setUid(code.getUid());
System.out.println("Converting (" + code + ") into (" + codeAndLabel + ")");
return codeAndLabel;
}
}
Writer :
public class CodeAndLabelItemWriter implements ItemWriter<CodeAndLabel>{
private AmqpTemplate template;
public CodeAndLabelItemWriter(AmqpTemplate template) {
this.template = template;
}
#Override
public void write(List<? extends CodeAndLabel> items) throws Exception {
if (log.isDebugEnabled()) {
log.debug("Writing to RabbitMQ with " + items.size() + " items."); }
for(CodeAndLabel item : items) {
template.convertAndSend(BatchConfiguration.topicExchangeName,"com.batchprocessing.queue",item);
System.out.println("item : "+item);
}
}
}
Listener :
#Component
public class JobCompletionNotificationListener extends JobExecutionListenerSupport {
#Autowired
private JdbcTemplate jdbcTemplate;
#Override
public void afterJob(JobExecution jobExecution) {
if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
System.out.println("JOB FINISHED");
}
}
}
And the class running the job :
#Component
public class Initialization {
// some code here
String[] languages = processLanguage.split(";");
for(String language : languages) {
JobParameters params = new JobParametersBuilder()
.addString("JobID",String.valueOf(System.currentTimeMillis()))
.addString("endPointSuffix",
"/codeAndLabel".concat(language.toUpperCase()))
.toJobParameters();
jobLauncher.run(job, params);
}
Output :
for first iteration :
Converting (WFR.SP.2C) into (WFR.SP.2C)
Converting (WFR.SP.3E) into (WFR.SP.3E)
Converting (WFR.SP.FC) into (WFR.SP.FC)
Converting (WFR.SP.FD) into (WFR.SP.FD)
Converting (WFR.SP.FI) into (WFR.SP.FI)
Converting (WFR.SP.FM) into (WFR.SP.FM)
item : WFR.SP.2C
item : WFR.SP.3E
item : WFR.SP.FC
item : WFR.SP.FD
item : WFR.SP.FI
item : WFR.SP.FM
JOB FINISHED
for second iteration
JOB FINISHED
I think in the second iteration the job is not running Reader processor and writer beans i don't know why.
Can anyone give some help on this please?
EDITS BASED ON SUGGESTION:
For brevity, I will remove older code and long part and re-phrase the issue.
I am trying to build the app (Spring boot + Spring Batch) taking the date and config information from command line. Based on suggestions, I can use the application properties?
The main aim is to use the same job (task of the job) to download different files form different host/time etc. So, properties file can give the information to use for download and compiled jar should read the info and do its tasks.
Main Entry point.
#SpringBootApplication
public class CoreApplication implements ApplicationRunner {
#Autowired
JobLauncher jobLauncher;
#Autowired
Job processJob;
#Value("${rundate}")
private String run_date;
private static final Logger logger = LoggerFactory.getLogger(CoreApplication.class);
public static void main(String[] args) {
SpringApplication.run(CoreApplication.class, args);
}
#Override
public void run(ApplicationArguments args) throws Exception {
JobParameters jobParameters = new JobParametersBuilder()
.addLong("JobID", System.currentTimeMillis())
.addString("RunDate", run_date)
.toJobParameters();
try {
jobLauncher.run(processJob, jobParameters);
} catch (Exception e) {
logger.error("Exception while running a batch job {}", e.getMessage());
}
}
}
I rearranged the code, to use the values of server, user, etc from application.properties file. Please let me know if it is wrong way to inject the properties.
application.properties file:
spring.datasource.url=jdbc:postgresql://dbhost:1000/db
spring.datasource.username=username
spring.datasource.password=password
spring.datasource.platform=postgresql
spring.batch.job.enabled=false
local.directory="/my/local/path/"
file.name="file_name_20200601.csv"
remote.directory="/remote/ftp/location"
remote.host="remotehost"
remote.port=22
remote.user="remoteuser"
private.key.location="/key/file/location"
My Batch Configuration:
#Configuration
#EnableBatchProcessing
#EnableIntegration
#EnableAutoConfiguration
public class BatchConfiguration {
private Logger logger = LoggerFactory.getLogger(BatchConfiguration.class);
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Bean
public Job ftpJob() {
return jobBuilderFactory.get("FTP Job")
.incrementer(new RunIdIncrementer())
.start(getFilesFromFTPServer())
.build();
}
#Bean
public Step getFilesFromFTPServer() {
return stepBuilderFactory.get("Get file from server")
.tasklet(new RemoteFileInboundTasklet())
.build();
}
}
My Tasklet:
public class RemoteFileInboundTasklet implements Tasklet {
private Logger logger = LoggerFactory.getLogger(RemoteFileInboundTasklet.class);
#Value("${file.name}")
private String fileNamePattern;
private String clientName;
private boolean deleteLocalFiles = true;
private boolean retryIfNotFound = false;
#Value("${local.directory}")
private String local_directory_value;
private File localDirectory;
private int downloadFileAttempts = 12;
private long retryIntervalMilliseconds = 300000;
#Value("${remote.directory}")
private String remoteDirectory;
#Value("${remote.host}")
private String remoteHost;
#Value("${remote.user}")
private String remoteUser;
#Value("${remote.port}")
private int remotePort;
#Value("${private.key.location}")
private String private_key_file;
public SessionFactory<ChannelSftp.LsEntry> clientSessionFactory() {
DefaultSftpSessionFactory ftpSessionFactory = new DefaultSftpSessionFactory();
ftpSessionFactory.setHost(remoteHost);
ftpSessionFactory.setPort(remotePort);
ftpSessionFactory.setUser(remoteUser);
ftpSessionFactory.setPrivateKey(new FileSystemResource(private_key_file));
ftpSessionFactory.setAllowUnknownKeys(true);
return ftpSessionFactory;
}
private SessionFactory sessionFactory = clientSessionFactory();
public SftpInboundFileSynchronizer sftpInboundFileSynchronizer() {
SftpInboundFileSynchronizer sftpInboundFileSynchronizer = new SftpInboundFileSynchronizer(sessionFactory);
sftpInboundFileSynchronizer.setDeleteRemoteFiles(false);
sftpInboundFileSynchronizer.setRemoteDirectory(remoteDirectory);
return sftpInboundFileSynchronizer;
}
private SftpInboundFileSynchronizer ftpInboundFileSynchronizer = sftpInboundFileSynchronizer();
private SftpInboundFileSynchronizingMessageSource sftpInboundFileSynchronizingMessageSource;
public boolean isDeleteLocalFiles() {
return deleteLocalFiles;
}
public void setDeleteLocalFiles(boolean deleteLocalFiles) {
this.deleteLocalFiles = deleteLocalFiles;
}
public SftpInboundFileSynchronizer getFtpInboundFileSynchronizer() {
return ftpInboundFileSynchronizer;
}
public void setFtpInboundFileSynchronizer(SftpInboundFileSynchronizer ftpInboundFileSynchronizer) {
this.ftpInboundFileSynchronizer = ftpInboundFileSynchronizer;
}
public SessionFactory getSessionFactory() {
return sessionFactory;
}
public void setSessionFactory(SessionFactory sessionFactory) {
this.sessionFactory = sessionFactory;
}
public SftpInboundFileSynchronizingMessageSource getSftpInboundFileSynchronizingMessageSource() {
return sftpInboundFileSynchronizingMessageSource;
}
public void setSftpInboundFileSynchronizingMessageSource(SftpInboundFileSynchronizingMessageSource sftpInboundFileSynchronizingMessageSource) {
this.sftpInboundFileSynchronizingMessageSource = sftpInboundFileSynchronizingMessageSource;
}
public String getRemoteDirectory() {
return remoteDirectory;
}
public void setRemoteDirectory(String remoteDirectory) {
this.remoteDirectory = remoteDirectory;
}
private SFTPGateway sftpGateway;
#ServiceActivator(inputChannel = "sftpChannel")
public MessageHandler clientMessageHandler() {
SftpOutboundGateway sftpOutboundGateway = new SftpOutboundGateway(clientSessionFactory(), "mget", "payload");
sftpOutboundGateway.setAutoCreateLocalDirectory(true);
sftpOutboundGateway.setLocalDirectory(new File(local_directory_value));
sftpOutboundGateway.setFileExistsMode(FileExistsMode.REPLACE_IF_MODIFIED);
sftpOutboundGateway.setFilter(new AcceptOnceFileListFilter<>());
return sftpOutboundGateway;
}
private void deleteLocalFiles()
{
if (deleteLocalFiles)
{
localDirectory = new File(local_directory_value);
SimplePatternFileListFilter filter = new SimplePatternFileListFilter(fileNamePattern);
List<File> matchingFiles = filter.filterFiles(localDirectory.listFiles());
if (CollectionUtils.isNotEmpty(matchingFiles))
{
for (File file : matchingFiles)
{
FileUtils.deleteQuietly(file);
}
}
}
}
#Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
deleteLocalFiles();
ftpInboundFileSynchronizer.synchronizeToLocalDirectory(localDirectory);
if (retryIfNotFound) {
SimplePatternFileListFilter filter = new SimplePatternFileListFilter(fileNamePattern);
int attemptCount = 1;
while (filter.filterFiles(localDirectory.listFiles()).size() == 0 && attemptCount <= downloadFileAttempts) {
logger.info("File(s) matching " + fileNamePattern + " not found on remote site. Attempt " + attemptCount + " out of " + downloadFileAttempts);
Thread.sleep(retryIntervalMilliseconds);
ftpInboundFileSynchronizer.synchronizeToLocalDirectory(localDirectory);
attemptCount++;
}
if (attemptCount >= downloadFileAttempts && filter.filterFiles(localDirectory.listFiles()).size() == 0) {
throw new FileNotFoundException("Could not find remote file(s) matching " + fileNamePattern + " after " + downloadFileAttempts + " attempts.");
}
}
return RepeatStatus.FINISHED;
}
public String getFileNamePattern() {
return fileNamePattern;
}
public void setFileNamePattern(String fileNamePattern) {
this.fileNamePattern = fileNamePattern;
}
public String getClientName() {
return clientName;
}
public void setClientName(String clientName) {
this.clientName = clientName;
}
public boolean isRetryIfNotFound() {
return retryIfNotFound;
}
public void setRetryIfNotFound(boolean retryIfNotFound) {
this.retryIfNotFound = retryIfNotFound;
}
public File getLocalDirectory() {
return localDirectory;
}
public void setLocalDirectory(File localDirectory) {
this.localDirectory = localDirectory;
}
public int getDownloadFileAttempts() {
return downloadFileAttempts;
}
public void setDownloadFileAttempts(int downloadFileAttempts) {
this.downloadFileAttempts = downloadFileAttempts;
}
public long getRetryIntervalMilliseconds() {
return retryIntervalMilliseconds;
}
public void setRetryIntervalMilliseconds(long retryIntervalMilliseconds) {
this.retryIntervalMilliseconds = retryIntervalMilliseconds;
}
}
My understanding (please correct here if wrong) that the application.properties file properties can be injected in the tasklet (as above).
Then I try to build the package.
mvn clean package
I get the following error:
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springframework.batch.core.Step]: Factory method 'getFilesFromFTPServer' threw exception; nested exception is java.lang.IllegalArgumentException: Path must not be null
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185) ~[spring-beans-5.2.6.RELEASE.jar:5.2.6.RELEASE]
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:651) ~[spring-beans-5.2.6.RELEASE.jar:5.2.6.RELEASE]
... 122 common frames omitted
Caused by: java.lang.IllegalArgumentException: Path must not be null
at org.springframework.util.Assert.notNull(Assert.java:198) ~[spring-core-5.2.6.RELEASE.jar:5.2.6.RELEASE]
at org.springframework.core.io.FileSystemResource.<init>(FileSystemResource.java:80) ~[spring-core-5.2.6.RELEASE.jar:5.2.6.RELEASE]
at com.my.batch.core.tasklet.RemoteFileInboundTasklet.clientSessionFactory(RemoteFileInboundTasklet.java:78) ~[classes/:na]
at com.my.batch.core.tasklet.RemoteFileInboundTasklet.<init>(RemoteFileInboundTasklet.java:83) ~[classes/:na]
at com.my.batch.core.BatchConfiguration.getFilesFromFTPServer(BatchConfiguration.java:71) ~[classes/:na]
at com.my.batch.core.BatchConfiguration$$EnhancerBySpringCGLIB$$17d8a6d9.CGLIB$getFilesFromFTPServer$1(<generated>) ~[classes/:na]
The line in the code is:
ftpSessionFactory.setPrivateKey(new FileSystemResource(private_key_file));
called via BatchConfiguration.java -> getFilesFromFTPServer.
This means my values from applcation.properties is not passed?
What changes I need to do?
And, while compiling or building the jar, why is it checking the value of variable?
NEW EDITS:
I tried to declare my tasklet as a bean in Configuration and build the package again. However, it is giving the same error.
My application.properties file after change:
spring.datasource.url=jdbc:postgresql://dbhost:1000/db
spring.datasource.username=username
spring.datasource.password=password
spring.datasource.platform=postgresql
spring.batch.job.enabled=false
local.directory=/my/local/path/
file.name=file_name_20200601.csv
remote.directory=/remote/ftp/location
remote.host=remotehost
remote.port=22
remote.user=remoteuser
private.key.location=/key/file/location
No change in tasklet.
Changed Configuration:
#Configuration
#EnableBatchProcessing
#EnableIntegration
#EnableAutoConfiguration
public class BatchConfiguration {
private Logger logger = LoggerFactory.getLogger(BatchConfiguration.class);
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Bean
public RemoteFileInboundTasklet remoteFileInboundTasklet() {
return new RemoteFileInboundTasklet();
}
#Bean
public Job ftpJob() {
return jobBuilderFactory.get("FTP Job")
.incrementer(new RunIdIncrementer())
.start(getFilesFromFTPServer())
.build();
}
#Bean
public Step getFilesFromFTPServer() {
return stepBuilderFactory.get("Get file from server")
.tasklet(remoteFileInboundTasklet())
.build();
}
}
When I tried to build the package(mvn clean package), I still get the same error.
Path must not be null.
It is not able to read the properties. Any idea what is wrong?
EDITS BASED ON DIFFERENT APPROACH:
I tried to further see how to use configuration and found the following approach to use the #ConfigurationProperties annotation (How to access a value defined in the application.properties file in Spring Boot)
I created a new ftp config class:
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Configuration;
#ConfigurationProperties(prefix = "ftp")
#Configuration("coreFtpProperties")
public class CoreFtp {
private String host;
private String port;
private String user;
private String passwordKey;
private String localDirectory;
private String remoteDirectory;
private String fileName;
public String getHost() {
return host;
}
public void setHost(String host) {
this.host = host;
}
public String getPort() {
return port;
}
public void setPort(String port) {
this.port = port;
}
public String getUser() {
return user;
}
public void setUser(String user) {
this.user = user;
}
public String getPasswordKey() {
return passwordKey;
}
public void setPasswordKey(String passwordKey) {
this.passwordKey = passwordKey;
}
public String getLocalDirectory() {
return localDirectory;
}
public void setLocalDirectory(String localDirectory) {
this.localDirectory = localDirectory;
}
public String getRemoteDirectory() {
return remoteDirectory;
}
public void setRemoteDirectory(String remoteDirectory) {
this.remoteDirectory = remoteDirectory;
}
public String getFileName() {
return fileName;
}
public void setFileName(String fileName) {
this.fileName = fileName;
}
}
Minor change to application.properties file:
spring.datasource.url=jdbc:postgresql://dbhost:1000/db
spring.datasource.username=username
spring.datasource.password=password
spring.datasource.platform=postgresql
spring.batch.job.enabled=false
ftp.local_directory=/my/local/path/
ftp.file_name=file_name_20200601.csv
ftp.remote_directory=/remote/ftp/location
ftp.host=remotehost
ftp.port=22
ftp.user=remoteuser
ftp.password_key=/key/file/location
In my batch configuration I made this changes:
#Configuration
#EnableBatchProcessing
#EnableIntegration
public class BatchConfiguration {
private Logger logger = LoggerFactory.getLogger(BatchConfiguration.class);
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
private CoreFtp coreFtpProperties;
#Bean
public RemoteFileInboundTasklet remoteFileInboundTasklet() {
RemoteFileInboundTasklet ftpTasklet = new RemoteFileInboundTasklet();
ftpTasklet.setRetryIfNotFound(true);
ftpTasklet.setDownloadFileAttempts(3);
ftpTasklet.setRetryIntervalMilliseconds(10000);
ftpTasklet.setFileNamePattern(coreFtpProperties.getFileName());
ftpTasklet.setRemoteDirectory(coreFtpProperties.getRemoteDirectory());
ftpTasklet.setLocalDirectory(new File(coreFtpProperties.getLocalDirectory()));
ftpTasklet.setSessionFactory(clientSessionFactory());
ftpTasklet.setFtpInboundFileSynchronizer(sftpInboundFileSynchronizer());
ftpTasklet.setSftpInboundFileSynchronizingMessageSource(new SftpInboundFileSynchronizingMessageSource(sftpInboundFileSynchronizer()));
return ftpTasklet;
}
#Bean
public SftpInboundFileSynchronizer sftpInboundFileSynchronizer() {
SftpInboundFileSynchronizer sftpInboundFileSynchronizer = new SftpInboundFileSynchronizer(clientSessionFactory());
sftpInboundFileSynchronizer.setDeleteRemoteFiles(false);
sftpInboundFileSynchronizer.setRemoteDirectory(coreFtpProperties.getRemoteDirectory());
return sftpInboundFileSynchronizer;
}
#Bean(name = "clientSessionFactory")
public SessionFactory<LsEntry> clientSessionFactory() {
DefaultSftpSessionFactory ftpSessionFactory = new DefaultSftpSessionFactory();
ftpSessionFactory.setHost(coreFtpProperties.getHost());
ftpSessionFactory.setPort(Integer.parseInt(coreFtpProperties.getPort()));
ftpSessionFactory.setUser(coreFtpProperties.getUser());
ftpSessionFactory.setPrivateKey(new FileSystemResource(coreFtpProperties.getPasswordKey()));
ftpSessionFactory.setPassword("");
ftpSessionFactory.setAllowUnknownKeys(true);
return ftpSessionFactory;
}
#Bean
#ServiceActivator(inputChannel = "sftpChannel")
public MessageHandler clientMessageHandler() {
SftpOutboundGateway sftpOutboundGateway = new SftpOutboundGateway(clientSessionFactory(), "mget", "payload");
sftpOutboundGateway.setAutoCreateLocalDirectory(true);
sftpOutboundGateway.setLocalDirectory(new File(coreFtpProperties.getLocalDirectory()));
sftpOutboundGateway.setFileExistsMode(FileExistsMode.REPLACE_IF_MODIFIED);
sftpOutboundGateway.setFilter(new AcceptOnceFileListFilter<>());
return sftpOutboundGateway;
}
#Bean
public Job ftpJob() {
return jobBuilderFactory.get("FTP Job")
.incrementer(new RunIdIncrementer())
.start(getFilesFromFTPServer())
.build();
}
#Bean
public Step getFilesFromFTPServer() {
return stepBuilderFactory.get("Get file from server")
.tasklet(remoteFileInboundTasklet())
.build();
}
}
So, accordingly my Tasklet is changed as:
public class RemoteFileInboundTasklet implements Tasklet {
private Logger logger = LoggerFactory.getLogger(RemoteFileInboundTasklet.class);
private String fileNamePattern;
private String clientName;
private boolean deleteLocalFiles = true;
private boolean retryIfNotFound = false;
private File localDirectory;
private int downloadFileAttempts = 12;
private long retryIntervalMilliseconds = 300000;
private String remoteDirectory;
private SessionFactory sessionFactory;
private SftpInboundFileSynchronizer ftpInboundFileSynchronizer;
private SftpInboundFileSynchronizingMessageSource sftpInboundFileSynchronizingMessageSource;
public boolean isDeleteLocalFiles() {
return deleteLocalFiles;
}
public void setDeleteLocalFiles(boolean deleteLocalFiles) {
this.deleteLocalFiles = deleteLocalFiles;
}
public SftpInboundFileSynchronizer getFtpInboundFileSynchronizer() {
return ftpInboundFileSynchronizer;
}
public void setFtpInboundFileSynchronizer(SftpInboundFileSynchronizer ftpInboundFileSynchronizer) {
this.ftpInboundFileSynchronizer = ftpInboundFileSynchronizer;
}
public SessionFactory getSessionFactory() {
return sessionFactory;
}
public void setSessionFactory(SessionFactory sessionFactory) {
this.sessionFactory = sessionFactory;
}
public SftpInboundFileSynchronizingMessageSource getSftpInboundFileSynchronizingMessageSource() {
return sftpInboundFileSynchronizingMessageSource;
}
public void setSftpInboundFileSynchronizingMessageSource(SftpInboundFileSynchronizingMessageSource sftpInboundFileSynchronizingMessageSource) {
this.sftpInboundFileSynchronizingMessageSource = sftpInboundFileSynchronizingMessageSource;
}
public String getRemoteDirectory() {
return remoteDirectory;
}
public void setRemoteDirectory(String remoteDirectory) {
this.remoteDirectory = remoteDirectory;
}
private SFTPGateway sftpGateway;
private void deleteLocalFiles()
{
if (deleteLocalFiles)
{
SimplePatternFileListFilter filter = new SimplePatternFileListFilter(fileNamePattern);
List<File> matchingFiles = filter.filterFiles(localDirectory.listFiles());
if (CollectionUtils.isNotEmpty(matchingFiles))
{
for (File file : matchingFiles)
{
FileUtils.deleteQuietly(file);
}
}
}
}
#Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
deleteLocalFiles();
ftpInboundFileSynchronizer.synchronizeToLocalDirectory(localDirectory);
if (retryIfNotFound) {
SimplePatternFileListFilter filter = new SimplePatternFileListFilter(fileNamePattern);
int attemptCount = 1;
while (filter.filterFiles(localDirectory.listFiles()).size() == 0 && attemptCount <= downloadFileAttempts) {
logger.info("File(s) matching " + fileNamePattern + " not found on remote site. Attempt " + attemptCount + " out of " + downloadFileAttempts);
Thread.sleep(retryIntervalMilliseconds);
ftpInboundFileSynchronizer.synchronizeToLocalDirectory(localDirectory);
attemptCount++;
}
if (attemptCount >= downloadFileAttempts && filter.filterFiles(localDirectory.listFiles()).size() == 0) {
throw new FileNotFoundException("Could not find remote file(s) matching " + fileNamePattern + " after " + downloadFileAttempts + " attempts.");
}
}
return RepeatStatus.FINISHED;
}
}
Based on above changes, I am able to compile the code and create the necessary Jar, and run the code using the jar.
You are declaring a bean jobExecutionListener() in which you create new FileSystemResource(config_file_path);. The config_file_path is injected from job parameters #Value("#{jobParameters['ConfigFilePath']}") which are not available at configuration time but only when a job/step is run. This is called late binding.
So in your case, when Spring tries to create the bean jobExecutionListener(), it tries to inject config_file_path but it is null at that time (at this point Spring is only creating beans to configure the application context) and the job is not run yet hence the method beforeJob is not executed yet. This is the reason you have a NullPointerException. Adding #JobScope on the jobExecutionListener() bean should fix the issue but I do not recommend that. The reason is that you are trying to configure some properties in the wrong way and in the wrong place, so I would fix that design instead of working around the issue by adding an annotation.
Job parameters are used for business parameters and not technical details. In your case, runDate is a good choice for a job parameter but not ConfigFilePath. Moreover, since you use Spring, why do you inject the file path then do properties = PropertiesLoaderUtils.loadProperties(resource); and Integer.parseInt(properties.getProperty("remote.port"));? Spring will do that for you if tell it to inject properties where needed.
I would remove this config_file_path job parameter as well as the job listener and inject the properties in the remoteFileInboundTasklet directly, that is, as close as possible to where these properties are needed.
Edit: Add code example
Can you help to understand where can I declare the tasklet as a bean?
In your step getFilesFromFTPServer , you are creating the tasklet manually, so dependency injection is not performed. You need to declare the tasklet as a Spring bean for this to work, something like:
#Bean
public Tasklet myTasklet() {
return new RemoteFileInboundTasklet()
}
#Bean
public Step getFilesFromFTPServer() {
return stepBuilderFactory.get("Get file from server")
.tasklet(myTasklet())
.build();
}
You need to change getFilesFromFTPServer bean to JobScope and read all the job runtime parameters from there.
#Bean
#JobScope
public Step getFilesFromFTPServer() {
I'am working on a spring batch. I have a partitioning step (of a list of objects) and then a slave step with Reader and Writer.
I want to execute the processStep in parallel mode. So, I want to have a specific instances of Reader-Writer for each partition.
For the moment, created partitions uses same instances of Reader-Writer. So, those operations are done in serial mode: Read and write the first partition and then do the same for the next one when the first is completed.
The spring boot configuration class:
#Configuration
#Import({ DataSourceConfiguration.class})
public class BatchConfiguration {
private final static int COMMIT_INTERVAL = 1;
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
#Qualifier(value="mySqlDataSource")
private DataSource mySqlDataSource;
public static int GRID_SIZE = 3;
public static List<Pojo> myList;
#Bean
public Job myJob() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return jobBuilderFactory.get("myJob")
.incrementer(new RunIdIncrementer())
.start(partitioningStep())
.build();
}
#Bean(name="partitionner")
public MyPartitionner partitioner() {
return new MyPartitionner();
}
#Bean
public SimpleAsyncTaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(GRID_SIZE);
return taskExecutor;
}
#Bean
public Step partitioningStep() throws NonTransientResourceException, Exception {
return stepBuilderFactory.get("partitioningStep")
.partitioner("processStep", partitioner())
.step(processStep())
.taskExecutor(taskExecutor())
.build();
}
#Bean
public Step processStep() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return stepBuilderFactory.get("processStep")
.<List<Pojo>, List<Pojo>> chunk(COMMIT_INTERVAL)
.reader(processReader())
.writer(processWriter())
.taskExecutor(taskExecutor())
.build();
}
#Bean
public ProcessReader processReader() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return new ProcessReader();
}
#Bean
public ProcessWriter processWriter() {
return new ProcessWriter();
}
}
The partitionner class
public class MyPartitionner implements Partitioner{
#Autowired
private IService service;
#Override
public Map<String, ExecutionContext> partition(int gridSize) {
// list of 300 object partitionned like bellow
...
Map<String, ExecutionContext> partitionData = new HashMap<String, ExecutionContext>();
ExecutionContext executionContext0 = new ExecutionContext();
executionContext0.putString("from", Integer.toString(0));
executionContext0.putString("to", Integer.toString(100));
partitionData.put("Partition0", executionContext0);
ExecutionContext executionContext1 = new ExecutionContext();
executionContext1.putString("from", Integer.toString(101));
executionContext1.putString("to", Integer.toString(200));
partitionData.put("Partition1", executionContext1);
ExecutionContext executionContext2 = new ExecutionContext();
executionContext2.putString("from", Integer.toString(201));
executionContext2.putString("to", Integer.toString(299));
partitionData.put("Partition2", executionContext2);
return partitionData;
}
}
The Reader class
public class ProcessReader implements ItemReader<List<Pojo>>, ChunkListener {
#Autowired
private IService service;
private StepExecution stepExecution;
private static List<String> processedIntervals = new ArrayList<String>();
#Override
public List<Pojo> read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
System.out.println("Instance reference: "+this.toString());
if(stepExecution.getExecutionContext().containsKey("from") && stepExecution.getExecutionContext().containsKey("to")){
Integer from = Integer.valueOf(stepExecution.getExecutionContext().get("from").toString());
Integer to = Integer.valueOf(stepExecution.getExecutionContext().get("to").toString());
if(from != null && to != null && !processedIntervals.contains(from + "" + to) && to < BatchConfiguration.myList.size()){
processedIntervals.add(String.valueOf(from + "" + to));
return BatchConfiguration.myList.subList(from, to);
}
}
return null;
}
#Override
public void beforeChunk(ChunkContext context) {
this.stepExecution = context.getStepContext().getStepExecution();
}
#Override
public void afterChunk(ChunkContext context) { }
#Override
public void afterChunkError(ChunkContext context) { }
}
}
The writer class
public class ProcessWriter implements ItemWriter<List<Pojo>>{
private final static Logger LOGGER = LoggerFactory.getLogger(ProcessWriter.class);
#Autowired
private IService service;
#Override
public void write(List<? extends List<Pojo>> pojos) throws Exception {
if(!pojos.isEmpty()){
for(Pojo item : pojos.get(0)){
try {
service.remove(item.getId());
} catch (Exception e) {
LOGGER.error("Error occured while removing the item [" + item.getId() + "]", e);
}
}
}
}
}
Can you please tell me what is wrong with my code?
Resolved by adding #StepScope to my reader and writer beans declaration:
#Configuration
#Import({ DataSourceConfiguration.class})
public class BatchConfiguration {
...
#Bean
#StepScope
public ProcessReader processReader() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
return new ProcessReader();
}
#Bean
#StepScope
public ProcessWriter processWriter() {
return new ProcessWriter();
}
...
}
By this way, you I have an different instance of the chunck (Reader-Writer) for each partition.
I am trying to process a series of files using Spring Integration in a batch fashion. I have this very old xml which tries to convert the messages into jobs
<int:transformer ref="messageToJobTransformer"/>
<batch-int:job-launching-gateway job-launcher="jobLauncher"/>
The messageToJobTransformer is a class which can convert a Message into a Job. The problem is I don't know where this file is now neither I want a xml config. I want it to be pure Java DSL. Here is my simple config.
return IntegrationFlows.from(Files.inboundAdapter(directory)
.preventDuplicates()
.patternFilter("*.txt")))
.handle(jobLaunchingGw())
.get();
And here is my bean for the gateway.
#Autowired
private JobLauncher jobLauncher;
#Bean
public MessageHandler jobLaunchingGw() {
return new JobLaunchingGateway(jobLauncher);
}
EDIT:Updating the Batch Config class.
#Configuration
#EnableBatchProcessing
public class BatchConfig
{
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory steps;
#Bean
public ItemReader<String> reader(#Value({jobParameters['input.file.name']}") String filename) throws MalformedURLException
{
FlatFileItemReader<String> reader = new FlatFileItemReader<String>();
return reader;
}
#Bean
public Job job() throws MalformedURLException
{
return jobs.get("job").start(step()).build();
}
#Bean
public Step step() throws MalformedURLException
{
return steps.get("step").<String, String> chunk(5).reader(reader())
.writer(writer()).build();
}
#Bean
public ItemWriter<String> writer(#Value("#{jobParameters['input.file.name']}")
{
FlatFileItemWriter writer = new FlatFileItemWriter();
return writer;
}
}
Your question isn't clear. The JobLaunchingGateway expects JobLaunchRequest as a payload.
Since your Integration Flow begins from the Files.inboundAdapter(directory), I can assume that you that you have there some Job definitions. So, what you need here is some class which can parse the file and return JobLaunchRequest.
Something like this from the Spring Batch Reference Manual:
public class FileMessageToJobRequest {
private Job job;
private String fileParameterName;
public void setFileParameterName(String fileParameterName) {
this.fileParameterName = fileParameterName;
}
public void setJob(Job job) {
this.job = job;
}
#Transformer
public JobLaunchRequest toRequest(Message<File> message) {
JobParametersBuilder jobParametersBuilder =
new JobParametersBuilder();
jobParametersBuilder.addString(fileParameterName,
message.getPayload().getAbsolutePath());
return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());
}
}
After the definition that class as a #Bean you can use it from the .transform() EIP-method just before your .handle(jobLaunchingGw()).
UPDATE
#Bean
public FileMessageToJobRequest fileMessageToJobRequest(Job job) {
FileMessageToJobRequest fileMessageToJobRequest = new FileMessageToJobRequest();
fileMessageToJobRequest.setJob(job);
fileMessageToJobRequest.setfileParameterName("file");
return fileMessageToJobRequest;
}
...
#Bean
public IntegrationFlow flowToBatch(FileMessageToJobRequest fileMessageToJobRequest) {
return IntegrationFlows
.from(Files.inboundAdapter(directory)
.preventDuplicates()
.patternFilter("*.txt")))
.transform(fileMessageToJobRequest)
.handle(jobLaunchingGw())
.get();
}
I have a simple csv file with ~400,000 line(one column only)
It takes me alot of time to read the records and process them
the processor validating records against couchbase
the writer - writing into remote topic
Takes me around 30 mins. thats insane.
I read that flatfileItemreader is not thread safe. so my chunk value is 1.
I read the Asynchronous processing could assist. but I cant see any improvements.
Thats my code:
#Configuration
#EnableBatchProcessing
public class NotificationFileProcessUploadedFileJob {
#Value("${expected.snid.header}")
public String snidHeader;
#Value("${num.of.processing.chunks.per.file}")
public int numOfProcessingChunksPerFile;
#Autowired
private InfrastructureConfigurationConfig infrastructureConfigurationConfig;
private static final String OVERRIDDEN_BY_EXPRESSION = null;
#Inject
private JobBuilderFactory jobs;
#Inject
private StepBuilderFactory stepBuilderFactory;
#Inject
ExecutionContextPromotionListener executionContextPromotionListener;
#Bean
public Job processUploadedFileJob() throws Exception {
return this.jobs.get("processUploadedFileJob").start((processSnidUploadedFileStep())).build();
}
#Bean
public Step processSnidUploadedFileStep() {
return stepBuilderFactory.get("processSnidFileStep")
.<PushItemDTO, PushItemDTO>chunk(numOfProcessingChunksPerFile)
.reader(snidFileReader(OVERRIDDEN_BY_EXPRESSION))
.processor(asyncItemProcessor())
.writer(asyncItemWriter())
// .throttleLimit(20)
// .taskJobExecutor(infrastructureConfigurationConfig.taskJobExecutor())
// .faultTolerant()
// .skipLimit(10) //default is set to 0
// .skip(MySQLIntegrityConstraintViolationException.class)
.build();
}
#Inject
ItemWriter writer;
#Bean
public AsyncItemWriter asyncItemWriter() {
AsyncItemWriter asyncItemWriter=new AsyncItemWriter();
asyncItemWriter.setDelegate(writer);
return asyncItemWriter;
}
#Bean
#Scope(value = "step", proxyMode = ScopedProxyMode.INTERFACES)
public ItemStreamReader<PushItemDTO> snidFileReader(#Value("#{jobParameters[filePath]}") String filePath) {
FlatFileItemReader<PushItemDTO> itemReader = new FlatFileItemReader<PushItemDTO>();
itemReader.setLineMapper(snidLineMapper());
itemReader.setLinesToSkip(1);
itemReader.setResource(new FileSystemResource(filePath));
return itemReader;
}
#Bean
public AsyncItemProcessor asyncItemProcessor() {
AsyncItemProcessor<PushItemDTO, PushItemDTO> asyncItemProcessor = new AsyncItemProcessor();
asyncItemProcessor.setDelegate(processor(OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION,
OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION, OVERRIDDEN_BY_EXPRESSION));
asyncItemProcessor.setTaskExecutor(infrastructureConfigurationConfig.taskProcessingExecutor());
return asyncItemProcessor;
}
#Scope(value = "step", proxyMode = ScopedProxyMode.INTERFACES)
#Bean
public ItemProcessor<PushItemDTO, PushItemDTO> processor(#Value("#{jobParameters[pushMessage]}") String pushMessage,
#Value("#{jobParameters[jobId]}") String jobId,
#Value("#{jobParameters[taskId]}") String taskId,
#Value("#{jobParameters[refId]}") String refId,
#Value("#{jobParameters[url]}") String url,
#Value("#{jobParameters[targetType]}") String targetType,
#Value("#{jobParameters[gameType]}") String gameType) {
return new PushItemProcessor(pushMessage, jobId, taskId, refId, url, targetType, gameType);
}
#Bean
public LineMapper<PushItemDTO> snidLineMapper() {
DefaultLineMapper<PushItemDTO> lineMapper = new DefaultLineMapper<PushItemDTO>();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setDelimiter(",");
lineTokenizer.setStrict(true);
lineTokenizer.setStrict(true);
String[] splittedHeader = snidHeader.split(",");
lineTokenizer.setNames(splittedHeader);
BeanWrapperFieldSetMapper<PushItemDTO> fieldSetMapper = new BeanWrapperFieldSetMapper<PushItemDTO>();
fieldSetMapper.setTargetType(PushItemDTO.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(new PushItemFieldSetMapper());
return lineMapper;
}
}
#Bean
#Override
public SimpleAsyncTaskExecutor taskProcessingExecutor() {
SimpleAsyncTaskExecutor simpleAsyncTaskExecutor = new SimpleAsyncTaskExecutor();
simpleAsyncTaskExecutor.setConcurrencyLimit(300);
return simpleAsyncTaskExecutor;
}
How do you think I could improve the processing performances and make them faster?
thank you
ItemWriter code:
#Bean
public ItemWriter writer() {
return new KafkaWriter();
}
public class KafkaWriter implements ItemWriter<PushItemDTO> {
private static final Logger logger = LoggerFactory.getLogger(KafkaWriter.class);
#Autowired
KafkaProducer kafkaProducer;
#Override
public void write(List<? extends PushItemDTO> items) throws Exception {
for (PushItemDTO item : items) {
try {
logger.debug("Writing to kafka=" + item);
sendMessageToKafka(item);
} catch (Exception e) {
logger.error("Error writing item=" + item.toString(), e);
}
}
}
Increasing your commit count is where I'd begin. Keep in mind what the commit count means. Since you have it set at 1, you are doing the following for each item:
Start a transaction
Read an item
Process the item
Write the item
Update the job repository
Commit the transaction
Your configuration doesn't show what the delegate ItemWriter is so I can't tell, but at a minimum you are executing multiple SQL statements per item to update the job repository.
You are correct in that the FlatFileItemReader is not thread safe. However, you aren't using multiple threads to read, only process so there is no reason to set the commit count to 1 from what I can see.