How do I store the whole read line using Spring Batch? - java

I am using Spring Boot 1.4 and Spring Batch 1.4 to read in a file and of course, parse the data into the database.
What I would like to do is store the entire line read in the database before the fields are mapped. The entire row would be stored as a string in the database. This is for auditing purposes, therefore I do not want to rebuild the row string from its components.
We have all seen the common mappers in use to get the data from the delimited line:
#Bean
#StepScope
public FlatFileItemReader<Claim> claimFileReader(#Value("#{jobParameters[fileName]}") String pathToFile) {
logger.debug("Setting up FlatFileItemReader for claim");
logger.debug("Job Parameter for input filename: " + pathToFile);
FlatFileItemReader<Claim> reader = new FlatFileItemReader<Claim>();
reader.setResource(new FileSystemResource(pathToFile));
reader.setLineMapper(claimLineMapper());
logger.debug("Finished setting up FlatFileItemReader for claim");
return reader;
}
#Bean
public LineMapper<Claim> claimLineMapper() {
logger.debug("Setting up lineMapper");
DefaultLineMapper<Claim> lineMapper = new DefaultLineMapper<Claim>();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setDelimiter("|");
lineTokenizer.setStrict(false);
lineTokenizer.setNames(new String[] { "RX_NUMBER", "SERVICE_DT", "CLAIM_STS", "PROCESSOR_CLAIM_ID", "CARRIER_ID", "GROUP_ID", "MEM_UNIQUE_ID" });
BeanWrapperFieldSetMapper<Claim> fieldSetMapper = new BeanWrapperFieldSetMapper<Claim>();
fieldSetMapper.setTargetType(Claim.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(claimFieldSetMapper());
logger.debug("Finished Setting up lineMapper");
return lineMapper;
}
If this is my row:
463832|20160101|PAID|504419000000|XYZ|GOLD PLAN|561868
I would want to store "463832|20160101|PAID|504419000000|HBT|GOLD PLAN|561868" as the string in the database (probably with some additional data such as job_instance_id).
Any ideas on how to hook this in during the file reading process?

Instead of using DefaultLineMapper, you can have a new class (suggest CustomLineMapper) as below
public class CustomLineMapper extends DefaultLineMapper<Claim> {
#Override
public Claim mapLine(String line, int lineNumber) throws Exception {
// here you can handle *line content*
return super.mapLine(line, lineNumber);
}
}
line object will contains the raw data which is before mapping it to an object.

Related

Spring batch multiple output file for input files with MultiResourceItemReader

I am using MultiResourceItemReader to read from multiple CSV files that have lines of ObjectX(field1,field2,field3...)
but the problem is that when the processor ends the writer gets all the lines of ObjectX from all the files.
and I have to write the data accepted in a file with the same name as inputFile.
I am using DelimitedLineAggregator
is there a way to have a writer for each file while using MultiResourceItemReade because the writer accepts only one resource at a time?
this is an example of what I have
#Bean
public MultiResourceItemReader<ObjectX> multiResourceItemReader()
{
MultiResourceItemReader<ObjectX> resourceItemReader = new MultiResourceItemReader<ObjectX>();
resourceItemReader.setResources(inputResources);
resourceItemReader.setDelegate(reader());
return resourceItemReader;
}
#Bean
public FlatFileItemReader<ObjectX> flatFileItemReader() {
FlatFileItemReader<ObjectX> flatFileItemReader = new FlatFileItemReader<>();
flatFileItemReader.setComments(new String[]{});
flatFileItemReader.setLineMapper(lineMapper());
return flatFileItemReader;
}
#Override
#StepScope
public Sinistre process(ObjectX objectX) throws Exception {
//business logic
return objectX;
}
#Bean
#StepScope
public FlatFileItemWriter<Sinistre> flatFileItemWriter(
#Value("${doneFile}") FileSystemResource doneFile,
#Value("#{stepExecution.jobExecution}") JobExecution jobExecution
) {
FlatFileItemWriter writer = new FlatFileItemWriter<Sinistre>() {
private String resourceName;
#Override
public String doWrite(List<? extends ObjectX> items) {
//business logic
//business logic
//business logic
return super.doWrite(items);
}
};
DelimitedLineAggregator delimitedLineAggregator = new DelimitedLineAggregator();
delimitedLineAggregator.setDelimiter(";");
BeanWrapperFieldExtractor beanWrapperFieldExtractor = new BeanWrapperFieldExtractor();
beanWrapperFieldExtractor.setNames(new String[]{"field1", "field2", "field3", "field4".......});
delimitedLineAggregator.setFieldExtractor(beanWrapperFieldExtractor);
writer.setResource(doneFile);
writer.setLineAggregator(delimitedLineAggregator);
// how to write the header
writer.setHeaderCallback(new FlatFileHeaderCallback() {
#Override
public void writeHeader(Writer writer) throws IOException {
writer.write((String) jobExecution.getExecutionContext().get("header"));
}
});
writer.setAppendAllowed(false);
writer.setFooterCallback(new FlatFileFooterCallback() {
#Override
public void writeFooter(Writer writer) throws IOException {
writer.write("#--- fin traitement ---");
}
});
return writer;
}
this is what I called ObjectX
public class SinistreDto implements ResourceAware {
private String codeCompagnieA;//A
private String numPoliceA;//B
private String numAttestationA;//C
private String immatriculationA;//D
private String numSinistreA;//E
private String pctResponsabiliteA;//F
private String dateOuvertureA;//G
private String codeCompagnieB;//H
private String numPoliceB;//I
private String numAttestationB;//J
private String immatriculationB;//K
private String numSinistreB;//L
private Resource resource;
}
and this is the CSV file's data (I will have a bunch of files with data exactly like this)
38;5457;16902-A;0001-02-34;84485;000;20221010 12:15;55;5457;W3456;22-A555
76;544687;16902;1234-56;8448;025;20221010 12:15;22;544687;WW456;22-A555
65;84987;16902;WW 123456;74478;033;20221010 12:15;88;84987;WW3456;22-A555
this is how I expect the output file for each input file.
#header
38;5457;16902-A;0001-02-34;84485;000;20221010 12:15;55;5457;W3456;22-A555
76;544687;16902;1234-56;8448;025;20221010 12:15;22;544687;WW456;22-A555
65;84987;16902;WW 123456;74478;033;20221010 12:15;88;84987;WW3456;22-A555
#--- fin traitement ---
I see no difference between the input file and output file except the header and trailer lines. But that is not an issue, you probably omitted the processing part as it is not relevant to the question.
I believe the MultiResourceItemReader is not suitable for your case as data from different input files can end up in the same chunk, and hence written to the same output file, which is not what you want.
I think a good option for your use case is to use partitioning, where each partition is a file. This way, each input file will be read, processed and written to a corresponding output file. Spring Batch provides the MultiResourcePartitioner that will create a partition per file. You can find an example here: https://github.com/spring-projects/spring-batch/blob/main/spring-batch-samples/src/main/resources/jobs/iosample/multiResource.xml.

Spring Batch PathMatchingResourcePatternResolver.getResources() not working for https url

I am trying to read multiple csv files present on "https://raw.githubusercontent.com/Shrutika09/SpringBatchTemplateUploaderPOC/main/order-data-*.csv" and insert the same into database parallelly using Spring Batch.
When I use the URL for a single csv file (https://raw.githubusercontent.com/Shrutika09/SpringBatchTemplateUploaderPOC/main/order-data-1.csv), all records are read and inserted into database.
But when I try to read all files with a particular naming pattern (https://raw.githubusercontent.com/Shrutika09/SpringBatchTemplateUploaderPOC/main/order-data-*.csv), it doesn't recognizes the file and hence doesn't work as expected.
Is there any way where we can read all files matching a particular naming pattern from a github location.
I am using Spring Batch Partitioner
Partitioner:
#Bean
public Partitioner partitioner() throws Exception {
System.out.println("In Partitioner");
MultiResourcePartitioner partitioner = new MultiResourcePartitioner();
PathMatchingResourcePatternResolver resolver = new PathMatchingResourcePatternResolver();
partitioner.setResources(resolver.getResources("https://raw.githubusercontent.com/Shrutika09/SpringBatchTemplateUploaderPOC/main/order-data-*.csv"));
partitioner.partition(5);
return partitioner;
}
Reader:
#Bean
#StepScope
public FlatFileItemReader<Orders> reader(#Value("#{stepExecutionContext['fileName']}") String path)
throws MalformedURLException {
System.out.println("In Reader");
System.out.println("In Reader" +path);
FlatFileItemReader<Orders> reader = new FlatFileItemReader<Orders>();
reader.setResource(new UrlResource(path));
reader.setLineMapper(new DefaultLineMapper<Orders>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(new String[] { "id", "firstName", "lastName" });
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Orders>() {
{
setTargetType(Orders.class);
}
});
}
});
return reader;
}

Spring batch FlatFileItemWriter write as csv from Object

I am using Spring batch and have an ItemWriter as follows:
public class MyItemWriter implements ItemWriter<Fixing> {
private final FlatFileItemWriter<Fixing> writer;
private final FileSystemResource resource;
public MyItemWriter () {
this.writer = new FlatFileItemWriter<>();
this.resource = new FileSystemResource("target/output-teste.txt");
}
#Override
public void write(List<? extends Fixing> items) throws Exception {
this.writer.setResource(new FileSystemResource(resource.getFile()));
this.writer.setLineAggregator(new PassThroughLineAggregator<>());
this.writer.afterPropertiesSet();
this.writer.open(new ExecutionContext());
this.writer.write(items);
}
#AfterWrite
private void close() {
this.writer.close();
}
}
When I run my spring batch job, the items are written to file as:
Fixing{id='123456', source='TEST', startDate=null, endDate=null}
Fixing{id='1234567', source='TEST', startDate=null, endDate=null}
Fixing{id='1234568', source='TEST', startDate=null, endDate=null}
1/ How can I write just the data so that the values are comma separated and where it is null, it is not written. So the target file should look like this:
123456,TEST
1234567,TEST
1234568,TEST
2/ Secondly, I am having an issue where only when I exit spring boot application, I am able to see the file get created. What I would like is once it has processed all the items and written, the file to be available without closing the spring boot application.
There are multiple options to write the csv file. Regarding second question writer flush will solve the issue.
https://howtodoinjava.com/spring-batch/flatfileitemwriter-write-to-csv-file/
We prefer to use OpenCSV with spring batch as we are getting more speed and control on huge file example snippet is below
class DocumentWriter implements ItemWriter<BaseDTO>, Closeable {
private static final Logger LOG = LoggerFactory.getLogger(StatementWriter.class);
private ColumnPositionMappingStrategy<Statement> strategy ;
private static final String[] columns = new String[] { "csvcolumn1", "csvcolumn2", "csvcolumn3",
"csvcolumn4", "csvcolumn5", "csvcolumn6", "csvcolumn7"};
private BufferedWriter writer;
private StatefulBeanToCsv<Statement> beanToCsv;
public DocumentWriter() throws Exception {
strategy = new ColumnPositionMappingStrategy<Statement>();
strategy.setType(Statement.class);
strategy.setColumnMapping(columns);
filename = env.getProperty("globys.statement.cdf.path")+"-"+processCount+".dat";
File cdf = new File(filename);
if(cdf.exists()){
writer = Files.newBufferedWriter(Paths.get(filename), StandardCharsets.UTF_8,StandardOpenOption.APPEND);
}else{
writer = Files.newBufferedWriter(Paths.get(filename), StandardCharsets.UTF_8,StandardOpenOption.CREATE_NEW);
}
beanToCsv = new StatefulBeanToCsvBuilder<Statement>(writer).withQuotechar(CSVWriter.NO_QUOTE_CHARACTER)
.withMappingStrategy(strategy).withSeparator(',').build();
}
#Override
public void write(List<? extends BaseDTO> items) throws Exception {
List<Statement> settlementList = new ArrayList<Statement>();
for (int i = 0; i < items.size(); i++) {
BaseDTO baseDTO = items.get(i);
settlementList.addAll(baseDTO.getStatementList());
}
beanToCsv.write(settlementList);
writer.flush();
}
#PreDestroy
#Override
public void close() throws IOException {
writer.close();
}
}
Since you are using PassThroughLineAggregator which does item.toString() for writing the object, overriding the toString() function of classes extending Fixing.java should fix it.
1/ How can I write just the data so that the values are comma separated and where it is null, it is not written.
You need to provide a custom LineAggregator that filters out null fields.
2/ Secondly, I am having an issue where only when I exit spring boot application, I am able to see the file get created
This is probably because you are calling this.writer.open in the write method which is not correct. You need to make your item writer implement ItemStream and call this.writer.open and this this.writer.close respectively in ItemStream#open and ItemStream#close

How to create a generic FlatFileItemReader to read CSV files with different headers?

I'm creating a job that will read and process different .csv files based on an input parameter. There are 3 different types of .csv files with different headers. I want to map each line of a file to a POJO using a generic FlatFileItemReader.
Each type of file will have its own POJO implementation, and all "File Specific POJOs" are subclassed from an abstract GenericFilePOJO.
A tasklet will first read the input parameter to decide which file type needs to be read, and construct a LineTokenizer with the appropriate header columns. It places this information in the infoHolder for retrieval at the reader step.
#Bean
public FlatFileItemReader<GenericFilePOJO> reader() {
FlatFileItemReader<RawFile> reader = new FlatFileItemReader<GenericFilePOJO>();
reader.setLinesToSkip(1); // header
reader.setLineMapper(new DefaultLineMapper() {
{
// The infoHolder will contain the file-specific LineTokenizer
setLineTokenizer(infoHolder.getLineTokenizer());
setFieldSetMapper(new BeanWrapperFieldSetMapper<GenericFilePOJO>() {
{
setTargetType(GenericFilePOJO.class);
}
});
}
});
return reader;
}
Can this reader handle the different File Specific POJOs despite returning the GenericFilePOJO?
You wrote:
A tasklet will first read the input parameter to decide which file
type needs to be read.
Because the tasklet or infoHolder knows about type of file you can implement the creation of specific FieldSetMapper instance.
This is a demo example how it can be implemented:
public class Solution<T extends GenericFilePOJO> {
private InfoHolder infoHolder = new InfoHolder();
#Bean
public FlatFileItemReader<T> reader()
{
FlatFileItemReader<T> reader = new FlatFileItemReader<T>();
reader.setLinesToSkip(1);
reader.setLineMapper(new DefaultLineMapper() {
{
setLineTokenizer(infoHolder.getLineTokenizer());
setFieldSetMapper(infoHolder.getFieldSetMapper());
}
});
return reader;
}
private class InfoHolder {
DelimitedLineTokenizer getLineTokenizer() {
return <some already existent logic>;
}
FieldSetMapper<T> getFieldSetMapper() {
if (some condition for specific file POJO 1){
return new BeanWrapperFieldSetMapper<T>() {
{
setTargetType(FileSpecificPOJO_1.class);
}
};
} else if (some condition for specific file POJO 2){
return new BeanWrapperFieldSetMapper<T>() {
{
setTargetType(FileSpecificPOJO_2.class);
}
};
}
}
}
}

Using ClassifierCompositeItemWriter and FlatFileItemWriter to write to multiple files

I'm trying to create a spring batch job that will read from MySQL database and write the data to different files depending on a value from the database. I am getting an error :
org.springframework.batch.item.WriterNotOpenException: Writer must be open before it can be written to
at org.springframework.batch.item.file.FlatFileItemWriter.write(FlatFileItemWriter.java:255)
Here's my ClassifierCompositeItemWriter
ClassifierCompositeItemWriter<WithdrawalTransaction> classifierCompositeItemWriter = new ClassifierCompositeItemWriter<WithdrawalTransaction>();
classifierCompositeItemWriter.setClassifier(new Classifier<WithdrawalTransaction,
ItemWriter<? super WithdrawalTransaction>>() {
#Override
public ItemWriter<? super WithdrawalTransaction> classify(WithdrawalTransaction wt) {
ItemWriter<? super WithdrawalTransaction> itemWriter = null;
if(wt.getPaymentMethod().equalsIgnoreCase("PDDTS")) { // condition
itemWriter = pddtsWriter();
} else {
itemWriter = swiftWriter();
}
return itemWriter;
}
});
As you can see, I only used two file writers for now.
#Bean("pddtsWriter")
private FlatFileItemWriter<WithdrawalTransaction> pddtsWriter()
And
#Bean("swiftWriter")
private FlatFileItemWriter<WithdrawalTransaction> swiftWriter()
I also added them as stream
#Bean
public Step processWithdrawalTransactions() throws Exception {
return stepBuilderFactory.get("processWithdrawalTransactions")
.<WithdrawalTransaction, WithdrawalTransaction> chunk(10)
.processor(withdrawProcessor())
.reader(withdrawReader)
.writer(withdrawWriter)
.stream(swiftWriter)
.stream(pddtsWriter)
.listener(headerWriter())
.build();
}
Am I doing something wrong?

Categories

Resources