Using ClassifierCompositeItemWriter and FlatFileItemWriter to write to multiple files

Using ClassifierCompositeItemWriter and FlatFileItemWriter to write to multiple files - java

I'm trying to create a spring batch job that will read from MySQL database and write the data to different files depending on a value from the database. I am getting an error :
org.springframework.batch.item.WriterNotOpenException: Writer must be open before it can be written to
at org.springframework.batch.item.file.FlatFileItemWriter.write(FlatFileItemWriter.java:255)
Here's my ClassifierCompositeItemWriter
ClassifierCompositeItemWriter<WithdrawalTransaction> classifierCompositeItemWriter = new ClassifierCompositeItemWriter<WithdrawalTransaction>();
classifierCompositeItemWriter.setClassifier(new Classifier<WithdrawalTransaction,
ItemWriter<? super WithdrawalTransaction>>() {
#Override
public ItemWriter<? super WithdrawalTransaction> classify(WithdrawalTransaction wt) {
ItemWriter<? super WithdrawalTransaction> itemWriter = null;
if(wt.getPaymentMethod().equalsIgnoreCase("PDDTS")) { // condition
itemWriter = pddtsWriter();
} else {
itemWriter = swiftWriter();
}
return itemWriter;
}
});
As you can see, I only used two file writers for now.
#Bean("pddtsWriter")
private FlatFileItemWriter<WithdrawalTransaction> pddtsWriter()
And
#Bean("swiftWriter")
private FlatFileItemWriter<WithdrawalTransaction> swiftWriter()
I also added them as stream
#Bean
public Step processWithdrawalTransactions() throws Exception {
return stepBuilderFactory.get("processWithdrawalTransactions")
.<WithdrawalTransaction, WithdrawalTransaction> chunk(10)
.processor(withdrawProcessor())
.reader(withdrawReader)
.writer(withdrawWriter)
.stream(swiftWriter)
.stream(pddtsWriter)
.listener(headerWriter())
.build();
}
Am I doing something wrong?

Related

Spring batch, how do I write simple strings into a file?

I'm new to spring batch and so far I've studied how to take a list of Objects as input from csv,xml,json files or DBs and write down those same Object in external files or DBs.
However I just realized that I don't know how to output simple strings, for example I've made this simple Processor:
public class ProductProcessorObjToStr implements ItemProcessor<Product,String> {
#Override
public String process(Product product) throws Exception {
return product.getProductName();
}
}
So that I get a simple list of names but I have no idea how to create the correct item writer.
I've studied theese kinds of writers where I map the various Object fields:
#Bean
#StepScope
public FlatFileItemWriter flatFileItemWriter(#Value("#{jobParameters['outputFile']}") FileSystemResource outputFile){
FlatFileItemWriter writer = new FlatFileItemWriter<Product>();
writer.setResource(outputFile);
writer.setLineAggregator(new DelimitedLineAggregator(){
{
setDelimiter("|");
setFieldExtractor(new BeanWrapperFieldExtractor(){
{
setNames(new String[]
{"productID","productName","productDesc","price","unit"});
}
});
}
});
writer.setHeaderCallback(new FlatFileHeaderCallback() {
#Override
public void writeHeader(Writer writer) throws IOException {
writer.write("productID,productName,ProductDesc,price,unit");
}
});
writer.setFooterCallback(new FlatFileFooterCallback() {
#Override
public void writeFooter(Writer writer) throws IOException {
//scrivo il footer
writer.write("****** File created at " + new SimpleDateFormat().format(new
Date()) + " ******");
}
});
return writer;
}
Which writer do I use for strings and how do I create it?
Thank you in advance for your suggestions !
Have a nice day.

for this case I think java.io.PrintWriter should do it and you can use method printLn() to print your string in a line if you want this,
and if you want to write a object and then read it and load it in your app use java.io.ObjectOutputStream to write your object then use java.io.ObjectInputStream to read it and load it to an instance of your object
note that you should implement java.io.Serializable to use it.

Why does my spring batch takes step2 while step1 is still going on?

First of all, thank you for checking out my post.
I would list the techs that are used, what I need to achieve and what is happening.
Services used:
Spring Boot in Groovy
AWS: AWS Batch, S3Bucket, SNS, Parameter Store, CloudFormation
What I need to achieve:
Read two csv files from S3 Bucket(student.csv: id,studentName are columns and score.csv: id,studentId,score)
Check who is failed by comparing two files. If student’s score is below 50, he/she is failed.
Create a new csv file in S3 Bucket and store it as failedStudents.csv.
Send an email via SNS topic the failedStudents.csv that was just created.
In the Spring Batch Config class, I am calling the 1,2,3 as step1 and 4 as step2.
#Bean
Step step1() {
return this.stepBuilderFactory
.get("step1")
.<BadStudent, BadStudent>chunk(100)
.reader(new IteratorItemReader<Student>(this.StudentLoader.Students.iterator()) as ItemReader<? extends BadStudent>)
.processor(this.studentProcessor as ItemProcessor<? super BadStudent, ? extends BadStudent>)
.writer(this.csvWriter())
.build()
}
#Bean
Step step2() {
return this.stepBuilderFactory
.get("step2")
.tasklet(new PublishSnsTopic())
.build()
}
#Bean
Job job() {
return this.jobBuilderFactory
.get("scoring-students-batch")
.incrementer(new RunIdIncrementer())
.start(this.step1())
.next(this.step2())
.build()
}
ItemWriter
#Component
#EnableContextResourceLoader
class BadStudentWriter implements ItemWriter<BadStudent> {
#Autowired
ResourceLoader resourceLoader
#Resource
FileProperties fileProperties
WritableResource resource
PrintStream writer
CSVPrinter csvPrinter
#PostConstruct
void setup() {
this.resource = this.resourceLoader.getResource("s3://students/failedStudents.csv") as WritableResource
this.writer = new PrintStream(this.resource.outputStream)
this.csvPrinter = new CSVPrinter(this.writer, CSVFormat.DEFAULT.withDelimiter('|' as char).withHeader('id', 'studentsName', 'score'))
}
#Override
void write(List<? extends BadStudent> items) throws Exception {
this.csvPrinter.with {CSVPrinter csvPrinter ->
items.each {BadStudent badStudent ->
csvPrinter.printRecord(
badStudent.id,
badStudent.studentsName,
badStudent.score
)
}
}
}
#AfterStep
void aferStep() {
this.csvPrinter.close()
}
}
PublishSnsTopic
#Configuration
#Service
class PublishSnsTopic implements Tasklet {
#Autowired
ResourceLoader resourceLoader
List<BadStudent> badStudents
#Autowired
FileProperties fileProperties
#PostConstruct
void setup() {
String badStudentCSVFileName = "s3://students/failedStudents.csv"
Reader badStudentReader = new InputStreamReader(
this.resourceLoader.getResource(badStudentCSVFileName).inputStream
)
this.badStudnets = new CsvToBeanBuilder(badStudentReader)
.withSeparator((char)'|')
.withType(BadStudent)
.withFieldAsNull(CSVReaderNullFieldIndicator.BOTH)
.build()
.parse()
String messageBody = ""
messageBody += this.badStudents.collect {it -> return "${it.id}"}
SnsClient snsClient = SnsClient.builder().build()
if(snsClient) {
publishTopic(snsClient, messageBody, this.fileProperties.topicArn)
snsClient.close()
}
}
void publishTopic(SnsClient snsClient, String message, String arn) {
try {
PublishRequest request = PublishRequest.builder()
.message(message)
.topicArn(arn)
.build()
PublishResponse result = snsClient.publish(request)
} catch (SnsException e) {
log.error "SOMETHING WENT WRONG"
}
}
#Override
RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
return RepeatStatus.FINISHED
}
}
The Issue here is that, while step 1 is executing step2 also gets executed.
And it would lead to serious problem as
As this batch job runs everyday. Let’s say today is Tuesday, the failedStudents.csv is generated and stored fine. However while it is also running step2 instead of waiting for step1 to be done, step2 is send the failedStudents.csv that is generated on Monday. So csv is generated as new dataset. But Step2 is sending a day old students list.
For some reason, if failedStudents.csv is not stored in the s3 or turned into Glacier bucket, it would failed the batch job. Because in the Step2 it would crash due to FileNotFound Exception while Step1 is still going on and the failedStudents.csv is not created yet.
This is my first AWS and Spring Batch Project and so far, Step1 is working great perfectly. But I am guessing Step2 has a problem,
I appreciate reading my long post.
I would be very happy if anyone could answer or help me with this problem.
I have been fixing PublishSnsTopic and ItemWriter, but this is where I am stuck for a while, not sure how to figure this out.
Thank you so much again for reading.

Spring batch FlatFileItemWriter write as csv from Object

I am using Spring batch and have an ItemWriter as follows:
public class MyItemWriter implements ItemWriter<Fixing> {
private final FlatFileItemWriter<Fixing> writer;
private final FileSystemResource resource;
public MyItemWriter () {
this.writer = new FlatFileItemWriter<>();
this.resource = new FileSystemResource("target/output-teste.txt");
}
#Override
public void write(List<? extends Fixing> items) throws Exception {
this.writer.setResource(new FileSystemResource(resource.getFile()));
this.writer.setLineAggregator(new PassThroughLineAggregator<>());
this.writer.afterPropertiesSet();
this.writer.open(new ExecutionContext());
this.writer.write(items);
}
#AfterWrite
private void close() {
this.writer.close();
}
}
When I run my spring batch job, the items are written to file as:
Fixing{id='123456', source='TEST', startDate=null, endDate=null}
Fixing{id='1234567', source='TEST', startDate=null, endDate=null}
Fixing{id='1234568', source='TEST', startDate=null, endDate=null}
1/ How can I write just the data so that the values are comma separated and where it is null, it is not written. So the target file should look like this:
123456,TEST
1234567,TEST
1234568,TEST
2/ Secondly, I am having an issue where only when I exit spring boot application, I am able to see the file get created. What I would like is once it has processed all the items and written, the file to be available without closing the spring boot application.

There are multiple options to write the csv file. Regarding second question writer flush will solve the issue.
https://howtodoinjava.com/spring-batch/flatfileitemwriter-write-to-csv-file/
We prefer to use OpenCSV with spring batch as we are getting more speed and control on huge file example snippet is below
class DocumentWriter implements ItemWriter<BaseDTO>, Closeable {
private static final Logger LOG = LoggerFactory.getLogger(StatementWriter.class);
private ColumnPositionMappingStrategy<Statement> strategy ;
private static final String[] columns = new String[] { "csvcolumn1", "csvcolumn2", "csvcolumn3",
"csvcolumn4", "csvcolumn5", "csvcolumn6", "csvcolumn7"};
private BufferedWriter writer;
private StatefulBeanToCsv<Statement> beanToCsv;
public DocumentWriter() throws Exception {
strategy = new ColumnPositionMappingStrategy<Statement>();
strategy.setType(Statement.class);
strategy.setColumnMapping(columns);
filename = env.getProperty("globys.statement.cdf.path")+"-"+processCount+".dat";
File cdf = new File(filename);
if(cdf.exists()){
writer = Files.newBufferedWriter(Paths.get(filename), StandardCharsets.UTF_8,StandardOpenOption.APPEND);
}else{
writer = Files.newBufferedWriter(Paths.get(filename), StandardCharsets.UTF_8,StandardOpenOption.CREATE_NEW);
}
beanToCsv = new StatefulBeanToCsvBuilder<Statement>(writer).withQuotechar(CSVWriter.NO_QUOTE_CHARACTER)
.withMappingStrategy(strategy).withSeparator(',').build();
}
#Override
public void write(List<? extends BaseDTO> items) throws Exception {
List<Statement> settlementList = new ArrayList<Statement>();
for (int i = 0; i < items.size(); i++) {
BaseDTO baseDTO = items.get(i);
settlementList.addAll(baseDTO.getStatementList());
}
beanToCsv.write(settlementList);
writer.flush();
}
#PreDestroy
#Override
public void close() throws IOException {
writer.close();
}
}

Since you are using PassThroughLineAggregator which does item.toString() for writing the object, overriding the toString() function of classes extending Fixing.java should fix it.

1/ How can I write just the data so that the values are comma separated and where it is null, it is not written.
You need to provide a custom LineAggregator that filters out null fields.
2/ Secondly, I am having an issue where only when I exit spring boot application, I am able to see the file get created
This is probably because you are calling this.writer.open in the write method which is not correct. You need to make your item writer implement ItemStream and call this.writer.open and this this.writer.close respectively in ItemStream#open and ItemStream#close

Choose between different steps depending on argument in spring batch

I'm using spring batch to write an application that read from a table and then write the output to a csv file. The application receives several input params, one of them is the database table to read. I want to write a single job that read the correct table depending on the input param. This is my Configuration class:
#Configuration
public class ExtractorConfiguration {
#Bean(name="readerA")
#StepScope
public JdbcCursorItemReader<ClassA> readerA(
#Value("#{jobParameters['REF_DATE']}")String dataRef
){
...
return reader;
}
#Bean(name="writerA")
#StepScope
public FlatFileItemWriter<ClassA> writerA(
#Value("#{jobParameters['OUTPUT_FILE_PATH']}")String outputPath
) {
...
return writer;
}
//endregion
#Bean(name="extractStep")
#StepScope
public Step extractStep(
#Value("#{jobParameters['DATABASE_TABLE']}")String tableName
) throws Exception {
switch (tableName) {
case tableA:
return steps.get("extractStep")
.<ClassA, ClassA>chunk(applicationProperties.getChunkSize())
.reader(readerA(""))
.writer(writerA(""))
.build();
default:
throw new Exception("Wrong table: " + tableName);
}
}
#Bean(name = "myJob")
public Job myJob() throws Exception {
return jobs.get("myJob")
.flow(extractStep(""))
.end()
.build();
}
}
The idea was to add inside extractStep a second case in the switch (something like this):
case tableB:
return steps.get("extractStep")
.<ClassB, ClassB>chunk(applicationProperties.getChunkSize())
.reader(readerB(""))
.writer(writerB(""))
.build();
and then write and exec methods for readerB and writerB; with this approach I'm receving this error:
Caused by: java.lang.IllegalStateException: No context holder available for step scope
I would like to know:
1- what is the error?
2- is there a method to get the JobParameters inside myJob rather than inside the steps?
3- is there a better approach?
Thanks.

How to create a generic FlatFileItemReader to read CSV files with different headers?

I'm creating a job that will read and process different .csv files based on an input parameter. There are 3 different types of .csv files with different headers. I want to map each line of a file to a POJO using a generic FlatFileItemReader.
Each type of file will have its own POJO implementation, and all "File Specific POJOs" are subclassed from an abstract GenericFilePOJO.
A tasklet will first read the input parameter to decide which file type needs to be read, and construct a LineTokenizer with the appropriate header columns. It places this information in the infoHolder for retrieval at the reader step.
#Bean
public FlatFileItemReader<GenericFilePOJO> reader() {
FlatFileItemReader<RawFile> reader = new FlatFileItemReader<GenericFilePOJO>();
reader.setLinesToSkip(1); // header
reader.setLineMapper(new DefaultLineMapper() {
{
// The infoHolder will contain the file-specific LineTokenizer
setLineTokenizer(infoHolder.getLineTokenizer());
setFieldSetMapper(new BeanWrapperFieldSetMapper<GenericFilePOJO>() {
{
setTargetType(GenericFilePOJO.class);
}
});
}
});
return reader;
}
Can this reader handle the different File Specific POJOs despite returning the GenericFilePOJO?

You wrote:
A tasklet will first read the input parameter to decide which file
type needs to be read.
Because the tasklet or infoHolder knows about type of file you can implement the creation of specific FieldSetMapper instance.
This is a demo example how it can be implemented:
public class Solution<T extends GenericFilePOJO> {
private InfoHolder infoHolder = new InfoHolder();
#Bean
public FlatFileItemReader<T> reader()
{
FlatFileItemReader<T> reader = new FlatFileItemReader<T>();
reader.setLinesToSkip(1);
reader.setLineMapper(new DefaultLineMapper() {
{
setLineTokenizer(infoHolder.getLineTokenizer());
setFieldSetMapper(infoHolder.getFieldSetMapper());
}
});
return reader;
}
private class InfoHolder {
DelimitedLineTokenizer getLineTokenizer() {
return <some already existent logic>;
}
FieldSetMapper<T> getFieldSetMapper() {
if (some condition for specific file POJO 1){
return new BeanWrapperFieldSetMapper<T>() {
{
setTargetType(FileSpecificPOJO_1.class);
}
};
} else if (some condition for specific file POJO 2){
return new BeanWrapperFieldSetMapper<T>() {
{
setTargetType(FileSpecificPOJO_2.class);
}
};
}
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Using ClassifierCompositeItemWriter and FlatFileItemWriter to write to multiple files - java

Related

Spring batch, how do I write simple strings into a file?

Why does my spring batch takes step2 while step1 is still going on?

Spring batch FlatFileItemWriter write as csv from Object

Choose between different steps depending on argument in spring batch

How to create a generic FlatFileItemReader to read CSV files with different headers?

Categories

Resources