I am trying to load data from SQL server, apply some transformations and put it into CSV using the spring batch scheduler. All works fine when everything is in the same class.
This is my code:
package com.abc.tools.bootbatch;
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public DataSource dataSource;
private static final String qry = "select top 20 colA, colB, colC from ABC";
private Resource outputResource = new FileSystemResource("output/outputData.csv");
#Bean
public DataSource dataSource() {
final DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName(driver_class);
dataSource.setUrl("db_url");
dataSource.setUsername(usr);
dataSource.setPassword(pwd);
return dataSource;
}
#Bean
ItemReader<Trade> reader() {
JdbcCursorItemReader<Trade> databaseReader = new JdbcCursorItemReader<>();
databaseReader.setDataSource(dataSource);
databaseReader.setSql(qry);
databaseReader.setRowMapper(new BeanPropertyRowMapper<>(Trade.class));
return databaseReader;
}
#Bean
public TradeProcessor processor() {
return new TradeProcessor();
}
#Bean
public FlatFileItemWriter<Trade> writer()
{
//Create writer instance
FlatFileItemWriter<Trade> writer = new FlatFileItemWriter<>();
//Set output file location
writer.setResource(outputResource);
//All job repetitions should "append" to same output file
writer.setAppendAllowed(true);
//Name field values sequence based on object properties
writer.setLineAggregator(new DelimitedLineAggregator<Trade>() {
{
setDelimiter(",");
setFieldExtractor(new BeanWrapperFieldExtractor<Trade>() {
{
setNames(new String[] { "colA", "colB", "colC" });
}
});
}
});
return writer;
}
#Bean
public Step step1() {
return stepBuilderFactory.get("step1").<Trade, Trade> chunk(10)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
#Bean
public Job exportUserJob() {
return jobBuilderFactory.get("exportUserJob")
.incrementer(new RunIdIncrementer())
.flow(step1())
.end()
.build();
}
}
When I seperate the processing, loading and data reading in different classes, it works fine using autowire, unless I use batch job. On using the batch job it gives error in instantiating the database.
So I removed the autowire and tried to do something like this:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public DBConfig dbConfig;
public DataConnection dataconnection=new DataConnection();
DataReader reader=new DataReader();
TradeProcessor processor=new TradeProcessor();
FlatFileWriter flatFileWriter=new FlatFileWriter();
DataSource ds=dataconnection.getDataSource(dbConfig);
#Bean
public Step step1() {
return stepBuilderFactory.get("step1").<Trade, Trade> chunk(10)
.reader(reader.reader(ds))
.processor(processor.processor())
.writer(flatFileWriter.writer())
.build();
}
#Bean
public Job exportUserJob() {
return jobBuilderFactory.get("exportUserJob")
.incrementer(new RunIdIncrementer())
.flow(step1())
.end()
.build();
}
}
This gives Failed to initialize BatchConfiguration
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'batchConfiguration'
I think I am missing something to aggregate it all. I am new to Spring, any help is appreciated
In your first example, you are autowiring a datasource and declaring a datasource bean in the same class which is incorrect. In the second example, instead of autowiring DBConfig, you can import it with #Import(DBConfig.class) and autowire the datasource in your job configuration as needed. Here is a typical configuration:
#Configuration
public class DBConfig {
#Bean
public DataSource dataSource() {
final DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName(driver_class);
dataSource.setUrl("db_url");
dataSource.setUsername(usr);
dataSource.setPassword(pwd);
return dataSource;
}
}
#Configuration
#EnableBatchProcessing
#Import(DBConfig.class)
public class BatchConfiguration {
#Bean
ItemReader<Trade> reader(DataSource datasource) {
// use datasource to configure the reader
}
}
Since you use Spring Boot, you can remove the DBConfig class, configure the datasource as needed in your application.properties file and the datasource will be automatically injected in your BatchConfiguration.
Related
Issue
When I've started to use separate threads to run the same job several times at the same time, it's happening that the records that have to be inserted, when they've been processed, from the Writer aren't being inserted into the database. The batch runs correctly when I run two sets of data at the same time:
Records processed dataSet1: 3606 (expected 3606).
Records processed dataSet2: 1776 (expected 1776).
As can be seen in the following image, the number of records read and written by Spring Batch are as expected:
Context
In this project I'm using MySQL as database and Hibernate.
Some code
Batch config, job and steps
#Configuration
#EnableBatchProcessing
public class BatchConfig extends DefaultBatchConfigurer
{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private StepSkipListener stepSkipListener;
#Autowired
private MainJobExecutionListener mainJobExecutionListener;
#Bean
public TaskExecutor taskExecutor()
{
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(10);
taskExecutor.setThreadNamePrefix("batch-thread-");
return taskExecutor;
}
#Bean
public JobLauncher jobLauncher() throws Exception
{
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(getJobRepository());
jobLauncher.setTaskExecutor(taskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
#Bean
public Step mainStep(ReaderImpl reader, ProcessorImpl processor, WriterImpl writer)
{
return stepBuilderFactory.get("step")
.<List<ExcelLoad>, Invoice>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant().skipPolicy(new ExceptionSkipPolicy())
.listener(stepSkipListener)
.build();
}
#Bean
public Job mainJob(Step mainStep)
{
return jobBuilderFactory.get("mainJob")
.listener(mainJobExecutionListener)
.incrementer(new RunIdIncrementer())
.start(mainStep)
.build();
}
}
Writer
#Override
public void write(List<? extends Invoice> list)
{
invoiceRepository.saveAll(list);
}
Repository
#Repository
public interface InvoiceRepository extends JpaRepository<Invoice, Integer>
{}
Properties
spring.main.allow-bean-definition-overriding=true
spring.batch.initialize-schema=always
spring.batch.job.enabled=false
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost:3306/bd_dev?autoReconnect=true&useTimezone=true&useLegacyDatetimeCode=false&serverTimezone=Europe/Paris&zeroDateTimeBehavior=convertToNull
spring.datasource.username=root
spring.datasource.password=password
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5InnoDBDialect
Before using the separate threads, the processed records were inserted into the database correctly. What could be happening?
Before using the separate threads, the processed records were inserted into the database correctly. What could be happening?
If you decide to use a multi-threaded step, you need to make sure your batch artefacts (reader, writer, etc) are thread-safe. From what you shared, the write method is not synchronized between threads and hence is not thread-safe. This is explained in the Multi-threaded Step section of the documentation.
You need to either synchronize it (by using the synchronized key word, or using a Lock, etc) or wrap your writer in a SynchronizedItemStreamWriter.
To help with the implementation, in case someone comes to this question, I share the code that, in my case, has worked to solve the problem:
Batch config, job and steps
#Configuration
#EnableBatchProcessing
public class BatchConfig extends DefaultBatchConfigurer
{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private StepSkipListener stepSkipListener;
#Autowired
private MainJobExecutionListener mainJobExecutionListener;
#Bean
public PlatformTransactionManager getTransactionManager()
{
return new JpaTransactionManager();
}
#Bean
public TaskExecutor taskExecutor()
{
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(10);
taskExecutor.setThreadNamePrefix("batch-thread-");
return taskExecutor;
}
#Bean
public JobLauncher jobLauncher() throws Exception
{
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(getJobRepository());
jobLauncher.setTaskExecutor(taskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
#Bean
public Step mainStep(ReaderImpl reader, ProcessorImpl processor, WriterImpl writer)
{
return stepBuilderFactory.get("step")
.transactionManager(jpaTransactionManager)
.<List<ExcelLoad>, Invoice>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant().skipPolicy(new ExceptionSkipPolicy())
.listener(stepSkipListener)
.build();
}
#Bean
public Job mainJob(Step mainStep)
{
return jobBuilderFactory.get("mainJob")
.listener(mainJobExecutionListener)
.incrementer(new RunIdIncrementer())
.start(mainStep)
.build();
}
}
Writer
#Component
public class WriterImpl implements ItemWriter<Invoice>
{
#Autowired
private InvoiceRepository invoiceRepository;
#Override
public void write(List<? extends Invoice> list)
{
invoiceRepository.saveAll(list);
}
}
My Multithreaded Spring Batch Step is behaving almost erratically. I haven't been able to discern any kind of pattern in the ways it's failing. Sometimes it reads and writes too many records from the database and sometimes it doesn't read enough.
I'm using a RepositoryItemReader to execute a custom native query. I've defined a countQuery for it and I've used the reader's setMaxItemCount(totalLimit) method, but it seems to consider that more of a suggestion rather than an actual hard maximum. Because with a thread count of 4, and just 1 intentionally bad record that causes 1 skip in the processor logic, I've seen...
limit | pageSize | chunkSize || actual writes
100 | 10 | 5 || 110 unique writes
800 | 100 | 25 || 804 unique writes, and 37 duplicate writes (WHY?)
800 | 100 | 25 || 663 unique writes, and 165 duplicate writes (WHYYYY???)
My project is using Spring Boot 2.1.11.RELEASE and it looks like the version of spring-batch-infrastructure that's pulling in is 4.1.3.RELEASE. Does anyone have any idea why Spring Batch is performing either too many or duplicate writes when just 1 skip occurs on one of the pages?
Maybe it has something to do with the way I've configured my in-memory JobRepository...
Here's my repository class:
#Repository
public interface MyEntityRepository extends JpaRepository<MyEntity, Integer> {
String FROM_MY_ENTITY_TABLE_LEFT_JOINED_WITH_ANOTHER_TABLE = "from {h-schema}my_entity e" +
"left join {h-schema}another_table a" +
"on e.fk = a.pk ";
#Query(
value = "select e.id, e.name, a.additional_info" +
FROM_MY_ENTITY_TABLE_LEFT_JOINED_WITH_ANOTHER_TABLE +
"where e.status <> :status and e.add_date < :date",
countjQuery = "select count(*) " +
FROM_MY_ENTITY_TABLE_LEFT_JOINED_WITH_ANOTHER_TABLE +
"where e.status <> :status and e.add_date < :date",
nativeQuery = true)
Page<MyProjection> findMyProjectionsWithoutStatusBeforeDate(#Param("status") String status,
#Param("date") Date date,
Pageable page);
}
And here's how I've configured my job:
#Configuration
public class ConversionBatchJobConfig {
#Bean
public SimpleCompletionPolicy processChunkSize(#Value("${commit.chunk.size:5}") Integer chunkSize) {
return new SimpleCompletionPolicy(chunkSize);
}
#Bean
#StepScope
public ItemStreamReader<MyProjection> dbReader(
MyEntityRepository myEntityRepository,
#Value("#{jobParameters[startTime]}") Date startTime,
#Value("#{jobParameters[pageSize]}") Integer pageSize,
#Value("#{jobParameters[limit]}") Integer limit) {
RepositoryItemReader<MyProjection> myProjectionRepositoryReader = new RepositoryItemReader<>();
myProjectionRepositoryReader.setRepository(myEntityRepository);
myProjectionRepositoryReader.setMethodName("findMyProjectionsWithoutStatusBeforeDate");
myProjectionRepositoryReader.setArguments(new ArrayList<Object>() {{
add("REMOVED");
add(startTime);
}});
myProjectionRepositoryReader.setSort(new HashMap<String, Sort.Direction>() {{
put("e.id", Sort.Direction.ASC);
}});
myProjectionRepositoryReader.setPageSize(pageSize);
myProjectionRepositoryReader.setMaxItemCount(limit);
myProjectionRepositoryReader.setSaveState(false);
return myProjectionRepositoryReader;
}
#Bean
#StepScope
public ItemProcessor<MyProjection, JsonMessage> dataConverter(AdditionalDbDataRetrievalService dataRetrievalService) {
return new MyProjectionToJsonMessageConverter(dataRetrievalService); // <== simple ItemProcessor implementation
}
#Bean
#StepScope
public ItemWriter<JsonMessage> jsonPublisher(GcpPubsubPublisherService publisherService) {
return new JsonMessageWriter(publisherService); // <== simple ItemWriter implementation
}
#Bean
public Step conversionProcess(SimpleCompletionPolicy processChunkSize,
ItemStreamReader<MyProjection> dbReader,
ItemProcessor<MyProjection, JsonMessage> dataConverter,
ItemWriter<JsonMessage> jsonPublisher,
StepBuilderFactory stepBuilderFactory,
TaskExecutor conversionThreadPool,
#Value("${conversion.failure.limit:20}") int maximumFailures) {
return stepBuilderFactory.get("conversionProcess")
.<MyProjection, JsonMessage>chunk(processChunkSize)
.reader(dbReader)
.processor(dataConverter)
.writer(jsonPublisher)
.faultTolerant()
.skipPolicy(new MyCustomConversionSkipPolicy(maximumFailures))
// ^ for now this returns true for everything until 20 failures
.listener(new MyConversionSkipListener(processStatus))
// ^ for now this just logs the error
.taskExecutor(conversionThreadPool)
.build();
}
#Bean
public Job conversionJob(Step conversionProcess,
JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory.get("conversionJob")
.start(conversionProcess)
.build();
}
}
And here's how I've configured my in-memory Job Repository:
#Configuration
#EnableBatchProcessing
public class InMemoryBatchManagementConfig {
#Bean
public ResourcelessTransactionManager resourcelessTransactionManager() {
ResourcelessTransactionManager resourcelessTransactionManager = new ResourcelessTransactionManager();
return resourcelessTransactionManager;
}
#Bean
public MapJobRepositoryFactoryBean mapJobRepositoryFactory(ResourcelessTransactionManager resourcelessTransactionManager)
throws Exception {
MapJobRepositoryFactoryBean factory = new MapJobRepositoryFactoryBean(resourcelessTransactionManager);
factory.afterPropertiesSet();
return factory;
}
#Bean
public JobRepository jobRepository(MapJobRepositoryFactoryBean factory) throws Exception {
return factory.getObject();
}
#Bean
public SimpleJobLauncher jobLauncher(JobRepository jobRepository) throws Exception {
SimpleJobLauncher launcher = new SimpleJobLauncher();
launcher.setJobRepository(jobRepository);
launcher.afterPropertiesSet();
return launcher;
}
#Bean
public JobExplorer jobExplorer(MapJobRepositoryFactoryBean factory) {
return new SimpleJobExplorer(factory.getJobInstanceDao(), factory.getJobExecutionDao(),
factory.getStepExecutionDao(), factory.getExecutionContextDao());
}
#Bean
public BatchConfigurer batchConfigurer(MapJobRepositoryFactoryBean mapJobRepositoryFactory,
ResourcelessTransactionManager resourceslessTransactionManager,
SimpleJobLauncher jobLauncher,
JobExplorer jobExplorer) {
return new BatchConfigurer() {
#Override
public JobRepository getJobRepository() throws Exception {
return mapJobRepositoryFactory.getObject();
}
#Override
public PlatformTransactionManager getTransactionManager() throws Exception {
return resourceslessTransactionManager;
}
#Override
public JobLauncher getJobLauncher() throws Exception {
return jobLauncher;
}
#Override
public JobExplorer getJobExplorer() throws Exception {
return jobExplorer;
}
};
}
}
EDIT
Was able to get Spring Batch working with an H2 database instead of a Map repository, but I'm still seeing the same issue. Here's how I configured batch to use H2:
I imported the H2 driver:
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<version>1.4.200</version>
</dependency>
I configured my primary DB config to point to my JPA entities:
#Configuration
#EnableTransactionManagement
#EnableJpaRepositories(basePackages = "com.company.project.jpa.repository", transactionManagerRef = "transactionManager")
#EntityScan(basePackages = "com.company.project.jpa.entity")
public class DbConfig {
#Bean
#Primary
#ConfigurationProperties("oracle.datasource")
public DataSource dataSource() {
return DataSourceBuilder.create().build();
}
#Bean
#Primary
public LocalContainerEntityManagerFactoryBean entityManagerFactory(DataSource dataSource,
EntityManagerFactoryBuilder builder) {
return builder.dataSource(dataSource).packages("com.company.project.jpa").build();
}
#Bean
#Primary
public PlatformTransactionManager transactionManager(
#Qualifier("entityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory) {
return new JpaTransactionManager(entityManagerFactory.getObject());
}
}
And then I configured my in-memory Batch management like this:
#Configuration
#EnableBatchProcessing
public class InMemoryBatchManagementConfig {
#Bean(destroyMethod = "shutdown")
public EmbeddedDatabase h2DataSource() {
return new EmbeddedDatabaseBuilder().setType(EmbeddedDatabaseType.H2)
.addScript("classpath:org/springframework/batch/core/schema-drop-h2.sql")
.addScript("classpath:org/springframework/batch/core/schema-h2.sql")
.build();
}
#Bean
public LocalContainerEntityManagerFactoryBean h2EntityManagerFactory(EmbeddedDatabase h2DataSource,
EntityManagerFactoryBuilder builder) {
return builder.dataSource(h2DataSource).packages("org.springframework.batch.core").build();
}
#Bean
public PlatformTransactionManager h2TransactionManager(
#Qualifier("h2EntityManagerFactory") LocalContainerEntityManagerFactoryBean h2EntityManagerFactory) {
return new JpaTransactionManager(h2EntityManagerFactory.getObject());
}
#Bean
public JobRepository jobRepository(EmbeddedDatabase h2DataSource,
#Qualifier("h2TransactionManager") PlatformTransactionManager h2TransactionManager) throws Exception {
final JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDatabaseType(DatabaseType.H2.getProductName());
factory.setDataSource(h2DataSource);
factory.setTransactionManager(h2TransactionManager);
return factory.getObject();
}
#Bean
public SimpleJobLauncher jobLauncher(JobRepository jobRepository) throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
#Bean
public JobRepositoryFactoryBean jobRepositoryFactoryBean(EmbeddedDatabase h2DataSource,
#Qualifier("h2TransactionManager") PlatformTransactionManager h2TransactionManager) {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDataSource(h2DataSource);
jobRepositoryFactoryBean.setTransactionManager(h2TransactionManager);
return jobRepositoryFactoryBean;
}
#Bean
public BatchConfigurer batchConfigurer(JobRepository jobRepository,
SimpleJobLauncher jobLauncher,
#Qualifier("h2TransactionManager") PlatformTransactionManager h2TransactionManager,
JobExplorer jobExplorer) {
return new BatchConfigurer() {
#Override
public JobRepository getJobRepository() {
return jobRepository;
}
#Override
public PlatformTransactionManager getTransactionManager() {
return h2TransactionManager;
}
#Override
public JobLauncher getJobLauncher() {
return jobLauncher;
}
#Override
public JobExplorer getJobExplorer() {
return jobExplorer;
}
};
}
}
Does placement of beans make a different when loading them into a scoped context? Is this a bug or a timing of instantiation issue?
If I include the #StepScope and #Bean directly in the BatchConfiguration class, everything works seamlessly with StepScope. However, if I define another class, say "BatchProcessProcessor" as included below, and mark a method within that other class as a Bean with StepScope, it does not resolve properly. The actual symptom in spring batch is StepScope not triggering and the beans being loaded as Singletons.
Something about providing the #Bean and #StepScope from another class that is loaded via constructor injection in the BatchConfiguration does not resolve properly.
Format described above, included below:
Main batch configuration class
#Slf4j
#Configuration
#EnableAutoConfiguration
#EnableBatchProcessing
public class BatchConfiguration extends DefaultBatchConfigurer {
private BatchProcessProcessor processor;
#Override
public void setDataSource(DataSource dataSource) {
// override to do not set datasource even if a datasource exist.
// initialize will use a Map based JobRepository (instead of database)
}
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autwired
public BatchConfiguration(BatchProcessProcessor processor){
this.processor = processor;
}
#Bean
#StepScope
public ListItemReader<String> reader() {
List<String> stringList = new ArrayList<>();
stringList.add("test");
stringList.add("another test");
log.info("LOGGING A BUNCH OF STUFF THIS IS UNIQUE" + String.valueOf(System.currentTimeMillis()));
return new ListItemReader<>(stringList);
}
#Bean
#StepScope
public CustomWriter writer() {
return new CustomWriter();
}
#Bean
public Job importUserJob(JobCompletionNotificationListener listener, Step step1) {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(step1)
.end()
.build();
}
#Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<String, String> chunk(10)
.reader(reader())
.processor(processor.processor())
.writer(writer()).build();
}
}
Processor container class
#Component
public class BatchProcessProcessor {
private MyService service;
#Autowired
BatchProcessProcessor(MyService service){
this.service= service;
}
/**
* Generate processor utilized for processing
* #return StringProcessor for testing
*/
#Bean
#StepScope
public DeploymentProcesser processor() {
return new DeploymentProcessor(service);
}
}
Actual Processor
#Slf4j
#Component
public class DeploymentProcesser implements ItemProcessor<Deployment, Model> {
private MyService service;
#Autowired
public DeploymentProcesser(MyService service){
this.service= service;
}
#Override
public Model process(final Deployment deployment) {
log.info(String.format("Processing %s details", deployment.getId()));
Model model = new Model();
model.setId(deployment.getId());
return model;
}
}
As far as I understand, when the BatchConfiguration loads it should inject the BatchProcessProcessor and load the bean with stepscope, but that doesn't seem to work.
As I said before, just copy-pasting the #Bean/#StepScope directly into the BatchConfiguration and returning the same DeploymentProcessor works perfectly and StepScope resolves.
Is this a lifecycle issue?
It does not make sense to declare a bean in a class annotated with #Component:
#Component
public class BatchProcessProcessor {
private MyService service;
#Autowired // This is correct, you can autowire collaborators
public DeploymentProcesser(MyService service){
this.service= service;
}
#Bean // THIS IS NOT CORRECT
#StepScope
public DeploymentProcesser processor() {
return new DeploymentProcessor(service);
}
}
You should rather do it in a configuration class annotated with #Configuration. That's why it works when you do it in BatchConfiguration.
I have a spring batch application to get a file in samba server
and generate a new file in a different folder on the same server.
However,
only ItemReader is called in the flow.
What is the problem? Thanks.
BatchConfiguration:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration extends BaseConfiguration {
#Bean
public ValeTrocaItemReader reader() {
return new ValeTrocaItemReader();
}
#Bean
public ValeTrocaItemProcessor processor() {
return new ValeTrocaItemProcessor();
}
#Bean
public ValeTrocaItemWriter writer() {
return new ValeTrocaItemWriter();
}
#Bean
public Job importUserJob(JobCompletionNotificationListener listener) throws Exception {
return jobBuilderFactory()
.get("importUserJob")
.incrementer(new RunIdIncrementer())
.repository(getJobRepository())
.listener(listener)
.start(this.step1())
.build();
}
#Bean
public Step step1() throws Exception {
return stepBuilderFactory()
.get("step1")
.<ValeTroca, ValeTroca>chunk(10)
.reader(this.reader())
.processor(this.processor())
.writer(this.writer())
.build();
}
}
BaseConfiguration:
public class BaseConfiguration implements BatchConfigurer {
#Bean
#Override
public PlatformTransactionManager getTransactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
#Override
public SimpleJobLauncher getJobLauncher() throws Exception {
final SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(this.getJobRepository());
return simpleJobLauncher;
}
#Bean
#Override
public JobRepository getJobRepository() throws Exception {
return new MapJobRepositoryFactoryBean(this.getTransactionManager()).getObject();
}
#Bean
#Override
public JobExplorer getJobExplorer() {
MapJobRepositoryFactoryBean repositoryFactory = this.getMapJobRepositoryFactoryBean();
return new SimpleJobExplorer(repositoryFactory.getJobInstanceDao(), repositoryFactory.getJobExecutionDao(),
repositoryFactory.getStepExecutionDao(), repositoryFactory.getExecutionContextDao());
}
#Bean
public MapJobRepositoryFactoryBean getMapJobRepositoryFactoryBean() {
return new MapJobRepositoryFactoryBean(this.getTransactionManager());
}
#Bean
public JobBuilderFactory jobBuilderFactory() throws Exception {
return new JobBuilderFactory(this.getJobRepository());
}
#Bean
public StepBuilderFactory stepBuilderFactory() throws Exception {
return new StepBuilderFactory(this.getJobRepository(), this.getTransactionManager());
}
}
ValeTrocaItemReader:
#Configuration
public class ValeTrocaItemReader implements ItemReader<ValeTroca>{
#Value(value = "${url}")
private String url;
#Value(value = "${user}")
private String user;
#Value(value = "${password}")
private String password;
#Value(value = "${domain}")
private String domain;
#Value(value = "${inputDirectory}")
private String inputDirectory;
#Bean
#Override
public ValeTroca read() throws MalformedURLException, SmbException, IOException, Exception {
File tempOutputFile = getInputFile();
DefaultLineMapper<ValeTroca> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(new DelimitedLineTokenizer() {
{
setDelimiter(";");
setNames(new String[]{"id_participante", "cpf", "valor"});
}
});
lineMapper.setFieldSetMapper(
new BeanWrapperFieldSetMapper<ValeTroca>() {
{
setTargetType(ValeTroca.class);
}
});
FlatFileItemReader<ValeTroca> itemReader = new FlatFileItemReader<>();
itemReader.setLinesToSkip(1);
itemReader.setResource(new FileUrlResource(tempOutputFile.getCanonicalPath()));
itemReader.setLineMapper(lineMapper);
itemReader.open(new ExecutionContext());
tempOutputFile.deleteOnExit();
return itemReader.read();
}
Sample of ItemProcessor:
public class ValeTrocaItemProcessor implements ItemProcessor<ValeTroca, ValeTroca> {
#Override
public ValeTroca process(ValeTroca item) {
//Do anything
ValeTroca item2 = item;
System.out.println(item2.getCpf());
return item2;
}
EDIT:
- Spring boot 2.1.2.RELEASE - Spring batch 4.1.1.RELEASE
Looking at your configuration, here are a couple of notes:
BatchConfiguration looks good. That's a typical job with a single chunk-oriented step.
BaseConfiguration is actually the default configuration you get when using #EnableBatchProcessing without providing a datasource. So this class can be removed
Adding #Configuration on ValeTrocaItemReader and marking the method read() with #Bean is not correct. This means your are declaring a bean named read of type ValeTroca in your application context. Moreover, your custom reader uses a FlatFileItemReader but has no added value compared to a FlatFileItemReader. You can declare your reader as a FlatFileItemReader and configure it as needed (resource, line mapper, etc ). This will also avoid the mistake of opening the execution context in the read method, which should be done when initializaing the reader or in the ItemStream#open method if the reader implements ItemStream
Other than that, I don't see from what you shared why the processor and writer are not called.
SOLVED: The problem was that even though I'm not using any databases, the spring batch, although configured to have the JobRepository in memory, needs a database (usually H2) to save the configuration tables, jobs, etc.
In this case, the dependencies of JDBC and without H2 in pom.xml were disabled. Just added them to the project and the problem was solved!
I have a Spring Boot project that use Mongodb. So, in my pom i have that dependence:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-mongodb</artifactId>
</dependency>
So i'm able to access to the database with this repository class:
package it.de.marini.server.dal;
import org.springframework.data.mongodb.repository.MongoRepository;
import it.de.marini.server.model.Role;
public interface RoleRepository extends MongoRepository<Role, String> {
}
I need to inizialize my data in Mongodb database putting default Role for example. What is the best way in Spring Boot framework?
There are several ways to do this, I will suggest you with CommandlineRunner
try:
#Bean
public CommandLineRunner initConfig(MyRepo repo) {
if (data not exist) {
repo.save(...);
}
}
Otherwise you can use #PostConstruct to initiate it..
if you need something like liquibase for RDBMS, checkout mongobee: https://github.com/mongobee/mongobee
As #Jaiwo99 say i understand that there is not a standard for do that. So i decided to make the work with Spring Batch. I realized a batch for load from CSV files Roles and Permissions of my application.
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public MongoTemplate mongoTemplate;
#Bean
public PlatformTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public Tasklet defaultRolePermissionTasklet() {
return new DefaultRolePermissionTasklet();
}
public <T> FlatFileItemReader<T> readerFile(String fileName,String[] fields,Class<T> type) {
FlatFileItemReader<T> reader = new FlatFileItemReader<T>();
reader.setStrict(false);
reader.setResource(new ClassPathResource(fileName));
reader.setLineMapper(new DefaultLineMapper<T>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(fields);
setStrict(false);
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<T>() {
{
setTargetType(type);
}
});
}
});
return reader;
}
#Bean
public PermissionItemProcessor permissionProcessor() {
return new PermissionItemProcessor();
}
#Bean
public RoleItemProcessor roleProcessor() {
return new RoleItemProcessor();
}
#Bean
public Job initAuthenticationData(JobCompletionNotificationListener listener) {
return jobBuilderFactory.get("initAuthenticationData").incrementer(new RunIdIncrementer()).listener(listener)
.start(stepPermission())
.next(stepRole())
.next(stepDefaultRolePermissions())
.build();
}
#Bean
public Step stepDefaultRolePermissions() {
return stepBuilderFactory.get("stepDefaultRolePermissions").tasklet(defaultRolePermissionTasklet()).build();
}
#Bean
public Step stepPermission() {
MongoItemWriter<Permission> writer = new MongoItemWriter<Permission>();
writer.setTemplate(mongoTemplate);
return stepBuilderFactory.get("stepPermission").<Permission, Permission>chunk(20)
.reader(readerFile("permission-data.csv",new String[] {"name","description"},Permission.class))
.processor(permissionProcessor())
.writer(writer)
.build();
}
#Bean
public Step stepRole() {
MongoItemWriter<Role> writer = new MongoItemWriter<Role>();
writer.setTemplate(mongoTemplate);
return stepBuilderFactory.get("stepRole").<Role, Role>chunk(20)
.reader(readerFile("role-data.csv",new String[] {"name","description"},Role.class))
.processor(roleProcessor())
.writer(writer)
.build();
}
}
At least, there is one more way how to initialize data in Mongodb using Spring Boot. You can create in your configuration like this:
#Configuration
public class AppConfiguration {
#Autowired
public void prepare(ReactiveMongoOperations mongoOperations,
UserRepository userRepository) {
mongoOperations.createCollection("users",
CollectionOptions.empty()
.maxDocuments(1_000)
.size(1024 * 8)
.capped()).block();
userRepository
.insert(List.of(
User.builder()
.name("Joe Doe")
.build()
))
.blockLast();
}
}
And of course, you must make a check that collection doesn't exist, in order to not create a collection if the database has already been created.