Rollback Spring Batch Job - java

Actually I jave a Job with several steps and two tasklets as follow:
#Bean
public Step stepItem() {
return stepBuilderFactory.get("stepItem")
.<Item, Item> chunk(10)
.reader(itemReader())
.processor(itemProcessor())
.faultTolerant()
.writer(itemWriter())
.build();
}
#Bean
public Step deleteAllItemStep() {
return stepBuilderFactory.get("deleteAllItemStep")
.tasklet(itemStepDeleteAllTasklet())
.build();
}
Here my custom tasklet
public class ItemStepDeleteAllTasklet implements Tasklet{
private ItemRepository itemRepository;
public ItemStepDeleteAllTasklet(ItemRepository itemRepository) {
this.itemRepository = itemRepository;
}
#Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
itemRepository.deleteAllInBatch();
return null;
}
}
This is my configuration with the main job.
#Configuration
#Import({BatchItemConfiguration.class, BatchCriticalComponentConfiguration.class})
public class BatchConfiguration {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private Step stepCriticalComponent;
#Autowired
private Step stepItem;
#Autowired
private Step deleteAllItemStep;
#Autowired
private Step deleteAllCriticalComponentStep;
#Bean
public Job importJob(JobCompletionNotificationListener listener) {
return jobBuilderFactory.get("importJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.start(deleteAllCriticalComponentStep)
.next(deleteAllItemStep)
.next(stepItem).next(stepCriticalComponent)
.build();
}
}
Actually if a step (stepItem or stepCriticalComponent) fails,data that is deleted by the tasklet is gone and I cannot recover it.
Is there a way to do a rollback on the entire Job or a rollback before a specific step/tasklet?

There is no concept of a rollback of an entire job or step within Spring Batch. However, you can use a listener (JobExecutionListener or StepExecutionListener) to execute compensating logic to do a rollback by hand.

Related

Spring Batch processes the records, but is not inserting them into the database

Issue
When I've started to use separate threads to run the same job several times at the same time, it's happening that the records that have to be inserted, when they've been processed, from the Writer aren't being inserted into the database. The batch runs correctly when I run two sets of data at the same time:
Records processed dataSet1: 3606 (expected 3606).
Records processed dataSet2: 1776 (expected 1776).
As can be seen in the following image, the number of records read and written by Spring Batch are as expected:
Context
In this project I'm using MySQL as database and Hibernate.
Some code
Batch config, job and steps
#Configuration
#EnableBatchProcessing
public class BatchConfig extends DefaultBatchConfigurer
{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private StepSkipListener stepSkipListener;
#Autowired
private MainJobExecutionListener mainJobExecutionListener;
#Bean
public TaskExecutor taskExecutor()
{
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(10);
taskExecutor.setThreadNamePrefix("batch-thread-");
return taskExecutor;
}
#Bean
public JobLauncher jobLauncher() throws Exception
{
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(getJobRepository());
jobLauncher.setTaskExecutor(taskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
#Bean
public Step mainStep(ReaderImpl reader, ProcessorImpl processor, WriterImpl writer)
{
return stepBuilderFactory.get("step")
.<List<ExcelLoad>, Invoice>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant().skipPolicy(new ExceptionSkipPolicy())
.listener(stepSkipListener)
.build();
}
#Bean
public Job mainJob(Step mainStep)
{
return jobBuilderFactory.get("mainJob")
.listener(mainJobExecutionListener)
.incrementer(new RunIdIncrementer())
.start(mainStep)
.build();
}
}
Writer
#Override
public void write(List<? extends Invoice> list)
{
invoiceRepository.saveAll(list);
}
Repository
#Repository
public interface InvoiceRepository extends JpaRepository<Invoice, Integer>
{}
Properties
spring.main.allow-bean-definition-overriding=true
spring.batch.initialize-schema=always
spring.batch.job.enabled=false
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost:3306/bd_dev?autoReconnect=true&useTimezone=true&useLegacyDatetimeCode=false&serverTimezone=Europe/Paris&zeroDateTimeBehavior=convertToNull
spring.datasource.username=root
spring.datasource.password=password
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5InnoDBDialect
Before using the separate threads, the processed records were inserted into the database correctly. What could be happening?
Before using the separate threads, the processed records were inserted into the database correctly. What could be happening?
If you decide to use a multi-threaded step, you need to make sure your batch artefacts (reader, writer, etc) are thread-safe. From what you shared, the write method is not synchronized between threads and hence is not thread-safe. This is explained in the Multi-threaded Step section of the documentation.
You need to either synchronize it (by using the synchronized key word, or using a Lock, etc) or wrap your writer in a SynchronizedItemStreamWriter.
To help with the implementation, in case someone comes to this question, I share the code that, in my case, has worked to solve the problem:
Batch config, job and steps
#Configuration
#EnableBatchProcessing
public class BatchConfig extends DefaultBatchConfigurer
{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private StepSkipListener stepSkipListener;
#Autowired
private MainJobExecutionListener mainJobExecutionListener;
#Bean
public PlatformTransactionManager getTransactionManager()
{
return new JpaTransactionManager();
}
#Bean
public TaskExecutor taskExecutor()
{
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(10);
taskExecutor.setThreadNamePrefix("batch-thread-");
return taskExecutor;
}
#Bean
public JobLauncher jobLauncher() throws Exception
{
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(getJobRepository());
jobLauncher.setTaskExecutor(taskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
#Bean
public Step mainStep(ReaderImpl reader, ProcessorImpl processor, WriterImpl writer)
{
return stepBuilderFactory.get("step")
.transactionManager(jpaTransactionManager)
.<List<ExcelLoad>, Invoice>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant().skipPolicy(new ExceptionSkipPolicy())
.listener(stepSkipListener)
.build();
}
#Bean
public Job mainJob(Step mainStep)
{
return jobBuilderFactory.get("mainJob")
.listener(mainJobExecutionListener)
.incrementer(new RunIdIncrementer())
.start(mainStep)
.build();
}
}
Writer
#Component
public class WriterImpl implements ItemWriter<Invoice>
{
#Autowired
private InvoiceRepository invoiceRepository;
#Override
public void write(List<? extends Invoice> list)
{
invoiceRepository.saveAll(list);
}
}

Using Multiple DataSource in Spring Batch Tasklet

I am new to spring batch, and I'm encountering an issue when using multiple data source in my batch.
Let me explain.
I am using 2 databases in my server with Spring Boot.
So far everything worked fine with my implementation of RoutingDataSource.
#Component("dataSource")
public class RoutingDataSource extends AbstractRoutingDataSource {
#Autowired
#Qualifier("datasourceA")
DataSource datasourceA;
#Autowired
#Qualifier("datasourceB")
DataSource datasourceB;
#PostConstruct
public void init() {
setDefaultTargetDataSource(datasourceA);
final Map<Object, Object> map = new HashMap<>();
map.put(Database.A, datasourceA);
map.put(Database.B, datasourceB);
setTargetDataSources(map);
}
#Override
protected Object determineCurrentLookupKey() {
return DatabaseContextHolder.getDatabase();
}
}
The implementation require a DatabaseContextHolder, here it is :
public class DatabaseContextHolder {
private static final ThreadLocal<Database> contextHolder = new ThreadLocal<>();
public static void setDatabase(final Database dbConnection) {
contextHolder.set(dbConnection);
}
public static Database getDatabase() {
return contextHolder.get();
}
}
When I received a request on my server, I have a basic interceptor that sets the current database based on some input I have in the request. with the method DatabaseContextHolder.setDatabase(db); Everything works fine with my actual controllers.
It gets more complicated when I try to run a job with one tasklet.
One of my controller start an async task like this.
#GetMapping("/batch")
public void startBatch() {
return jobLauncher.run("myJob", new JobParameters());
}
#EnableBatchProcessing
#Configuration
public class MyBatch extends DefaultBatchConfigurer {
#Autowired private JobBuilderFactory jobs;
#Autowired private StepBuilderFactory steps;
#Autowired private MyTasklet tasklet;
#Bean
public Job job(Step step) {
return jobs.get("myJob").start(step).build();
}
#Bean
protected Step registeredDeliveryTask() {
return steps.get("myTask").tasklet(tasklet).build();
}
/** Overring the joblauncher get method to make it asynchornous */
#Override
public JobLauncher getJobLauncher() {
try {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(super.getJobRepository());
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
} catch (Exception e) {
throw new BatchConfigurationException(e);
}
}
}
And my Tasklet :
#Component
public class MyTasklet implements Tasklet {
#Autowired
private UserRepository repository;
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)throws Exception {
//Do stuff with the repository.
}
But the RoutingDataSource doesn't work, even if I set my Context before starting the job. For example if I set my database to B, the repo will work on database A.
It is always the default datasource that is selected. (because of this line
setDefaultTargetDataSource(datasourceA); )
I tried to set the database, by passing the value in the parameters, inside the tasklet, but still got the same issue.
#GetMapping("/batch")
public void startBatch() {
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("database", new JobParameter(DatabaseContextHolder.getCircaDatabase().toString()));
return jobLauncher.run("myJob", new JobParameters(parameters));
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)throws Exception {
String database =
chunkContext.getStepContext().getStepExecution().getJobParameters().getString("database");
DatabaseContextHolder.setDatabase(Database.valueOf(database));
//Do stuff with the repository.
}
I feel like the problem is because the database was set in a different thread, because my job is asynchronous. So it cannot fetch the database set before launching the job. But I couldn't find any solution so far.
Regards
Your routing datasource is being used for Spring Batch's meta-data, which means the job repository will interact with a different database depending on the thread processing the request. This is not needed for batch jobs. You need to configure Spring Batch to work with a fixed data source.

Java Spring Batch - Resource file is not injected into the tasklet

I'm doing the java examples from the book Spring Batch In Action chapter 1.
In this example, a tasket unzips a zip file. The tasklet receives the zip file path as a job parameter.
I implemented a test method that runs the job and passes the parameters.
#StepScope
#Component
public class DecompressTasklet implements Tasklet {
private static final Logger LOGGER = LogManager.getLogger(DecompressTasklet.class);
#Value("#{jobParameters['inputResource']}")
private Resource inputResource;
#Value("#{jobParameters['targetDirectory']}")
private String targetDirectory;
#Value("#{jobParameters['targetFile']}")
private String targetFile;
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
//code here
}
}
#Configuration
public class DescompressStep {
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private DecompressTasklet decompressTasklet;
#Bean
public Step stepDescompress() {
return stepBuilderFactory
.get(DescompressStep.class.getSimpleName())
.tasklet(decompressTasklet)
.build();
}
}
#EnableBatchProcessing
#Configuration
public class ImportProductsJob {
#Autowired
private DescompressStep descompressStep;
#Autowired
private ReadWriteProductStep readWriteProductStep;
#Bean
public Job job(JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory
.get("importProductsJob")
.start(descompressStep.stepDescompress())
.next(readWriteProductStep.stepReaderWriter())
.incrementer(new RunIdIncrementer())
.build();
}
}
Below is the test code that runs the job
#RunWith(SpringRunner.class)
#SpringBootTest
#SpringBatchTest
#AutoConfigureTestDatabase
public class ImportProductsIntegrationTest {
#Autowired
private JobRepositoryTestUtils jobRepositoryTestUtils;
#Autowired
private JobLauncherTestUtils jobLauncherTestUtils;
#After
public void cleanUp() {
jobRepositoryTestUtils.removeJobExecutions();
}
#Test
public void importProducts() throws Exception {
jobLauncherTestUtils.launchJob(defaultJobParameters());
}
private JobParameters defaultJobParameters() {
JobParametersBuilder paramsBuilder = new JobParametersBuilder();
paramsBuilder.addString("inputResource", "classpath:input/products.zip");
paramsBuilder.addString("targetDirectory", "./target/importproductsbatch/");
paramsBuilder.addString("targetFile", "products.txt");
paramsBuilder.addLong("timestamp", System.currentTimeMillis());
return paramsBuilder.toJobParameters();
}
}
The products.zip file is in src/main/resources/input
The problem is that when running the test the error occurs
java.lang.NullPointerException: null
at com.springbatch.inaction.ch01.DecompressTasklet.execute(DecompressTasklet.java:62) ~[classes/:na]
I verified that the inputResource property is null. Why does this error occur?
In your job definition, you have:
#Bean
public Job job(JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory
.get("importProductsJob")
.start(descompressStep.stepDescompress())
.next(readWriteProductStep.stepReaderWriter())
.incrementer(new RunIdIncrementer())
.build();
}
The way you are passing steps to start and next methods is incorrect (I don't even see how this would compile). What you can do is import step configuration classes and inject both steps in your job definition. Something like:
#EnableBatchProcessing
#Configuration
#Import({DescompressStep.class, ReadWriteProductStep.class})
public class ImportProductsJob {
#Bean
public Job job(JobBuilderFactory jobBuilderFactory,
Step stepDescompress, Step stepReaderWriter) {
return jobBuilderFactory
.get("importProductsJob")
.start(stepDescompress)
.next(stepReaderWriter)
.incrementer(new RunIdIncrementer())
.build();
}
}

Spring Batch Bean Placement

Does placement of beans make a different when loading them into a scoped context? Is this a bug or a timing of instantiation issue?
If I include the #StepScope and #Bean directly in the BatchConfiguration class, everything works seamlessly with StepScope. However, if I define another class, say "BatchProcessProcessor" as included below, and mark a method within that other class as a Bean with StepScope, it does not resolve properly. The actual symptom in spring batch is StepScope not triggering and the beans being loaded as Singletons.
Something about providing the #Bean and #StepScope from another class that is loaded via constructor injection in the BatchConfiguration does not resolve properly.
Format described above, included below:
Main batch configuration class
#Slf4j
#Configuration
#EnableAutoConfiguration
#EnableBatchProcessing
public class BatchConfiguration extends DefaultBatchConfigurer {
private BatchProcessProcessor processor;
#Override
public void setDataSource(DataSource dataSource) {
// override to do not set datasource even if a datasource exist.
// initialize will use a Map based JobRepository (instead of database)
}
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autwired
public BatchConfiguration(BatchProcessProcessor processor){
this.processor = processor;
}
#Bean
#StepScope
public ListItemReader<String> reader() {
List<String> stringList = new ArrayList<>();
stringList.add("test");
stringList.add("another test");
log.info("LOGGING A BUNCH OF STUFF THIS IS UNIQUE" + String.valueOf(System.currentTimeMillis()));
return new ListItemReader<>(stringList);
}
#Bean
#StepScope
public CustomWriter writer() {
return new CustomWriter();
}
#Bean
public Job importUserJob(JobCompletionNotificationListener listener, Step step1) {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(step1)
.end()
.build();
}
#Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<String, String> chunk(10)
.reader(reader())
.processor(processor.processor())
.writer(writer()).build();
}
}
Processor container class
#Component
public class BatchProcessProcessor {
private MyService service;
#Autowired
BatchProcessProcessor(MyService service){
this.service= service;
}
/**
* Generate processor utilized for processing
* #return StringProcessor for testing
*/
#Bean
#StepScope
public DeploymentProcesser processor() {
return new DeploymentProcessor(service);
}
}
Actual Processor
#Slf4j
#Component
public class DeploymentProcesser implements ItemProcessor<Deployment, Model> {
private MyService service;
#Autowired
public DeploymentProcesser(MyService service){
this.service= service;
}
#Override
public Model process(final Deployment deployment) {
log.info(String.format("Processing %s details", deployment.getId()));
Model model = new Model();
model.setId(deployment.getId());
return model;
}
}
As far as I understand, when the BatchConfiguration loads it should inject the BatchProcessProcessor and load the bean with stepscope, but that doesn't seem to work.
As I said before, just copy-pasting the #Bean/#StepScope directly into the BatchConfiguration and returning the same DeploymentProcessor works perfectly and StepScope resolves.
Is this a lifecycle issue?
It does not make sense to declare a bean in a class annotated with #Component:
#Component
public class BatchProcessProcessor {
private MyService service;
#Autowired // This is correct, you can autowire collaborators
public DeploymentProcesser(MyService service){
this.service= service;
}
#Bean // THIS IS NOT CORRECT
#StepScope
public DeploymentProcesser processor() {
return new DeploymentProcessor(service);
}
}
You should rather do it in a configuration class annotated with #Configuration. That's why it works when you do it in BatchConfiguration.

Spring Batch doesn't call both ItemProcessor and ItemWriter in chunk-flow

I have a spring batch application to get a file in samba server
and generate a new file in a different folder on the same server.
However,
only ItemReader is called in the flow.
What is the problem? Thanks.
BatchConfiguration:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration extends BaseConfiguration {
#Bean
public ValeTrocaItemReader reader() {
return new ValeTrocaItemReader();
}
#Bean
public ValeTrocaItemProcessor processor() {
return new ValeTrocaItemProcessor();
}
#Bean
public ValeTrocaItemWriter writer() {
return new ValeTrocaItemWriter();
}
#Bean
public Job importUserJob(JobCompletionNotificationListener listener) throws Exception {
return jobBuilderFactory()
.get("importUserJob")
.incrementer(new RunIdIncrementer())
.repository(getJobRepository())
.listener(listener)
.start(this.step1())
.build();
}
#Bean
public Step step1() throws Exception {
return stepBuilderFactory()
.get("step1")
.<ValeTroca, ValeTroca>chunk(10)
.reader(this.reader())
.processor(this.processor())
.writer(this.writer())
.build();
}
}
BaseConfiguration:
public class BaseConfiguration implements BatchConfigurer {
#Bean
#Override
public PlatformTransactionManager getTransactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
#Override
public SimpleJobLauncher getJobLauncher() throws Exception {
final SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(this.getJobRepository());
return simpleJobLauncher;
}
#Bean
#Override
public JobRepository getJobRepository() throws Exception {
return new MapJobRepositoryFactoryBean(this.getTransactionManager()).getObject();
}
#Bean
#Override
public JobExplorer getJobExplorer() {
MapJobRepositoryFactoryBean repositoryFactory = this.getMapJobRepositoryFactoryBean();
return new SimpleJobExplorer(repositoryFactory.getJobInstanceDao(), repositoryFactory.getJobExecutionDao(),
repositoryFactory.getStepExecutionDao(), repositoryFactory.getExecutionContextDao());
}
#Bean
public MapJobRepositoryFactoryBean getMapJobRepositoryFactoryBean() {
return new MapJobRepositoryFactoryBean(this.getTransactionManager());
}
#Bean
public JobBuilderFactory jobBuilderFactory() throws Exception {
return new JobBuilderFactory(this.getJobRepository());
}
#Bean
public StepBuilderFactory stepBuilderFactory() throws Exception {
return new StepBuilderFactory(this.getJobRepository(), this.getTransactionManager());
}
}
ValeTrocaItemReader:
#Configuration
public class ValeTrocaItemReader implements ItemReader<ValeTroca>{
#Value(value = "${url}")
private String url;
#Value(value = "${user}")
private String user;
#Value(value = "${password}")
private String password;
#Value(value = "${domain}")
private String domain;
#Value(value = "${inputDirectory}")
private String inputDirectory;
#Bean
#Override
public ValeTroca read() throws MalformedURLException, SmbException, IOException, Exception {
File tempOutputFile = getInputFile();
DefaultLineMapper<ValeTroca> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(new DelimitedLineTokenizer() {
{
setDelimiter(";");
setNames(new String[]{"id_participante", "cpf", "valor"});
}
});
lineMapper.setFieldSetMapper(
new BeanWrapperFieldSetMapper<ValeTroca>() {
{
setTargetType(ValeTroca.class);
}
});
FlatFileItemReader<ValeTroca> itemReader = new FlatFileItemReader<>();
itemReader.setLinesToSkip(1);
itemReader.setResource(new FileUrlResource(tempOutputFile.getCanonicalPath()));
itemReader.setLineMapper(lineMapper);
itemReader.open(new ExecutionContext());
tempOutputFile.deleteOnExit();
return itemReader.read();
}
Sample of ItemProcessor:
public class ValeTrocaItemProcessor implements ItemProcessor<ValeTroca, ValeTroca> {
#Override
public ValeTroca process(ValeTroca item) {
//Do anything
ValeTroca item2 = item;
System.out.println(item2.getCpf());
return item2;
}
EDIT:
- Spring boot 2.1.2.RELEASE - Spring batch 4.1.1.RELEASE
Looking at your configuration, here are a couple of notes:
BatchConfiguration looks good. That's a typical job with a single chunk-oriented step.
BaseConfiguration is actually the default configuration you get when using #EnableBatchProcessing without providing a datasource. So this class can be removed
Adding #Configuration on ValeTrocaItemReader and marking the method read() with #Bean is not correct. This means your are declaring a bean named read of type ValeTroca in your application context. Moreover, your custom reader uses a FlatFileItemReader but has no added value compared to a FlatFileItemReader. You can declare your reader as a FlatFileItemReader and configure it as needed (resource, line mapper, etc ). This will also avoid the mistake of opening the execution context in the read method, which should be done when initializaing the reader or in the ItemStream#open method if the reader implements ItemStream
Other than that, I don't see from what you shared why the processor and writer are not called.
SOLVED: The problem was that even though I'm not using any databases, the spring batch, although configured to have the JobRepository in memory, needs a database (usually H2) to save the configuration tables, jobs, etc.
In this case, the dependencies of JDBC and without H2 in pom.xml were disabled. Just added them to the project and the problem was solved!

Categories

Resources