Updating a processed item in Spring Batch

Updating a processed item in Spring Batch - java

I have a Spring Batch app that uses ItemProcessor to process items.
#Component
#StepScope
public class MemberProcessor<T> implements ItemProcessor<T, Member> {
#Override
public Member process(T t) throws Exception {
return processMember((UnprocessedMember) t);
}
}
#Component
#StepScope
public class MemberWriter implements ItemWriter<Member> {
#Override
public void write(List<? extends Member> members) throws Exception {
//store the members in db
saveToDb(members);
}
}
I want to know if it is possible to update an item after it's been processed so that when it gets to ItemWriter, it's updated. For example, I process one item, and then I process another one that may need to edit a property of the previous item. As the previous item has not reached the Writer, I can't do the update on the database and the first item gets written without the update

Spring Batch provides ItemProcesListener for the processing of an item before items processed by the writer. Implementations of this interface will be notified before and after an item is passed to the ItemProcessor and in the event of any exceptions thrown by the processor.
So basically you need to create a custom item process listener implementing ItemProcesListener and register with the processor setp task.
Example:
public class MyCustomItemProcessListener implements ItemProcessListener<T, R> {
#Override
public void beforeProcess(T item) {
System.out.println("MyCustomItemProcessListener - beforeProcess");
}
#Override
public void afterProcess(T item, R result) {
System.out.println("MyCustomItemProcessListener - afterProcess");
// Apply your custom logic to act on the object before write
}
#Override
public void onProcessError(T item, Exception e) {
System.out.println("MyCustomItemProcessListener - onProcessError");
}
}
Now you can add the listener to your step. This depends on your Step definitions and configurations. But essentially something like the below.
StepBuilderFactory.get("Stepx")
...
.processor(someProcessor).listener(new MyCustomItemProcessListener())
...
Or
steps.get("stepx")
.tasklet(new MyProcessorTask())
.listener(new MyCustomItemProcessListener())
.build();

Related

Spring Batch Automatically Commits Updates On Database Even Without Specifying It on ItemWriter

I'm testing Spring Batch for my next project. In my step I indicated my ItemReader, ItemProcessor, and ItemWriter.
On my ItemReader I'm fetching the ID's only using JdbcPagingItemReader and passing it to ItemProcessor where it fetch the whole entity using JpaRepository.findOne(). I'm applying the Driving Query Pattern here.
On the same time in my ItemProcessor implementation, I'm setting one of its field: setUpdated(new Date()).
Then on my ItemWriter, I'm just logging the passed entity from the ItemProcessor.
My question is when I check the logs, hibernate is updating the entities value in the table even though in my ItemWriter I'm just logging it.
ItemProcessor implementation:
public class IdToContractItemProcessor implements ItemProcessor<BigDecimal, Contract> {
#Override
public Contract process(BigDecimal id) {
Contract contract = repo.findOne(id.longValue());
contract.setUpdated(new Date());
return contract;
}
}
ItemWriter implementation:
public class CustomItemWriter implements ItemWriter<Contract> {
#Override
public void write(List<? extends Contract> list) {
for (Contract c : list) {
log.info("id {}", c.getId());
}
}
}
Step bean:
#Bean
public Step step() {
return stepBuilderFactory
.get("myStep")
.<BigDecimal, Contract>chunk(3)
.reader(jdbcPagingItemReader)
.processor(idToContractItemProcessor)
.writer(customWriter)
.build();
}
Contract item is an entity class.
Why is that every after 3 Contracts (chunk), hibernate logs update statement even though I'm not saving it on ItemWriter?

Fetch recently inserted row id using Room library

I'm using room persistence library to update the database. I'm stuck at a point where I want to fetch the id of recently inserted record.
I know that using long as the return type for the insert method returns the id. But I access this Dao method through a viewmodel.
My DAO method is as follows:
//MyDao
#Insert
long insert(RecordItem record);
This method is accessed from a repository by doing this:
//MyRepository
public class MyRepository {
private MyDao myDao;
public MyRepository(#NonNull Application application) {
MainDatabase mainDatabase = MainDatabase.getInstance(application);
myDao = mainDatabase.myDao();
}
public void insert(RecordItem record) {
MainDatabase.dbWriteExecutor.execute(() -> {
myDao.insert(record);
});
}
}
And the repository method is called from viewmodel as follows:
//MyViewModel
public void insert(RecordItem record) {
repository.insert(record);
}
And finally the viewmodel method as:
//MyActivity
myViewModel.insert(record);
My problem is, I don't know how I can get long returned through a viewmodel method. I tried doing this in repository
//MyRepository
public class MyRepository {
private MyDao myDao;
private long id;
public MyRepository(#NonNull Application application) {
MainDatabase mainDatabase = MainDatabase.getInstance(application);
myDao = mainDatabase.myDao();
}
public long insert(RecordItem record) {
MainDatabase.dbWriteExecutor.execute(() -> {
id = myDao.insert(record);
});
return id;
}
}
and subsequent changes to viewmodel method as well.
However, it returns 0, which I suppose happens since the insert method is executed on a different thread and id is returned as soon as the statement is reached(correct me if I'm wrong).
Thanks in advance.

You can approach following solution for your issue:
Create a Callback interface as below:
public interface DbInsertCallback {
void onInsert(long insertedItemId);
}
Then use this interface on your repository insert(RecordItem record) method like below usage:
public class MyRepository {
// ... Some repo code ...
public void insert(RecordItem record, DbInsertCallback callback) {
MainDatabase.dbWriteExecutor.execute(() -> {
long id = myDao.insert(record);
callback.onInsert(id);
});
}
// ... Rest of repo code ...
}
And also make necessary changes on caller site (I.e. ViewModel & Activity) to provide object of this callback class as parameter as well. To do the implementation of this interface, you can either create object of that interface along with implementation or else pass it contextually like providing this.

You can also use RxJava for this problem, where the insert method will return Single<Long>.
#Insert
Single<long> insert(RecordItem item)
Then when calling insert you call subscribe to get the returning id or use flatMap for any further actions using RxJava.
myDao.insert(record).subscribeWith(new DisposableSingleObserver<long>() {
#Override
public void onSuccess(long id) {
// handle the id
}
#Override
public void onError(Throwable e) {
// handle the error case
}
}
I suggest you to take a look at RxJava further down the line since it makes asynchronous programming much more natural and easier to work with and Room also implements it out of the box.

How to access current chunk context programmaticaly from CustomItemWriter?

I have a job with a multi-threaded chunk-oriented step and I need to count how many written items satisfy some business rules. (PS: For legacy reasons, I'm using Spring Batch 3.0.x)
I have to keep in mind that if a rollback happens, then previous already counted items within the same transaction (i.e. same chunk) must be ignored. So I can't just update JobExecutionContext straight from Writer, rather I update an attribute in ChunkContext and use a CustomChunkListener to only update the JobExecutionContext after the chunk succeeds (as you can see in code below).
Before making the step multi-threaded, I had following implementation that worked as expected (I simplified the code as much as I could to focus on the issue):
CustomItemWriter
public class CustomItemWriter implements ItemWriter<String[]> {
private ChunkContext chunkContext;
#Override
public void write(List<? extends String[]> items) throws Exception {
for (String[] item : items) {
((AtomicLong)this.chunkContext.getAttribute("chunkCounter")).incrementAndGet();
}
}
#BeforeChunk
private void beforeChunk(ChunkContext chunkContext) {
this.chunkContext = chunkContext;
}
}
CustomChunkListener
public class CustomChunkListener extends ChunkListenerSupport {
#Override
public void beforeChunk(ChunkContext context) {
context.setAttribute("chunkCounter", new AtomicLong());
}
#Override
public void afterChunk(ChunkContext context) {
((AtomicLong)context.getStepContext().getJobExecutionContext().get("jobCounter")).addAndGet(((AtomicLong)context.getAttribute("chunkCounter")).get());
}
}
CustomJobListener
public class CustomJobListener extends JobExecutionListenerSupport {
#Override
public void beforeJob(JobExecution jobExecution) {
jobExecution.getExecutionContext().put("jobCounter", new AtomicLong());
}
#Override
public void afterJob(JobExecution jobExecution) {
System.out.println("jobCounter = " + jobExecution.getExecutionContext().get("jobCounter"));
}
}
However, when I configured the job to run the step in a multi-threaded fashion, the counter wasn't being updated properly and I know that it was because of the way I was getting access to the ChunkContext in the CustomItemWriter.
The bean CustomItemWriter is of "step scope" (as far as I know, there is no "chunk scope" available), so each time a thread started a new ChunkContext, the method beforeChunk in CustomItemWriter was overwriting the previous ChunkContext and was messing everything up (previously counted would then be gone, since I had lost reference to previous ChunkContext instances).
So, I managed to fix the issue by using ThreadLocal, like below:
CustomItemWriter (v2)
public class CustomItemWriter implements ItemWriter<String[]> {
private ThreadLocal<ChunkContext> chunkContext = new ThreadLocal<ChunkContext>();
#Override
public void write(List<? extends String[]> items) throws Exception {
for (String[] item : items) {
((AtomicLong)this.chunkContext.get().getAttribute("chunkCounter")).incrementAndGet();
}
}
#BeforeChunk
private void beforeChunk(ChunkContext chunkContext) {
this.chunkContext.set(chunkContext);
}
}
Although I have managed to solve the problem, I'm wondering if there is a better way to access the current ChunkContext (for the current thread) from within the CustomItemWriter. Is there a way to get it programmaticaly? To do it "the Spring way", perhaps a [new] Chunk Scope should be implemented in newer versions of Spring Batch?
PS: Also, although the problem is solved, I thought it would be helpful to write this question so it can help someone with the same needs.

Although I have managed to solve the problem, I'm wondering if there is a better way to access the current ChunkContext (for the current thread) from within the CustomItemWriter
No, I see no other way to do that. However, it is not recommended to play with the chunk context in a multi-threaded step. This context is a mutable state which is shared between all threads processing the step (with all the nasty bugs of using shared mutable state in a multi-threaded environment).
The documentation will be updated in that regard, see https://github.com/spring-projects/spring-batch/pull/591/files#diff-177ad333794c9242aaa9ec2d0bec1842R147-R150.

How to restore state machine from context builded in runtime?

I have a state machine
#EnableStateMachine
#Configuration
public class StateMachineConfiguration extends EnumStateMachineConfigurerAdapter<Status, Event> {
#Override
public void configure(StateMachineStateConfigurer<Status, Event> states) throws Exception {
states.withStates()
.initial(Status.DRAFT)
.states(EnumSet.allOf(Status.class));
}
#Override
public void configure(StateMachineTransitionConfigurer<Status, Event> transitions) throws Exception {
transitions
.withExternal()
.target(Status.INVITATION).source(Status.DRAFT)
.event(Event.INVITED)
.guard(new Guard())
.action(new ActionInvited())
.and()
.withExternal()
.target(Status.DECLINED).source(Status.INVITATION)
.event(Event.DECLINED)
.action(new ActionDeclined());
}
#Override
public void configure(StateMachineConfigurationConfigurer<Status, Event> config) throws Exception {
config.withConfiguration().autoStartup(true);
}
}
and I have a model, for example Order.
Model persists in DB. I extract model from DB, now my model has a status Order.status == INVITATION. I want to continue processing model with statemachine, but instance of statemachine will starts processing with initial state DRAFT but I needs continue processing from status INVITATION. In other words I want to execute
stateMachine.sendEvent(MessageBuilder
.withPayload(Event.DECLINED)
.setHeader("orderId", order.id)
.build()
)
and execute action ActionDeclined(). I don't want to persist a context of state machine in DB. I want to setting a state of stateMachine to state of my Model in runtime. How can I do that in right way? Using DefaultStateContext constructor or have an other, more beautiful way?

One possible approach is to create the StateMachine on the fly and to rehydrate the state machine from the DB using the state of the Order.
In this case you need to do the following steps:
Resetting the StateMachine in all regions
Load Order status from DB
Create new DefaultStateMachineContext and populate accordingly
Let's assume you have a build method, which returns new state machines for processing order events (using a StateMachineFactory), but for an existing order, it will rehydrate the state from the database.
StateMachine<Status, Event> build(long orderId) {
orderService.getOrder(orderId) //returns Optional
.map(order -> {
StateMachine<Status, Event> sm = stateMachineFactory.getStateMachine(Long.toString(orderId));
sm.stop();
rehydrateState(sm, sm.getExtendedState(), order.getStatus());
sm.start();
return sm;
})
.orElseGet(() -> createNewStateMachine(orderId);
}
void rehydrateState(StateMachine<Status, Event> newStateMachine, ExtendedState extendedState, Status orderStatus) {
newStateMachine.getStateMachineAccessor().doWithAllRegions(sma ->
sma.resetStateMachine(new DefaultStateMachineContext<>(orderStatus, null, null, extendedState));
});
}

Best approach to pass data between ItemProcessors in spring batch?

I need to pass data related to processing an item in between item processors, I don't need to persist the data, what is the best approach (Note I'm currently using StepSynchronizationManager to access the stepExecution and store the data in ExecutionContext).

What makes you think, that your way - storing the data in StepExecutionContext - is a bad or not the best way ?
You could try it without saving the data in the StepExecution and instead change the items between the processors
public class FirstProcessor implements ItemProcessor<String, String> {...}
public class SecondProcessor implements ItemProcessor<String, OtherClass> {
public OtherClass process(String item) throws Exception {
return otherClassObjectWithDataForNextProcessor;
}
}
public class ThirdProcessor implements ItemProcessor<OtherClass, TargetClass> {...}
public class CustomItemWriter implements ItemWriter<TargetClass> {...}
see Spring Batch Doc - Chaining Item Processors

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Updating a processed item in Spring Batch - java

Related

Spring Batch Automatically Commits Updates On Database Even Without Specifying It on ItemWriter

Fetch recently inserted row id using Room library

How to access current chunk context programmaticaly from CustomItemWriter?

How to restore state machine from context builded in runtime?

Best approach to pass data between ItemProcessors in spring batch?

Categories

Resources