I have a job with a multi-threaded chunk-oriented step and I need to count how many written items satisfy some business rules. (PS: For legacy reasons, I'm using Spring Batch 3.0.x)
I have to keep in mind that if a rollback happens, then previous already counted items within the same transaction (i.e. same chunk) must be ignored. So I can't just update JobExecutionContext straight from Writer, rather I update an attribute in ChunkContext and use a CustomChunkListener to only update the JobExecutionContext after the chunk succeeds (as you can see in code below).
Before making the step multi-threaded, I had following implementation that worked as expected (I simplified the code as much as I could to focus on the issue):
CustomItemWriter
public class CustomItemWriter implements ItemWriter<String[]> {
private ChunkContext chunkContext;
#Override
public void write(List<? extends String[]> items) throws Exception {
for (String[] item : items) {
((AtomicLong)this.chunkContext.getAttribute("chunkCounter")).incrementAndGet();
}
}
#BeforeChunk
private void beforeChunk(ChunkContext chunkContext) {
this.chunkContext = chunkContext;
}
}
CustomChunkListener
public class CustomChunkListener extends ChunkListenerSupport {
#Override
public void beforeChunk(ChunkContext context) {
context.setAttribute("chunkCounter", new AtomicLong());
}
#Override
public void afterChunk(ChunkContext context) {
((AtomicLong)context.getStepContext().getJobExecutionContext().get("jobCounter")).addAndGet(((AtomicLong)context.getAttribute("chunkCounter")).get());
}
}
CustomJobListener
public class CustomJobListener extends JobExecutionListenerSupport {
#Override
public void beforeJob(JobExecution jobExecution) {
jobExecution.getExecutionContext().put("jobCounter", new AtomicLong());
}
#Override
public void afterJob(JobExecution jobExecution) {
System.out.println("jobCounter = " + jobExecution.getExecutionContext().get("jobCounter"));
}
}
However, when I configured the job to run the step in a multi-threaded fashion, the counter wasn't being updated properly and I know that it was because of the way I was getting access to the ChunkContext in the CustomItemWriter.
The bean CustomItemWriter is of "step scope" (as far as I know, there is no "chunk scope" available), so each time a thread started a new ChunkContext, the method beforeChunk in CustomItemWriter was overwriting the previous ChunkContext and was messing everything up (previously counted would then be gone, since I had lost reference to previous ChunkContext instances).
So, I managed to fix the issue by using ThreadLocal, like below:
CustomItemWriter (v2)
public class CustomItemWriter implements ItemWriter<String[]> {
private ThreadLocal<ChunkContext> chunkContext = new ThreadLocal<ChunkContext>();
#Override
public void write(List<? extends String[]> items) throws Exception {
for (String[] item : items) {
((AtomicLong)this.chunkContext.get().getAttribute("chunkCounter")).incrementAndGet();
}
}
#BeforeChunk
private void beforeChunk(ChunkContext chunkContext) {
this.chunkContext.set(chunkContext);
}
}
Although I have managed to solve the problem, I'm wondering if there is a better way to access the current ChunkContext (for the current thread) from within the CustomItemWriter. Is there a way to get it programmaticaly? To do it "the Spring way", perhaps a [new] Chunk Scope should be implemented in newer versions of Spring Batch?
PS: Also, although the problem is solved, I thought it would be helpful to write this question so it can help someone with the same needs.
Although I have managed to solve the problem, I'm wondering if there is a better way to access the current ChunkContext (for the current thread) from within the CustomItemWriter
No, I see no other way to do that. However, it is not recommended to play with the chunk context in a multi-threaded step. This context is a mutable state which is shared between all threads processing the step (with all the nasty bugs of using shared mutable state in a multi-threaded environment).
The documentation will be updated in that regard, see https://github.com/spring-projects/spring-batch/pull/591/files#diff-177ad333794c9242aaa9ec2d0bec1842R147-R150.
Related
I have multiple methods in my codebase annotated with Spring's #transactional with different propgation levels (lets ignore the idea behind choosing the propagation levels). Example -
public class X {
#Transactional(Propagation.NOT_SUPPORTED)
public void A() { do_something; }
#Transactional(Propagation.REQUIRED)
public void B() { do_something; }
#Transactional(Propagation.REQUIRES_NEW)
public void C() { do_something; }
}
Now I have a new use case where I want to perform all these operations in a single transaction (for this specific use case only, without modifying existing behavior), overriding any annotated propagation levels. Example -
public class Y {
private X x;
// Stores application's global state
private GlobalState globalState;
#Transactional
public void newOperation() {
// Set current operation as the new operation in the global state,
// in case this info might be required somewhere
globalState.setCurrentOperation("newOperation");
// For this new operation A, B, C should be performed in the current
// transaction regardless of the propagation level defined on them
x.A();
x.B();
x.C();
}
}
Does Spring provide some way to achieve this ? Is this not possible ?
One way I could think of is to split the original methods
#Transactional(Propagation.NOT_SUPPORTED)
public void A() { A_actual(); }
// Call A_actual from A and newOperation
public void A_actual() { do_something; }
But this might not be as simple to do as this example (there can be a lot of such methods and doing this might not scale). Also it does not look much clean.
Also the use case might also appear counter intuitive, but anyway let's keep that out of scope of this question.
I do believe the only option is to replace TransactionInterceptor via BeanPostProcessor, smth. like:
public class TransactionInterceptorExt extends TransactionInterceptor {
#Override
public Object invoke(MethodInvocation invocation) throws Throwable {
// here some logic determining how to proceed invocation
return super.invoke(invocation);
}
}
public class TransactionInterceptorPostProcessor implements BeanFactoryPostProcessor, BeanPostProcessor, BeanFactoryAware {
#Setter
private BeanFactory beanFactory;
#Override
public void postProcessBeanFactory(#NonNull ConfigurableListableBeanFactory beanFactory) throws BeansException {
beanFactory.addBeanPostProcessor(this);
}
#Override
public Object postProcessBeforeInitialization(#NonNull Object bean, #NonNull String beanName) throws BeansException {
if (bean instanceof TransactionInterceptor) {
TransactionInterceptor interceptor = (TransactionInterceptor) bean;
TransactionInterceptor result = new TransactionInterceptorExt();
result.setTransactionAttributeSource(interceptor.getTransactionAttributeSource());
result.setTransactionManager(interceptor.getTransactionManager());
result.setBeanFactory(beanFactory);
return result;
}
return bean;
}
}
#Configuration
public class CustomTransactionConfiguration {
#Bean
//#ConditionalOnBean(TransactionInterceptor.class)
public static BeanFactoryPostProcessor transactionInterceptorPostProcessor() {
return new TransactionInterceptorPostProcessor();
}
}
However, I would agree with #jim-garrison suggestion to refactor your spring beans.
UPD.
But you favour refactoring the beans instead of following this approach. So for the sake of completeness, can you please mention any issues/shortcomings with this
Well, there are a plenty of things/concepts/ideas in spring framework which were implemented without understanding/anticipating consequences (I believe the goal was to make framework attractive to unexperienced developers), and #Transactional annotation is one of such things. Let's consider the following code:
#Transactional(Propagation.REQUIRED)
public void doSomething() {
do_something;
}
The question is: why do we put #Transactional(Propagation.REQUIRED) annotation above that method? Someone might say smth. like this:
that method modifies multiple rows/tables in DB and we would like to avoid inconsistencies in our DB, moreover Propagation.REQUIRED does not hurt anything, because according to the contract it either starts new transaction or joins to the exisiting one.
and that would be wrong:
#Transactional annotation poisons stacktraces with irrelevant information
in case of exception it marks existing transaction it joined to as rollback-only - after that caller side has no option to compensate that exception
In the most cases developers should not use #Transactional(Propagation.REQUIRED) - technically we just need a simple assertion about transaction status.
Using #Transactional(Propagation.REQUIRES_NEW) is even more harmful:
in case of existing transaction it acquires another one JDBC-connection from connection pool, and hence you start getting 2+ connections per thread - this hurts performance sizing
you need to carefully watch for data you are working with - data corruptions and self-locks are the consequences of using #Transactional(Propagation.REQUIRES_NEW), cause now you have two incarnations of the same data within the same thread
In the most cases #Transactional(Propagation.REQUIRES_NEW) is an indicator that you code requires refactoring.
So, the general idea about #Transactional annotation is do not use it everywhere just because we can, and your question actually confirms this idea: you have failed to tie up 3 methods together just because developer had some assumptions about how those methods should being executed.
I have created an Osgi service. I want to create a new instance of my service each time the service request comes.
Code look likes this -
#Component(immediate=true)
#Service(serviceFactory = true)
#Property(name = EventConstants.EVENT_TOPIC, value = {DEPLOY, UNDEPLOY })
public class XyzHandler implements EventHandler {
private Consumer consumer;
public static setConsumer(Consumer consumer) {
this.consumer = consumer;
}
#Override
public void handleEvent(final Event event) {
consumer.notify();
}
}
public class Consumer {
private DataSourceCache cache;
public void notify() {
updateCache(cache);
System.out.println("cache updated");
}
public void updateCache(DataSourceCache cache) {
cache = null;
}
}
In my Consumer class, I want to access the service instance of XyzHandler & set the attribute consumer. Also I would like to have a new service instance of XyzHandler created every time for each request.
I found few articles where it is mentioned that using osgi declarative service annotations this can be achieved.
OSGi how to run mutliple instances of one service
But I want to achieve this without using DS 1.3.
How can I do this without using annotations or how can it be done using DS 1.2?
To me this looks like a case of having asked a question based on what you think the answer is rather than describing what you're trying to achieve. If we take a few steps back then a more elegant solution exists.
In general injecting objects into stateful services is a bad pattern in OSGi. It forces you to be really careful about the lifecycle, and risks memory leaks. From the example code it appears as though what you really want is for your Consumer to get notified when an event occurs on an Event Admin topic. The easiest way to do this would be to remove the XyzHandler from the equation and make the Consumer an Event Handler like this:
#Component(property= { EventConstants.EVENT_TOPIC + "=" + DEPLOY,
EventConstants.EVENT_TOPIC + "=" + UNDEPLOY})
public class Consumer implements EventHandler {
private DataSourceCache cache;
#Override
public void handleEvent(final Event event) {
notify();
}
public void notify() {
updateCache(cache);
System.out.println("cache updated");
}
public void updateCache(DataSourceCache cache) {
cache = null;
}
}
If you really don't want to make your Consumer an EventHandler then it would still be easier to register the Consumer as a service and use the whiteboard pattern to get it picked up by a single XyzHandler:
#Component(service=Consumer.class)
public class Consumer {
private DataSourceCache cache;
public void notify() {
updateCache(cache);
System.out.println("cache updated");
}
public void updateCache(DataSourceCache cache) {
cache = null;
}
}
#Component(property= { EventConstants.EVENT_TOPIC + "=" + DEPLOY,
EventConstants.EVENT_TOPIC + "=" + UNDEPLOY})
public class XyzHandler implements EventHandler {
// Use a thread safe list for dynamic references!
private List<Consumer> consumers = new CopyOnWriteArrayList<>();
#Reference(cardinality=MULTIPLE, policy=DYNAMIC)
void addConsumer(Consumer consumer) {
consumers.add(consumer);
}
void removeConsumer(Consumer consumer) {
consumers.remove(consumer);
}
#Override
public void handleEvent(final Event event) {
consumers.forEach(this::notify);
}
private void notify(Consumer consumer) {
try {
consumer.notify();
} catch (Exception e) {
// TODO log this?
}
}
}
Using the whiteboard pattern in this way avoids you needing to track which XyzHandler needs to be created/destroyed when a bundle is started or stopped, and will keep your code much cleaner.
It sounds like your service needs to be a prototype scope service. This was introduced in Core R6. DS 1.3, from Compendium R6, includes support for components to be prototype scope services.
But DS 1.2 predates Core R6 and thus has no knowledge or support for prototype scope services.
The title might be incorrect, but I will try to explain my issue. My project is a Spring Boot project. I have services which do calls to external REST endpoints.
I have a service method which contains several method calls to other services I have. Every individual method call can be successful or not. Every method call is done to a REST endpoint and there can be issues that for example the webservice is not available or that it throws an unknown exception in rare cases. What ever happens, I need to be able to track which method calls were successful and if any one of them fails, I want to rollback to the original state as if nothing happened, see it a bit as #Transactional annotation. All REST calls are different endpoints and need to be called separately and are from an external party which I don't have influence on. Example:
public MyServiceImpl implements MyService {
#Autowired
private Process1Service;
#Autowired
private Process2Service;
#Autowired
private Process3Service;
#Autowired
private Process4Service;
public void bundledProcess() {
process1Service.createFileRESTcall();
process2Service.addFilePermissionsRESTcall();
process3Service.addFileMetadataRESTcall(); <-- might fail for example
process4Service.addFileTimestampRESTcall();
}
}
If for example process3Service.addFileMetadataRESTcall fails I want to do something like undo (in reverse order) for every step before process3:
process2Service.removeFilePermissionsRESTcall();
process1Service.deleteFileRESTcall();
I read about the Command pattern, but that seems to be used for Undo actions inside an application as a sort of history of actions performed, not inside a Spring web application. Is this correct for my use case too or should I track per method/webservice call if it was successful? Is there a best practice for doing this?
I guess however I track it, I need to know which method call failed and from there on perform my 'undo' method REST calls. Although in theory even these calls might also fail of course.
My main goal is to not have files being created (in my example) which any further processes have not been performed on. It should either be all successful or nothing. A sort of transactional.
Update1: improved pseudo implementation based on comments:
public Process1ServiceImpl implements Process1Service {
public void createFileRESTcall() throws MyException {
// Call an external REST api, pseudo code:
if (REST-call fails) {
throw new MyException("External REST api failed");
}
}
}
public class BundledProcessEvent {
private boolean createFileSuccess;
private boolean addFilePermissionsSuccess;
private boolean addFileMetadataSuccess;
private boolean addFileTimestampSuccess;
// Getters and setters
}
public MyServiceImpl implements MyService {
#Autowired
private Process1Service;
#Autowired
private Process2Service;
#Autowired
private Process3Service;
#Autowired
private Process4Service;
#Autowired
private ApplicationEventPublisher applicationEventPublisher;
#Transactional(rollbackOn = MyException.class)
public void bundledProcess() {
BundleProcessEvent bundleProcessEvent = new BundleProcessEvent();
this.applicationEventPublisher.publishEvent(bundleProcessEvent);
bundleProcessEvent.setCreateFileSuccess = bundprocess1Service.createFileRESTcall();
bundleProcessEvent.setAddFilePermissionsSuccess = process2Service.addFilePermissionsRESTcall();
bundleProcessEvent.setAddFileMetadataSuccess = process3Service.addFileMetadataRESTcall();
bundleProcessEvent.setAddFileTimestampSuccess = process4Service.addFileTimestampRESTcall();
}
#TransactionalEventListener(phase = TransactionPhase.AFTER_ROLLBACK)
public void rollback(BundleProcessEvent bundleProcessEvent) {
// If the last process event is successful, we should not
// be in this rollback method even
//if (bundleProcessEvent.isAddFileTimestampSuccess()) {
// remove timestamp
//}
if (bundleProcessEvent.isAddFileMetadataSuccess()) {
// remove metadata
}
if (bundleProcessEvent.isAddFilePermissionsSuccess()) {
// remove file permissions
}
if (bundleProcessEvent.isCreateFileSuccess()) {
// remove file
}
}
Your operation looks like a transaction, so you can use #Transactional annotation. From your code I can't really tell how you are managing HTTP response calls for each of those operations, but you should consider having your service methods to return them, and then do a rollback depending on response calls. You can create an array of methods like so, but how exactly you want your logic to be is up to you.
private Process[] restCalls = new Process[] {
new Process() { public void call() { process1Service.createFileRESTcall(); } },
new Process() { public void call() { process2Service.addFilePermissionsRESTcall(); } },
new Process() { public void call() { process3Service.addFileMetadataRESTcall(); } },
new Process() { public void call() { process4Service.addFileTimestampRESTcall(); } },
};
interface Process {
void call();
}
#Transactional(rollbackOn = Exception.class)
public void bundledProcess() {
restCalls[0].call();
... // say, see which process returned wrong response code
}
#TransactionalEventListener(phase = TransactionPhase.AFTER_ROLLBACK)
public void rollback() {
// handle rollback according to failed method index
}
Check this article. Might come in handy.
The answer to this question is quite broad. There are various ways to do distributed transactions to go through them all here. However, since you are using Java and Spring, your best bet is to use something like JTA (Java Transaction API), which enables a distributed transactions across multiple services/instances/etc.. Fortunately, Spring Boot supports JTA using either Atomikos or Bitronix. You can read the doc here.
One approach to enable distributed transactions is through a message broker such as JMS, RabbitMQ, Kafka, ActiveMQ, etc. and use a protocol like XA transactions (two-phase commit). In the case of external services that do not support distributed, one approach is to write a wrapper service that understands XA transactions to that external service.
I've been using SpringBatch for a few months now..
I used to store execution-related variables(like page count, item count, current position of a batch and so on) into Beans. Then those beans are mounted onto ItemReader, ItemProcessor, ItemWriter by using setVar(), getVar()-setters and getters. Also those beans are shared among threads with manual synchronization.
But now I found out this could be a wrong way of doing batch jobs. Beans mounted to ItemReaders can't be persistent in JobRepository and therefore unable to record states for stopping and restarting of a Job. So I still need to go back and use StepExecution/JobExecution.
Those examples I found online are all based on either XML config, or the worse SpEL autowired to a setter method..
I use purely Java Config..Is there a Java config or Java code-oriented way of accessing StepExecution? What's the best practice for accessing various sorts of ExecutionContext?
To get access to the StepExecution and the JobExecution your ItemReader, ItemProcessor, or ItemWriter will have to implement the StepExecutionListener.
For instance:
public class MyCustomItemWriter implements ItemWriter<Object>, StepExecutionListener {
private StepExecution stepExecution;
#Override
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
return stepExecution.getExitStatus();
}
#Override
public void write(List<? extends Object> list) throws Exception {
if (null == list || list.isEmpty()) {
throw new Exception("Cannot write null or empty list");
}
ExecutionContext stepExecContext = this.stepExecution.getExecutionContext()
ExecutionContext jobExecContext = this.stepExecution.getJobExecution().getExecutionContext();
// TODO: Write your code here
}
}
To get access to StepExecution, JobExecution, you can use methods with annotations from package org.springframework.batch.core.annotation or implementing iterfaces like JobExecutionListener, StepExecutionListener depending on your needs
I'm trying to unit test an existing ZK controller and I want to find a way to handle a call like the following while unit testing my Controller,
Sessions.getCurrent().setAttribute("from", from.getValue());
I'd be happy to either replacing the offending code, or find a way around it for the unit test. My goal is testability by dealing with the NullPointerException
My test is simple (it's not too bad a place to start...)
#Test
public void zkControllerDoesMockingInitialisedSuccessfully() throws Exception {
T2TripBigDaoInterface tripBigDao = createMock(T2TripBigDao.class);
ZkFieldValidator fieldValidator = createMock(ZkTextFieldValidator.class);
FieldRangeValidator rangeValidator = createMock(DefaultFieldRangeValidator.class);
TripController controller = new TripController(tripBigDao, fieldValidator, rangeValidator);
replay(tripBigDao, fieldValidator, rangeValidator);
controller.onClick$getTrips(new Event("getTrips"));
verify(tripBigDao, fieldValidator, rangeValidator);
// Test purpose: Just get a unit test of the controller running to start with....
}
Extract of the controller:
public class TripController extends GenericForwardComposer {
....
public void onClick$getTrips(Event event) throws Exception {
Sessions.getCurrent().setAttribute("from", from.getValue());
Sessions.getCurrent().setAttribute("to", to.getValue());
....
}
....
Extract of the stack trace:
java.lang.NullPointerException
at com.t2.webservice.controller.alert.TripController.onClick$getTrips(TripController.java:72)
at com.t2.webservice.controller.alert.TripControllerTest.zkControllerDoesMockingInitialisedSuccessfully(TripControllerTest.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
This is one of the things I dislike most about ZK: their use of singletons and the impact that has on testability.
What I end up doing is removing any references to their singletons (Sessions, Executions, Selectors) from my controllers. In normal operation these singletons get used, but in tests they can be mocked out.
How you go about this is up to you, I still haven't found a pattern I'm in love with.
Here's one idea..
public class TripController extends GenericForwardComposer {
private final TripSessionManager tripSessionManager;
public TripController() {
// ZK calls the default constructor
this(new ZKTripSessionManager());
}
protected TripController(TripSessionManager tripSessionManager) {
this.tripSessionManager = tripSessionManager;
}
public void onClick$getTrips(Event event) throws Exception {
tripSessionManager.setTo(to.getValue());
tripSessionManager.setFrom(from.getValue());
}
}
Your TripSessionManager would then look like this..
public interface TripSessionManager {
void setTo(String to);
void setFrom(String from);
}
With the default ZK implementation relying on the Sessions singleton..
public ZKTripSessionManager implements TripSessionManager {
public void setTo(String to) {
setAttribute("to", to);
}
public void setFrom(String from) {
setAttribute("from", from);
}
private void setAttribute(String name, String value) {
// only valid if called in a ZK managed thread
Sessions.getCurrent().setAttribute(name, value);
}
}
By abstracting out the implementation, you can test your controller with a mock TripSessionManager..
#Test
public void test() {
TripSessionManager mockTripSessionManager = mock(TripSessionManager);
when(mockTripSessionManager.setTo(anyString()).thenAnswer(...);
when(mockTripSessionManager.setFrom(anyString()).thenAnswer(...);
TripController controller = new TripController(mockTripSessionManager);
}
You could also imagine different ways of managing these new dependencies (eg: avoid new ZKTripSessionManager()) using dependency injection frameworks like Spring or Guice.