Spring Boot batch - MultiResourceItemReader : move to next file on error - java

In a batch service, I read multiple XML files using a MultiResourceItemReader, which delegate to a StaxEventItemReader.
If an error is raised reading a file (a parsing exception for example), I would like to specify to Spring to start reading the next matching file. Using #OnReadError annotation and/or a SkipPolicy for example.
Currently, when a reading exception is raised, the batch stops.
Does anyone have an idea how to do it ?
EDIT: I see MultiResourceItemReader has a method readNextItem(), but it's private -_-

I'm not using SB for a while, but looking MultiResourceItemReader code I suppose you can write your own ResourceAwareItemReaderItemStream wrapper where you check for a flag setted to move to next file or to perform a standard read using a delegate.
This flag can be stored into execution-context or into your wrapper and should be cleared after a move next.
class MoveNextReader<T> implements ResourceAwareItemReaderItemStream<T> {
private ResourceAwareItemReaderItemStream delegate;
private boolean skipThisFile = false;
public void setSkipThisFile(boolean value) {
skipThisFile = value;
}
public void setResource(Resource resource) {
skipThisFile = false;
delegate.setResource(resource);
}
public T read() {
if(skipThisFile) {
skipThisFile = false;
// This force MultiResourceItemReader to move to next resource
return null;
}
return delegate.read();
}
}
Use this class as delegate for MultiResourceItemReader and in #OnReadErrorinject MoveNextReader and set MoveNextReader.skipThisFile.
I can't test code from myself but I hope this can be a good starting point.

Here are my final classes to read multiple XML files and jump to the next file when a read error occurs on one (thanks to Luca's idea).
My custom ItemReader, extended from MultiResourceItemReader :
public class MyItemReader extends MultiResourceItemReader<InputElement> {
private SkippableResourceItemReader<InputElement> reader;
public MyItemReader() throws IOException {
super();
// Resources
PathMatchingResourcePatternResolver resourceResolver = new PathMatchingResourcePatternResolver();
this.setResources( resourceResolver.getResources( "classpath:input/inputFile*.xml" ) );
// Delegate reader
reader = new SkippableResourceItemReader<InputElement>();
StaxEventItemReader<InputElement> delegateReader = new StaxEventItemReader<InputElement>();
delegateReader.setFragmentRootElementName("inputElement");
Jaxb2Marshaller unmarshaller = new Jaxb2Marshaller();
unmarshaller.setClassesToBeBound( InputElement.class );
delegateReader.setUnmarshaller( unmarshaller );
reader.setDelegate( delegateReader );
this.setDelegate( reader );
}
[...]
#OnReadError
public void onReadError( Exception exception ){
reader.setSkipResource( true );
}
}
And the ItemReader-in-the-middle used to skip the current resource :
public class SkippableResourceItemReader<T> implements ResourceAwareItemReaderItemStream<T> {
private ResourceAwareItemReaderItemStream<T> delegate;
private boolean skipResource = false;
#Override
public void close() throws ItemStreamException {
delegate.close();
}
#Override
public T read() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
if( skipResource ){
skipResource = false;
return null;
}
return delegate.read();
}
#Override
public void setResource( Resource resource ) {
skipResource = false;
delegate.setResource( resource );
}
#Override
public void open( ExecutionContext executionContext ) throws ItemStreamException {
delegate.open( executionContext );
}
#Override
public void update( ExecutionContext executionContext ) throws ItemStreamException {
delegate.update( executionContext );
}
public void setDelegate(ResourceAwareItemReaderItemStream<T> delegate) {
this.delegate = delegate;
}
public void setSkipResource( boolean skipResource ) {
this.skipResource = skipResource;
}
}

Related

Resilience4j context propagator not able to propagte thread local values

I am trying to migrate my circuit breaker code from Hystrix to Resilience4j. The communication is between two applications out of which one is an artifact containing all the resilience 4j config in the java code itself and the second application which is a microservice uses it directly.
There's one RequestId which generates in the microservice and propagates to the artifact context where it gets printed in the logs. With Hystrix, it was working perfectly fine but ever since I moved to resilience, I am getting null for the request Id.
Below is my config for bulk head and context propagator :
ThreadPoolBulkheadConfig bulkheadConfig = ThreadPoolBulkheadConfig.custom()
.maxThreadPoolSize(maxThreadPoolSize)
.coreThreadPoolSize(coreThreadPoolSize)
.queueCapacity(queueCapacity)
.contextPropagator(new DummyContextPropagator())
.build();
// Bulk Head Registry
ThreadPoolBulkheadRegistry bulkheadRegistry = ThreadPoolBulkheadRegistry.of(bulkheadConfig);
// Create Bulk Head
ThreadPoolBulkhead bulkhead = bulkheadRegistry.bulkhead(name, bulkheadConfig);
Dummy Context Propagator :
public class DummyContextPropagator implements ContextPropagator {
private static final Logger log = LoggerFactory.getLogger( DummyContextPropagator.class);
#Override
public Supplier<Optional<Object>> retrieve() {
return () -> (Optional<Object>) get();
}
#Override
public Consumer<Optional<Object>> copy() {
return (t) -> t.ifPresent(e -> {
clear();
put(e);
});
}
#Override
public Consumer<Optional<Object>> clear() {
return (t) -> DummyContextHolder.clear();
}
public static class DummyContextHolder {
private static final ThreadLocal threadLocal = new ThreadLocal();
private DummyContextHolder() {
}
public static void put(Object context) {
if (threadLocal.get() != null) {
clear();
}
threadLocal.set(context);
}
public static void clear() {
if (threadLocal.get() != null) {
threadLocal.set(null);
threadLocal.remove();
}
}
public static Optional<Object> get() {
return Optional.ofNullable(threadLocal.get());
}
}
}
However, nothing seems to work so that I can get the RequestId.
Am I doing everything right or is there another way to do that ?
i think you want to get params from threadlocal from parent-thread when you in sub-thread, in hystrix it use command-model to decorate callabletask
in resilience4j i think u can fix it like this:
#Resource
DispatcherServlet dispatcherServlet;
#PostConstruct
public void changeThreadLocalModel() {
dispatcherServlet.setThreadContextInheritable(true);
}
i find my last answer may lead to some problems, when you use "dispatcherServlet.setThreadContextInheritable(true);"
it may pollute your custom thread-pool`s threadlocalmap;
so here is my final resolve, and it only works at resilience4j;
#Resource
Resilience4jBulkheadProvider resilience4jBulkheadProvider;
#PostConstruct
public void concurrentThreadContextStrategy() {
ThreadPoolBulkheadConfig threadPoolBulkheadConfig = ThreadPoolBulkheadConfig.custom().contextPropagator(new CustomInheritContextPropagator()).build();
resilience4jBulkheadProvider.configureDefault(id -> new Resilience4jBulkheadConfigurationBuilder()
.bulkheadConfig(BulkheadConfig.ofDefaults()).threadPoolBulkheadConfig(threadPoolBulkheadConfig)
.build());
}
private static class CustomInheritContextPropagator implements ContextPropagator<RequestAttributes> {
#Override
public Supplier<Optional<RequestAttributes>> retrieve() {
// give requestcontext to reference from threadlocal;
// this method call by web-container thread, such as tomcat, jetty,or undertow, depends on what you used;
return () -> Optional.ofNullable(RequestContextHolder.getRequestAttributes());
}
#Override
public Consumer<Optional<RequestAttributes>> copy() {
// load requestcontex into real-call thread
// this method call by resilience4j bulkhead thread;
return requestAttributes -> requestAttributes.ifPresent(context -> {
RequestContextHolder.resetRequestAttributes();
RequestContextHolder.setRequestAttributes(context);
});
}
#Override
public Consumer<Optional<RequestAttributes>> clear() {
// clean requestcontext finally ;
// this method call by resilience4j bulkhead thread;
return requestAttributes -> RequestContextHolder.resetRequestAttributes();
}
}
i got the same problem with springboot 2.5 et springboot cloud 2020.0.6
and I solved it with an implementation of ContextPropagator
public class SleuthPropagator implements ContextPropagator<TraceContext> {
ThreadLocal<ScopedSpan> scopedSpanThreadLocal = new ThreadLocal<>();
#Override
public Supplier<Optional<TraceContext>> retrieve() {
return this::getCurrentcontext;
}
#Override
public Consumer<Optional<TraceContext>> copy() {
return c -> {
if (!c.isPresent()) {
return;
}
TraceContext traceContext = c.get();
ScopedSpan resilience4jSpan = getTracer()
.map(t -> t.startScopedSpanWithParent("Resilience4j", traceContext))
.orElse(null);
scopedSpanThreadLocal.set(resilience4jSpan);
};
}
#Override
public Consumer<Optional<TraceContext>> clear() {
return t -> {
try {
ScopedSpan resilience4jSpan = scopedSpanThreadLocal.get();
if (resilience4jSpan != null) {
resilience4jSpan.finish();
}
} finally {
scopedSpanThreadLocal.remove();
}
};
}
private static Optional<Tracer> getTracer() {
return Optional.ofNullable(Tracing.current())
.map(Tracing::tracer);
}
private Optional<TraceContext> getCurrentcontext() {
return getTracer()
.map(Tracer::currentSpan)
.map(Span::context);
}
}
And use the propagator in adding this to your application.properties
resilience4j.thread-pool-bulkhead.instances.YOUR_BULKHEAD_CONFIG.context-propagators=com.your.package.SleuthPropagator

JUnit/Mockito: How to mock or create a private member variable

I have a private String variable filePath that will be set in the SpringBoot's execute(..) method and then the value will be used in another method that will be called from inside this execute(..).
#Component("filebatchjobtask")
public class FileBatchJobTask extends BaseFileBatchJobTask implements Tasklet {
private String filePath; // PRIVATE VARIABLE THAT WILL BE USED IN A CALL
private static final CalLogger LOGGER = CalLoggerFactory.getLogger(FileBatchJobTask.class);
#Override
public RepeatStatus execute(final StepContribution stepContribution, final ChunkContext chunkContext) throws Exception {
// INITIALIZE PRIVATE VARIABLE HERE
filePath = chunkContext.getStepContext().getJobParameters().get(Constants.FILEPATH).toString();
processeFile(); // METHOD CALL WHERE FILEPATH INITIALIZED ABOVE WILL BE USED
return RepeatStatus.FINISHED;
}
#Override
protected void processeFile() throws IOException {
LOGGER.warn("FileBatchJobTask:processeFile():: Directory to process files: " + filePath);
File[] filelist = geteFiles(filePath); // THIS IS THE CALL I WANT TO MOCK
if (filelist == null || filelist.length < 1) {
LOGGER.warn("FileBatchJobTask: No eFiles available to process");
return;
}
LOGGER.warn("Total number of files to process: " + filelist.length);
}
It's corresponding test below:
//#RunWith(PowerMockRunner.class)
#RunWith(MockitoJUnitRunner.class)
public class FileBatchJobTaskTest extends BaseFileBatchJobTaskTest {
#InjectMocks
FileBatchJobTask fileBatchJobTask;
#Override
BaseFileBatchJobTask createFileBatchJobTask() {
return fileBatchJobTask;
}
#Test
public void processeFile() {
BaseFileBatchJobTask batchJobTask = Mockito.spy(createFileBatchJobTask());
// THIS resourceDir is the I want to use instead of filePath variable in tests here and pick file from this test resource path
Path resourceDir = Paths.get("src", "test", "resources", "data", "validation");
resourcePath = resourceDir.toFile().getAbsolutePath();
File fileDir = new File(resourcePath);
File[] files = fileDir.listFiles(new FileFilter() {
#Override
public boolean accept(final File pathname) {
String name = pathname.getName().toLowerCase();
return name.endsWith(".xml") && pathname.isFile();
}
});
doReturn(files).when(batchJobTask).geteFiles(anyString()); // THIS IS THE CALL I AM TRYING TO MOCK
try {
fileBatchJobTask.processeFile();
Assert.assertTrue(true);
} catch (...) {
}
}
This is the base class
class BaseFileBatchJobTask {
protected File[] geteFiles(final String eFileDirPath) {
File fileDir = new File(eFileDirPath); // NPE as eFileDirPath is null
File[] files = fileDir.listFiles(new FileFilter() {
#Override
public boolean accept(final File pathname) {
String name = pathname.getName().toLowerCase();
return name.endsWith(".xml") && pathname.isFile();
}
});
return files;
}
}
ERROR: I am getting NPE as when the test is run, getEFiles() is executed and filePath is null. Since I am mocking, it shouldn't go inside the actual implementation of the method. However, seems it's not being mocked as expected, so need help in figuring out the issue.
Also looked up a lot of SO posts but couldn't figure out the issue so please don't mark as duplicate if you don't know the answer :)
You need to call processeFile() on the spied version of your jobTask, not on the original one. Think about a spy being a wrapper around the spied object, that intercepts the mocked calls.
For short, just use batchJobTask inside the try-catch block like this:
try {
batchJobTask.processeFile();
Assert.assertTrue(true);
} catch (...) {
}

How to edit ExecutionContext spring batch

is there any way to add entry in the executioncontext other than read(), updated(), and open() method.
Like in the below code I'm trying to add entry in the close method.
public class MyFileReader extends FlatFileItemReader<AccountDetails>{
private long currentRowProcessedCount = 0;
#Autowired
private ExecutionContext executionContext;
#Override
public synchronized AccountDetails read() throws Exception, UnexpectedInputException, ParseException {
AccountDetails accDetailsObj = super.read();
currentRowProcessedCount++;
return accDetailsObj;
}
#Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
super.open(executionContext);
currentRowProcessedCount = executionContext.getLong(Constants.CONTEXT_COUNT_KEY.getStrValue(),0);
this.executionContext = executionContext;
}
#Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
executionContext.putLong(Constants.CONTEXT_COUNT_KEY.getStrValue(), currentRowProcessedCount);
}
#Override
public void close() throws ItemStreamException {
System.out.println("close --------------"+currentRowProcessedCount);
System.out.println(executionContext.getLong(Constants.CONTEXT_COUNT_KEY.getStrValue()));
this.executionContext.putLong(Constants.CONTEXT_COUNT_KEY.getStrValue(), currentRowProcessedCount);
}
}
in the above example I'm not able to updated new entry.
It' only working as readonly. I can read data but no write.
class abc{
#Autowired
private ExecutionContext executionContext;
public AccountDetails mapFieldSet(FieldSet fieldSet) throws BindException {
executionContext.putLong(Constants.CONTEXT_COUNT_KEY.getStrValue(), 47);
return accDetailsObj;
}
}
I need to updated executionContext in other classes also.
Is there any way?
You just use put(String key, Object value) to override already existing value.
ExecutionContext is backed by ConcurrentHashMap so if you really want it you can get reference to it via reflection and then use computeIfAbsent, etc...
Also counting already implemented in AbstractItemCountingItemStreamItemReader and If you inherited from it (and you are), this should be already solved.

Spring Batch FlatFileItemReader populating field value across all items/lines read

I am reading a flat file using spring batch FlatFileItemReader.
I have a requestId field which i need to populate with a unique value for all records read from the flat file.
eg: When i read file1. I want to set the requestId to 1 for all Item objects created at requestId field. For file2, i need to set requestId to 2.
my requestId is uniquely generated by a separate class.
How can I achieve this using spring batch?
there are some possible solutions
use an ResourceAware Item
public class MyItem implements ResourceAware {
private Resource resource;
public String getId() {
return createIdFromResource(resource);
}
private String createIdFromResource(final Resource resource) {
// create your ID here
return resource.getFilename();
}
#Override
public void setResource(final Resource resource) {
this.resource = resource;
}
}
use an Listener (here with interfaces, less verbose use of annotations is possible too)
public class TestListener implements StepExecutionListener, ItemReadListener<String> {
private StepExecution stepExecution;
private static final String CURRENT_ID = "currentId";
#Override
public void beforeStep(final StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
#Override
public ExitStatus afterStep(final StepExecution stepExecution) {
return null;
}
#Override
public void beforeRead() {
}
#Override
public void afterRead(final String item) {
String currentId = null;
if (stepExecution.getExecutionContext().containsKey(CURRENT_ID)) {
currentId = stepExecution.getExecutionContext().getString(CURRENT_ID);
} else {
String fileName = stepExecution.getExecutionContext().getString("fileName");
// ... create ID from FileName
currentId = fileName + "foo";
stepExecution.getExecutionContext().put(CURRENT_ID, currentId);
}
}
#Override
public void onReadError(final Exception ex) {
}
}
in the above example the current fileName is avavailable in the stepExecutionContext, it might be you have to pull it from jobParameters and extract the filename
String paramValue = stepExecution.getJobExecution().getJobParameters().getString("paramName");
// extract fileName from paramValue

Logging within a JUL log handler

I've created a new log handler for JUL which extends java.util.logging.Handler.
Is there any standardized way how I can react to errors that occur during the processing of the LogRecord (like LogLog in log4j)?
Just using a JUL-Logger results in another LogRecord that has to be processed by the very same handler.
So, I'm basically looking for a (standard) way how to log messages without creating an endless loop. At the moment, I'm comparing the sourceClassName to prevent such a loop.
Exceptions that occur inside of the handler should be reported using Handler.reportError(String, Exception, int). This reports failures to the installed ErrorManager which can be customized. Using that should take care of most of the endless loops.
However, if act of publishing relies on a lib that also generates log records then you have to resort to detecting the loop. Use a java.lang.ThreadLocal and some sort of enum to track the state changes.
public class HandlerReentrance extends Handler {
private static final Level PUBLISH = Level.ALL;
private static final Level REPORT = Level.OFF;
private static final ThreadLocal<Level> LOCAL = new ThreadLocal<>();
#Override
public void publish(LogRecord record) {
if (LOCAL.get() == null) {
LOCAL.set(PUBLISH);
try {
doPublish(record);
} finally {
LOCAL.remove();
}
} else {
final Level last = LOCAL.get();
if (PUBLISH.equals(last)) {
LOCAL.set(REPORT);
try {
reportLoop(record);
} finally {
LOCAL.set(last);
}
}
}
}
private void doPublish(LogRecord record) {
if (isLoggable(record)) {
//Insert code.
}
}
private void reportLoop(LogRecord record) {
//Insert code.
}
#Override
public void flush() {
}
#Override
public void close() {
}
}
At the moment, I'm going for this solution:
public class MyHandler extends Handler {
private final static String className = MyHandler.class.getName();
private final static Logger logLogger = Logger.getLogger(className);
...
#Override
public void publish(LogRecord record) {
if (!super.isLoggable(record)) {
return;
}
String loggerName = record.getLoggerName(); //EDIT: Was getSourceClassName before
if (loggerName != null && loggerName.equals(className)) {
// This is our own log line; returning immediately to prevent endless loops
return;
}
try {
// ... do actual handling...
} catch(Exception e) {
logLogger.logp(level.SEVERE, className, "publish", "something went wrong", e);
}
}
}

Categories

Resources