Handle json parse error without crashing Kafka stream processor application - java

I have a kafka streaming application which map/transforms json message and streams the output to a topic.
KStream<String, String> logMessageStream = builder.stream(inputTopic, Consumed.with(stringSerde, stringSerde));
logMessageStream.map((k, v) -> { //Map record
try { // Map record to (requestId, message)
// readValue throws IOException, JsonParseException, JsonMappingException
LogMessage logMessage = objectMapper.readValue(v, LogMessage.class);
return new KeyValue<>(logMessage.requestId(), logMessage);
} catch (IOException e) {
e.printStackTrace();
}
return null; // <== RETURNS null due to caught exception
}).toStream().to(outoutTopic)
now i will get parse error if the input record json contains invalid syntax, the stream application crashes with :
java.lang.NullPointerException
at org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:42)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:146)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:129)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:93)
....
I want to consume this error while mapping and continue the processing for other message. Is there any handler I can set to consumer the exception. Looking for suggestions.
Thanks..

You can also take advantage of the StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG property, as detailed on https://docs.confluent.io/current/streams/faq.html#handling-corrupted-records-and-deserialization-errors-poison-pill-records .
Properties streamsSettings = new Properties();
streamsSettings.put(
StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndContinueExceptionHandler.class.getName()
);

Instead of using map() you can use flatMap() that allows you to return zero elements. Returning null from a map() is not allowed as pointed out in the JavaDocs:
The provided {#link KeyValueMapper} must return a {#link KeyValue} type and must not return {#code null}.
Note, that flatMap() does not allow to return null either. But it accepts anything you can iterate over (ie, Iterable). For example, you can return a Collections.singleton() on success, and Collection.emptySet() on failure.

Just take a look at setUncaughtExceptionHandler method:
KafkaStreams streams = new KafkaStreams(topology, props);
streams.setUncaughtExceptionHandler((Thread t, Throwable e) -> {
// your logic here
});

Related

kafka streams: publish/send messages even when few record transformation throw exceptions?

A typical kafka streams application flow is as below (not including all step like props/serdes etc) -
final StreamsBuilder builder = new StreamsBuilder();
final KStream<String, String> textLines = builder.stream(inputTopic);
final KStream<String, String> textTransformation_1 = textLines.processValues(value -> value+"firstTranstormation");
final KStream<String, String> textTransformation_2 = textTransformation_1.processValues(value -> value+"secondTranstormation");
//my concern is at this stage -
final KStream<String, String> textTransformation_3 = textTransformation_2.processValues(this::processValueAndDoRelatedStuff);
....
....
textTransformation_x.to(outputTopic, Produced.with(Serdes.String(), Serdes.Long()));
final KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfiguration);
streams.start();
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
Now if the processValueAndDoRelatedStuff(String input) method throws an error, I don't want the program to crash but want kafka to only NOT send that one transformation output to outputTopic (i.e ignore the transformation of that one record) and continue dealing with processing rest of the incoming messages normally.
Is the above possible??
In generally, as there is a way to skip sending transformation output to outputTopic based on a predicate. In the next stage, I can think of adding an filter, if in processValueAndDoRelatedStuff(String input) i can catch the exception and return some value based on which I can filter in the next stage.
final KStream<String, String> textTransformation_4 = textTransformation_3.filter((k,v) -> !v.equals("badrecord"));
But I am more interested in the case where the exception is not handled but thrown from the mapper functions. Is it possible for kafka to ignore that one record causing an exception and still proceed with rest of processing.
The default behavior is to stop the topology on any uncaught exception.
If you want to catch them, simply don't use a function handle. Use a try-catch around the function
final KStream<String, String> textTransformation_3 = textTransformation_2.processValues(value -> {
try {
return processValueAndDoRelatedStuff(value);
} catch (Exception e) {
// log, if you want
return null;
}
).filter((k, v) -> Objects.nonNull(v)); // remove events that caused exceptions
Otherwise, you can set exception handlers, as well - https://developer.confluent.io/learn-kafka/kafka-streams/error-handling/

why do I receive an empty string when mapping Mono<Void> to Mono<String>?

I am developing an API REST using Spring WebFlux, but I have problems when uploading files. They are stored but I don't get the expected return value.
This is what I do:
Receive a Flux<Part>
Cast Part to FilePart.
Save parts with transferTo() (this return a Mono<Void>)
Map the Mono<Void> to Mono<String>, using file name.
Return Flux<String> to client.
I expect file name to be returned, but client gets an empty string.
Controller code
#PostMapping(value = "/muscles/{id}/image")
public Flux<String> updateImage(#PathVariable("id") String id, #RequestBody Flux<Part> file) {
log.info("REST request to update image to Muscle");
return storageService.saveFiles(file);
}
StorageService
public Flux<String> saveFiles(Flux<Part> parts) {
log.info("StorageService.saveFiles({})", parts);
return
parts
.filter(p -> p instanceof FilePart)
.cast(FilePart.class)
.flatMap(file -> saveFile(file));
}
private Mono<String> saveFile(FilePart filePart) {
log.info("StorageService.saveFile({})", filePart);
String filename = DigestUtils.sha256Hex(filePart.filename() + new Date());
Path target = rootLocation.resolve(filename);
try {
Files.deleteIfExists(target);
File file = Files.createFile(target).toFile();
return filePart.transferTo(file)
.map(r -> filename);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
FilePart.transferTo() returns Mono<Void>, which signals when the operation is done - this means the reactive Publisher will only publish an onComplete/onError signal and will never publish a value before that.
This means that the map operation was never executed, because it's only given elements published by the source.
You can return the name of the file and still chain reactive operators, like this:
return part.transferTo(file).thenReturn(part.filename());
It is forbidden to use the block operator within a reactive pipeline and it even throws an exception at runtime as of Reactor 3.2.
Using subscribe as an alternative is not good either, because subscribe will decouple the transferring process from your request processing, making those happen in different execution sequences. This means that your server could be done processing the request and close the HTTP connection while the other part is still trying to read the file part to copy it on disk. This is likely to fail in subtle ways at runtime.
FilePart.transferTo() returns Mono<Void> that is a constant empty. Then, map after that was never executed. I solved it by doing this:
private Mono<String> saveFile(FilePart filePart) {
log.info("StorageService.saveFile({})", filePart);
String filename = DigestUtils.sha256Hex(filePart.filename() + new Date());
Path target = rootLocation.resolve(filename);
try {
Files.deleteIfExists(target);
File file = Files.createFile(target).toFile();
return filePart
.transferTo(file)
.doOnSuccess(data -> log.info("do something..."))
.thenReturn(filename);
} catch (IOException e) {
throw new RuntimeException(e);
}
}

RxJava: OnErrorFailedException. Identifying the correct cause

Being inspired by T.Nurkiewicz's "Reactive Programming with RxJava" I tried to apply it in a project that I am working on and here's the issue that I am facing.
I have a Rest end point that takes an input stream and a username and either returns a link for the updated username or returns a Bad Request error. Here's how I tried to implement this using RxJava:
#PUT
#Path("{username}")
public Response updateCredential(#PathParam("username") final String username, InputStream stream) {
CredentialCandidate candidate = new CredentialCandidate();
Observable.just(repository.getByUsername(username))
.subscribe(
credential -> {
serializeCandidate(candidate, stream);
try {
repository.updateCredential(build(credential, candidate));
} catch (Exception e) {
String msg = "Failed to update credential +\""+username+"\": "+e.getMessage();
throw new BadRequestException(msg, Response.status(Response.Status.BAD_REQUEST).build());
}
},
ex -> {
String msg = "Couldn't update credential \""+username+"\""
+ ". A credential with such username doesn't exist: " + ex.getMessage();
logger.error(msg);
throw new BadRequestException(msg, Response.status(Response.Status.BAD_REQUEST).build());
});//if the Observable completes without exceptions we have a success case
Map<String, String> map = new HashMap<>();
map.put("path", "credential/" + username);
return Response.ok(getJsonRepr("link", uriGenerator.apply(appsUriBuilder, map).toASCIIString())).build();
}
My issue is at the line 11 (the catch clause of the onNext method). This is the log output that quickly will demonstrate what happens:
19:23:50.472 [http-listener(4)] ERROR com.vgorcinschi.rimmanew.rest.services.CredentialResourceService - Couldn't update credential "admin". A credential with such username doesn't exist: Failed to update credential +"admin": Password too weak!
So the exception thrown in the onNext method goes to the upstream and ends-up in the onError method! Apparently this works as designed, but I am confused as to how I could return the correct reason of the Bad Request Error. After all in my test case a credential with the user was found by the repository, the correct error was that the suggested password was too weak. This is the helper method that generated the error:
private Credential build(Credential credential, CredentialCandidate candidate) {
if(!isOkPsswd.test(candidate.getPassword())){
throw new BadRequestException("Password too weak!", Response.status(Response.Status.BAD_REQUEST).build());
}
...
}
I am still fairly new to Reactive Programming so I realise I may be missing something that is obvious. Skimming through the book didn't get me to an answer, so I would appreciate any help.
Just in case, this is the full stack trace:
updateCredentialTest(com.vgorcinschi.rimmanew.services.CredentialResourceServiceTest) Time elapsed: 0.798 sec <<< ERROR!
rx.exceptions.OnErrorFailedException: Error occurred when trying to propagate error to Observer.onError
at com.vgorcinschi.rimmanew.rest.services.CredentialResourceService.lambda$updateCredential$9(CredentialResourceService.java:245)
at rx.internal.util.ActionSubscriber.onNext(ActionSubscriber.java:39)
at rx.observers.SafeSubscriber.onNext(SafeSubscriber.java:134)
at rx.internal.util.ScalarSynchronousObservable$WeakSingleProducer.request(ScalarSynchronousObservable.java:276)
at rx.Subscriber.setProducer(Subscriber.java:209)
at rx.Subscriber.setProducer(Subscriber.java:205)
at rx.internal.util.ScalarSynchronousObservable$JustOnSubscribe.call(ScalarSynchronousObservable.java:138)
at rx.internal.util.ScalarSynchronousObservable$JustOnSubscribe.call(ScalarSynchronousObservable.java:129)
at rx.Observable.subscribe(Observable.java:10238)
at rx.Observable.subscribe(Observable.java:10205)
at rx.Observable.subscribe(Observable.java:10045)
at com.vgorcinschi.rimmanew.rest.services.CredentialResourceService.updateCredential(CredentialResourceService.java:238)
at com.vgorcinschi.rimmanew.services.CredentialResourceServiceTest.updateCredentialTest(CredentialResourceServiceTest.java:140)
It's seems you didn't grasp Reactive programming principles right.
First thing is that Observable are asynchronous by their API, while you are trying to enforce it to be synchronous API, by trying to return the Response value directly from the method, instead of returning Observable<Response> that emits this Response value over time by its onNext() notification.
That's why you are struggling with the exception, each notification lambda method (onNext/onError) is encapsulated by the Observable mechanism, in order to create a proper stream that obey some rules (the Observable contract), some of those expected behaviors are that errors should be redirect to the onError() method, which is the exception catch method, you shouldn't throw there, and throwing there will be considered as fatal error and will swallowed by throwing OnErrorFailedException.
Ideally it will be something like this:
public Observable<Response> updateCredential(#PathParam("username") final String username,
InputStream stream) {
rerurn Observable.fromCallable(() -> {
CredentialCandidate candidate = new CredentialCandidate();
Credential credential = repository.getByUsername(username);
serializeCandidate(candidate, stream);
repository.updateCredential(build(credential, candidate));
Map<String, String> map = new HashMap<>();
map.put("path", "credential/" + username);
return Response.ok(getJsonRepr("link", uriGenerator.apply(appsUriBuilder, map).toASCIIString())).build();
})
.onErrorReturn(throwable -> {
String msg = "Failed to update credential +\"" + username + "\": " + e.getMessage();
throw new BadRequestException(msg, Response.status(Response.Status.BAD_REQUEST).build());
});
}
use fromCallable in order to make the request happen when subscribing (while Observable.just(repository.getByUsername(username)) will act synchronously when the Observable is constructs ), the success path is withing the callable itself, while if any error occurred, you will transform it to your custom exception using onErrorReturn operator.
with his approach you will return Observable object that will act when you will subscribe to it, you will get all the benefits of Observable and Reactive approach such being able to compose it with some other operations, being able to specify from outside whether it will act synchronously (current thread) or async on some other thread (using Scheduler) .
For more detailed explanation regarding reactive programming I suggest to start from this great tutorial from André Staltz.

How do you handle EmptyResultDataAccessException with Spring Integration?

I have a situation where before I process an input file I want to check if certain information is setup in the database. In this particular case it is a client's name and parameters used for processing. If this information is not setup, the file import shall fail.
In many StackOverflow pages, the users resolve handling EmptyResultDataAccessException exceptions generated by queryForObject returning no rows by catching them in the Java code.
The issue is that Spring Integration is catching the exception well before my code is catching it and in theory, I would not be able to tell this error from any number of EmptyResultDataAccessException exceptions which may be thrown with other queries in the code.
Example code segment showing try...catch with queryForObject:
MapSqlParameterSource mapParameters = new MapSqlParameterSource();
// Step 1 check if client exists at all
mapParameters.addValue("clientname", clientName);
try {
clientID = this.namedParameterJdbcTemplate.queryForObject(FIND_BY_NAME, mapParameters, Long.class);
} catch (EmptyResultDataAccessException e) {
SQLException sqle = (SQLException) e.getCause();
logger.debug("No client was found");
logger.debug(sqle.getMessage());
return null;
}
return clientID;
In the above code, no row was returned and I want to properly handle it (I have not coded that portion yet). Instead, the catch block is never triggered and instead, my generic error handler and associated error channel is triggered instead.
Segment from file BatchIntegrationConfig.java:
#Bean
#ServiceActivator(inputChannel="errorChannel")
public DefaultErrorHandlingServiceActivator errorLauncher(JobLauncher jobLauncher){
logger.debug("====> Default Error Handler <====");
return new DefaultErrorHandlingServiceActivator();
}
Segment from file DefaultErrorHandlingServiceActivator.java:
public class DefaultErrorHandlingServiceActivator {
#ServiceActivator
public void handleThrowable(Message<Throwable> errorMessage) throws Throwable {
// error handling code should go here
}
}
Tested Facts:
queryForObject expects a row to be returned and will thrown an
exception if otherwise, therefore you have to handle the exception
or use a different query which returns a row.
Spring Integration is monitoring exceptions and catching them before
my own code can hand them.
What I want to be able to do:
Catch the very specific condition and log it or let the end user know what they need to do to fix the problem.
Edit on 10/26/2016 per recommendation from #Artem:
Changed my existing input channel to Spring provided Handler Advice:
#Transformer(inputChannel = "memberInputChannel", outputChannel = "commonJobGateway", adviceChain="handleAdvice")
Added support Bean and method for the advice:
#Bean
ExpressionEvaluatingRequestHandlerAdvice handleAdvice() {
ExpressionEvaluatingRequestHandlerAdvice advice = new ExpressionEvaluatingRequestHandlerAdvice();
advice.setOnFailureExpression("payload");
advice.setFailureChannel(customErrorChannel());
advice.setReturnFailureExpressionResult(true);
advice.setTrapException(true);
return advice;
}
private QueueChannel customErrorChannel() {
return new DirectChannel();
}
I initially had some issues with wiring up this feature, but in the end, I realized that it is creating yet another channel which will need to be monitored for errors and handled appropriately. For simplicity, I have chosen to not use another channel at this time.
Although potentially not the best solution, I switched to checking for row counts instead of returning actual data. In this situation, the data exception is avoided.
The main code above moved to:
MapSqlParameterSource mapParameters = new MapSqlParameterSource();
mapParameters.addValue("clientname", clientName);
// Step 1 check if client exists at all; if exists, continue
// Step 2 check if client enrollment rules are available
if (this.namedParameterJdbcTemplate.queryForObject(COUNT_BY_NAME, mapParameters, Integer.class) == 1) {
if (this.namedParameterJdbcTemplate.queryForObject(CHECK_RULES_BY_NAME, mapParameters, Integer.class) != 1) return null;
} else return null;
return findClientByName(clientName);
I then check the data upon return to the calling method in Spring Batch:
if (clientID != null) {
logger.info("Found client ID ====> " + clientID);
}
else {
throw new ClientSetupJobExecutionException("Client " +
fileNameParts[1] + " does not exist or is improperly setup in the database.");
}
Although not needed, I created a custom Java Exception which could be useful at a later point in time.
Spring Integration Service Activator can be supplied with the ExpressionEvaluatingRequestHandlerAdvice, which works like a try...catch and let you to perform some logic onFailureExpression: http://docs.spring.io/spring-integration/reference/html/messaging-endpoints-chapter.html#expression-advice
Your problem might be that you catch (EmptyResultDataAccessException e), but it is a cause, not root on the this.namedParameterJdbcTemplate.queryForObject() invocation.

Catch exception thrown by custom function in JEXL

I added some functions to the JEXL engine wich can be used in the JEXL expressions:
Map<String, Object> functions = new HashMap<String, Object>();
mFunctions = new ConstraintFunctions();
functions.put(null, mFunctions);
mEngine.setFunctions(functions);
However, some functions can throw exceptions, for example:
public String chosen(String questionId) throws NoAnswerException {
Question<?> question = mQuestionMap.get(questionId);
SingleSelectAnswer<?> answer = (SingleSelectAnswer<?>) question.getAnswer();
if (answer == null) {
throw new NoAnswerException(question);
}
return answer.getValue().getId();
}
The custom function is called when i interpret an expression. The expression of course holds a call to this function:
String expression = "chosen('qID')";
Expression jexl = mEngine.createExpression(expression);
String questionId = (String) mExpression.evaluate(mJexlContext);
Unfortunetaly, when this function is called in course of interpretation, if it throws the NoAnswerException, the interpreter does not propagete it to me, but throws a general JEXLException. Is there any way to catch exceptions from custom functions? I use the apache commons JEXL engine for this, which is used as a library jar in my project.
After some investigation, i found an easy solution!
When an exception is thrown in a custom function, JEXL will throw a general JEXLException. However, it smartly wraps the original exception in the JEXLException, as it's cause in particular. So if we want to catch the original, we can write something like this:
try {
String questionId = (String) mExpression.evaluate(mJexlContext);
} catch (JexlException e) {
Exception original = e.getCause();
// do something with the original
}

Categories

Resources