Catch exception thrown by custom function in JEXL - java

I added some functions to the JEXL engine wich can be used in the JEXL expressions:
Map<String, Object> functions = new HashMap<String, Object>();
mFunctions = new ConstraintFunctions();
functions.put(null, mFunctions);
mEngine.setFunctions(functions);
However, some functions can throw exceptions, for example:
public String chosen(String questionId) throws NoAnswerException {
Question<?> question = mQuestionMap.get(questionId);
SingleSelectAnswer<?> answer = (SingleSelectAnswer<?>) question.getAnswer();
if (answer == null) {
throw new NoAnswerException(question);
}
return answer.getValue().getId();
}
The custom function is called when i interpret an expression. The expression of course holds a call to this function:
String expression = "chosen('qID')";
Expression jexl = mEngine.createExpression(expression);
String questionId = (String) mExpression.evaluate(mJexlContext);
Unfortunetaly, when this function is called in course of interpretation, if it throws the NoAnswerException, the interpreter does not propagete it to me, but throws a general JEXLException. Is there any way to catch exceptions from custom functions? I use the apache commons JEXL engine for this, which is used as a library jar in my project.

After some investigation, i found an easy solution!
When an exception is thrown in a custom function, JEXL will throw a general JEXLException. However, it smartly wraps the original exception in the JEXLException, as it's cause in particular. So if we want to catch the original, we can write something like this:
try {
String questionId = (String) mExpression.evaluate(mJexlContext);
} catch (JexlException e) {
Exception original = e.getCause();
// do something with the original
}

Related

Handle json parse error without crashing Kafka stream processor application

I have a kafka streaming application which map/transforms json message and streams the output to a topic.
KStream<String, String> logMessageStream = builder.stream(inputTopic, Consumed.with(stringSerde, stringSerde));
logMessageStream.map((k, v) -> { //Map record
try { // Map record to (requestId, message)
// readValue throws IOException, JsonParseException, JsonMappingException
LogMessage logMessage = objectMapper.readValue(v, LogMessage.class);
return new KeyValue<>(logMessage.requestId(), logMessage);
} catch (IOException e) {
e.printStackTrace();
}
return null; // <== RETURNS null due to caught exception
}).toStream().to(outoutTopic)
now i will get parse error if the input record json contains invalid syntax, the stream application crashes with :
java.lang.NullPointerException
at org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:42)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:146)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:129)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:93)
....
I want to consume this error while mapping and continue the processing for other message. Is there any handler I can set to consumer the exception. Looking for suggestions.
Thanks..
You can also take advantage of the StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG property, as detailed on https://docs.confluent.io/current/streams/faq.html#handling-corrupted-records-and-deserialization-errors-poison-pill-records .
Properties streamsSettings = new Properties();
streamsSettings.put(
StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndContinueExceptionHandler.class.getName()
);
Instead of using map() you can use flatMap() that allows you to return zero elements. Returning null from a map() is not allowed as pointed out in the JavaDocs:
The provided {#link KeyValueMapper} must return a {#link KeyValue} type and must not return {#code null}.
Note, that flatMap() does not allow to return null either. But it accepts anything you can iterate over (ie, Iterable). For example, you can return a Collections.singleton() on success, and Collection.emptySet() on failure.
Just take a look at setUncaughtExceptionHandler method:
KafkaStreams streams = new KafkaStreams(topology, props);
streams.setUncaughtExceptionHandler((Thread t, Throwable e) -> {
// your logic here
});

How to write code that returns a boolean value upon completing JSON Schema Validation

I need help with writing code to do a JSON Schema validation. I've already written code that does the schema validation, but unfortunately, I cannot return a boolean value because the method returns void.
I don't know of any other library that would help with this.
Library:
org.json.JSONObject;
org.everit.json.schema.Schema;
org.everit.json.schema.loader.SchemaLoader
This is my current code:
StringReader reader = new StringReader(response);
JSONObject jsonSchema = new JSONObject(
new JSONTokener(JSONSchemaValidation.class.getResourceAsStream("/biographics_schema.schema.json")));
JSONObject jsonSubject = new JSONObject(new JSONTokener((reader)));
Schema schema = SchemaLoader.load(jsonSchema);
schema.validate(jsonSubject);
As you can see, there's no way of checking if schema validation was valid or false.
Can someone help me write code that checks to see if schema validation was successful?
Thanks
From the latest API available from the org.everit.json GitHub project, an invocation of Schema#validate will throw ValidationException if your schema is invalid.
You may want to try/catch and return false in the catch block.
Catch the exception that validate() throws:
try {
schema.validate(jsonSubject);
return true;
} catch (ValidationException e) {
return false;
}
from the javadoc:
public void validate(Object subject) Performs the schema validation.
Parameters: subject - the object to be validated Throws:
ValidationException - if the subject is invalid against this schema.
So if you want to check wehter it was successful you need to catch the exception. Since ValidationEception extends RuntimeException it is an unchecked exception, this means you are not forced to do a try-catch block.
So what you can do is following:
boolean valid;
try{
schema.validate(jsonSubject);
valid = true;
} catch (ValidationException e){
valid = false;
}

Stackoverflowerror while using distinct in apache spark

I use Spark 2.0.1.
I am trying to find distinct values in a JavaRDD as below
JavaRDD<String> distinct_installedApp_Ids = filteredInstalledApp_Ids.distinct();
I see that this line is throwing the below exception
Exception in thread "main" java.lang.StackOverflowError
at org.apache.spark.rdd.RDD.checkpointRDD(RDD.scala:226)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:84)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:84)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:84)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:84)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:84)
..........
The same stacktrace is repeated again and again.
The input filteredInstalledApp_Ids has large input with millions of records.Will thh issue be the number of records or is there a efficient way to find distinct values in JavaRDD. Any help would be much appreciated. Thanks in advance. Cheers.
Edit 1:
Adding the filter method
JavaRDD<String> filteredInstalledApp_Ids = installedApp_Ids
.filter(new Function<String, Boolean>() {
#Override
public Boolean call(String v1) throws Exception {
return v1 != null;
}
}).cache();
Edit 2:
Added the method used to generate installedApp_Ids
public JavaRDD<String> getIdsWithInstalledApps(String inputPath, JavaSparkContext sc,
JavaRDD<String> installedApp_Ids) {
JavaRDD<String> appIdsRDD = sc.textFile(inputPath);
try {
JavaRDD<String> appIdsRDD1 = appIdsRDD.map(new Function<String, String>() {
#Override
public String call(String t) throws Exception {
String delimiter = "\t";
String[] id_Type = t.split(delimiter);
StringBuilder temp = new StringBuilder(id_Type[1]);
if ((temp.indexOf("\"")) != -1) {
String escaped = temp.toString().replace("\\", "");
escaped = escaped.replace("\"{", "{");
escaped = escaped.replace("}\"", "}");
temp = new StringBuilder(escaped);
}
// To remove empty character in the beginning of a
// string
JSONObject wholeventObj = new JSONObject(temp.toString());
JSONObject eventJsonObj = wholeventObj.getJSONObject("eventData");
int appType = eventJsonObj.getInt("appType");
if (appType == 1) {
try {
return (String.valueOf(appType));
} catch (JSONException e) {
return null;
}
}
return null;
}
}).cache();
if (installedApp_Ids != null)
return sc.union(installedApp_Ids, appIdsRDD1);
else
return appIdsRDD1;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
I assume the main dataset is in inputPath. It appears that it's a comma-separated file with JSON-encoded values.
I think you could make your code a bit simpler by combination of Spark SQL's DataFrames and from_json function. I'm using Scala and leave converting the code to Java as a home exercise :)
The lines where you load a inputPath text file and the line parsing itself can be as simple as the following:
import org.apache.spark.sql.SparkSession
val spark: SparkSession = ...
val dataset = spark.read.csv(inputPath)
You can display the content using show operator.
dataset.show(truncate = false)
You should see the JSON-encoded lines.
It appears that the JSON lines contain eventData and appType fields.
val jsons = dataset.withColumn("asJson", from_json(...))
See functions object for reference.
With JSON lines, you can select the fields of your interest:
val apptypes = jsons.select("eventData.appType")
And then union it with installedApp_Ids.
I'm sure the code gets easier to read (and hopefully to write too). The migration will give you extra optimizations that you may or may not be able to write yourself using assembler-like RDD API.
And the best is that filtering out nulls is as simple as using na operator that gives DataFrameNaFunctions like drop. I'm sure you'll like them.
It does not necessarily answer your initial question, but this java.lang.StackOverflowError might get away just by doing the code migration and the code gets easier to maintain, too.

Handling errors in ANTLR4

The default behavior when the parser doesn't know what to do is to print messages to the terminal like:
line 1:23 missing DECIMAL at '}'
This is a good message, but in the wrong place. I'd rather receive this as an exception.
I've tried using the BailErrorStrategy, but this throws a ParseCancellationException without a message (caused by a InputMismatchException, also without a message).
Is there a way I can get it to report errors via exceptions while retaining the useful info in the message?
Here's what I'm really after--I typically use actions in rules to build up an object:
dataspec returns [DataExtractor extractor]
#init {
DataExtractorBuilder builder = new DataExtractorBuilder(layout);
}
#after {
$extractor = builder.create();
}
: first=expr { builder.addAll($first.values); } (COMMA next=expr { builder.addAll($next.values); })* EOF
;
expr returns [List<ValueExtractor> values]
: a=atom { $values = Arrays.asList($a.val); }
| fields=fieldrange { $values = values($fields.fields); }
| '%' { $values = null; }
| ASTERISK { $values = values(layout); }
;
Then when I invoke the parser I do something like this:
public static DataExtractor create(String dataspec) {
CharStream stream = new ANTLRInputStream(dataspec);
DataSpecificationLexer lexer = new DataSpecificationLexer(stream);
CommonTokenStream tokens = new CommonTokenStream(lexer);
DataSpecificationParser parser = new DataSpecificationParser(tokens);
return parser.dataspec().extractor;
}
All I really want is
for the dataspec() call to throw an exception (ideally a checked one) when the input can't be parsed
for that exception to have a useful message and provide access to the line number and position where the problem was found
Then I'll let that exception bubble up the callstack to whereever is best suited to present a useful message to the user--the same way I'd handle a dropped network connection, reading a corrupt file, etc.
I did see that actions are now considered "advanced" in ANTLR4, so maybe I'm going about things in a strange way, but I haven't looked into what the "non-advanced" way to do this would be since this way has been working well for our needs.
Since I've had a little bit of a struggle with the two existing answers, I'd like to share the solution I ended up with.
First of all I created my own version of an ErrorListener like Sam Harwell suggested:
public class ThrowingErrorListener extends BaseErrorListener {
public static final ThrowingErrorListener INSTANCE = new ThrowingErrorListener();
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e)
throws ParseCancellationException {
throw new ParseCancellationException("line " + line + ":" + charPositionInLine + " " + msg);
}
}
Note the use of a ParseCancellationException instead of a RecognitionException since the DefaultErrorStrategy would catch the latter and it would never reach your own code.
Creating a whole new ErrorStrategy like Brad Mace suggested is not necessary since the DefaultErrorStrategy produces pretty good error messages by default.
I then use the custom ErrorListener in my parsing function:
public static String parse(String text) throws ParseCancellationException {
MyLexer lexer = new MyLexer(new ANTLRInputStream(text));
lexer.removeErrorListeners();
lexer.addErrorListener(ThrowingErrorListener.INSTANCE);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyParser parser = new MyParser(tokens);
parser.removeErrorListeners();
parser.addErrorListener(ThrowingErrorListener.INSTANCE);
ParserRuleContext tree = parser.expr();
MyParseRules extractor = new MyParseRules();
return extractor.visit(tree);
}
(For more information on what MyParseRules does, see here.)
This will give you the same error messages as would be printed to the console by default, only in the form of proper exceptions.
When you use the DefaultErrorStrategy or the BailErrorStrategy, the ParserRuleContext.exception field is set for any parse tree node in the resulting parse tree where an error occurred. The documentation for this field reads (for people that don't want to click an extra link):
The exception which forced this rule to return. If the rule successfully completed, this is null.
Edit: If you use DefaultErrorStrategy, the parse context exception will not be propagated all the way out to the calling code, so you'll be able to examine the exception field directly. If you use BailErrorStrategy, the ParseCancellationException thrown by it will include a RecognitionException if you call getCause().
if (pce.getCause() instanceof RecognitionException) {
RecognitionException re = (RecognitionException)pce.getCause();
ParserRuleContext context = (ParserRuleContext)re.getCtx();
}
Edit 2: Based on your other answer, it appears that you don't actually want an exception, but what you want is a different way to report the errors. In that case, you'll be more interested in the ANTLRErrorListener interface. You want to call parser.removeErrorListeners() to remove the default listener that writes to the console, and then call parser.addErrorListener(listener) for your own special listener. I often use the following listener as a starting point, as it includes the name of the source file with the messages.
public class DescriptiveErrorListener extends BaseErrorListener {
public static DescriptiveErrorListener INSTANCE = new DescriptiveErrorListener();
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol,
int line, int charPositionInLine,
String msg, RecognitionException e)
{
if (!REPORT_SYNTAX_ERRORS) {
return;
}
String sourceName = recognizer.getInputStream().getSourceName();
if (!sourceName.isEmpty()) {
sourceName = String.format("%s:%d:%d: ", sourceName, line, charPositionInLine);
}
System.err.println(sourceName+"line "+line+":"+charPositionInLine+" "+msg);
}
}
With this class available, you can use the following to use it.
lexer.removeErrorListeners();
lexer.addErrorListener(DescriptiveErrorListener.INSTANCE);
parser.removeErrorListeners();
parser.addErrorListener(DescriptiveErrorListener.INSTANCE);
A much more complicated example of an error listener that I use to identify ambiguities which render a grammar non-SLL is the SummarizingDiagnosticErrorListener class in TestPerformance.
What I've come up with so far is based on extending DefaultErrorStrategy and overriding it's reportXXX methods (though it's entirely possible I'm making things more complicated than necessary):
public class ExceptionErrorStrategy extends DefaultErrorStrategy {
#Override
public void recover(Parser recognizer, RecognitionException e) {
throw e;
}
#Override
public void reportInputMismatch(Parser recognizer, InputMismatchException e) throws RecognitionException {
String msg = "mismatched input " + getTokenErrorDisplay(e.getOffendingToken());
msg += " expecting one of "+e.getExpectedTokens().toString(recognizer.getTokenNames());
RecognitionException ex = new RecognitionException(msg, recognizer, recognizer.getInputStream(), recognizer.getContext());
ex.initCause(e);
throw ex;
}
#Override
public void reportMissingToken(Parser recognizer) {
beginErrorCondition(recognizer);
Token t = recognizer.getCurrentToken();
IntervalSet expecting = getExpectedTokens(recognizer);
String msg = "missing "+expecting.toString(recognizer.getTokenNames()) + " at " + getTokenErrorDisplay(t);
throw new RecognitionException(msg, recognizer, recognizer.getInputStream(), recognizer.getContext());
}
}
This throws exceptions with useful messages, and the line and position of the problem can be gotten from either the offending token, or if that's not set, from the current token by using ((Parser) re.getRecognizer()).getCurrentToken() on the RecognitionException.
I'm fairly happy with how this is working, though having six reportX methods to override makes me think there's a better way.
For anyone interested, here's the ANTLR4 C# equivalent of Sam Harwell's answer:
using System; using System.IO; using Antlr4.Runtime;
public class DescriptiveErrorListener : BaseErrorListener, IAntlrErrorListener<int>
{
public static DescriptiveErrorListener Instance { get; } = new DescriptiveErrorListener();
public void SyntaxError(TextWriter output, IRecognizer recognizer, int offendingSymbol, int line, int charPositionInLine, string msg, RecognitionException e) {
if (!REPORT_SYNTAX_ERRORS) return;
string sourceName = recognizer.InputStream.SourceName;
// never ""; might be "<unknown>" == IntStreamConstants.UnknownSourceName
sourceName = $"{sourceName}:{line}:{charPositionInLine}";
Console.Error.WriteLine($"{sourceName}: line {line}:{charPositionInLine} {msg}");
}
public override void SyntaxError(TextWriter output, IRecognizer recognizer, Token offendingSymbol, int line, int charPositionInLine, string msg, RecognitionException e) {
this.SyntaxError(output, recognizer, 0, line, charPositionInLine, msg, e);
}
static readonly bool REPORT_SYNTAX_ERRORS = true;
}
lexer.RemoveErrorListeners();
lexer.AddErrorListener(DescriptiveErrorListener.Instance);
parser.RemoveErrorListeners();
parser.AddErrorListener(DescriptiveErrorListener.Instance);
For people who use Python, here is the solution in Python 3 based on Mouagip's answer.
First, define a custom error listener:
from antlr4.error.ErrorListener import ErrorListener
from antlr4.error.Errors import ParseCancellationException
class ThrowingErrorListener(ErrorListener):
def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):
ex = ParseCancellationException(f'line {line}: {column} {msg}')
ex.line = line
ex.column = column
raise ex
Then set this to lexer and parser:
lexer = MyScriptLexer(script)
lexer.removeErrorListeners()
lexer.addErrorListener(ThrowingErrorListener())
token_stream = CommonTokenStream(lexer)
parser = MyScriptParser(token_stream)
parser.removeErrorListeners()
parser.addErrorListener(ThrowingErrorListener())
tree = parser.script()

Faster alternative to JsonPath in Java

JsonPath seems to be pretty slow for large JSON files.
In my project, I'd like a user to be able to pass an entire query as a string. I used JsonPath because it lets you do an entire query like $.store.book[3].price all at once by doing JsonPath.read(fileOrString, "$.store.book[3].price", new Filter[0]). Is there a faster method to interact with JSON files in Javascript? It would be ideal to be able to pass the entire query as a string, but I'll write a parser if I have to. Any ideas?
Even small optimizations would be helpful. For instance, I'm currently reading from a JSON file every time I query. Would it be better just to copy the entire file into a string at the beginning and query to the string instead?
EDIT: To those of you saying "this is Javascript, not Java", well, it actually is Java. JsonPath is a Javascript-like query language, but the file I am writing is most assuredly Java. Only the query is written in Javascript. Here's some info about JsonPath, and a snippet of code: https://code.google.com/p/json-path/
List toRet;
String query = "$.store.book[3].price";
try {
// if output is a list, good
toRet = (List) JsonPath.read(filestring_, query, new Filter[0]);
} catch (ClassCastException cce) {
// if output isn't a list, put it in a list
Object outObj = null;
try {
outObj = JsonPath.read(filestring_, query, new Filter[0]);
} catch (Exception e) {
throw new DataSourceException("Invalid file!\n", e, DataSourceException.UNKNOWN);
}

Categories

Resources