How to stop ANTLR from suppressing syntax errors?

How to stop ANTLR from suppressing syntax errors? - java

So I'm writing a compiler in Java using ANTLR, and I'm a little puzzled by how it deals with errors.
The default behavior seems to be to print an error message and then attempt, by means of token insertion and such, to recover from the error and continue parsing. I like this in principle; it means that (in the best case) if the user has committed more than one syntax error, they'll get one message per error, but it'll mention all the errors instead of forcing them to recompile to discover the next one. The default error message is fine for my purposes. The trouble comes when it's done reading all the tokens.
I am, of course, using ANTLR's tree constructors to build abstract syntax trees. While it's nice for the parse to continue through syntax errors so the user can see all the errors, once it's done parsing I want to get an exception or some kind of indication that the input wasn't syntactically valid; that way I can stop the compilation and tell the user "sorry, fix your syntax errors and then try again". What I don't want is for it to spit out an incomplete AST based on what it thinks the user was trying to say, and continue to the next phase of compilation with no indication that anything went wrong (other than the error messages which went to the console and I can't see). Yet by default, it does exactly that.
The Definitive ANTLR Reference offers a technique to stop parsing as soon as a syntax error is detected: override the mismatch and recoverFromMismatchedSet methods to throw RecognitionExceptions, and add a #rulecatch action to do the same. This would seem to lose the benefit of recovering from parse errors, but more importantly, it only partially works. If a necessary token is missing (for instance, if a binary operator only has an expression on one side of it), it throws an exception just as expected, but if an extraneous token is added, ANTLR inserts the token that it thinks belongs there and continues on its merry way, producing an AST with no indication of a syntax error except a console message. (To make matters worse, the token it inserted was EOF, so the rest of the file didn't even get parsed.)
I'm sure I could fix this by, say, adding something like an isValid field to the parser and overriding methods and adding actions so that, at the end of the parse, it throws an exception if there were any errors. But is there a better way? I can't imagine that what I'm trying to do is unusual among ANTLR users.

... [O]nce it's done parsing I want to get an exception or some kind of indication that the input wasn't syntactically valid; that way I can stop the compilation...
You can call getNumberOfSyntaxErrors on both the lexer and the parser after parsing to determine if there was an error that was covertly accommodated by ANTLR. This doesn't tell you what those errors were, obviously, but I think these methods address the "once it's done parsing ... stop the compilation" part of your question.
The Definitive ANTLR Reference offers a technique to stop parsing as soon as a syntax error is detected: override the mismatch and recoverFromMismatchedSet methods to throw RecognitionExceptions, and add a #rulecatch action to do the same.
I don't think you mentioned which version of ANTLR you're using, but the documentation in the ANTLR v3.4 code for the method recoverFromMismatchedSet says it's "not currently used" and an Eclipse "global usage" scan found no callers. Neither here nor there to your main problem, but I wanted to mention it for the record. It may be the correct method to override for your version.
If a necessary token is missing ..., [the overridden code] throws an exception just as expected, but if an extraneous token is added, ANTLR inserts the token that it thinks belongs there and continues on its merry way...
Method recoverFromMismatchedToken tests for a recoverable missing and extraneous token by delegating to methods mismatchIsMissingToken and mismatchIsUnwantedToken respectively. If the appropriate method determines that an insertion or deletion will solve the problem, recoverFromMismatchedToken makes the appropriate correction. If it is determined that no operation solves the mismatched token problem, recoverFromMismatchedToken throws a MismatchedTokenException.
If a recovery operation takes place, reportError is called, which calls displayRecognitionError with the details.
This applies to ANTLR v3.4 and possibly earlier versions.
This gives you at least two options:
Override recoverFromMismatchedToken and handle errors at a fine-grained level. From here you can delegate the call to the super implementation, roll your own recovery code, or bail out with an exception. Whatever the case, your code will be called and thus will be aware that a mismatch error occurred, recoverable or otherwise. This option is probably equivalent to overriding recoverFromMismatchedSet.
Override displayRecognitionError and handle the errors at a course-grained level. Method reportError does some state juggling, so I wouldn't recommend overriding it unless the overriding implementation calls the super-implementation. Method displayRecognitionError appears to be one of the last calls in the recovered-token call chain, so it would be a reasonable place to determine whether or not to continue. I would prefer it had a name that indicated that it was a reasonable place for that, but oh well. Here is an answer that demonstrates this option.
I'm partial towards overriding displayRecognitionError because it provides the error message text easily enough and because I know it's going to be called only after a token recovery operation and required state juggling -- no need for my parser to figure out how to recover for itself. This coupled with getNumberOfSyntaxErrors appear to give you the options that you're looking for, assuming that you're working with a relevant version of ANTLR and that I fully understood your problem.

Related

TAINTED_SOURCE - os_command_sink

Further to Tainted_source JAVA, I want to add more information regarding the error os_command_sink I am getting.
Below is the section of code that's entry point of data from front end and marks parameter as tainted_souce
Now when the DTO - CssEmailWithAttachment is sent to static method of CommandUtils, it reports os_command_sink issue. Below is the code for the method
I tried various ways to sanitize the source in controller method - referenceDataExport i.e. using allowlist, using #Pattern annotation but coverity reports os_command_sink all the times.
I understand the reason as any data coming from http is marked as tainted by default. And the code is using the data to construct an OS command hence the issue is reported.
Coverity provides below information regarding the issue
So I tried strict validation of entityType that it should be one of the known values only but that also doesn't remove the issue.
Is there anyway this can be resolved?
Thanks

The main issue is that the code, as it currently stands, is insecure. To summarize the Coverity report:
entityType comes from an HTTP parameter, hence is under attacker control.
entityType is concatenated into tagline.
tagline is passed as the body and subject of CdsEmailWithAttachment. (You haven't included the constructor of that class, so this is partially speculation on my part.)
The subject and body are concatenated into an sh command line. Consequently, anyone who can invoke your HTTP service can execute arbitrary command lines on your server backend!
There is an attempt at validation in sendEmailWithAttachment, where certain shell metacharacters are filtered out. However, the filtering is incomplete (missing at least single and double quote) and is not applied to the subject.
So, your first task here is to fix the vulnerability. The Coverity tool has correctly reported that there is a problem, but making Coverity happy is not the goal, and even if it stops reporting after you make a change, that does not necessarily mean the vulnerability is fixed.
There are at least two straightforward ways I see to fix this code:
Use a whitelist filter on entityType, rejecting the request if the value is not among a fixed list of safe strings. You mentioned trying the #Pattern annotation, and that could work if used correctly. Be sure to test that your filter works and provides a sensible error message.
Instead of invoking mailx via sh, invoke it directly using ProcessBuilder. This way you can safely transport arbitrary data into mailx without the risks of a shell command line.
Personally, I would do both of these. It appears that entityType is meant to be one of a fixed set of values, so should be validated regardless of any vulnerability potential; and using sh is both risky from a security perspective and makes controlling the underlying process difficult (e.g., implementing a timeout).
Whatever you decide to do, test the fix. In fact, I recommend first (before changing the code) demonstrating that the code is vulnerable by constructing an exploit, as that will be needed later to test any fix, and is a valuable exercise in its own right. When you think you have fixed the problem, write more tests to really be sure. Think like an attacker; be devious!
Finally, I suspect you may be inexperienced at dealing with potential security vulnerabilities (I apologize if I'm mistaken). If so, please understand that code security is very important, and getting it right is difficult. If you have the option, I recommend consulting with someone in your organization who has more experience with this topic. Do not rely only on Coverity.

How can I gather information about exception?

Since I work in project also in maintenance team I have pretty often deal with same exceptions which occurs pretty often (in the same place) but there are insufficient amount of information in stacktrace about them. Which causes situation where even if it's very often I cannot do anything about them because cause is unknown.
So I was wondering if there are some tools which would allow to gather more information about context (probably by connecting somehow JVM).
Is this place where I should use these guys:
jinfo, jhat, jmap, jsadebugd, jstack?
Or maybe there are some more handy/more powerful/(call it) ways?
What I would expect:
dump values of variables (in invoked method and if possible in other methods in stacktrace)
solution doesn't change code itself
checking if some other defined methods was invoked before exception occured

The stack trace provides you with the exact line of code where the problem occurred along with a (hopefully) useful message to explain the problem, which should be enough to identify the problem.
Generally, if it's not, the best thing to do is to modify the code in the problematic area, to give you more information. That might be to log the state of variables just before the line where the exception occurs, in all cases (not just when there's an exception), or to increase the amount of information provided when the exception does occur.
Alternatively, if there is more than one thing happening on the line where the exception occurs, you could split the code there into multiple lines, which will cause the line number in all subsequent exceptions to be more helpful, for example, if you have a line like this:
Foo myFoo = myBar.getBaz().getFoo();
and you get a NullPointerException on that line, you might not be able to identify whether myBar is null, or whether its getBaz() method returns null. If you split that into:
Baz myBaz = myBar.getBaz();
Foo myFoo = myBaz.getFoo();
the next time you encounter the same NullPointerException the line number will give you a better clue as to what is null.

When is it suitable to throw an Exception?

I've seen some code recently where the author was throwing exceptions for almost every constructor and throwing runtime exceptions for things like the code below, in a method that returns int:
if(condition){
return 1;
}
if(condition){
return 2;
}
if(condition){
return 3;
}
throw new RuntimeException("Unreachable code");
// method ends here
I wouldn't have personally thrown an exception there, because I would have structured it using if and else if statements, and in this particular case your code would be fundamentally wrong for it not to satisfy one of the conditions anyway.
There are plenty of places you could throw runtime exceptions, that would never be reached if you're code is working correctly, sometimes it just seems like the author doesn't trust the code to work, in the case of the code block above. Also, every constructor could throw an exception for if it doesn't initialize correctly, but you could also structure it so that the object would be null - which you could then check for in main, for instance.
What I'm asking, basically, is when is it worth throwing an exception?

The point of exceptions is to communicate exceptional situations.
In that sense: if it is absolutely unexpected that all your conditions are false in your example, and that there is also no valid return value to indicate that situation, then throwing that RuntimeException is the reasonable thing to do here; but I would probably change the message to:
throw new RuntimeException("All conditions failed: " + some data)
As said: it is about communicating; in this case to the person debugging the problem. So it might be helpful here to include the information that is required to understand why exactly all those checks turned out false.
The point is: there is a contract for that method; and that contract should include such details. Meaning: if that method is public, you should probably add a #throws RuntimeException with a clear description.
And it is also a valid practice to use RuntimeException for such situations; as you do not want to pollute your method signatures with checked exceptions all over the place.
Edit: of course, balancing is required. Example: my classes often look like:
public class Whatever {
private final Foo theFoo;
public Whatever(Foo theFoo) {
Objects.requireNonNull(theFoo, "theFoo must not be null");
this.theFoo = theFoo;
So, there might be a NPE thrown from my constructors; yes. But: only there. All my methods can rely on the fact that all fields were initialized to non-null; and they are final, so they will always be non-null.
Meaning: one has to stay reasonable; and "develop" a feeling for: which problems are exceptional but possible; and which ones are so impossible that you don't pollute your code all over the place to check for them.
Finally; just to make that clear - adding exceptions is only one part of the equation. When something throws, then you need something to catch! Therefore, as said: balancing comes in. Whatever you do in your code has to "add value" to it. If your code doesn't fulfill a clear, defined purpose, then chances are: you don't need it!

GhostCat has basically covered all that need to be said when and why we should use exceptions. Just to take it further, the best thing to do is to weigh the cost benefit of including an exception. The cost in this context refers to performance as well as degraded client friendliness of the application while the benefit is the smooth running of the application as well as being user-friendly. In my opinion first one should distinguish between application and system error. Then these errors further need to be scrutinised after dichotomizing them into compile and runtime ( note that compile time errors normally do not need to be handled with exception but to debug and find out issues you need to handle them using debug tools such as assert of C++). Even if the nitty-gritty of inclusion of exception handlers depends on the context of the specific application, generally, one can postulate the following principles as a starting point:
1- Identify critical hotspot crash points of the code;
2- Distinguish between system and application errors;
3-Identify run time and compile time errors;
4- Handle compile time error using debugging tools such as assert or preprocessor directives. Also, include exception handlers or trace or debug to handle runtime errors
4-weigh the consequences of handling exceptions at run time;
5- Then provide a testable framework, which normally can be handled during Unit Test, to identify where exceptions need to be included or not.
6- Finally, decide where you need to include the exception handlers for your production code taking into account factors you think are decisive and need to include exception handler to handle them.
7- Finally finally .... you need to have a crash proof exception handler that should be triggered in the unlikely scenario that the application crashes and include fallback safety to handle states to make the application very user-friendly.

When is the right time to throw a RuntimeException?

I'm developing a library for Android, which I intend to open source and naturally I want to tick all of the boxes before I publish it - so users are suitably impressed with my code. Ahem.
As with many libraries, there are certain basic configurations necessary in order for the library to function.
public static final String API_KEY = "your_api_key_here";
In the above instance, when a user passes their API key to the library, I'm putting a simple string match in for "your_api_key_here" and if it matches, I'm going to throw a RuntimeException, as they quite simply haven't read the basic instructions and I want their app to die.
Is this a valid use of a RuntimeException? If it isn't, then in Java what is?
EDIT - My motivation for posting this is due to this post, where the OP is lynched by shouts of "why!?" for asking how to throw one.
ANSWER - In this instance, it seems to be more a matter of preference than right or wrong either way - at least no one has so far objected. This scenario should only occur during the testing phase for a developer and never in production. If this wasn't the case, I wouldn't have chosen an uncaught exception.
I've marked an answer as correct due to the most upvotes and following #mech's comment below, I have created a custom ReadTheDocumentationException which provides a suitably persuasive message.

I think you should use illegal argument exception which is subclass of java.lang.RuntimeException . You can do something like this
if(API_KEY.equals("your_api_key_here"))
throw new IllegalArgumentException("you message here");
For more info see this

You should create your own exception by extending RuntimeException or any other Exception. IllegalStateException would work for a case when someone terribly misbehave.

It sounds like part of your question deals with what is the proper use of RuntimeException, and partly deals with how your library should behave if misconfigured. I'll deal with mostly the former.
In Java, there are two types of exceptions, checked and unchecked.
RuntimeException and all of its subclasses are "unchecked" exceptions, meaning there is no requirement from the compiler to catch them. You can use these to crash your process if something is very wrong. The caller can still catch and handle them on their own, so be prepared that the caller may continue to call into your lib incorrectly.
Exception and all of its subclasses (except RuntimeException) are "checked", meaning that the compiler requires the caller to catch them or declare in a method declaration that it could be thrown. You use this in cases where you expect the caller to try to recover from whatever condition caused the exception to be thrown.
In your case, you can throw a RuntimeException with a meaningful message, or a custom subclass of RuntimeException with a message to indicate to the caller exactly what went wrong and how to remedy it. It doesn't really matter what you choose, but many people choose to subclass for clarity. I'd just make sure that the exception is never thrown by surprise in order to have clear rules for engagement for your lib.

.Net equivalent to Java's AssertionError

In Java, I will occasionally throw an AssertionError directly, to assert that a particular line will not be reached. An example of this would be to assert that the default case in a switch statement cannot be reached (see this JavaSpecialists page for an example).
I would like to use a similar mechanism in .Net. Is there an equivalent exception that I could use? Or is there another method that could be used with the same effect?
Edit - To clarify, I'm looking for a mechanism to flag failures at runtime, in released code, to indicate that there has been a (possibly catastrophic) failure of some invariant in the code. The linked example generates a random integer between 0 and 2 (inclusive) and asserts that the generated number is always 0, 1 or 2. If this assertion doesn't hold, it would be better to stop execution completely rather than continue with some unknown corrupt state of the system.

I'd normally throw InvalidOperationException or ArgumentOutOfRangeException depending on where the value came from.
Alternatively, there's Debug.Assert (which will only fail when you've got the DEBUG preprocessor symbol defined) or in .NET 4.0 you could use Contract.Fail, Contract.Assert or Contract.Assume depending on the situation. Explicitly throwing an exception has the benefit that the compiler knows that the next statement is unreachable though.
I'm not a big fan of Debug.Assert - it's usually inappropriate for a release (as it throws up an assertion box rather than just failing) and by default it won't be triggered in release anyway. I prefer exceptions which are always thrown, as they prevent your code from carrying on regardless after the opportunity to detect that "stuff is wrong".
Code Contracts changes the game somewhat, as there are all kinds of options for what gets preserved at execution time, and the static checker can help to prove that you won't get into that state. You still need to choose the execution time policy though...

You can use the Trace.Assert method, which will work on release builds (if you have the TRACE compilation symbol defined, which is defined by default on Visual Studio projects). You can also customize the way your application reacts on assertion errors by way of a TraceListener. The default is (unsurprisingly) the DefaultTraceListener, which will show the assertion in a dialog box if the application is running in interactive mode. If you want to throw an exception, for example, you can create your own TraceListener and throw it on the method Fail. You can then remove the DefaultTraceListener and use your own, either programmatically or in the configuration file.
This looks like a lot of trouble, and is only justifiable if you want to dynamically change the way your application handles assertions by way of the trace listeners. For violations that you always want to fail, create your own AssertionException class and throw it right away.
For .NET 4.0, I'd definetely look at the Contract.Assert method. But, this method is only compiled when the symbols DEBUG or CONTRACTS_FULL are defined. DEBUG won't work on release builds, and CONTRACTS_FULL will also turn on all other contracts checking, some of which you might not want to be present in release builds.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.