TAINTED_SOURCE - os_command_sink

TAINTED_SOURCE - os_command_sink - java

Further to Tainted_source JAVA, I want to add more information regarding the error os_command_sink I am getting.
Below is the section of code that's entry point of data from front end and marks parameter as tainted_souce
Now when the DTO - CssEmailWithAttachment is sent to static method of CommandUtils, it reports os_command_sink issue. Below is the code for the method
I tried various ways to sanitize the source in controller method - referenceDataExport i.e. using allowlist, using #Pattern annotation but coverity reports os_command_sink all the times.
I understand the reason as any data coming from http is marked as tainted by default. And the code is using the data to construct an OS command hence the issue is reported.
Coverity provides below information regarding the issue
So I tried strict validation of entityType that it should be one of the known values only but that also doesn't remove the issue.
Is there anyway this can be resolved?
Thanks

The main issue is that the code, as it currently stands, is insecure. To summarize the Coverity report:
entityType comes from an HTTP parameter, hence is under attacker control.
entityType is concatenated into tagline.
tagline is passed as the body and subject of CdsEmailWithAttachment. (You haven't included the constructor of that class, so this is partially speculation on my part.)
The subject and body are concatenated into an sh command line. Consequently, anyone who can invoke your HTTP service can execute arbitrary command lines on your server backend!
There is an attempt at validation in sendEmailWithAttachment, where certain shell metacharacters are filtered out. However, the filtering is incomplete (missing at least single and double quote) and is not applied to the subject.
So, your first task here is to fix the vulnerability. The Coverity tool has correctly reported that there is a problem, but making Coverity happy is not the goal, and even if it stops reporting after you make a change, that does not necessarily mean the vulnerability is fixed.
There are at least two straightforward ways I see to fix this code:
Use a whitelist filter on entityType, rejecting the request if the value is not among a fixed list of safe strings. You mentioned trying the #Pattern annotation, and that could work if used correctly. Be sure to test that your filter works and provides a sensible error message.
Instead of invoking mailx via sh, invoke it directly using ProcessBuilder. This way you can safely transport arbitrary data into mailx without the risks of a shell command line.
Personally, I would do both of these. It appears that entityType is meant to be one of a fixed set of values, so should be validated regardless of any vulnerability potential; and using sh is both risky from a security perspective and makes controlling the underlying process difficult (e.g., implementing a timeout).
Whatever you decide to do, test the fix. In fact, I recommend first (before changing the code) demonstrating that the code is vulnerable by constructing an exploit, as that will be needed later to test any fix, and is a valuable exercise in its own right. When you think you have fixed the problem, write more tests to really be sure. Think like an attacker; be devious!
Finally, I suspect you may be inexperienced at dealing with potential security vulnerabilities (I apologize if I'm mistaken). If so, please understand that code security is very important, and getting it right is difficult. If you have the option, I recommend consulting with someone in your organization who has more experience with this topic. Do not rely only on Coverity.

Related

Why use enums when it creates dependency across teams?

I know enums are used when we are expecting only a set of values to be passed. We don't want the caller to pass anything other than the well defined set.
And this works very well inside a project. Because you know what you've to pass.
But consider 2 projects, I am using the models of 1st project in 2nd.
Second project has a method like this.
public void updateRefundMode(RefundMode refundMode)
enum RefundMode("CASH","CARD","GIFT_VOUCHER")
Now, I realise RefundMode can be PHONEPE also, So If I start passing this to 1st project, it would fail at their end (Unable to desirialize enum PHONEPE). Although I've added this enum at my end.
Which is fine, because If my first project doesn't know about the "PHONEPE", then it doesn't know how to handle it, so he has to update the models too.
But my problem is, Let's imagine a complex Object am trying to pass, which also takes this RefundMode, when I pass a new RefundMode just this field should be become null or ignored at their end right ? Rather than not accepting the whole object, and breaking the entire flow/request.
Is there a way I can specify jackson (jsonproperties) to just ignore that field if an unknown value is being passed. Curious to know.. (Although In that case, I am breaking the rule of ENUM) So, why not keep a String which solves all the problem ?

It's all about contracts.
When you are in a client/server situation, being a mobile app and a web server, or a Java library (jar) and another Java project, you have to keep the contracts in mind.
As you observed, a change in contracts need to be propagated to both parties: the client and the server (supplier).
One way of working with this is to use versioning. You may say: "Version 1: those are the refund modes.". Then the mobile app may call the web server by specifying the contract version in the URL: /api/v1/refund?mode=CASH
When the contract needs to be changed, you need to consider what to do with the clients. In the case of mobile apps, the users might not have updated their app to the latest version, so their app may still be calling /api/v1 (and not supporting new refund modes). In that case, you may want to support both /api/v1 and /api/v2 (with the new refund mode) in your web server.
As your example shows, it is not always possible to transparently adapt one contract version to another (in your example, there is no good equivalent to PHONEPE in the original enum). If you have to deal with contract updates, I suggest explicitly writing code to them (you can use dedicated JSON schemas, classes and services) instead of trying to bridge the gaps. Think of what would happen with a third, fourth version.
Edit: to answer your last question, you can ignore unknown fields in JSON by following this answer (with the caveats explained above): https://stackoverflow.com/a/59307683/2223027
Edit 2: in general, using Enums is a form of strong typing. Sure, you could use Strings, or even bits, but then it would be easier to make mistakes, like using GiftVoucher instead of GIFT_VOUCHER.

Custom Validator for Static Use of Reflection vs. Custom Rule in SonarQube (Java, Eclipse)

There may be some related questions, but I think my situation is peculiar enough to justify a question on its own.
I'm working on a historically grown huge Java project (far over one million LOC, due to other reasons we're still bound to Java 6 at the moment), where reflection is used to display data in tables - reflection is not used for dynamically changing the displayed data, but just for using some kind of short cuts in the code. A simplified part of the code looks like this.
TableColumns taco = new TableColumns(Bean.class);
taco.add(new TableColumn("myFirstMember"));
taco.add(new TableColumn("mySecondMember"));
...
List<Bean> dataList = getDataFromDB(myFilterSettings);
taco.displayTable(dataList);
So the values of the table cells of each row are stored in an instance of Bean. The values for the first cell comes from calling itemOfDataList.getMyFirstMember() (so here comes the reflection part of the code). The rendering of the table cells is done depending on the return type of the itemOfDataList.getMyFirstMember().
This way, it's easy to add new columns to the table, getting them rendered in a standard way without caring about any details.
Problem of this approach: when the getter name changes, the compiler doesn't notice and there will be an exception at runtime in case Bean.getMyFirstMember() was renamed to Bean.getMyFirstMemberChanged().
While reflection is used to determine which getter is called, the needed info is in fact available at compile time, there are no variables used for the column info.
My goal: having a validator that will check at compile time whether the needed getter methods in the Bean class do exist.
Possible solultions:
modifying the code (using more specific infos, writing an adapter, using annotations or whatever that can be checked at compile time by the compiler), I explicitely don't want a solution of this kind, due to the huge code basis. I just need to guarantee that the reflection won't fail at runtime.
writing a custom validator: I guess this shouldn't be too complex, but I have no real idea how to start, we use eclipse as ide, so it should be possible to write such a custom validator - any hints for a good starting point?
The validator should show a warning in eclipse if the parameter in the TableColumn(parameter) isn't final (should be a literal or constant). The validator should show an error in eclipse if the TableColumn is added to TableColumns and the corresponding Bean.getParameter() doesn't exist.
as we use SonarQube for quality checking, we could also implement a custom rule checking if the methods do exist - not completely sure if such a custom rule is possible (probably yes)
maybe other solutions that will give a fast feedback within eclipse that some tables won't render correctly after some getter methods were renamed
What I'm asking for:
what will be easier in this situation: writing a custom validator for eclipse or writing a custom rule for SonarQube?
hints where to start either approach
hints for other solultions
Thanks for your help.

Some alternatives:
You could migrate to more modern Java for this pattern, it is a prime candidate for method references. Then, your IDE of choice can automatically take care of the problem when you refactor/rename. This can be done bit-by-bit as the opportunity/necessity arises.
You could write your own custom annotations:
Which you can probably get SonarQube to scan for
Which could allow you to take advantage of javax.validation.* goodies, so your code may look/feel more like 'standard' Java EE code.
Annotations can be covered by a processor during the build step, various build tools have ways to hook this up -- and the processor can do more advanced/costly introspection so you can push the validation to compile-time as opposed to run-time.

How to stop ANTLR from suppressing syntax errors?

So I'm writing a compiler in Java using ANTLR, and I'm a little puzzled by how it deals with errors.
The default behavior seems to be to print an error message and then attempt, by means of token insertion and such, to recover from the error and continue parsing. I like this in principle; it means that (in the best case) if the user has committed more than one syntax error, they'll get one message per error, but it'll mention all the errors instead of forcing them to recompile to discover the next one. The default error message is fine for my purposes. The trouble comes when it's done reading all the tokens.
I am, of course, using ANTLR's tree constructors to build abstract syntax trees. While it's nice for the parse to continue through syntax errors so the user can see all the errors, once it's done parsing I want to get an exception or some kind of indication that the input wasn't syntactically valid; that way I can stop the compilation and tell the user "sorry, fix your syntax errors and then try again". What I don't want is for it to spit out an incomplete AST based on what it thinks the user was trying to say, and continue to the next phase of compilation with no indication that anything went wrong (other than the error messages which went to the console and I can't see). Yet by default, it does exactly that.
The Definitive ANTLR Reference offers a technique to stop parsing as soon as a syntax error is detected: override the mismatch and recoverFromMismatchedSet methods to throw RecognitionExceptions, and add a #rulecatch action to do the same. This would seem to lose the benefit of recovering from parse errors, but more importantly, it only partially works. If a necessary token is missing (for instance, if a binary operator only has an expression on one side of it), it throws an exception just as expected, but if an extraneous token is added, ANTLR inserts the token that it thinks belongs there and continues on its merry way, producing an AST with no indication of a syntax error except a console message. (To make matters worse, the token it inserted was EOF, so the rest of the file didn't even get parsed.)
I'm sure I could fix this by, say, adding something like an isValid field to the parser and overriding methods and adding actions so that, at the end of the parse, it throws an exception if there were any errors. But is there a better way? I can't imagine that what I'm trying to do is unusual among ANTLR users.

... [O]nce it's done parsing I want to get an exception or some kind of indication that the input wasn't syntactically valid; that way I can stop the compilation...
You can call getNumberOfSyntaxErrors on both the lexer and the parser after parsing to determine if there was an error that was covertly accommodated by ANTLR. This doesn't tell you what those errors were, obviously, but I think these methods address the "once it's done parsing ... stop the compilation" part of your question.
The Definitive ANTLR Reference offers a technique to stop parsing as soon as a syntax error is detected: override the mismatch and recoverFromMismatchedSet methods to throw RecognitionExceptions, and add a #rulecatch action to do the same.
I don't think you mentioned which version of ANTLR you're using, but the documentation in the ANTLR v3.4 code for the method recoverFromMismatchedSet says it's "not currently used" and an Eclipse "global usage" scan found no callers. Neither here nor there to your main problem, but I wanted to mention it for the record. It may be the correct method to override for your version.
If a necessary token is missing ..., [the overridden code] throws an exception just as expected, but if an extraneous token is added, ANTLR inserts the token that it thinks belongs there and continues on its merry way...
Method recoverFromMismatchedToken tests for a recoverable missing and extraneous token by delegating to methods mismatchIsMissingToken and mismatchIsUnwantedToken respectively. If the appropriate method determines that an insertion or deletion will solve the problem, recoverFromMismatchedToken makes the appropriate correction. If it is determined that no operation solves the mismatched token problem, recoverFromMismatchedToken throws a MismatchedTokenException.
If a recovery operation takes place, reportError is called, which calls displayRecognitionError with the details.
This applies to ANTLR v3.4 and possibly earlier versions.
This gives you at least two options:
Override recoverFromMismatchedToken and handle errors at a fine-grained level. From here you can delegate the call to the super implementation, roll your own recovery code, or bail out with an exception. Whatever the case, your code will be called and thus will be aware that a mismatch error occurred, recoverable or otherwise. This option is probably equivalent to overriding recoverFromMismatchedSet.
Override displayRecognitionError and handle the errors at a course-grained level. Method reportError does some state juggling, so I wouldn't recommend overriding it unless the overriding implementation calls the super-implementation. Method displayRecognitionError appears to be one of the last calls in the recovered-token call chain, so it would be a reasonable place to determine whether or not to continue. I would prefer it had a name that indicated that it was a reasonable place for that, but oh well. Here is an answer that demonstrates this option.
I'm partial towards overriding displayRecognitionError because it provides the error message text easily enough and because I know it's going to be called only after a token recovery operation and required state juggling -- no need for my parser to figure out how to recover for itself. This coupled with getNumberOfSyntaxErrors appear to give you the options that you're looking for, assuming that you're working with a relevant version of ANTLR and that I fully understood your problem.

Is it possible to prevent a class from using a method in java?

Suppose I have a class called Foo. This class will be modified by many people, and WILL print information to the console. To this effect, we have the following method:
private void print(String message){ ... }
which prints out to the screen in the format we want.
However, while reviewing code from other devs I see that they constantly call System.out.println(...)
instead, which results in barely-readable printouts.
My question is the following: is it possible to prevent any and every use of System.out.println() in Foo.java? If so, how?
I've tried looking this up, but all I found had to do with inheritance, which is not related to my question.
Thanks a lot!
N.S.
EDIT: I know that whatever I have to do to prevent the use of a method could be removed by a dev, but we have as a policy never to remove code marked //IMPORTANT so it could still be used as a deterrent.
EDIT2: I know I can simply tell the devs not to do it or use code reviews to filter the "errors" out but 1) I'm already doing it and it costs a lot of time and 2) the question is whether this is possible or not, NOT how to deal with my devs.

public methods are just that - public. There is no way to restrict access to them.
This kind of problem is usually "solved" by setting up some code-checker like PMD or checkstyle and integrating them into the continuous integration build. So violations of these stuff will be emailed to someone with a big hammer :-)

Although communicating that developers should not use System.out directly would be preferred, you could set System.out to another PrintStream, then use the alternative PrintStream in the private method. That way, when people use System.out.println they won't output anything but you'll still be able to use the alternative PrintStream... something like they do here: http://halyph.blogspot.com/2011/07/how-to-disable-systemout.html

Pre-commit hooks for your revision control system (SVN, Git, Mercurial) can grep for uses of System.{err,out} and prevent commit if they occur.
http://stuporglue.org/svn-pre-commit-hook-which-can-syntax-check-all-files/ is an example that takes an action for different changed files based on file extension for SVN. You should be able to modify that example to take an example based on some subset of Java files and reject if something like the following is true
egrep -q '\bSystem\.(err|out)\b'

You can redirect System.out calls to a streams that ignores the output or that redirects it to your logging system.
System.setOut(printStream);
You can also kill those using System.out.println in a production environment.

You can replace the OutputStream of System with your own implementation that would either throw an exception, or redirect the call to your own print implementation (which you would need to make public).

No, it's not possible to 100% prevent a class from ever using a specific method in Java.
Having that said...
My suggestion would be to add code analysis to your build process and failing the build on any occurrence of System.out.println. A good place to start if you're interested in going this route would be to check out PMD.
Also... have some constructive discussions with your developers and talk about why they're doing what they're doing. Good luck.

Writing long test method names to describe tests vs using in code documentation

For writing unit tests, I know it's very popular to write test methods that look like
public void Can_User_Authenticate_With_Bad_Password()
{
...
}
While this makes it easy to see what the test is testing for, I think it looks ugly and it doesn't display well in auto-generated documentation (like sandcastle or javadoc).
I'm interested to see what people think about using a naming schema that is the method being tested and underscore test and then the test number. Then using the XML code document(.net) or the javadoc comments to describe what is being tested.
/// <summary>
/// Tests for user authentication with a bad password.
/// </summary>
public void AuthenticateUser_Test1()
{
...
}
by doing this I can easily group my tests together by what methods they are testing, I can see how may test I have for a given method, and I still have a full description of what is being tested.
we have some regression tests that run vs a data source (an xml file), and these file may be updated by someone without access to the source code (QA monkey) and they need to be able to read what is being tested and where, to update the data sources.

I prefer the "long names" version - although only to describe what happens. If the test needs a description of why it happens, I'll put that in a comment (with a bug number if appropriate).
With the long name, it's much clearer what's gone wrong when you get a mail (or whatever) telling you which tests have failed.
I would write it in terms of what it should do though:
LogInSucceedsWithValidCredentials
LogInFailsWithIncorrectPassword
LogInFailsForUnknownUser
I don't buy the argument that it looks bad in autogenerated documentation - why are you running JavaDoc over the tests in the first place? I can't say I've ever done that, or wanted generated documentation. Given that test methods typically have no parameters and don't return anything, if the method name can describe them reasonably that's all the information you need. The test runner should be capable of listing the tests it runs, or the IDE can show you what's available. I find that more convenient than navigating via HTML - the browser doesn't have a "Find Type" which lets me type just the first letters of each word of the name, for example...

Does the documentation show up in your test runner? If not that's a good reason for using long, descriptive names instead.
Personally I prefer long names and rarely see the need to add comments to tests.

I've done my dissertation on a related topic, so here are my two cents: Any time you rely on documentation to convey something that is not in your method signature, you are taking the huge risk that nobody would read the documentation.
When developers are looking for something specific (e.g., scanning a long list of methods in a class to see if what they're looking for is already there), most of them are not going to bother to read the documentation. They want to deal with one type of information that they can easily see and compare (e.g., names), rather than have to start redirecting to other materials (e.g., hover long enough to see the JavaDocs).
I would strongly recommend conveying everything relevant in your signature.

Personally I prefer using the long method names. Note you can also have the method name inside the expression, as:
Can_AuthenticateUser_With_Bad_Password()

I suggest smaller, more focussed (test) classes.
Why would you want to javadoc tests?

What about changing
Can_User_Authenticate_With_Bad_Password
to
AuthenticateDenieTest
AuthenticateAcceptTest
and name suit something like User

As a Group how do we feel about doing a hybrid Naming schema like this
/// <summary>
/// Tests for user authentication with a bad password.
/// </summary>
public void AuthenticateUser_Test1_With_Bad_Password()
{
...
}
and we get the best of both.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.