How to require/assert a condition in KNIME? - java

If I want to specify preconditions on the input arguments, what is the idiomatic way when developing KNIME nodes?
Using assert(condition, message) might be efficient and simple, thought its check depends on the VM argument -ea.
Manually checking with if (condition) throw new IllegalArgumentException(message); seems better, but it does not provide extra semantic information when only checking for nulls for example.
There is also the org.knime.core.node.InvalidSettingsException exception. Should that be used for this purpose?
Is there a collection of methods that should be used in KNIME?

Yes, there is a recommended way to signal incorrect inputs, there are specialized methods in org.knime.core.node.util.CheckUtils (from the bundle org.knime.core.util). It has methods for:
non-null checks: checkNotNull, checkArgumentNotNull, checkSettingNotNull
arguments: checkArgument
state: checkState
setting (from UI or flow variable): checkSetting
files: checkDestinationFile, checkSourceFile, checkDestinationDirectory
These allow using templates in the messages which only expanded when the check fails.
You can find example usages with this query.

Related

What to do in findXY method if no such object exists?

Assume I have a Java method
public X findX(...)
which tries to find an Object of type X fulfilling some conditions (given in the parameters). Often such functions cannot guarantee to find such an object. I can think of different ways to deal with this:
One could write an public boolean existsX(...) method with the same signature which should be called first. This avoids any kind of exceptions and null handling, but probably you get some duplicate logic.
One could just return null (and explain this in javadoc). The caller has to handle it.
One could throw a checked exception (which one would fit for this?).
What would you suggest?
The new Java 8 Optional class was made for this purpose.
If the object exists then you return Optional.of(x) where x is the object, if it doesn't then return Optional.empty(). You can check if an Optional has an object present by using the isPresent() method and you can get the object using get().
https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html
If you can't use Optional I would go with option 2.
In the case of 1. You'd be doing double work, first you have to check if X exists, but if it does, you're basically discarding the result, and you have to do the work again in findX. Although the result of existsX could be cached and checked first when calling findX, this would still be an extra step over just returning X.
In the case of 3. To me this comes down to usability. Sometimes you just know that findX will return a result (and if it doesn't, there is a mistake somewhere else), but with a checked exception, you would still have to write the try and (most likely empty) catch block.
So option 2 is the winner to me. It doesn't do extra work, and checking the result is optional. As long as you document the null return, there should be no problems.
Guava (potentially other libraries, too) also offers an Optional class that might be worth exploring if your project uses Guava since you seem to not use Java 8.
If you
don't/can't/won't use Java 8 and its accompanying Optional type
don't want another library dependency (like Guava) for a single class implementation
but
you want something more robust than methods that can return null for which you have to have a check in all consumers (which was the standard before Java 8)
then writing your own Optional as an util class is the easiest option.

Some java method are null safe, some are not, how do I know?

Some java method is null safe, but some are not. How to distinguish them?
I assume you mean in terms of the parameters? The documentation should state whether or not the arguments can be null, and when they can be null, what semantic meaning is inferred from nullity.
Unfortunately not all documentation is clear like this - and likewise it may not specify whether the return value might be null or not... in which case all you can do is experiment or look at the source code where possible :(
In general, I would suggest that you assume that you cannot pass null as a parameter unless the documentation clearly states that you can and what the corresponding behaviour is.
A problem with taking the default assumption that a parameter might be "null-safe" is that, even if that turns out to be true, it's not always clear without documentation what the corresponding behaviour actually is. "Not throwing an exception" doesn't actually indicate what alternative behaviour/default parameter/assumptions are then going to occur instead.
If you're designing an API, then where is's practical, I would suggest not actually encouraging null to be passed as a parameter to exposed methods/constructors, but rather have separate method signatures that include or not the various optional parameters. And in any case, you may then need to document in some way what actual behaviour is being taken to make up for the missing parameter.
If you're lucky, the parameter will be documented or annotated, or both. Unfortunately, most Java APIs lack both.
Some static analysis tools can use annotations to check whether you're passing a null value inappropriately. For example, the FindBugs tool includes support for these annotations:
#NonNull - The value must not be null
#CheckForNull - The value may contain null.
#Nullable - Whether the value may contain null or not depends on context.
Read the javadocs of the methods you are trying to call. If the javadocs don't specify this, then trial and error in a unit test is probably your best bet.

Java source refactoring of 7000 references

I need to change the signature of a method used all over the codebase.
Specifically, the method void log(String) will take two additional arguments (Class c, String methodName), which need to be provided by the caller, depending on the method where it is called. I can't simply pass null or similar.
To give an idea of the scope, Eclipse found 7000 references to that method, so if I change it the whole project will go down. It will take weeks for me to fix it manually.
As far as I can tell Eclipse's refactoring plugin of Eclipse is not up to the task, but I really want to automate it.
So, how can I get the job done?
Great, I can copy a previous answer of mine and I just need to edit a tiny little bit:
I think what you need to do is use a source code parser like javaparser to do this.
For every java source file, parse it to a CompilationUnit, create a Visitor, probably using ModifierVisitor as base class, and override (at least) visit(MethodCallExpr, arg). Then write the changed CompilationUnit to a new File and do a diff afterwards.
I would advise against changing the original source file, but creating a shadow file tree may me a good idea (e.g. old file: src/main/java/com/mycompany/MyClass.java, new file src/main/refactored/com/mycompany/MyClass.java, that way you can diff the entire directories).
Eclipse is able to do that using Refactor -> Change Method signature and provide default values for the new parameters.
For the class parameter the defaultValue should be this.getClass() but you are right in your comment I don't know how to do for the method name parameter.
IntelliJ IDEA shouldn't have any trouble with this.
I'm not a Java expert, but something like this could work. It's not a perfect solution (it may even be a very bad solution), but it could get you started:
Change the method signature with IntelliJ's refactoring tools, and specify default values for the 2 new parameters:
c: self.getClass()
methodName: Thread.currentThread().getStackTrace()[1].getMethodName()
or better yet, simply specify null as the default values.
I think that there are several steps to dealing with this, as it is not just a technical issue but a 'situation':
Decline to do it in short order due to the risk.
Point out the issues caused by not using standard frameworks but reinventing the wheel (as Paul says).
Insist on using Log4j or equivalent if making the change.
Use Eclipse refactoring in sensible chunks to make the changes and deal with the varying defaults.
I have used Eclipse refactoring on quite large changes for fixing old smelly code - nowadays it is fairly robust.
Maybe I'm being naive, but why can't you just overload the method name?
void thing(paramA) {
thing(paramA, THE_DEFAULT_B, THE_DEFAULT_C)
}
void thing(paramA, paramB, paramC) {
// new method
}
Do you really need to change the calling code and the method signature? What I'm getting at is it looks like the added parameters are meant to give you the calling class and method to add to your log data. If the only requirement is just adding the calling class/method to the log data then Thread.currentThread().getStackTrace() should work. Once you have the StackTraceElement[] you can get the class name and method name for the caller.
If the lines you need replaced fall into a small number of categories, then what you need is Perl:
find -name '*.java' | xargs perl -pi -e 's/log\(([^,)]*?)\)/log(\1, "foo", "bar")/g'
I'm guessing that it wouldn't be too hard to hack together a script which would put the classname (derived from the filename) in as the second argument. Getting the method name in as the third argument is left as an exercise to the reader.
Try refactor using intellij. It has a feature called SSR (Structural Search and Replace). You can refer classes, method names, etc for a context. (seanizer's answer is more promising, I upvoted it)
I agree with Seanizer's answer that you want a tool that can parse Java. That's necessary but not sufficient; what you really want is a tool that can carry out a reliable mass-change.
To do this, you want a tool that can parse Java, can pattern match against the parsed code, install the replacement call, and spit out the answer without destroying the rest of the source code.
Our DMS Software Reengineering Toolkit can do all of this for a variety of languages, including Java. It parses complete java systems of source, builds abstract syntax trees (for the entire set of code).
DMS can apply pattern-directed, source-to-source transformations to achieve the desired change.
To achieve the OP's effect, he would apply the following program transformation:
rule replace_legacy_log(s:STRING): expression -> expression
" log(\s) " -> " log( \s, \class\(\), \method\(\) ) "
What this rule says is, find a call to log which has a single string argument, and replace it with a call to log with two more arguments determined by auxiliary functions class and method.
These functions determine the containing method name and containing class name for the AST node root where the rule finds a match.
The rule is written in "source form", but actually matches against the AST and replaces found ASTs with the modified AST.
To get back the modified source, you ask DMS to simply prettyprint (to make a nice layout) or fidelity print (if you want the layout of the old code preserved). DMS preserves comments, number radixes, etc.\
If the exisitng application has more than one defintion of the "log" function, you'll need to add a qualifier:
... if IsDesiredLog().
where IsDesiredLog uses DMS's symbol table and inheritance information to determine if the specific log refers to the definition of interest.
Il fact your problem is not to use a click'n'play engine that will allow you to replace all occurences of
log("some weird message");
by
log(this.getClass(), new Exception().getStackTrace()[1].getMethodName());
As it has few chances to work on various cases (like static methods, as an example).
I would tend to suggest you to take a look at spoon. This tool allows source code parsing and transformation, allowing you to achieve your operation in a -obviously code based- slow, but controlled operation.
However, you could alos consider transforming your actual method with one exploring stack trace to get information or, even better, internally use log4j and a log formatter that displays the correct information.
I would search and replace log( with log(#class, #methodname,
Then write a little script in any language (even java) to find the class name and the method names and to replace the #class and #method tokens...
Good luck
If the class and method name are required for "where did this log come from?" type data, then another option is to print out a stack trace in your log method. E.g.
public void log(String text)
{
StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw, true);
new Throwable.printStackTrace(pw);
pw.flush();
sw.flush();
String stackTraceAsLog = sw.toString();
//do something with text and stackTraceAsLog
}

spirit of a jUnit test

Suppose that you have the following logic in place:
processMissing(masterKey, masterValue, p.getPropertiesData().get(i).getDuplicates());
public StringBuffer processMissing(String keyA, String valueA, Set<String> dupes) {
// do some magic
}
I would like to write a jUnit test for processMissing, testing its behavior in event dupes is null.
Am i doing the right thing here? Should I check how method handles under null, or perhaps test method call to make sure null is never sent?
Generally speaking, what is the approach here? We can't test everything for everything. We also can't handle every possible case.
How should one think when deciding what tests to write?
I was thinking about it as this:
I have a certain expectation with the method
Test should confirm define my expectation and confirm method works under that condition
Is this the right way to think about it?
Thanks and please let me know
First, define whether null is a valid value for the parameter or not.
If it is, then yes, definitely test the behavior of the method with null.
If it is not, then:
Specify that constraint via parameter documentation.
Annotate that constraint on the parameter itself (using an annotation compatible with the tool below).
Use a static analysis tool to verify that null is never passed.
No unit test is required for the invalid value unless you're writing code to check for it.
The static analysis tool FindBugs supports annotations such as #NonNull, with some limited data-flow analysis.
I personally think it would be unnecessarily expensive within large Java codebases to always write and maintain explicit checks for NULL and corresponding, non-local unit tests.
If you want to ensure that people don't call your API with a null argument you may want to consider using annotations to make this explicit, JSR 305 covers this, and its used in Guava. Otherwise you're relying on users reading javadoc.
As for testing, you're spot on in that you can't handle every possible case, assuming you don't want to support null values, I'd say that you may want to throw an IllegalArguemntException rather than a NullPointerException so you can be explicit about what is null, then you can just test for that exception being thrown - see JUnit docs.

Best practice with respect to NPE and multiple expressions on single line

I'm wondering if it is an accepted practice or not to avoid multiple calls on the same line with respect to possible NPEs, and if so in what circumstances. For example:
anObj.doThatWith(myObj.getThis());
vs
Object o = myObj.getThis();
anObj.doThatWith(o);
The latter is more verbose, but if there is an NPE, you immediately know what is null. However, it also requires creating a name for the variable and more import statements.
So my questions around this are:
Is this problem something worth
designing around? Is it better to go
for the first or second possibility?
Is the creation of a variable name something that would have an effect performance-wise?
Is there a proposal to change the exception
message to be able to determine what
object is null in future versions of
Java ?
Is this problem something worth designing around? Is it better to go for the first or second possibility?
IMO, no. Go for the version of the code that is most readable.
If you get an NPE that you cannot diagnose then modify the code as required. Alternatively, run it using the debugger and use breakpoints and single stepping to find out where the null pointer is coming from.
Is the creation of a variable name something that would have an effect performance-wise?
Adding an extra variable may increase the stack frame size, or may extend the time that some objects remain reachable. But both effects are unlikely to be significant.
Is there a proposal to change the exception message to be able to determine what object is null in future versions of Java ?
Not that I am aware of. Implementing such a feature would probably have significant performance downsides.
The Law of Demeter explicitly says not to do this at all.
If you are sure that getThis() cannot return a null value, the first variant is ok. You can use contract annotations in your code to check such conditions. For instance Parasoft JTest uses an annotation like #post $result != null and flags all methods without the annotation that use the return value without checking.
If the method can return null your code should always use the second variant, and check the return value. Only you can decide what to do if the return value is null, it might be ok, or you might want to log an error:
Object o = getThis();
if (null == o) {
log.error("mymethod: Could not retrieve this");
} else {
o.doThat();
}
Personally I dislike the one-liner code "design pattern", so I side by all those who say to keep your code readable. Although I saw much worse lines of code in existing projects similar to this:
someMap.put(
someObject.getSomeThing().getSomeOtherThing().getKey(),
someObject.getSomeThing().getSomeOtherThing())
I think that no one would argue that this is not the way to write maintainable code.
As for using annotations - unfortunately not all developers use the same IDE and Eclipse users would not benefit from the #Nullable and #NotNull annotations. And without the IDE integration these do not have much benefit (apart from some extra documentation). However I do recommend the assert ability. While it only helps during run-time, it does help to find most NPE causes and has no performance effect, and makes the assumptions your code makes clearer.
If it were me I would change the code to your latter version but I would also add logging (maybe print) statements with a framework like log4j so if something did go wrong I could check the log files to see what was null.

Categories

Resources