Code analyzers: PMD & FindBugs - java

1. Regarding PMD:
1.1 How do I set the PMD checks, to ignore some of them, like "Variable name is too short, or too long", "Remove empty constructor, etc" - and if I do that, another warning appears that says the class must have some static methods. Basically, the class was empty, for later development, and I like to leave it that way for now.
1.2 Is it necesarry to follow this warning advice?
A class which only has private constructors should be final
1.3 What is that supposed to mean?
The class 'Dog' has a Cyclomatic Complexity of 3 (Highest = 17)
1.4 What about this one? I would love to change this, but nothing crosses my mind at the moment regarding the change:
Assigning an Object to null is a code smell. Consider refactoring.
2.Regarding FindBugs:
2.1 Is it really that bad to write to a static field, at some point later than its declaration? The following code gives me a warning:
Main.appCalendar = Calendar.getInstance();
Main.appCalendar.setTimeInMillis(System.currentTimeMillis());
where appCalendar is a static variable.
2.2 This code:
strLine = objBRdr.readLine().trim();
gives the warning:
Immediate dereference of the result of readLine()
where objBRdr is a BufferedReader(FileReader). What could happen? readLine() could be null?
The code is nested in while (objBRdr.ready()) test, and so far, I have zero problems there.
Update1: 2.2 was fixed when I replaced the code with:
strLine = objBRdr.readLine();
if (strLine != null) {
strLine = strLine.trim();
}

1.1 How do i set the PMD checks [...]
PMD stores rule configuration in a special repository referred to as the Ruleset XML file. This configuration file carries information about currently installed rules and their attributes.
These files are located in the rulesets directory of the PMD distribution. When using PMD with Eclipse, check Customizing PMD.
1.2 Is it necessary to follow this warning advice?
A class which only has private constructors should be final
All constructors always begin by calling a superclass constructor. If the constructor explicitly contains a call to a superclass constructor, that constructor is used. Otherwise the no-argument constructor is implied. If the no-argument constructor does not exist or is not visible to the subclass, you get a compile-time error.
So it's actually not possible to derive a subclass from a class whose every constructor is private. Marking such a class as final is thus a good idea (but not necessary) as it explicitly prevent subclassing.
1.3 What is that supposed to mean?
The class 'Dog' has a Cyclomatic Complexity of 3 (Highest = 17)
The complexity is the number of decision points in a method plus one for the method entry. The decision points are 'if', 'while', 'for', and 'case labels'. Generally, 1-4 is low complexity, 5-7 indicates moderate complexity, 8-10 is high complexity, and 11+ is very high complexity.
Having that said, I'll just quote some parts of Aggregate Cyclomatic complexity is meaningless:
[...] This metric only has meaning in the context of a single method. Mentioning that a class has a Cyclomatic complexity of X is essentially useless.
Because Cyclomatic complexity measures
pathing in a method, every method has
at least a Cyclomatic complexity of 1,
right? So, the following getter method
has a CCN value of 1:
public Account getAccount(){
return this.account;
}
It’s clear from this boogie method
that account is a property of this
class. Now imagine that this class has 15 properties and follows the typical getter/setter paradigm for each property and those are the only methods available. That means the class has 30 simple methods, each with a Cyclomatic complexity value of 1. The aggregate value of the class is then 30.
Does this value have any meaning, man?
Of course, watching it over time may
yield something interesting; however,
on its own, as an aggregate value, it
is essentially meaningless. 30 for the
class means nothing, 30 for a method
means something though.
The next time you find yourself
reading a copasetic aggregate
Cyclomatic complexity value for a
class, make sure you understand how
many methods the class contains. If
the aggregate Cyclomatic complexity
value of a class is 200– it shouldn’t
raise any red flags until you know the
count of methods. What’s more, if you
find that the method count is low yet
the Cyclomatic complexity value is
high, you will almost always find the
complexity localized to a method.
Right on!
So to me, this PMD rule should be taken with care (and is actually not very valuable).
1.4 What about this one? I would love to change this, but nothing crosses my mind at the moment regarding the change:
Assigning an Object to null is a code smell. Consider refactoring.
Not sure what you don't get about this one.
2.1 Is it really that bad to write to a static field, at some point later than its declaration? [...]
My guess is that you get a warning because the method contains an unsynchronized lazy initialization of a non-volatile static field. And because the compiler or processor may reorder instructions, threads are not guaranteed to see a completely initialized object, if the method can be called by multiple threads. You can make the field volatile to correct the problem.
2.2 [...] Immediate dereference of the result of readLine()
If there are no more lines of text to read, readLine() will return null and dereferencing that will generate a null pointer exception. So you need indeed to check if the result is null.

Here some idea / answer
1.4 What is the reason to assign null to a object? If you reuse the same variable, there's not reason to set it to null before.
2.1 The reason about this warning, is to be sure that all your instance of the class Main have the same static fields. In your Main class, you could have
static Calendar appCalendar = Calendar.getInstance() ;
about your 2.2 you're right, with the null check, you are sure that you'll not have any NullPointerException. We never know when your BufferedReader can block/trash, this doesn't happen often (in my experience) but we never know when a hard drive crash.

Related

Memory/Performance differences of declaring variable for return result of method call versus inline method call

Are there any performance or memory differences between the two snippets below? I tried to profile them using visualvm (is that even the right tool for the job?) but didn't notice a difference, probably due to the code not really doing anything.
Does the compiler optimize both snippets down to the same bytecode? Is one preferable over the other for style reasons?
boolean valid = loadConfig();
if (valid) {
// OK
} else {
// Problem
}
versus
if (loadConfig()) {
// OK
} else {
// Problem
}
The real answer here: it doesn't even matter so much what javap will tell you how the corresponding bytecode looks like!
If that piece of code is executed like "once"; then the difference between those two options would be in the range of nanoseconds (if at all).
If that piece of code is executed like "zillions of times" (often enough to "matter"); then the JIT will kick in. And the JIT will optimize that bytecode into machine code; very much dependent on a lot of information gathered by the JIT at runtime.
Long story short: you are spending time on a detail so subtle that it doesn't matter in practical reality.
What matters in practical reality: the quality of your source code. In that sense: pick that option that "reads" the best; given your context.
Given the comment: I think in the end, this is (almost) a pure style question. Using the first way it might be easier to trace information (assuming the variable isn't boolean, but more complex). In that sense: there is no "inherently" better version. Of course: option 2 comes with one line less; uses one variable less; and typically: when one option is as readable as another; and one of the two is shorter ... then I would prefer the shorter version.
If you are going to use the variable only once then the compiler/optimizer will resolve the explicit declaration.
Another thing is the code quality. There is a very similar rule in sonarqube that describes this case too:
Local Variables should not be declared and then immediately returned or thrown
Declaring a variable only to immediately return or throw it is a bad practice.
Some developers argue that the practice improves code readability, because it enables them to explicitly name what is being returned. However, this variable is an internal implementation detail that is not exposed to the callers of the method. The method name should be sufficient for callers to know exactly what will be returned.
https://jira.sonarsource.com/browse/RSPEC-1488

why is it a bad practise to not initialize primitive fields in a class?

Java primitives always have a default value in the memory of O (booleans are false = 0). So why is it considered as a bad practise to not initialize them, if they even have a predefined value because of this mechanic? And in arrays, even with initialization of a new int[8], all the values in it arent really initialized, but that isnt frowned upon...
By explicitly defining a value, it's clear that you intended that value at that point of execution. If not, another reader might interpret it as if you either forgot to initialize this variable or you don't care at that point (and will set it somewhere else later).
In other words, it's some kind of implicit documentation. Generally, it's considered better practice to write verbose code for better readability; i.e. never use abbreviations for methods names, write them out!
Also, if you have to write line comments (//), you can almost always replace them by wrapping the following code into a well-named method. Implicit documentation ftw! :)
ALL instance variables are initialized. If you don't specify a value, the default value is used.
Who says it's bad practice to not initialize instance variables? I tend not to initialize them unless it's to a non-default value, but it's not a big deal either way. It's about readability and reducing "code noise" improves readability. Useless initializing is code noise IMHO.
Say i am writing a small game and every single entity (enemy, player etc) starts with 100 health, there is no point in using a sethealth(100) method every time a new entity is created.
So basically, imo unless you need to use a certain value other than zero, I would not initialize them. Same goes for booleans, unless you need something to be true right off the bat, no point in touching it.
Its not bad practice, and Ive seen experienced developers who do initialise, and those who dont.
My preference is to initialise as it shows the developer has taken time to consider what the values should be on start up, and is not just relying on the compiler using defaults
It is not bad practice to initialize instance variable of class. But it is bot necessory to initialize it, because if you forgot or not initialize it, default value is assign to it.
Initialization required when we want that class instance variable/s must have some value at the time of initialization of class object.

Is there a zero-time, startup (no recompilation) switchable condition flag in Java?

I'm looking for a way to provide the fastest (I mean zero-time - compilation/classloading/JIT time resolved) possible On/Off flag for if condition. Of course this condition will be changed only once per application run - at startup.
I know that "compile-time constant if conditions" can be conditionaly compiled and whole condition can be removed from code. But what is the fastest (and possibly simple) alternative without need to recompile sources?
Can I move condition to separate .jar with single class & method with condition, where I produce two versions of that .jar and will swtich those versions in classpath on application startup? Will JIT remove call to method in separate .jar if it discovers, that method is empty?
Can I do it by providing two classes in classpath implementing "ClassWithMyCondition", where one of those class will have a real implementation and second will have just empty method and instantiate one of it by Class.forName and .newInstance()?Will JIT remove call to empty method from my primary very loop-nested method?
What can be simplest byte-code manipulation solution to this problem?
A standard way to do this sort of logic is to create an interface for the functionality you want, and then to create two (or more) implementations for that functionality. Only one of the implementations will be loaded in your runtime, and that implementation can make the assumptions it needs to in order to avoid the if condition entirely.
This has the advantage that each implementation is mutually exclusive, and things like the JIT compiler can ignore all the useless code for this particular run.
The simplest solution works here. Don't overcomplicate things for yourself.
Just put a final static boolean that isn't a compile-time constant (as defined in the JLS) somewhere and reference it wherever you want the "conditional" compilation. The JVM will evaluate it the first time it sees it, and by the time the code gets JIT'ed, the JVM will know that the value won't change and can then remove the check and, if the value is false, the block.
Some sources: Oracle has a wiki page on performance techniques which says to use constants when possible (note that in this context, the compiler is the JVM/JIT, and therefore a final field counts as a constant even if it isn't a compile-time constant by JLS standards). That page links to an index of performance tactics the JIT takes, which mentions techniques such as constant folding and flow-sensitive rewrites, including dead code removal.
You can pass custom values in the command line, and then check for that value once. So in your code, have something like this:
final static boolean customProp = "true".equalsIgnoreCase(System.getProperty("customProp"));
Depending on your command line parameters, the static final value will change. This will set the value to true:
java -DcustomProp="true" -jar app.jar
While this will set the value to false:
java -jar app.jar
This gives you the benefits of a static final boolean, but allows the value to be altered without recompiling.
[Edit]
As indicated in the comments, this approach does not allow for optimizations at compile time. The value of the static final boolean is set on classload, and is unchanged from there. "Normal" execution of the bytecode will likely need to evaluate every if (customProp). However, JIT happens at runtime, compiling bytecode down to native code. At this point, since the bytecode has the runtime value, more aggressive optimizations like inlining or excluding code are possible. Note that you cannot predict exactly if or when the JIT will kick in, though.
you should load the value from a properties file so that you can avoid having to recompile each time it cahnges. Simply update the text file and on next program run, it uses the new value. Here's an example I wrote a long time ago:
https://github.com/SnakeDoc/JUtils/blob/master/src/net/snakedoc/jutils/Config.java
The JIT recompiles the code every time you run it. You are doing this already whether you know it or not. This means if you have a field which the JIT believe is not changed (it doesn't even have to be final) it will be inlined and the check and code optimised away.
Trying to out smart the JIT is getting harder over time.

Java variable declaration efficiency

As I understand, in case of an array, JAVA checks the index against the size of the Array.
So instead of using array[i] multiple times in a loop, it is better to declare a variable which stores the value of array[i], and use that variable multiple times.
My question is, if I have a class like this:
public class MyClass(){
public MyClass(int value){
this.value = value;
}
int value;
}
If I create an instance of this class somewhere else: (MyClass myobject = new MyClass(7)), and I have to use the objects value multiple times, is it okay to use myobject.value often or would it be better to declare a variable which stores that value and use that multiple times, or would it be the same?
In your case, it wouldn't make any difference, since referencing myobject.value is as fast and effective as referencing a new int variable.
Also, the JVM is usually able to optimize these kinds of things, and you shouldn't spend time worrying about it unless you have a highly performance critical piece of code. Just concentrate on writing clear, readable code.
The short answer is yes (in fact, in the array case, it does not only have to check the index limit but to calculate the actual memory position of the reference you are looking for -as in i=7, get the base position of the array and add 7 words-).
The long answer is that, unless you are really using that value a lot (and I mean a lot) and you are really constrained due to speed, it is not worth the added complexity of the code. Add to that that the local variable means that your JVM uses more memory, may hit a cache fault, and so on.
In general, you should worry more about the efficiency of your algorithm (the O(n)) and less about these tiny things.
The Java compiler is no bozo. He will do that optimization for you. There is 0 speed difference between all the options you give, usually.
I say 'usually' because whether or not accessing the original object or your local copy isn't always the same. If your array is globally visible, and another thread is accessing it, the two forms will yield different results, and the compiler cannot optimize one into the other. It is possible that something confuses the compiler into thinking there may be a problem, even though there isn't. Then it won't apply a legal optimization.
However, if you aren't doing funny stuff, the compiler will see what you're doing and optimize variable access for you. Really, that's what a compiler does. That's what it's for.
You need to optimize at least one level above that. This one isn't for you.

Why does javac complain about not initialized variable?

For this Java code:
String var;
clazz.doSomething(var);
Why does the compiler report this error:
Variable 'var' might not have been initialized
I thought all variables or references were initialized to null. Why do you need to do:
String var = null;
??
Instance and class variables are initialized to null (or 0), but local variables are not.
See §4.12.5 of the JLS for a very detailed explanation which says basically the same thing:
Every variable in a program must have a value before its value is used:
Each class variable, instance variable, or array component is initialized with a default value when it is created:
[snipped out list of all default values]
Each method parameter is initialized to the corresponding argument value provided by the invoker of the method.
Each constructor parameter is initialized to the corresponding argument value provided by a class instance creation expression or explicit constructor invocation.
An exception-handler parameter is initialized to the thrown object representing the exception.
A local variable must be explicitly given a value before it is used, by either initialization or assignment, in a way that can be verified by the compiler using the rules for definite assignment.
It's because Java is being very helpful (as much as possible).
It will use this same logic to catch some very interesting edge-cases that you might have missed. For instance:
int x;
if(cond2)
x=2;
else if(cond3)
x=3;
System.out.println("X was:"+x);
This will fail because there was an else case that wasn't specified. The fact is, an else case here should absolutely be specified, even if it's just an error (The same is true of a default: condition in a switch statement).
What you should take away from this, interestingly enough, is don't ever initialize your local variables until you figure out that you actually have to do so. If you are in the habit of always saying "int x=0;" you will prevent this fantastic "bad logic" detector from functioning. This error has saved me time more than once.
Ditto on Bill K. I add:
The Java compiler can protect you from hurting yourself by failing to set a variable before using it within a function. Thus it explicitly does NOT set a default value, as Bill K describes.
But when it comes to class variables, it would be very difficult for the compiler to do this for you. A class variable could be set by any function in the class. It would be very difficult for the compiler to determine all possible orders in which functions might be called. At the very least it would have to analyze all the classes in the system that call any function in this class. It might well have to examine the contents of any data files or database and somehow predict what inputs users will make. At best the task would be extremely complex, at worst impossible. So for class variables, it makes sense to provide a reliable default. That default is, basically, to fill the field with bits of zero, so you get null for references, zero for integers, false for booleans, etc.
As Bill says, you should definitely NOT get in the habit of automatically initializing variables when you declare them. Only initialize variables at declaration time if this really make sense in the context of your program. Like, if 99% of the time you want x to be 42, but inside some IF condition you might discover that this is a special case and x should be 666, then fine, start out with "int x=42;" and inside the IF override this. But in the more normal case, where you figure out the value based on whatever conditions, don't initialize to an arbitrary number. Just fill it with the calculated value. Then if you make a logic error and fail to set a value under some combination of conditions, the compiler can tell you that you screwed up rather than the user.
PS I've seen a lot of lame programs that say things like:
HashMap myMap=new HashMap();
myMap=getBunchOfData();
Why create an object to initialize the variable when you know you are promptly going to throw this object away a millisecond later? That's just a waste of time.
Edit
To take a trivial example, suppose you wrote this:
int foo;
if (bar<0)
foo=1;
else if (bar>0)
foo=2;
processSomething(foo);
This will throw an error at compile time, because the compiler will notice that when bar==0, you never set foo, but then you try to use it.
But if you initialize foo to a dummy value, like
int foo=0;
if (bar<0)
foo=1;
else if (bar>0)
foo=2;
processSomething(foo);
Then the compiler will see that no matter what the value of bar, foo gets set to something, so it will not produce an error. If what you really want is for foo to be 0 when bar is 0, then this is fine. But if what really happened is that you meant one of the tests to be <= or >= or you meant to include a final else for when bar==0, then you've tricked the compiler into failing to detect your error. And by the way, that's way I think such a construct is poor coding style: Not only can the compiler not be sure what you intended, but neither can a future maintenance programmer.
I like Bill K's point about letting the compiler work for you- I had fallen into initializing every automatic variable because it 'seemed like the Java thing to do'. I'd failed to understand that class variables (ie persistent things that constructors worry about) and automatic variables (some counter, etc) are different, even though EVERYTHING is a class in Java.
So I went back and removed the initialization I'd be using, for example
List <Thing> somethings = new List<Thing>();
somethings.add(somethingElse); // <--- this is completely unnecessary
Nice. I'd been getting a compiler warning for
List<Thing> somethings = new List();
and I'd thought the problem was lack of initialization. WRONG. The problem was I hadn't understood the rules and I needed the <Thing> identified in the "new", not any actual items of type <Thing> created.
(Next I need to learn how to put literal less-than and greater-than signs into HTML!)
I don't know the logic behind it, but local variables are not initialized to null. I guess to make your life easy. They could have done it with class variables if it were possible. It doesn't mean you have to have it initialized in the beginning. This is fine :
MyClass cls;
if (condition) {
cls = something;
else
cls = something_else;
Sure, if you've really got two lines on top of each other as you show- declare it, fill it, no need for a default constructor. But, for example, if you want to declare something once and use it several or many times, the default constructor or null declaration is relevant. Or is the pointer to an object so lightweight that its better to allocate it over and over inside a loop, because the allocation of the pointer is so much less than the instantiation of the object? (Presumably there's a valid reason for a new object at each step of the loop).
Bill IV

Categories

Resources