Check Comments/Javadoc number of words with Checkstyle

Check Comments/Javadoc number of words with Checkstyle - java

Is it possible to set up a checkstyle rule that will count the number of words from a commment and then shows a problem if the number of words is under a defined limit. I searched on the checkstyle Javadoc properties, but did not found something useful.
For Example:
/**
* Stores the database connection.
*/
private Connection mConnection;
The comment contains more than 3 words and would be correct. But if the number of words would be under the limit, checkstyle should mark it as a problem.
If this is possible, it would be great if the rule is applied to every part of a comment (description of return, the parameters, method description and so on).

There isn't a standard Checkstyle check to do that. However you could write your own:
The "Writing Checks" page provides an introduction.
You could use the source code for the classes in the ...checks.javadoc package to help you understand how to deal with comments.
Comment: word count is not a valid measure of javadoc quality. It is not bad style to have terse or minimal descriptions for things whose meaning is blatantly self-evident in the context. Besides, counting words is only going to encourage people to pad out their javadocs with useless or meaningless stuff; e.g "yadda, yadda, yadda" :-).
If you want to make a meaningful assessment of javadoc quality, a human being needs to read them.

Related

How to abbreviate "number of <items>" in source code?

I'm new to learning Java and I'm bad at English but I try my best to write good, understandable source code.
I want to make variables to save the "number of cars" or "number of items". How do I abbreviate "number of ..." without using symbols like # that don't work in a source code?
Thanks

You have several choices to do that
numberOfItems (verbose, but clear in meaning)
numItems (it's ok)
itemCount (probably, the best — what I'd have used)
items (shortest, but can't know if it is an integer or a list of items)

I try my best to write good, understandable source code
Then my best advice would be not to abbreviate variable names.
Just go with numberOfCars.
Why?
I know you've probably seen a lot of programs where people use one-letter variables or stuff like numCars.
Abbreviating your variables make your code less clear for others (including you in 6 months).
We all have great text editors with auto-completion on variables, use that.

CheckStyle with warning '100' is a magic number

In my code,it shows the warning with message '100' is a magic number.See the below code,
int randomNo = generator.nextInt(100);
I read it here What is a magic number, and why is it bad? but my doubt is declaring 100 by creating variable with static will occupy more space,since i am using this in a single place.Is this a correct way to solve this?
public static final int HUNDRED= 100;
Any Suggestion?

well HUNDRED really is kinda silly, but why did you choose 100 and what is its meaning?
something like:
public static final int RANDOM_UPPER_LIMIT=100;
or something even more informative, depending on what you use the value for:
public static final int MAX_NUMBER_OF_COLORS=100;
would make more sense and improve readability.
Space saving should not be a consideration in this case, the space overhead of declaring a variable, if there is such, is completely negligible.

It doesn't really have to do with storage, it's readability. If you want to change some number it can be difficult to find in code, if it's up at the top then it's better (if it's in a config file, better yet in many cases).
Yes, that's a good solution.
If you don't need it outside that file, you should make it "private", and you might want to be even more readable and use a name that indicates what it really means, like:
MAX_RANDOM_NUMBER=100
better yet include what it's used for
MAX_RANDOM_FOR_CARD_SELECTION
or something like that.
In this way when you go to look into that file 5 months from now because you added 20 new cards, it's completely obvious what you have to change without even glancing at the code.

It's good to write not the shortest code, but code which is easy to understand and maintain. Storing this 100 to constant you may add a good name explaining why it's really 100. For example if you want to generate random score and your maximal possible score is 100, then you can define
static final int MAX_SCORE = 100;
After that you can use it in other places as well. Everybody will understand why it's 100 and nothing else. And if someday you will need to change it, say, to 200, you will have to replace it in only one place without searching through the code.
Also it's possible that in some other part of your program you will have 100 which has different meaning (say, MAX_PERCENT). If you want to change MAX_SCORE to 200, but leave MAX_PERCENT as is, it would be much easier if you have separate constants.

Check out Robert Martin's (Uncle Bob)
Clean Code
book for a thorough explanaition (or any other guide on coding style). Basically a hard coded '100' doesn't mean anything to the reader of your code. It won't mean anything to you in 6 months either, after you are done with your app. It is a magic number since it appears in the code - in 99 out of 100 cases - as almost out of nowhere. Why is it 100 and not 99 or 101? Why should your app limit the random number generation to 100 and not above (or below)?
To wrap up, it's a thing of readability of your code for yourself and for present or future readers of your code, it's a matter of coding style.

Most compilers, including JIT compilers, will inline constant primitives, including statics. IE, at compile time it will remove the variable and expand your code back to
generator.nextInt(100);
So in practice there won't be space trade-offs, and you're improving readablity.
Be careful about expanding the idea out to more complicated code though. There are compiler and language dependent rules on when it is able to do certain optimisations.

what is the standard number of parameters that a java method should have?

I am writing a program that checks the number of parameters of a methods and prints out a warning message (its a codesmell program) If there are more than what the standard is, the problem is that I don't know what the agreed number is. I have looked around and not had any luck. Can anyone tell me or at least point me in the right direction?

There is no standard limit on the number of parameters you can specify in Java, but according to "Code Complete" (see this post) you should limit the amount of parameters to about 7, any more and it will have a negative effect on the readability of your code.

This really has nothing to do with Java specifically. And you should definitely make it configurable, because there are quite different views on this.
In "Clean Code", Robert Martin argues that the ideal number of method parameters is 0, 1 is OK, 2 needs strong justification, and 3 or more requires special dispensation from the pope.
Most people will consider this way too strict and wouldn't blink twice at a method with 3 parameters. You can probably get broad agreement that 6 parameters is too many.

In Java you can't define more than 255 pararmeters for a method. This is the restriction.
For and advise, Uncle Bob says -Clean Code- max parameter count should be three.
Too many parameters, parameter xxxxxxx is exceeding the limit of 255 words eligible for method parameters

Checkstyle is a popular tool to check java coding standard.
Here is the link the the ParameterNumber rule: ParameterNumber

My honest opinion is there is no defined limit to the number of parameters. My personal preference is not to have more than 3 or at least 4 since this can affect readability and mental mapping (difficult to remember more than 4 parameters). You can also have a quick peep at Uncle Bob's Clean Code and Steve McConnell's Code Complete regarding this.
There is a similar thread in StackOverflow see When a method has too many parameters?

There really is not a standard number of parameters.

You can use any number of arguments in a function in java. There is no standard limit to have this number of argument in function in java.[As per I know] IMO as a practice you should not have more than 4 arguments for a function but this is not the standard you can have any number of arguments.

There's no hard limit, but I'd say more than five is a code smell in a language that has no keyword arguments (such as Java).

comparing "the likes" smartly

Suppose you need to perform some kind of comparison amongst 2 files. You only need to do it when it makes sense, in other words, you wouldn't want to compare JSON file with Property file or .txt file with .jar file
Additionally suppose that you have a mechanism in place to sort all of these things out and what it comes down to now is the actual file name. You would want to compare "myFile.txt" with "myFile.txt", but not with "somethingElse.txt". The goal is to be as close to "apples to apples" rules as possible.
So here we are, on one side you have "myFile.txt" and on another side you have "_myFile.txt", "_m_y_f_i_l_e.txt" and "somethingReallyClever.txt".
Task is to pick the closest name to later compare. Unfortunately, identical name is not found.
Looking at the character composition, it is not hard to figure out what the relationship is. My algo says:
_myFile.txt to _m_y_f_i_l_e.txt 0.312
_myFile.txt to somethingReallyClever.txt 0.16
So _m_y_f_i_l_e.txt is closer to_myFile.txt then somethingReallyClever.txt. Fantastic. But also says that ist is only 2 times closer, where as in reality we can look at the 2 files and would never think to compare somethingReallyClever.txt with _myFile.txt.
Why?
What logic would you suggest i apply to not only figure out likelihood by having chars on the same place, but also test whether determined weight makes sense?
In my example, somethingReallyClever.txt should have had a weight of 0.0
I hope i am being clear.
Please share your experience and thoughts on this.
(whatever approach you suggest should not depend on number of characters filename consists out of)

Possibly helpful previous question which highlights several possible algorithms:
Word comparison algorithm
These algorithms are based on how many changes would be needed to get from one string to the other - where a change is adding a character, deleting a character, or replacing a character.
Certainly any sensible metric here should have a low score as meaning close (think distance between the two strings) and larger scores as meaning not so close.

Sounds like you want the Levenshtein distance, perhaps modified by preconverting both words to the same case and normalizing spaces (e.g. replace all spaces and underscores with empty string)

Java's String.replace() vs. String.replaceFirst() vs. homebrew

I have a class that is doing a lot of text processing. For each string, which is anywhere from 100->2000 characters long, I am performing 30 different string replacements.
Example:
string modified;
for(int i = 0; i < num_strings; i++){
modified = runReplacements(strs[i]);
//do stuff
}
public runReplacements(String str){
str = str.replace("foo","bar");
str = str.replace("baz","beef");
....
return str;
}
'foo', 'baz', and all other "targets" are only expected to appear once and are string literals (no need for an actual regex).
As you can imagine, I am concerned about performance :)
Given this,
replaceFirst() seems a bad choice because it won't use Pattern.LITERAL and will do extra processing that isn't required.
replace() seems a bad choice because it will traverse the entire string looking for multiple instances to be replaced.
Additionally, since my replacement texts are the same everytime, it seems to make sense for me to write my own code otherwise String.replaceFirst() or String.replace() will be doing a Pattern.compile every single time in the background. Thinking that I should write my own code, this is my thought:
Perform a Pattern.compile() only once for each literal replacement desired (no need to recompile every single time) (i.e. p1 - p30)
Then do the following for each pX: p1.matcher(str).replaceFirst(Matcher.quoteReplacement("desiredReplacement"));
This way I abandon ship on the first replacement (instead of traversing the entire string), and I am using literal vs. regex, and I am not doing a re-compile every single iteration.
So, which is the best for performance?

So, which is the best for performance?
Measure it! ;-)
ETA: Since a two word answer sounds irretrievably snarky, I'll elaborate slightly. "Measure it and tell us..." since there may be some general rule of thumb about the performance of the various approaches you cite (good ones, all) but I'm not aware of it. And as a couple of the comments on this answer have mentioned, even so, the different approaches have a high likelihood of being swamped by the application environment. So, measure it in vivo and focus on this if it's a real issue. (And let us know how it goes...)

First, run and profile your entire application with a simple match/replace. This may show you that:
your application already runs fast enough, or
your application is spending most of its time doing something else, so optimizing the match/replace code is not worthwhile.
Assuming that you've determined that match/replace is a bottleneck, write yourself a little benchmarking application that allows you to test the performance and correctness of your candidate algorithms on representative input data. It's also a good idea to include "edge case" input data that is likely to cause problems; e.g. for the substitutions in your example, input data containing the sequence "bazoo" could be an edge case. On the performance side, make sure that you avoid the traps of Java micro-benchmarking; e.g. JVM warmup effects.
Next implement some simple alternatives and try them out. Is one of them good enough? Done!
In addition to your ideas, you could try concatenating the search terms into a single regex (e.g. "(foo|baz)" ), use Matcher.find(int) to find each occurrence, use a HashMap to lookup the replacement strings and a StringBuilder to build the output String from input string substrings and replacements. (OK, this is not entirely trivial, and it depends on Pattern/Matcher handling alternates efficiently ... which I'm not sure is the case. But that's why you should compare the candidates carefully.)
In the (IMO unlikely) event that a simple alternative doesn't cut it, this wikipedia page has some leads which may help you to implement your own efficient match/replacer.

Isn't if frustrating when you ask a question and get a bunch of advice telling you to do a whole lot of work and figure it out for yourself?!
I say use replaceAll();
(I have no idea if it is, indeed, the most efficient, I just don't want you to feel like you wasted your money on this question and got nothing.)
[edit]
PS. After that, you might want to measure it.
[edit 2]
PPS. (and tell us what you found)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.