Is it an accepted practice in the Java programming language to end a brace for a code block with a comment that briefly explains what code block the brace closes off? I personally would think that they are useless comments that clutter the readability of the code, but perhaps I could be wrong. For example:
public void myMethod(int foo) {
// some code
if (foo == 2) {
for (int i = 0; i < myMax; i++) {
while (true) {
// some more code
} // end while
} // end for
} // end if
} // end myMethod(int)
Is the practice of commenting code blocks in a similar manner an accepted practice?
My take on it is that as a rule it is NOT a good practice. As always with rules there could be exceptions but very rare.
It is not a good practice because
Modern editors highlight the opening bracket when you place cursor on the closing one and vice versa.
Most important: if there is a possibility to not see the beginning of the clause it means that the method is huge (more than half a page) which is a bad practice.
It adds noise to the code that will confuse readers who are used to more conventional Java coding style.
Incorporating LordScree-Joachim Sauer comment: These comments will
be pain in the neck to maintain. So most likely it will not be maintained and the information will usually be out of sync with reality.
This is not exactly a bad practice, BUT it is a deadly side effect of poor Object-Oriented coding practice!
Also, this violates style guidelines and the tenets of "self-documenting code". You should never have that many brackets or a method long enough to confuse the reader about bracket placement, instead encapsulate that functionality in another method that is well documented.
Brackets imply either looping or complex if-else logic chains, good programming practice is to have a method do exactly one thing and do it well, then build your program from these smaller, atomized methods. I would read Butler Lampson's seminal piece "Hints for Computer System Design". It goes into some of the details on how to design good software.
So essentially, don't comment like this because:
It violates style guidelines
It shows poor Object-Oriented programming - atomize your functionality!
It is a deadly side effect of coding practice which goes against the underlying concepts of why Java was created - encapsulation, specifically information hiding
It violates the concept of self-documenting code
Other programmers will make fun of you.
I only do this when there is a place in code that has a lot of closing braces after each other. But not for every brace. Mainly I seem to use it for loops so that you can easly see what is a repeated code block.
Something that looks like this:
// ...
file.Close();
}
}
}
}
}
}
Then it helps to add some comments:
// ...
file.Close();
}
}
}
}//for: each file
}
}
It depends on how complicated the method or function is. Like if you have something that can be easily read and understood than there really wouldn't be a point in ending EVERY line with a comment that explains that the part has ended. That's what indentation and line breaks are for. However if you have something that is truly complex and affects a major part of your code then you should denote where that code ends and what that section does.
If there are nested blocks of same kind, it may be confusing instead of doing good. Let's say you have 4 or 5 nested if statements(keep in mind that this only an example to demonstrate the situation, regardless of "oh you should separate those" or "make a method" suggestions) in that case you will have 4 or 5 different //end if sequenced. After a while, it makes you confuse which "end if" is for which statement, makes you spend extra effort unconsciously to see through the actual code/statements because it's not always as clean/short as your example.
IMO, It does not help much. The code inside a loop or an if statement should not be too big.
It would have helped if there were 500 lines inside the loop.
Related
When i see code from others, i mainly see two types of method-styling.
One looks like this, having many nested ifs:
void doSomething(Thing thing) {
if (thing.hasOwner()) {
Entity owner = thing.getOwner();
if (owner instanceof Human) {
Human humanOwner = (Human) owner;
if (humanOwner.getAge() > 20) {
//...
}
}
}
}
And the other style, looks like this:
void doSomething(Thing thing) {
if (!thing.hasOwner()) {
return;
}
Entity owner = thing.getOwner();
if (!(owner instanceof Human)) {
return;
}
Human humanOwner = (Human) owner;
if (humanOwner.getAge() <= 20) {
return;
}
//...
}
My question is, are there names for these two code styles? And if, what are they called.
The early-returns in the second example are known as guard clauses.
Prior to the actual thing the method is going to do, some preconditions are checked, and if they fail, the method immediately returns. It is a kind of fail-fast mechanism.
There's a lot of debate around those return statements. Some think that it's bad to have multiple return statements within a method. Others think that it avoids wrapping your code in a bunch of if statements, like in the first example.
My own humble option is in line with this post: minimize the number of returns, but use them if they enhance readability.
Related:
Should a function have only one return statement?
Better Java syntax: return early or late?
Guard clauses may be all you need
I don't know if there is a recognized name for the two styles, but in structured programming terms, they can be described as "single exit" versus "multiple exit" control structures. (This also includes continue and break statements in loop constructs.)
The classical structured programming paradigm advocated single exit over multiple exit, but most programmers these days are happy with either style, depending on the context. Even classically, relaxation of the "single exit" rule was acceptable when the resulting code was more readable.
(One needs to remember that structured programming was a viewed as the antidote to "spaghetti" programming, particularly in assembly language, where the sole control constructs were conditional and non-conditional branches.)
i would say it's about readability. The 2nd style which i prefer, gives you the opportunity to send for example messages to the user/program for any check that should stop the program.
One could call it "multiple returns" and "single return". But I wouldn't call it a style, you may want to use both approaches, depending on readability in any particular case.
Single return is considered a better practice in general, since it allows you to write more readable code with the least surprise for the reader. In a complex method, it may be quite complicated to understand at which point the program will exit for any particular arguments, and what side effects may occur.
But if in any particular case you feel multiple returns improve readability of your code, there's nothing wrong with using them.
I've seen a lot of methods in these two style:
1.
void foo() {
if(!good) {
return;
}
doFoo();
}
2.
void foo() {
if(good) {
doFoo();
}
}
I wonder if this is just a matter of taste or one is actually better than the other. Does anyone know any evidence about this? (Style-checker rules, book, etc.)
I'm thinking mainly about this code written in Java.
Depends on your project team's code style rules.
If if statement checks method contracts I prefer fast fail approach.
i believe its all depend on your program logic and requirements, In your first example doFoo() will be called only if your if condition is not satisfied and in your second option doFoo() will be called when if condition is satisfied..
you can read coding convention for java programmer it is pretty good http://www.oracle.com/technetwork/java/codeconvtoc-136057.html
Between these two, second option is better IMHO. If there are more if-else branches, you could end up with multiple return statements which makes the code hard to navigate (in most cases, but sometimes it does good to your code).
First variant handle extra nesting. It depends on your project's code style conventions and your method logic.
I advice you to read Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
Currently I am working on a bit of code which (I believe) requires quite a few embedded if statements. Is there some standard to how many if statements to embed? Most of my googling has turned up things dealing with excel..don't know why.
If there is a standard, why? Is it for readability or is it to keep code running more smoothly? In my mind, it makes sense that it would be mainly for readability.
An example of my if-structure:
if (!all_fields_are_empty):
if (id_search() && validId()):
// do stuff
else if (name_search):
if (name_exists):
if (match < 1):
// do stuff
else:
// do stuff
else if (name_search_type_2):
if (exists):
if (match < 1):
// do stuff
else:
// do stuff
else:
// you're stupid
I have heard that there's a limit to 2-3 nested for/while loops, but is there some standard for if-statements?
Update:
I have some years under my belt now. Please don't use this many if statements. If you need this many, your design is probably bad. Today, I LOVE when I can find an elegant way to do these things with minimal if statements or switch cases. The code ends up cleaner, easier to test, and easier to maintain. Normally.
As Randy mentioned, the cause of this kind of code is in most cases a poor design of an application. Usually I try to use "processor" classes in your case.
For example, given that there is some generic parameter named "operation" and 30 different operations with different parameters, you could make an interface:
interface OperationProcessor {
boolean validate(Map<String, Object> parameters);
boolean process(Map<String, Object> parameters);
}
Then implement lots of processors for each operation you need, for example:
class PrinterProcessor implements OperationProcessor {
boolean validate(Map<String, Object> parameters) {
return (parameters.get("outputString") != null);
}
boolean process(Map<String, Object> parameters) {
System.out.println(parameters.get("outputString"));
}
}
Next step - you register all your processors in some array when application is initialized:
public void init() {
this.processors = new HashMap<String, OperationProcessor>();
this.processors.put("print",new PrinterProcessor());
this.processors.put("name_search", new NameSearchProcessor());
....
}
So your main method becomes something like this:
String operation = parameters.get("operation"); //For example it could be 'name_search'
OperationProcessor processor = this.processors.get(operation);
if (processor != null && processor.validate()) { //Such operation is registered, and it validated all parameters as appropriate
processor.process();
} else {
System.out.println("You are dumb");
}
Sure, this is just an example, and your project would require a bit different approach, but I guess it could be similiar to what I described.
I don't think there is a limit but i wouldn't recommend embeddeding more the two - it's too hard to read, difficult to debug and hard to unit test. Consider taking a look at a couple great books like Refactoring, Design Patterns, and maybe Clean Code
Technically, I am not aware of any limitation to nesting.
It might be an indicator of poor design if you find yourself going very deep.
Some of what you posted looks like it may be better served as a case statement.
I would be concerned with readability, and code maintenance for the next person which really means it will be difficult - even for the first person (you) - to get it all right in the first place.
edit:
You may also consider having a class that is something like SearchableObject(). You could make a base class of this with common functionality, then inherit for ID, Name, etc, and this top level control block would be drastically simplified.
Technically you can have as many as you like but if you have a lot it can quickly make the code unreadable.
What i'd normally do is something like:
if(all_fields_are_empty) {
abuseuser;
return;
}
if(id_search() && validId()) {
//do stuff
return;
}
if(name_search)
{
if(name_exists)
//do stuff
return
else
//do stuff
return
}
I'm sure you get the picture
Tl;Dr You don't really want anymore than 10-15 paths though any one method
What your essentially referring to here is Cyclomatic complexity.
Cyclomatic complexity is a software metric (measurement), used to
indicate the complexity of a program. It is a quantitative measure of
the number of linearly independent paths through a program's source
code. It was developed by Thomas J. McCabe, Sr. in 1976.
So every if statement is potentially a new path though your code and increases it's Cyclomatic complexity. There are tools that will measure this for you and high light areas of high complexity for potential refactoring.
Is there some standard to how many if statements to embed?
Yes and no. It's generally regarded (and McCabe himself argued) that a Cyclomatic complexity of over about 10 or 15 is too high and a sign that the code should be refactored.
One of McCabe's original applications was to limit the complexity of
routines during program development; he recommended that programmers
should count the complexity of the modules they are developing, and
split them into smaller modules whenever the cyclomatic complexity of
the module exceeded 10.[2] This practice was adopted by the NIST
Structured Testing methodology, with an observation that since
McCabe's original publication, the figure of 10 had received
substantial corroborating evidence, but that in some circumstances it
may be appropriate to relax the restriction and permit modules with a
complexity as high as 15. As the methodology acknowledged that there
were occasional reasons for going beyond the agreed-upon limit, it
phrased its recommendation as: "For each module, either limit
cyclomatic complexity to [the agreed-upon limit] or provide a written
explanation of why the limit was exceeded."[7]
This isn't really a hard rule though and can be disregarded in some circumstances. See this question What is the highest Cyclomatic Complexity of any function you maintain? And how would you go about refactoring it?.
why? Is it for readability or is it to keep code running more
smoothly?
Essentially this is for readability, which should make your code run smoothly. To quote Martin Fowler
Any fool can write code that a computer can understand. Good
programmers write code that humans can understand.
The only technical limit to the number of nested if/else blocks in Java will probably be the size of your stack. Style is another matter.
Btw: What's with the colons?
I just learned today that the following Java code is perfectly legal:
myBlock: {
/* ... code ... */
if (doneExecutingThisBlock())
break myBlock;
/* ... more code ... */
}
Note that myBlock isn't a loop - it's just a block of code I've delimited with curly braces.
This seems like a rather strange feature to have. It means that you can use a named break to break out of an if statement or anonymous block, though you can't normally use a break statement in these contexts.
My question is this: is there a good reason for this design decision? That is, why make it so that you can only break out of certain enclosing statements using labeled breaks but not regular breaks? And why allow for this behavior at all? Given how (comparatively) well-designed Java is as a language I would assume there's a reason for this, but I honestly can't think of one.
It is plausible that this was done for simplicity. If originally the labeled break can only break loop statements, then it should be immediately clear to language designer that the restriction isn't necessary, the semantics work the same for all statements. For the economics of the language spec, and simpler implementation of compilers, or just out of the habit towards generality, labeled break is defined for any statement, not just loop statements.
Now we can look back and judge this choice. Does it benefit programmers, by giving them extra expression power? Seems very little, the feature is rarely used. Does it cost programmers in learning and understanding? Seems so, as evidenced by this discussion.
If you could go back time and change it, would you? I can't say I would. We have a fetish for generality.
If in a parallel universe it was limited to loop statements only, there is still a chance, probably much smaller, that someone posts the question on stackoverflow: why couldn't it work on arbitrary statements?
Think of it as a return statement that returns from the block instead of from the entire function. The same reasoning you apply to object to break being scattered anywhere can also be applied to return being allowed anywhere except at the end of a function.
The issue with goto is that it can jump forward, past code. A labeled break cannot do that (it can only go backwards). IIRC C++ has to deal with goto jumping past code (it is been over 17 years since I cared about that though so I am not sure I am remembering that right).
Java was designed to be used by C/C++ programmers, so many things were done to make it familiar to those developers. It is possible to do a reasonable translation from C/C++ to Java (though some things are not trivial).
It is reasonable to think that they put that into the language to give C/C++ developers a safe goto (where you can only go backwards in the code) to make it more comfortable to some programmers converting over.
I have never seen that in use, and I have rarely seen a labeled break at all in 16+ years of Java programming.
You cannot break forward:
public class Test
{
public static void main(final String[] argv)
{
int val = 1;
X:
{
if(argv.length == 0)
{
break X;
}
if(argv.length == 1)
{
break Y; <--- forward break will not compile
}
}
val = 0;
Y:
{
Sysytem.out.println(val); <-- if forward breaks were allowed this would
print out 1 not 0.
}
}
}
Why make it so that you can only break out of certain enclosing statements using labeled breaks but not regular breaks
Consider:
while (true) {
if (condition) {
break;
}
}
If the break did as you suggest, this code would perform unexpectedly. Breaks would become a lot more difficult to use.
And why allow for this behavior at all?
I don't use it, but it is a feature and allows for certain unique control-flow constructs. I'd ask you, why not allow it?
is there a good reason for this design decision?
Yes. Because it works.
In the labelled break case, the fact that you don't need to be inside a loop or switch lets you to express things that are harder to express in other ways. (Admittedly, people rarely do use labelled break this way ... but that's not a fault of the language design.)
In the unlabelled break case, the behavior is to break out of the innermost enclosing loop or switch. If it was to break out of the innermost enclosing statement, then a lot of things would be much harder to express, and many would probably require a labelled block. For example:
while (...) {
/* ... */
if (something) break;
/* ... */
}
If break broke out of the innermost enclosing statement, then it wouldn't break out of the loop.
There is another possible reason / rationale. Remember that Java was a brand new language and a relatively early adopter of exceptions and exception handling.
Consider this:
try {
/* ... code ... */
if (doneExecutingThisBlock())
throw new OuttaHere();
/* ... more code ... */
} catch (OuttaHere e) {
/* do nothing */
}
According to the dogma, that is bad code. You shouldn't use exceptions for "normal" flow control.
(Pragmatically, that it also very inefficient due to the overheads of exception creation and handling. Exceptions performance was improved significantly in Java 8, I think, but that was ~20 years later.)
Now imagine that you are a language designer, and you feel that you have to provide an alternative to the "exceptions as flow control" anti-pattern. The "break to label" construct does exactly that. Compare the above with the example in the question.
In hindsight, this is unnecessary. The above can be done in other ways; i.e. without labelled break. In practice this construct is used so rarely that many (maybe most) programmers don't even know it exists in Java.
The ability to leave a sequence of statements has been implemented in several programming languages before Java. Two examples:
Algol-68 had exit to terminate the execution of the smallest closed-clause (very loosely speaking, a begin ... end sequence).
BLISS had labelled BEGIN … END blocks, with a LEAVE statement to terminate execution.
Implementations with labels (as in Java) are more flexible in that they can exit nested blocks (or compound statements, or whatever you call them in your language of choice); without the label, you're limited to exiting a single "level" only.
Answering the direct question, "why" -- because it's been found to be a useful construct in other, prior, languages.
Adding to Stephen C's answer, if (something) you cannot break out of a nested loop. These situations do happen in numerical algorithms. One simple example here - you cannot break out of the i-loop without the named for. Hope this helps.
public class JBreak {
private int brj;
public JBreak (String arg) {
brj = Integer.parseInt (arg);
}
public void print () {
jbreak:
for (int i = 1 ; i < 3 ; i++) {
for (int j = 0 ; j < 5 ; j++) {
if ((i*j) == brj)
break jbreak;
System.out.println ("i,j: " + i + "," + j);
}}}
public static void main (String[] args) {
new JBreak(args[0]).print();
}}
It's the "structured" equivalent to a goto, useful in certain circumstances.
I quite often use such a label create named sub-blocks in a method to tightly limit scope of variables or to simply label a block of code which is not appropriate to break out into a separate function. That is, I use it to label a block so that the code structure around braces is preserved. Here's an example in C for a JNI call, and I do the same in Java:
JNIEXPORT void JNICALL Java_xxx_SystemCall_jniChangePassword(JNIEnv *jep, jobject thsObj,
jlong handle, jbyteArray rndkey, jbyteArray usrprf, jbyteArray curpwd, jbyteArray newpwd, jint pwdccs, jint tmosec) {
Message rqs,rpy;
thsObj=thsObj;
SetupRequest: {
memset(&rqs,0,sizeof(rqs));
setOpcode(&rqs,"CHGPWD");
if(!setField(mFldAndLen(rqs.rnd ),null ,jep,rndkey,"Random Key")) {
return;
}
if(!setField(mFldAndLen(rqs.dta.chgpwd.user ),&rqs.dta.chgpwd.userLen ,jep,usrprf,"User Profile")) {
return;
}
if(!setField(mFldAndLen(rqs.dta.chgpwd.curPass),&rqs.dta.chgpwd.curPassLen,jep,curpwd,"Cur Password")) {
return;
}
if(!setField(mFldAndLen(rqs.dta.chgpwd.newPass),&rqs.dta.chgpwd.newPassLen,jep,newpwd,"New Password")) {
return;
}
rqs.dta.chgpwd.ccsid=pwdccs;
}
...
The break statement terminates the labeled statement; it does not transfer the flow of control to the label. Control flow is transferred to the statement immediately following the labeled (terminated) statement.
It seems to be useful to exit nested loops. See http://download.oracle.com/javase/tutorial/java/nutsandbolts/branch.html
It's semantically the same as is there a equivalent of Java's labelled break in C# or a workaround
Today I had a coworker suggest I refactor my code to use a label statement to control flow through 2 nested for loops I had created. I've never used them before because personally I think they decrease the readability of a program. I am willing to change my mind about using them if the argument is solid enough however. What are people's opinions on label statements?
Many algorithms are expressed more easily if you can jump across two loops (or a loop containing a switch statement). Don't feel bad about it. On the other hand, it may indicate an overly complex solution. So stand back and look at the problem.
Some people prefer a "single entry, single exit" approach to all loops. That is to say avoiding break (and continue) and early return for loops altogether. This may result in some duplicate code.
What I would strongly avoid doing is introducing auxilary variables. Hiding control-flow within state adds to confusion.
Splitting labeled loops into two methods may well be difficult. Exceptions are probably too heavyweight. Try a single entry, single exit approach.
Labels are like goto's: Use them sparingly, and only when they make your code faster and more importantly, more understandable,
e.g., If you are in big loops six levels deep and you encounter a condition that makes the rest of the loop pointless to complete, there's no sense in having 6 extra trap doors in your condition statements to exit out the loop early.
Labels (and goto's) aren't evil, it's just that sometimes people use them in bad ways. Most of the time we are actually trying to write our code so it is understandable for you and the next programmer who comes along. Making it uber-fast is a secondary concern (be wary of premature optimization).
When Labels (and goto's) are misused they make the code less readable, which causes grief for you and the next developer. The compiler doesn't care.
There are few occasions when you need labels and they can be confusing because they are rarely used. However if you need to use one then use one.
BTW: this compiles and runs.
class MyFirstJavaProg {
public static void main(String args[]) {
http://www.javacoffeebreak.com/java101/java101.html
System.out.println("Hello World!");
}
}
I'm curious to hear what your alternative to labels is. I think this is pretty much going to boil down to the argument of "return as early as possible" vs. "use a variable to hold the return value, and only return at the end."
Labels are pretty standard when you have nested loops. The only way they really decrease readability is when another developer has never seen them before and doesn't understand what they mean.
I have use a Java labeled loop for an implementation of a Sieve method to find prime numbers (done for one of the project Euler math problems) which made it 10x faster compared to nested loops. Eg if(certain condition) go back to outer loop.
private static void testByFactoring() {
primes: for (int ctr = 0; ctr < m_toFactor.length; ctr++) {
int toTest = m_toFactor[ctr];
for (int ctr2 = 0; ctr2 < m_divisors.length; ctr2++) {
// max (int) Math.sqrt(m_numberToTest) + 1 iterations
if (toTest != m_divisors[ctr2]
&& toTest % m_divisors[ctr2] == 0) {
continue primes;
}
} // end of the divisor loop
} // end of primes loop
} // method
I asked a C++ programmer how bad labeled loops are, he said he would use them sparingly, but they can occasionally come in handy. For example, if you have 3 nested loops and for certain conditions you want to go back to the outermost loop.
So they have their uses, it depends on the problem you were trying to solve.
I've never seen labels used "in the wild" in Java code. If you really want to break across nested loops, see if you can refactor your method so that an early return statement does what you want.
Technically, I guess there's not much difference between an early return and a label. Practically, though, almost every Java developer has seen an early return and knows what it does. I'd guess many developers would at least be surprised by a label, and probably be confused.
I was taught the single entry / single exit orthodoxy in school, but I've since come to appreciate early return statements and breaking out of loops as a way to simplify code and make it clearer.
I'd argue in favour of them in some locations, I found them particularly useful in this example:
nextItem: for(CartItem item : user.getCart()) {
nextCondition : for(PurchaseCondition cond : item.getConditions()) {
if(!cond.check())
continue nextItem;
else
continue nextCondition;
}
purchasedItems.add(item);
}
I think with the new for-each loop, the label can be really clear.
For example:
sentence: for(Sentence sentence: paragraph) {
for(String word: sentence) {
// do something
if(isDone()) {
continue sentence;
}
}
}
I think that looks really clear by having your label the same as your variable in the new for-each. In fact, maybe Java should be evil and add implicit labels for-each variables heh
I never use labels in my code. I prefer to create a guard and initialize it to null or other unusual value. This guard is often a result object. I haven't seen any of my coworkers using labels, nor found any in our repository. It really depends on your style of coding. In my opinion using labels would decrease the readability as it's not a common construct and usually it's not used in Java.
Yes, you should avoid using label unless there's a specific reason to use them (the example of it simplifying implementation of an algorithm is pertinent). In such a case I would advise adding sufficient comments or other documentation to explain the reasoning behind it so that someone doesn't come along later and mangle it out of some notion of "improving the code" or "getting rid of code smell" or some other potentially BS excuse.
I would equate this sort of question with deciding when one should or shouldn't use the ternary if. The chief rationale being that it can impede readability and unless the programmer is very careful to name things in a reasonable way then use of conventions such as labels might make things a lot worse. Suppose the example using 'nextCondition' and 'nextItem' had used 'loop1' and 'loop2' for his label names.
Personally labels are one of those features that don't make a lot of sense to me, outside of Assembly or BASIC and other similarly limited languages. Java has plenty of more conventional/regular loop and control constructs.
I found labels to be sometimes useful in tests, to separate the usual setup, excercise and verify phases and group related statements. For example, using the BDD terminology:
#Test
public void should_Clear_Cached_Element() throws Exception {
given: {
elementStream = defaultStream();
elementStream.readElement();
Assume.assumeNotNull(elementStream.lastRead());
}
when:
elementStream.clearLast();
then:
assertThat(elementStream.lastRead()).isEmpty();
}
Your formatting choices may vary but the core idea is that labels, in this case, provide a noticeable distinction between the logical sections comprising your test, better than comments can. I think the Spock library just builds on this very feature to declare its test phases.
Personally whenever I need to use nested loops with the innermost one having to break out of all the parent loops, I just write everything in a method with a return statement when my condition is met, it's far more readable and logical.
Example Using method:
private static boolean exists(int[][] array, int searchFor) {
for (int[] nums : array) {
for (int num : nums) {
if (num == searchFor) {
return true;
}
}
}
return false;
}
Example Using label (less readable imo):
boolean exists = false;
existenceLoop:
for (int[] nums : array) {
for (int num : nums) {
if (num == searchFor) {
exists = true;
break existenceLoop;
}
}
}
return exists;