This question already has answers here:
Eliminating `switch` statements [closed]
(23 answers)
Closed 5 years ago.
In the codebase I'm currently working on, it's common to have to take a string passed in from further up the chain and use it as a key to find a different String. The current standard idiom is to use switch statements, however for larger switch statements (think ~20-30 cases) sonarqube says it's a code smell and should be reduced for cyclomatic complexity. My current solution is to use a static HashMap, like so
private static final HashMap<String, String> sortMap;
static {
sortMap = new HashMap<>();
sortMap.put("foo1", "bar1");
sortMap.put("foo2", "bar2");
sortMap.put("foo3", "bar3");
etc...
}
protected String mapSortKey(String key) {
return sortMap.get(key);
}
However this doesn't seem to actually be any cleaner, and if anything seems more confusing for maintainers. Is there a better way to solve this? Or should sonarqube be ignored in this situation? I am aware of using polymorphism, i.e. Ways to eliminate switch in code, however that seems like it is overkill for this problem, as the switch statements are being used as makeshift data structures rather than as rudimentary polymorphism. Other similar questions I've found about reducing switch case cyclomatic complexity aren't really applicable in this instance.
If, by your example, this is just the case of choosing a mapped value from a key, a table or properties file would be a more appropriate way to handle this.
If you're talking about logic within the different switch statements, you might find that a rules engine would suit better.
You hit upon the major requirement: maintainability. If we are coding in too much logic or too much data, we have made brittle code. Choose a design pattern suited to the type of switched information and export the functionality into a maintainable place for whomever must make changes later... because with a long list like this, chances are high that changes will be occurring with some frequency.
Related
I need to modify a local variable inside a lambda expression in a JButton's ActionListener and since I'm not able to modify it directly, I came across the AtomicInteger type.
I implemented it and it works just fine but I'm not sure if this is a good practice or if it is the correct way to solve this situation.
My code is the following:
newAnchorageButton.addActionListener(e -> {
AtomicInteger anchored = new AtomicInteger();
anchored.set(0);
cbSets.forEach(cbSet ->
cbSet.forEach(cb -> {
if (cb.isSelected())
anchored.incrementAndGet();
})
);
// more code where I use the 'anchored' variable...
}
I'm not sure if this is the right way to solve this since I've read that AtomicInteger is used mostly for concurrency-related applications and this program is single-threaded, but at the same time I can't find another way to solve this.
I could simply use two nested for-loops to go over those arrays but I'm trying to reduce the method's cognitive complexity as much as I can according to the sonarlint vscode extension, and leaving those for-loops theoretically increases the method complexity and therefore its readability and maintainability.
Replacing the for-loops with lambda expressions reduces the cognitive complexity but maybe I shouldn't pay that much attention to it.
While it is safe enough in single-threaded code, it would be better to count them in a functional way, like this:
long anchored = cbSets.stream() // get a stream of the sets
.flatMap(List::stream) // flatten to list of cb's
.filter(JCheckBox::isSelected) // only selected ones
.count(); // count them
Instead of mutating an accumulator, we limit the flattened stream to only the ones we're interested in and ask for the count.
More generally, though, it is always possible to sum things up or generally aggregate the values without a mutable variable. Consider:
record Country(int population) { }
countries.stream()
.mapToInt(Country::population)
.reduce(0, Math::addExact)
Note: we never mutate any values; instead, we combine each successive value with the preceding one, producing a new value. One could use sum() but I prefer reduce(0, Math::addExact) to avoid the possibility of overflow.
and leaving those for-loops theoretically increases the method complexity and therefore its readability and maintainability.
This is obvious horsepuckey. x.forEach(foo -> bar) is not 'cognitively simpler' than for (var foo : x) bar; - you can map each AST node straight over from one to the other.
If a definition is being used to define complexity which concludes that one is significantly more complex than the other, then the only correct conclusion is that the definition is silly and should be fixed or abandoned.
To make it practical: Yes, introducing AtomicInteger, whilst performance wise it won't make one iota of difference, does make the code way more complicated. AtomicInteger's simple existence in the code suggests that concurrency is relevant here. It isn't, so you'd have to add a comment to explain why you're using it. Comments are evil. (They imply the code does not speak for itself, and they cannot be tested in any way). They are often the least evil, but evil they are nonetheless.
The general 'trick' for keeping lambda-based code cognitively easily followed is to embrace the pipeline:
You write some code that 'forms' a stream. This can be as simple as list.stream(), but sometimes you do some stream joining or flatmapping a collection of collections.
You have a pipeline of operations that operate on single elements in the stream and do not refer to the whole or to any neighbour.
At the end, you reduce (using collect, reduce, max - some terminator) such that the reducing method returns what you need.
The above model (and the other answer follows it precisely) tends to result in code that is as readable/complex as the 'old style' code, and rarely (but sometimes!) more readable, and significantly less complicated. Deviate from it and the result is virtually always considerably more complicated - a clear loser.
Not all for loops in java fit the above model. If it doesn't fit, then trying to force that particular square peg into the round hole will take a lot of effort and almost always results in code that is significantly worse: Either an order of magnitude slower or considerably more cognitively complicated.
It also means that it is virtually never 'worth' rewriting perfectly fine readable non-stream based code into stream based code; at best it becomes a percentage point more readable according to some personal tastes, with no significant universally agreed upon improvement.
Turn off that silly linter rule. The fact that it considers the above 'less' complex, and that it evidently determines that for (var foo : x) bar; is 'more complicated' than x.forEach(foo -> bar) is proof enough that it's hurting way more than it is helping.
I have the following to add to the two other answers:
Two general good practices in your code are in question:
Lambdas shouldn't be longer than 3-4 lines
Except in some precise cases, lambdas of stream operations should be stateless.
For #1, consider extracting the code of the lambda to a private method for example, when it's getting too long.
You will probably gain in readability, and you will also probably gain in better separating UI from business logic.
For #2, you are probably not concerned since you are working in a single thread at the moment, but streams can be parallelized, and they may not always execute exactly as you think it does.
For that reason, it's always better to keep the code stateless in stream pipeline operations. Otherwise you might be surprised.
More generally, streams are very good, very concise, but sometimes it's just better to do the same with good old loops.
Don't hesitate to come back to classic loops.
When Sonar tells you that the complexity is too high, in fact, you should try to factorize your code: split into smaller methods, improve the model of your objects, etc.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm wondering what the best way is in Java 8 to work with all the values of an enum. Specifically when you need to get all the values and add it to somewhere, for example, supposing that we have the following enum:
public enum Letter {
A, B, C, D;
}
I could of course do the following:
for (Letter l : Letter.values()) {
foo(l);
}
But, I could also add the following method to the enum definition:
public static Stream<Letter> stream() {
return Arrays.stream(Letter.values());
}
And then replace the for from above with:
Letter.stream().forEach(l -> foo(l));
Is this approach OK or does it have some fault in design or performance? Moreover, why don't enums have a stream() method?
I'd go for EnumSet. Because forEach() is also defined on Iterable, you can avoid creating the stream altogether:
EnumSet.allOf(Letter.class).forEach(x -> foo(x));
Or with a method reference:
EnumSet.allOf(Letter.class).forEach(this::foo);
Still, the oldschool for-loop feels a bit simpler:
for (Letter x : Letter.values()) {
foo(x);
}
Three questions: three-part answer:
Is it okay from a design point of view?
Absolutely. Nothing wrong with it. If you need to do lots of iterating over your enum, the stream API is the clean way to go and hiding the boiler plate behind a little method is fine. Although I’d consider OldCumudgeon’s version even better.
Is it okay from a performance point of view?
It most likely doesn’t matter. Most of the time, enums are not that big. Therefore, whatever overhead there is for one method or the other probably doesn’t matter in 99.9% of the cases.
Of course, there are the 0.1% where it does. In that case: measure properly, with your real-world data and consumers.
If I had to bet, I’d expect the for each loop to be faster, since it maps more directly to the memory model, but don’t guess when talking performance, and don’t tune before there is actual need for tuning. Write your code in a way that is correct first, easy to read second and only then worry about performance of code style.
Why aren’t Enums properly integrated into the Stream API?
If you compare Java’s Stream API to the equivalent in many other languages, it appears seriously limited. There are various pieces that are missing (reusable Streams and Optionals as Streams, for example). On the other hand, implementing the Stream API was certainly a huge change for the API. It was postponed multiple times for a reason. So I guess Oracle wanted to limit the changes to the most important use cases. Enums aren’t used that much anyway. Sure, every project has a couple of them, but they’re nothing compared to the number of Lists and other Collections. Even when you have an Enum, in many cases you won’t ever iterate over it. Lists and Sets, on the other hand, are probably iterated over almost every time. I assume that these were the reasons why the Enums didn’t get their own adapter to the Stream world. We’ll see whether more of this gets added in future versions. And until then you always can use Arrays.stream.
My guess is that enums are limited in size (i.e the size is not limited by language but limited by usage)and thus they don't need a native stream api. Streams are very good when you have to manipulate transform and recollect the elements in a stream; these are not common uses case for Enum (usually you iterate over enum values, but rarely you need to transform, map and collect them).
If you need only to do an action over each elements perhaps you should expose only a forEach method
public static void forEach(Consumer<Letter> action) {
Arrays.stream(Letter.values()).forEach(action);
}
.... //example of usage
Letter.forEach(e->System.out.println(e));
I think the shortest code to get a Stream of enum constants is Stream.of(Letter.values()). It's not as nice as Letter.values().stream() but that's an issue with arrays, not specifically enums.
Moreover, why don't enums have a stream() method?
You are right that the nicest possible call would be Letter.stream(). Unfortunately a class cannot have two methods with the same signature, so it would not be possible to implicitly add a static method stream() to every enum (in the same way that every enum has an implicitly added static method values()) as this would break every existing enum that already has a static or instance method without parameters called stream().
Is this approach OK?
I think so. The drawback is that stream is a static method, so there is no way to avoid code duplication; it would have to be added to every enum separately.
Recent events on the blogosphere have indicated that a possible performance problem with Scala is its use of closures to implement for.
What are the reasons for this design decision, as opposed to a C or Java-style "primitive for" - that is one which will be turned into a simple loop?
(I'm making a distinction between Java's for and its "foreach" construct here, as the latter involves an implicit Iterator).
More detail, following up from Peter. This bit of Scala:
object ScratchFor {
def main(args : Array[String]) : Unit = {
for (val s <- args) {
println(s)
}
}
}
creates 3 classes: ScratchFor$$anonfun$main$1.class ScratchFor$.class ScratchFor.class
ScratchFor::main just forwards to the companion object, ScratchFor$.MODULE$::main which spins up an ScratchFor$$anonfun$main$1 (which is an implementation of AbstractFunction1).
It's in the apply() method of this anonymous inner impl of AbstractFunction1 that the actual code lives, which is effectively the loop body.
I don't see HotSpot being able to rewrite this into a simple loop. Happy to be proved wrong on this, though.
Traditional for loops are clumsy, verbose and error-prone. I think it is proof enough of this that "for-each" loops where added to Java, C# and C++, but if you want more details you may check item 46 of Effective Java.
Now, for-each loops are still much faster than Scala for-comprehension, but they are also much less powerful (and more clumsy) because they cannot return values. If you want to transform or filter a collection (or do both to a group of collections), you'll still have to handle all the mechanical details of constructing the result collection in addition to computing the values. Not to mention it inevitably uses some mutable state.
Finally, even though for-each loops are adequate enough for collections, they are not suited to other monadic classes (of which collections are a subset of).
So Scala has a general method which takes care of all of the above. Yes, it is slower, but the goal is to have the compiler effectively optimise it well enough so that this doesn't become a hindrance (and, of course, JIT could help here as well).
That has not been accomplished to this date, but -optimise has reduced a lot of ground between common for-each loops and for-comprehensions on the latest versions of Scala. If performance is essential, you can always use while or tail recursion.
Now, it would be possibly for Scala to have common for loops or for-each loops as special cases specifically targeted at performance issues (since for-comprehensions can do everything they do). However, that violates two principles that guide Scala's design:
Reduce complexity. Yes, contrary to what some say, that is a design goal, and special cases that serve no other purpose other than optimise performance -- even though a workable solution exists for performance cases -- would needlessly increase the complexity of the language.
Scalability. This is in the sense that the use can scale the language for any size of problem by writing libraries. The point here is that having the compiler optimise one particular class, such as Range, would make it impossible for the user to create a replacement class that would perform just as well.
The for comprehension in Scala is a powerful general-purpose looping and pattern-matching construct. Look at what it can do:
case class Person(first: String, last: String) {}
val people = List(Person("Isaac","Newton"), Person("Michael","Jordan"))
val lastfirst = for (Person(f,l) <- people) yield l+", "+f
for (n <- lastfirst) println(n)
The second case looks pretty straightforward--take each item in a collection and print it. But the first takes apart a list containing a custom data structure and transforms it into a different collection type!
The first for there highlights only a small portion of the capability of the construct; it is both extremely powerful and extremely general. In order to maintain this power, the for must be able to turn into something very general, which means closures. Then the question is: do you also introduce special cases that operate on known collections in simple ways with improved performance? The answer thus far has been mostly no, instead preferring solutions that optimize the general closure-taking methods that for turns into.
Whether this is useful for you in particular depends on whether you are using the general capabilities a lot (in which case you will be glad) or not (in which case you may wish progress was faster).
Still, try -optimize. It often usefully speeds up simple for-comprehensions these days.
The for-comprehension is much more than a simple loop.
If you need an imperative loop, use while. If you want to write performant code in Scala, you need to know this. Just like you have to know about language implementation when you want to write fast code in every other language.
So, since the for-comprehension is not a simple loop, I hope you understand that it's not compiled down to a simple loop.
I would assume using a closure is a general solution. A more optimal solution in some cases would be to "inline" the closure as a loop and eliminate the need to create an object. Perhaps the Scala designers feel the JIT should do this, rather having the compiler do this.
Let's say in Java this is the same as writing
public static void main(String... args) {
for_loop(args, new Function<String>() {
public void apply(String s) {
System.out.println(s);
}
});
}
interface Function<T> {
void apply(T s);
}
public static <T> void for_loop(T... ts, Function<T> tFunc) {
for(T t: ts) tFunc.apply(t);
}
This is fairly easy to inline (if you're a human). What is surprising is that Scala doesn't have an intrinsic to perform the optimisation to eliminate the need for a new object. Certainly the JIT could do it in theory, but in practise, it might be a while before it handles this specific case.
I'm surprised that no one has mentioned one of the pitfalls you can get into if for does not create a closure.
In Python for example:
ls = [None] * 3
for i in [0, 1, 2]:
ls[i] = lambda: i
print(ls[0]())
print(ls[1]())
print(ls[2]())
This prints 2 2 2, because i has a longer lifetime than the for loop. I run into this trap all the time in Python and R.
So even in the very simplest of cases, it is important for for in Scala to be implemented using an anonymous function, because it creates an environment to store variables.
When you're designing the API for a code library, you want it to be easy to use well, and hard to use badly. Ideally you want it to be idiot proof.
You might also want to make it compatible with older systems that can't handle generics, like .Net 1.1 and Java 1.4. But you don't want it to be a pain to use from newer code.
I'm wondering about the best way to make things easily iterable in a type-safe way... Remembering that you can't use generics so Java's Iterable<T> is out, as is .Net's IEnumerable<T>.
You want people to be able to use the enhanced for loop in Java (for Item i : items), and the foreach / For Each loop in .Net, and you don't want them to have to do any casting. Basically you want your API to be now-friendly as well as backwards compatible.
The best type-safe option that I can think of is arrays. They're fully backwards compatible and they're easy to iterate in a typesafe way. But arrays aren't ideal because you can't make them immutable. So, when you have an immutable object containing an array that you want people to be able to iterate over, to maintain immutability you have to provide a defensive copy each and every time they access it.
In Java, doing (MyObject[]) myInternalArray.clone(); is super-fast. I'm sure that the equivalent in .Net is super-fast too. If you have like:
class Schedule {
private Appointment[] internalArray;
public Appointment[] appointments() {
return (Appointment[]) internalArray.clone();
}
}
people can do like:
for (Appointment a : schedule.appointments()) {
a.doSomething();
}
and it will be simple, clear, type-safe, and fast.
But they could do something like:
for (int i = 0; i < schedule.appointments().length; i++) {
Appointment a = schedule.appointments()[i];
}
And then it would be horribly inefficient because the entire array of appointments would get cloned twice for every iteration (once for the length test, and once to get the object at the index). Not such a problem if the array is small, but pretty horrible if the array has thousands of items in it. Yuk.
Would anyone actually do that? I'm not sure... I guess that's largely my question here.
You could call the method toAppointmentArray() instead of appointments(), and that would probably make it less likely that anyone would use it the wrong way. But it would also make it harder for people to find when they just want to iterate over the appointments.
You would, of course, document appointments() clearly, to say that it returns a defensive copy. But a lot of people won't read that particular bit of documentation.
Although I'd welcome suggestions, it seems to me that there's no perfect way to make it simple, clear, type-safe, and idiot proof. Have I failed if a minority of people are unwitting cloning arrays thousands of times, or is that an acceptable price to pay for simple, type-safe iteration for the majority?
NB I happen to be designing this library for both Java and .Net, which is why I've tried to make this question applicable to both. And I tagged it language-agnostic because it's an issue that could arise for other languages too. The code samples are in Java, but C# would be similar (albeit with the option of making the Appointments accessor a property).
UPDATE: I did a few quick performance tests to see how much difference this made in Java. I tested:
cloning the array once, and iterating over it using the enhanced for loop
iterating over an ArrayList using
the enhanced for loop
iterating over an unmodifyable
ArrayList (from
Collections.unmodifyableList) using
the enhanced for loop
iterating over the array the bad way (cloning it repeatedly in the length check
and when getting each indexed item).
For 10 objects, the relative speeds (doing multiple repeats and taking the median) were like:
1,000
1,300
1,300
5,000
For 100 objects:
1,300
4,900
6,300
85,500
For 1000 objects:
6,400
51,700
56,200
7,000,300
For 10000 objects:
68,000
445,000
651,000
655,180,000
Rough figures for sure, but enough to convince me of two things:
Cloning, then iterating is definitely
not a performance issue. In fact
it's consistently faster than using a
List. (this is why Java's
enum.values() method returns a
defensive copy of an array instead of
an immutable list.)
If you repeatedly call the method,
repeatedly cloning the array unnecessarily,
performance becomes more and more of an issue the larger the arrays in question. It's pretty horrible. No surprises there.
clone() is fast but not what I would describe as super faster.
If you don't trust people to write loops efficiently, I would not let them write a loop (which also avoids the need for a clone())
interface AppointmentHandler {
public void onAppointment(Appointment appointment);
}
class Schedule {
public void forEachAppointment(AppointmentHandler ah) {
for(Appointment a: internalArray)
ah.onAppointment(a);
}
}
Since you can't really have it both ways, I would suggest that you create a pre generics and a generics version of your API. Ideally, the underlying implementation can be mostly the same, but the fact is, if you want it to be easy to use for anyone using Java 1.5 or later, they will expect the usage of Generics and Iterable and all the newer languange features.
I think the usage of arrays should be non-existent. It does not make for an easy to use API in either case.
NOTE: I have never used C#, but I would expect the same holds true.
As far as failing a minority of the users, those that would call the same method to get the same object on each iteration of the loop would be asking for inefficiency regardless of API design. I think as long as that's well documented, it's not too much to ask that the users obey some semblance of common sense.
Currently I am working on a bit of code which (I believe) requires quite a few embedded if statements. Is there some standard to how many if statements to embed? Most of my googling has turned up things dealing with excel..don't know why.
If there is a standard, why? Is it for readability or is it to keep code running more smoothly? In my mind, it makes sense that it would be mainly for readability.
An example of my if-structure:
if (!all_fields_are_empty):
if (id_search() && validId()):
// do stuff
else if (name_search):
if (name_exists):
if (match < 1):
// do stuff
else:
// do stuff
else if (name_search_type_2):
if (exists):
if (match < 1):
// do stuff
else:
// do stuff
else:
// you're stupid
I have heard that there's a limit to 2-3 nested for/while loops, but is there some standard for if-statements?
Update:
I have some years under my belt now. Please don't use this many if statements. If you need this many, your design is probably bad. Today, I LOVE when I can find an elegant way to do these things with minimal if statements or switch cases. The code ends up cleaner, easier to test, and easier to maintain. Normally.
As Randy mentioned, the cause of this kind of code is in most cases a poor design of an application. Usually I try to use "processor" classes in your case.
For example, given that there is some generic parameter named "operation" and 30 different operations with different parameters, you could make an interface:
interface OperationProcessor {
boolean validate(Map<String, Object> parameters);
boolean process(Map<String, Object> parameters);
}
Then implement lots of processors for each operation you need, for example:
class PrinterProcessor implements OperationProcessor {
boolean validate(Map<String, Object> parameters) {
return (parameters.get("outputString") != null);
}
boolean process(Map<String, Object> parameters) {
System.out.println(parameters.get("outputString"));
}
}
Next step - you register all your processors in some array when application is initialized:
public void init() {
this.processors = new HashMap<String, OperationProcessor>();
this.processors.put("print",new PrinterProcessor());
this.processors.put("name_search", new NameSearchProcessor());
....
}
So your main method becomes something like this:
String operation = parameters.get("operation"); //For example it could be 'name_search'
OperationProcessor processor = this.processors.get(operation);
if (processor != null && processor.validate()) { //Such operation is registered, and it validated all parameters as appropriate
processor.process();
} else {
System.out.println("You are dumb");
}
Sure, this is just an example, and your project would require a bit different approach, but I guess it could be similiar to what I described.
I don't think there is a limit but i wouldn't recommend embeddeding more the two - it's too hard to read, difficult to debug and hard to unit test. Consider taking a look at a couple great books like Refactoring, Design Patterns, and maybe Clean Code
Technically, I am not aware of any limitation to nesting.
It might be an indicator of poor design if you find yourself going very deep.
Some of what you posted looks like it may be better served as a case statement.
I would be concerned with readability, and code maintenance for the next person which really means it will be difficult - even for the first person (you) - to get it all right in the first place.
edit:
You may also consider having a class that is something like SearchableObject(). You could make a base class of this with common functionality, then inherit for ID, Name, etc, and this top level control block would be drastically simplified.
Technically you can have as many as you like but if you have a lot it can quickly make the code unreadable.
What i'd normally do is something like:
if(all_fields_are_empty) {
abuseuser;
return;
}
if(id_search() && validId()) {
//do stuff
return;
}
if(name_search)
{
if(name_exists)
//do stuff
return
else
//do stuff
return
}
I'm sure you get the picture
Tl;Dr You don't really want anymore than 10-15 paths though any one method
What your essentially referring to here is Cyclomatic complexity.
Cyclomatic complexity is a software metric (measurement), used to
indicate the complexity of a program. It is a quantitative measure of
the number of linearly independent paths through a program's source
code. It was developed by Thomas J. McCabe, Sr. in 1976.
So every if statement is potentially a new path though your code and increases it's Cyclomatic complexity. There are tools that will measure this for you and high light areas of high complexity for potential refactoring.
Is there some standard to how many if statements to embed?
Yes and no. It's generally regarded (and McCabe himself argued) that a Cyclomatic complexity of over about 10 or 15 is too high and a sign that the code should be refactored.
One of McCabe's original applications was to limit the complexity of
routines during program development; he recommended that programmers
should count the complexity of the modules they are developing, and
split them into smaller modules whenever the cyclomatic complexity of
the module exceeded 10.[2] This practice was adopted by the NIST
Structured Testing methodology, with an observation that since
McCabe's original publication, the figure of 10 had received
substantial corroborating evidence, but that in some circumstances it
may be appropriate to relax the restriction and permit modules with a
complexity as high as 15. As the methodology acknowledged that there
were occasional reasons for going beyond the agreed-upon limit, it
phrased its recommendation as: "For each module, either limit
cyclomatic complexity to [the agreed-upon limit] or provide a written
explanation of why the limit was exceeded."[7]
This isn't really a hard rule though and can be disregarded in some circumstances. See this question What is the highest Cyclomatic Complexity of any function you maintain? And how would you go about refactoring it?.
why? Is it for readability or is it to keep code running more
smoothly?
Essentially this is for readability, which should make your code run smoothly. To quote Martin Fowler
Any fool can write code that a computer can understand. Good
programmers write code that humans can understand.
The only technical limit to the number of nested if/else blocks in Java will probably be the size of your stack. Style is another matter.
Btw: What's with the colons?