Does using a map truly reduce cyclomatic complexity? - java

Suppose I have the original method below.
public String someMethod(String str) {
String returnStr;
if("BLAH".equals(str)) {
returnStr="ok";
} else if ("BLING".equals(str)) {
returnStr="not ok";
} else if ("BONG".equals(str)) {
returnStr="ok";
}
return returnStr;
}
Does converting to below truly reduce CC?
Map<String, String> validator = new HashMap<String,String>();
validator.put("BLAH","ok");
validator.put("BLING","not ok");
validator.put("BONG","ok");
public String someMethod(String str) {
return validator.get(str);
}

Yes, in your case. In simple terms, cyclomatic complexity is a number of linear-independent ways to reach the end of code piece from starting point. So, any conditional operator increases CC of your code.
(if OP's question is somehow related to testing tag) However, reducing CC doesn't reduce count of unit test which have to be written to cover your code: CC gives you only a lower bound of test count. For good coverage unit tests should cover all specific cases, and in second case you don't reduce this number of specific cases, you only make your code more readable.

Yes, because the cyclomatic complexity defines the number of linear independent paths in the control flow graph plus one. In your second example, there is only one path, the first has multiple path through if branches. However, it does not seem that cyclomatic complexity is really a problem here. You could substitue your method like this, to make it better readable:
public String someMethod(String str) {
switch(str) {
case "BLAH":
case "BONG": return "ok";
case "BLING": return "not ok";
default: return null;
}
}

Short answer : yes, the usuage of Hashmap in your case does reduce Cyclomatic complexcity.
Detailed answer : Cyclomatic complexcity as per wikipedia is
It is a quantitative measure of the number of linearly independent paths through a program's source code.
There are various ways to tackle if-else cases. If-else statements makes the code less readable, difficult to understand. These if-else are also bad as each time you have addition / deletion / modification in the cases then you need to modify the existing code files where your other business logic remains same, and due to change in these files you need to test them all over again. This leads to maintainance issues as well as at all the places we need to make sure to handle the cases. Same issues exists with switch statements too, though they are little more readable.
The way you have used also reduces the different logical paths of execution.Another alternative approach is as below.
You can create an interface say IPair. Let this interface define an abstract method public String getValue(); Lets define different classes for each case we have and let BlahMatch.java, Bling.java and Bong.java implement IPair and in the implementation of getValue() method return the appropriate String.
public String someMethod(IPair pair) {
return pair.getValue();
}
The advantage of above approach is, in case you have some new pair later you can still create a new class and pass the object of your new class easily, just you need to have your class provide implementation for IPair.

Related

Dynamic comparator or 2 separate comparators?

I'm working on a solution including 2 types of sorting based on a property. What do you think which one is a better approach?
public class ABComparator implements Comparator<O> {
private final Property property;
public ABComparator(Request request) {
property = getProperty();
}
#Override
public int compare(O o1, O o2) {
if (property.someLogic()) {
// 1 first type of sorting
} else. {
// another type of sorting
}
}
}
or is better having 2 classes with their own logic and choosing one in the class where the sort is actually happening?
Thanks
Problems are exponential as you incorporate more mutations and more "choices" in your algorithms.
You can check what Sonarcloud thinks about cognitivity complexity and cyclomatic complexity. https://www.sonarsource.com/resources/cognitive-complexity/
To put it simply less choices, less state to manage will create more robust code and less bugs. Bringing 2 classes is low on complexity, especially if they don't depend on anything else. Code volume added should be low. I personally would use 2 classes with 2 implementations that don't share anything between themselves if possible so there is no if condition in your implementation of your compare function.

Can the java compiler optimize loops to return early?

I'm working with an external library that decided to handle collections on its own. Not working with it or updating is outside my control. To work with elements of this third party "collection" it only returns iterators.
A question came up during a code review about having multiple returns in the code to gain performance. We all agree (within the team) the code is more readable with a single return, but some are worried about optimizations.
I'm aware premature optimization is bad. That is a topic for another day.
I believe the JIT compiler can handle this and skip the unneeded iterations, but could not find any info to back this up. Is JIT capable of such a thing?
A code sample of the issue at hand:
public void boolean contains(MyThings things, String valueToFind) {
Iterator<Thing> thingIterator = things.iterator();
boolean valueFound = false;
while(thingIterator.hasNext()) {
Thing thing = thingIterator.next();
if (valueToFind.equals(thing.getValue())) {
valueFound = true;
}
}
return valueFound;
}
VS
public void boolean contains(MyThings things, String valueToFind) {
Iterator<Thing> thingIterator = things.iterator();
while(thingIterator.hasNext()) {
Thing thing = thingIterator.next();
if (valueToFind.equals(thing.getValue())) {
return true;
}
}
return false;
}
We all agree the code is more readable with a single return.
Not really. This is just old school structured programming when functions were typically not kept small and the paradigms of keeping values immutable weren't popular yet.
Although subject to debate, there is nothing wrong with having very small methods (a handful of lines of code), which return at different points. For example, in recursive methods, you typically have at least one base case which returns immediately, and another one which returns the value returned by the recursive call.
Often you will find that creating an extra result variable, just to hold the return value, and then making sure no other part of the function overwrites the result, when you already know you can just return, just creates noise which makes it less readable not more. The reader has to deal with cognitive overload to see the result is not modified further down. During debugging this increases the pain even more.
I don't think your example is premature optimisation. It is a logical and critical part of your search algorithm. That is why you can break from loops, or in your case, just return the value. I don't think the JIT could realise that easily it should break out the loop. It doesn't know if you want to change the variable back to false if you find something else in the collection. (I don't think it is that smart to realise that valueFound doesn't change back to false).
In my opinion, your second example is not only more readable (the valueFound variable is just extra noise) but also faster, because it just returns when it does its job. The first example would be as fast if you put a break after setting valueFound = true. If you don't do this, and you have a million items to check, and the item you need is the first, you will be comparing all the others just for nothing.
Java compiler cannot do an optimization like that, because doing so in a general case would change the logic of the program.
Specifically, adding an early return would change the number of invocations of thingIterator.hasNext(), because your first code block continues iterating the collection to the end.
Java could potentially replace a break with an early return, but that would have any effect on the timing of the program.

How to compare Java function object to a specific method? [duplicate]

Say I have a List of object which were defined using lambda expressions (closures). Is there a way to inspect them so they can be compared?
The code I am most interested in is
List<Strategy> strategies = getStrategies();
Strategy a = (Strategy) this::a;
if (strategies.contains(a)) { // ...
The full code is
import java.util.Arrays;
import java.util.List;
public class ClosureEqualsMain {
interface Strategy {
void invoke(/*args*/);
default boolean equals(Object o) { // doesn't compile
return Closures.equals(this, o);
}
}
public void a() { }
public void b() { }
public void c() { }
public List<Strategy> getStrategies() {
return Arrays.asList(this::a, this::b, this::c);
}
private void testStrategies() {
List<Strategy> strategies = getStrategies();
System.out.println(strategies);
Strategy a = (Strategy) this::a;
// prints false
System.out.println("strategies.contains(this::a) is " + strategies.contains(a));
}
public static void main(String... ignored) {
new ClosureEqualsMain().testStrategies();
}
enum Closures {;
public static <Closure> boolean equals(Closure c1, Closure c2) {
// This doesn't compare the contents
// like others immutables e.g. String
return c1.equals(c2);
}
public static <Closure> int hashCode(Closure c) {
return // a hashCode which can detect duplicates for a Set<Strategy>
}
public static <Closure> String asString(Closure c) {
return // something better than Object.toString();
}
}
public String toString() {
return "my-ClosureEqualsMain";
}
}
It would appear the only solution is to define each lambda as a field and only use those fields. If you want to print out the method called, you are better off using Method. Is there a better way with lambda expressions?
Also, is it possible to print a lambda and get something human readable? If you print this::a instead of
ClosureEqualsMain$$Lambda$1/821270929#3f99bd52
get something like
ClosureEqualsMain.a()
or even use this.toString and the method.
my-ClosureEqualsMain.a();
This question could be interpreted relative to the specification or the implementation. Obviously, implementations could change, but you might be willing to rewrite your code when that happens, so I'll answer at both.
It also depends on what you want to do. Are you looking to optimize, or are you looking for ironclad guarantees that two instances are (or are not) the same function? (If the latter, you're going to find yourself at odds with computational physics, in that even problems as simple as asking whether two functions compute the same thing are undecidable.)
From a specification perspective, the language spec promises only that the result of evaluating (not invoking) a lambda expression is an instance of a class implementing the target functional interface. It makes no promises about the identity, or degree of aliasing, of the result. This is by design, to give implementations maximal flexibility to offer better performance (this is how lambdas can be faster than inner classes; we're not tied to the "must create unique instance" constraint that inner classes are.)
So basically, the spec doesn't give you much, except obviously that two lambdas that are reference-equal (==) are going to compute the same function.
From an implementation perspective, you can conclude a little more. There is (currently, may change) a 1:1 relationship between the synthetic classes that implement lambdas, and the capture sites in the program. So two separate bits of code that capture "x -> x + 1" may well be mapped to different classes. But if you evaluate the same lambda at the same capture site, and that lambda is non-capturing, you get the same instance, which can be compared with reference equality.
If your lambdas are serializable, they'll give up their state more easily, in exchange for sacrificing some performance and security (no free lunch.)
One area where it might be practical to tweak the definition of equality is with method references because this would enable them to be used as listeners and be properly unregistered. This is under consideration.
I think what you're trying to get to is: if two lambdas are converted to the same functional interface, are represented by the same behavior function, and have identical captured args, they're the same
Unfortunately, this is both hard to do (for non-serializable lambdas, you can't get at all the components of that) and not enough (because two separately compiled files could convert the same lambda to the same functional interface type, and you wouldn't be able to tell.)
The EG discussed whether to expose enough information to be able to make these judgments, as well as discussing whether lambdas should implement more selective equals/hashCode or more descriptive toString. The conclusion was that we were not willing to pay anything in performance cost to make this information available to the caller (bad tradeoff, punishing 99.99% of users for something that benefits .01%).
A definitive conclusion on toString was not reached but left open to be revisited in the future. However, there were some good arguments made on both sides on this issue; this is not a slam-dunk.
To compare labmdas I usually let the interface extend Serializable and then compare the serialized bytes. Not very nice but works for the most cases.
I don't see a possibility, to get those informations from the closure itself.
The closures doesn't provide state.
But you can use Java-Reflection, if you want to inspect and compare the methods.
Of course that is not a very beautiful solution, because of the performance and the exceptions, which are to catch. But this way you get those meta-informations.

Call Method based on user preferences, which is faster/better

We have X methods and we like call the one relative to user settings which of the following runs faster?
Case 1:
int userSetting = 1;
Method method = Class.getDeclaredMethod("Method" + userSetting);
method.invoke();
Case 2:
int userSetting = 1;
switch(userSettings) {
case 0:
Method0();
break;
case 1:
Method1();
break;
...
...
}
Case 3:
int userSetting = 1;
if(userSetting == 0){
Method0();
} else if(userSetting == 1){
Method1();
} else....
Also:
You think one even if slower is better practice that the others? If yes why?
There is another way witch is better/faster...please tell us.
Thanks
Option 1 uses reflection, and thus will probably be slower, as the javadocs indicate:
Performance Overhead
Because reflection involves types that are dynamically resolved, certain Java
virtual machine optimizations can not be performed. Consequently, reflective
operations have slower performance than their non-reflective counterparts,
and should be avoided in sections of code which are called frequently in
performance-sensitive applications.
However it is easier to maintain this option then options 2+3.
I would suggest you to use a complete different option: use the strategy design pattern. It is more likely to be faster and much more readable then the alternatives.
As amit points out, this is a case for the Strategy design pattern. Additionally, I want to give a short example:
Pseudo-Code:
public interface Calculator {
public int calc(...);
}
public class FastCalc implements Calculator {
public int calc(...) {
// Do the fast stuff here
}
}
public class SlowCalc implements Calculator {
public int calc(...) {
// Do the slow stuff here
}
}
You main program then decides which strategy to use based on the user preferences:
Calculator calc = userPreference.getBoolean("fast") ? new FastCalc() : new SlowCalc();
int result = calc.calc(...);
This is because later, you can use the Factory pattern to create multiple strategies for various operations:
Factory factory = new SlowFactory();
Calculator calc = factory.createCalculator();
Operation op = factory.createSomeOtherOperation();
Factory factory = new FastFactory();
Calculator calc = factory.createCalculator();
Operation op = factory.createSomeOtherOperation();
As you can see, the code is the same for the Slow case and for the Fast case, except the factory class, and that you can create by deciding based on the user preference. Especially if you have more such operations, such as Calculator and my Operation example, then you will want your code to not be dependent on the user preference everywhere but only at a single place.
I think the obvious slowest version is number one. reflexion is complex and is done during the runtime. for number 2 and number 3 you could have a look at Java: case-statment or if-statement efficiency perspective.
another way: could the configuration of the user change during the execution? if not, make the decision only one time on start-up.
Case 1 uses reflection and suffers a performance hit beyond approaches 2 and 3.
Between approaches 2 & 3 performance difference would be marginal at most. You must ask yourselves if any possible performance gain is really justified over code readability? Unless being on a truly limited microchip or similar I would always answer no.
Apart from the performance view, as #HoeverCraft Full Of Eels already pointed out you're probably better of redesigning your program to completely avoid the series of conditional clauses.
As all others have said #1 will most likely be the slowest.
The differences between 2 and 3 are negligible, but generally #2 shouldn't be slower than #3, because the compiler can change a switch to a cascaded if, if it thinks it would be faster. Also since the switch is clearly better readable than the if/else cascade I'd go with the second anyhow.
Although I'm extremely sure that this isn't the bottleneck anyhow - even if using reflection..

Refactoring advice and tools

I have some code that consists of a lot (several hundreds of LOC) of uggly conditionals i.e.
SomeClass someClass = null;
if("foo".equals(fooBar)) {
// do something possibly involving more if-else statments
// and possibly modify the someClass variable among others...
} else if("bar".equals(fooBar)) {
// Same as above but with some slight variations
} else if("baz".equals(fooBar)) {
// and yet again as above
}
//... lots of more else ifs
} else {
// and if nothing matches it is probably an error...
// so there is some error handling here
}
// Some code that acts on someClass
GenerateOutput(someClass);
Now I had the idea of refactoring this kind of code something along the lines of:
abstract class CheckPerform<S,T,Q> {
private CheckPerform<T> next;
CheckPerform(CheckPerform<T> next) {
this.next = next;
}
protected abstract T perform(S arg);
protected abstract boolean check(Q toCheck);
public T checkPerform(S arg, Q toCheck) {
if(check(toCheck)) {
return perform(arg);
}
// Check if this CheckPerform is the last in the chain...
return next == null ? null : next.checkPerform();
}
}
And for each if statment generate a subclass of CheckPerform e.g.
class CheckPerformFoo extends CheckPerform<SomeInput, SomeClass, String> {
CheckPerformFoo(CheckPerform<SomeInput, SomeClass, String> next) {
super(next);
}
protected boolean check(String toCheck) {
// same check as in the if-statment with "foo" above"
returs "foo".equals(toCheck);
}
protected SomeClass perform(SomeInput arg) {
// Perform same actions (as in the "foo" if-statment)
// and return a SomeClass instance (that is in the
// same state as in the "foo" if-statment)
}
}
I could then inject the diffrent CheckPerforms into eachother so that the same order of checks are made and the corresponding actions taken. And in the original class I would only need to inject one CheckPerform object. Is this a valid approach to this type of problem? The number of classes in my project is likely to explode, but atleast I will get more modular and testable code. Should I do this some other way?
Since these if-else-if-...-else-if-else statments are what I would call a recurring theme of the code base I would like to do this refactoring as automagically as possible. So what tools could I use to automate this?
a) Some customizable refactoring feature hidden somewhere in an IDE that I have missed (either in Eclipse or IDEA preferably)
b) Some external tool that can parse Java code and give me fine grained control of transformations
c) Should I hack it myself using Scala?
d) Should I manually go over each class and do the refactoring using the features I am familiar with in my IDE?
Ideally the output of the refactoring should also include some basic test code template that I can run (preferably also test cases for the original code that can be run on both new and old as a kind of regression test... but that I leave for later).
Thanks for any input and suggestions!
What you have described is the Chain of Responsibility Pattern and this sounds like it could be a good choice for your refactor. There could be some downsides to this.
Readability Because you are going to be injecting the the order of the CheckPerformers using spring or some such, this means that it is difficult to see what the code will actually do at first clance.
Maintainence If someone after you wants to add a new condition, as well as adding a whole new class they also have to edit some spring config. Choosing the correct place to add there new CheckPerformer could be difficult and error prone.
Many Classes Depending on how many conditions you have and how much repeated code within those conditions you could end up with a lot of new classes. Even though the long list of if else its very pretty, the logic it in one place, which again aids readability.
To answer the more general part of your question, I don't know of any tools for automatic refactoring beyond basic IDE support, but if you want to know what to look for to refactor have a look at the Refactoring catalog. The specific of your question are covered by replace conditional with Polymorphism and replace conditional with Visitor.
To me the easiest approach would involve a Map<String, Action>, i.e. mapping various strings to specific actions to perform. This way the lookup would be simpler and more performant than the manual comparison in your CheckPerform* classes, getting rid of much duplicated code.
The actions can be implemented similar to your design, as subclasses of a common interface, but it may be easier and more compact to use an enum with overridden method(s). You may see an example of this in an earlier answer of mine.
Unfortunately I don't know of any automatic refactoring which could help you much in this. Earlier when I did somewhat similar refactorings, I wrote unit tests and did the refactoring step-by-step, manually, using automated support at the level of Move Method et al. Of course since the unit tests were pretty similar to each other in their structure, I could reuse part of the code there.
Update
#Sebastien pointed out in his comment, that I missed the possible sub-ifs within the bigger if blocks. One can indeed use a hierarchy of maps to resolve this. However, if the hierarchy starts to be really complex with a lot of duplicated functionality, a further improvement might be to implement a DSL, to move the whole mapping out of code into a config file or DB. In its simplest form it might look something like
foo -> com.foo.bar.SomeClass.someMethod
biz -> com.foo.bar.SomeOtherClass.someOtherMethod
baz -> com.foo.bar.YetAnotherClass.someMethod
bar -> com.foo.bar.SomeOtherClass.someMethod
biz -> com.foo.bar.DifferentClass.aMethod
baz -> com.foo.bar.AndAnotherClass.anotherMethod
where the indented lines configure the sub-conditions for each bigger case.

Categories

Resources