When should function call another vs when to use method dependency injection - java

Consider a code, where I need to find square of sum of x largest elements in an array. ( this is NOT a which data structure question, so please dont post replies recommending heap etc.).
I initially code it up :
OPTION1
singleFunction {
// code to sort
// code to sum
// code to sqaure
return;
}
Soon, I realize I could leverage helper functions and break them into functions.
OPTION 2
getFinalAnswer() {
// sort;
return sumAndSquare();
}
sumAndSqaure() {
// sum
return square();
}
square() {
// return square.
}
Now I realize sort, sum and square can be used as utility methods rather than simply helper methods.
Now I break down functionality into 3 functions (1) sort (2) sum x (3) square()
OPTION3
someFunction(int[] arr, int x) {
sort(arr);
b = sumOfLastXElements(arr, x);
c = sqaure(b);
return c;
}
Now questions:
Looks like option 3 is the best of the lot, still so many times we find a function calling another. What is an advantage of option 2 over option 3 ?
A method by definition is supposed to do a single task/responsibility. but somefunction is doing 3 different things. What are such functions called ?

First, being strict, I must say that Java has no functions, only methods due to its OO nature.
1) Looks like option 3 is the best of the lot, still so many times we find a function calling another. What is an advantage of option 2 over option 3 ?
As you said, sort, sum and square methods hold a single responsibility each one, so there's no need to have a single monster method that do al the three. Also, each one can be reused later in other methods.
Option 2 has a sumAndSquare method that may or may not be reusable. This will heavily depends on your needs. The fact that you need this method or not will be noted if you have lots of these along your code (and by lots, I mean is at least 10 times in different methods):
long theSum = sum(array);
long theSquare = square(theSum);
2) A method by definition is supposed to do a single task/responsibility. but somefunction is doing 3 different things. What are such functions called?
It's task or responsibility is:
sort a list of numbers (I guess?)
sum the largets numbers
apply the square of the sum
So, the method is doing its task as expected. IMO you can even split the sumOfLastXElements into two methods: int[] findLastXElements(array) and long sum(array).
To answer this: What are such functions called? there's no specific nor special name, they are just methods. But the process to go from option 1 to option 3 is called Code Refactoring

Answer to your first question:
Number two has more overhead, since each function/method has to be instantiated.
Option 2 has more flexibility, and if you're going to use those individual pieces all the time, might be worth your time. However, if they are only used in code together, consider grouping them together.
As a wise professor of mine once said, separate what changes from what stays the same.
If they are only used together, no need to have individual methods/functions.
Answer to your second question:
Difference between a method and a function
(don't get too hung up on terminology, IMHO.)
Hope this helps.

Related

Use compareTo or not

I have to compare objects of a custom class, say A.
Comparison is simple one based on a int member, say mem, of A.
So in comparator implementation, I can either do:
(A a1, A a2) -> {return (Integer)a1.getMem().compareTo(a2.getMem());}
Or, I can do comparison on my own:
(A a1, A a2) -> {
if(a1.getMem() > a2.getMem()){
return 1;
}else{
if(a1.getMem() < a2.getMem()) {
return -1;
}else{
return 0
}
}
}
Which one is a better approach?
First approach has far lesser lines of code, but internally compareTo does same what we are doing in second approach.
It's usually better not to re-invent the wheel. Therefore the first approach is better.
You can even write less code with Comparator.comparingInt:
Comparator.comparingInt(A::getMem)
Go for the first approach. It is more readable (how do we compare two As? compare their getMem) than a bunch of if statements and returning magic numbers. Also, using a method from the library like compareTo is less error prone than writing a bunch of comparison logic yourself. Imagine having mistyped a -1 as a 1 or a < as a >.
But, there is an even better approach:
Comparator.comparingInt(A::getMem)
One of the most basic rules to get to a "good" code base: avoid code duplication like the plague!
It is not only about writing the minimum amount of code to solve a problem. It is really about: not having the same logic in more than one place.
Why? Because when you decide to change that logic at some point, you have to remember to update all places that contain that logic.
There are studies that show that code duplication in larger project sooner or later leads to having multiple almost identical clones of some piece of logic. And guess what: that is where bugs are hiding. You copy 9 lines out of 10, and you make a subtle modification within that 9 lines. And either you just added a bug, or you fixed a problem in those 9 lines, but not in the original 10 lines. And now two places in your code do slightly different things. Rarely a good thing.
So follow the two other answers, but understand why you should do that.
And make no mistake: at some point, you might decide that this compareTo implementation is no longer what you need. Then it is perfectly fine to change it to something else, and write that down in this place in full length. But until that day: re-use that already existing code!

What are the differences between returning values and side effect coding?

My questions are motivated by a C++ code which is not mine and that I am currently trying to understand. Nevertheless, I think this question can be answered by OO developers in general (because I have ever seen this case in Java code for example).
Reading through the code, I noticed that the developer always work using side effects (most functions have "void return type" except for getters and some rare cases) instead of returning results directly. He sometimes uses return values but only for control flows (error code... instead of exceptions).
Here are two possible examples of his prototypes (in pseudo-code):
For a function that should return min, max and avg of the float values in a matrix M:
void computeStatistics(float min, float max, float avg, Matrix M);
OR
void computeStatistics(List myStat, Matrix M);
For a function that should return some objects in a given list that verifies a certain criteria and the number of objects found:
int controlValue findObjects(List result, int nbObjectsFound, Object myCriteria, List givenList)
I am not familiar with C++ as you can probably see in my very-pseudo-code... But rather with Matlab where it is possible to return everything you want from a function for example an int and a List side by side (which could be useful for the second example). I know it is not possible in C++ and that could explain the second prototype but it doesn't explain the choice for the first example where he could have done:
List myStat computeStat(Matrix M)
Finally, here are my questions:
What are the possible reasons that could motivate this choice? Is it a good practice, a convention or just a development choice? Are there advantages of one way over the other (returning values vs. side effects way)?
In terms of C++:
IMO using returns values is clearer than passing value by references and present, in most cases, no overhead. (please have a look at RVO and Copy Elision)
However if you do use return values for your control flow, using references is not a bad thing and is still clear for most developers.
So I guess we could say that the choice is yours.
Keep also in mind that many developers are not aware of what black magic your C++ compiler is doing and so using return values might offend them.
In the past it was a common practice to use reference parameters as output, since returning complex objects was very slow without return value optimization an move semantic. Today I belief in most cases returning the value is the best choice.
Want Speed? Pass by Value.
Writing the following provided that the list has a copy would by me be considered inappropriate.
void computeStatistics(List myStat, Matrix M);
Instead (provided that list has copy) you should.
List myStat computeStat(Matrix M)
However the call-by-reference approach can be motivated if you do not have a copy on your object, then you wont need to allocate it on the heap instead you can allocate it on the stack and send your function a pointer to it.
Regarding:
void computeStatistics(float min, float max, float avg, Matrix M);
My personal opinion is that best-practice is one method one purpose, so I would do this like:
float min computeMin(Matrix M);
float max computeMax(Matrix M);
float avg computeAvg(Matrix M);
The only reason that I can see for making all this in one function would be because the calculations are not done separately (more work to do it in separate functions).
If you however need to have several return types in one method i would do it with call-by-reference. For example:
void SomeMethod(input1, input2, &output1, &output2, &output3)

Java: Setting an array of predetermined length as a method parameter?

I have a method that takes 5 double values and performs an action with them. Right now the argument list is five different doubles. Is there any way to pass a double[] as an argument to the method but make sure its length is exactly 5?
One way is this:
private void myMethod(double[] args) {
if (args.length == 5) {
// do something
}
}
but is there a better way?
If you know you need exactly 5 doubles, then I think you are better off asking for 5 distinct doubles. Having them listed out with meaningful names it will still be hard enough (even with intellisense or whatever it's called) to keep the order of the variables straight. If they are in an array, the user will need to consult the documentation to see which value should go in which index.
No. You can't restrict the length of an array passed to a function.
If your goal is to keep the checking code out of the method so it's cleaner, you could delegate the real work to another method.
If your concern is the length of the parameter list you could pass a parameter object.
You could create a class which is a specialization of a Vector limited to 5 doubles, but it seems like overkill. I would just throw an exception if there are too few or too many entries in the array - this is likely a programming problem rather than a runtime exception.
You could put your code in try-catch block. This provides to miss an unnecessary check.
But if something doing wrong you could avoid the problems with exception.

Calling a method n times: should I use a converted for-each loop or a traditional for loop?

Given the need to loop up to an arbitrary int value, is it better programming practice to convert the value into an array and for-each the array, or just use a traditional for loop?
FYI, I am calculating the number of 5 and 6 results ("hits") in multiple throws of 6-sided dice. My arbitrary int value is the dicePool which represents the number of multiple throws.
As I understand it, there are two options:
Convert the dicePool into an array and for-each the array:
public int calcHits(int dicePool) {
int[] dp = new int[dicePool];
for (Integer a : dp) {
// call throwDice method
}
}
Use a traditional for loop:
public int calcHits(int dicePool) {
for (int i = 0; i < dicePool; i++) {
// call throwDice method
}
}
My view is that option 1 is clumsy code and involves unnecessary creation of an array, even though the for-each loop is more efficient than the traditional for loop in Option 2.
At this point, speed isn't important (insert premature-optimization comment ;). What matters is how quickly you can understand what the code does, which is to call a method dicePool times.
The first method allocates an array of size dicePool and iterates through its values, which happens to run the loop body dicePool times (I'll pretend you meant int instead of Integer to avoid the unrelated autoboxing issue). This is potentially inefficient for the computer running the code, but more importantly it's inefficient for the human reading the code as it's conceptually distant from what you wanted to accomplish. Specifically, you force the reader to think about the new array you've just made, AND the value of the variable a, which will be 0 for every iteration of the loop, even though neither of those are related to your end goal.
Any Java programmer looking at the second method will realize that you're executing the loop body dicePool times with i 'counting up' to dicePool. While the latter part isn't especially important, the beginning is exactly what you meant to do. Using this common Java idiom minimizes the unrelated things a reader needs to think about, so it's the best choice.
When in doubt, go with simplicity. :D
Why would you need to allocate an array to loop over a variable that can be safely incremented and used without any need of allocation?
It sounds unecessarily inefficient. You can need to allocate an array if you need to swap the order of ints but this is not the case. I would go for option 2 for sure.
The foreach is useful when you want to iterate on a collection but creating a collection just to iterate over it when you don't need it is just without sense..
(2) is the obvious choice because there's no point in creating the array, based on your description. If there is, of course things change.
What makes you think that the for-each loop is more efficient?
Iterating over a set is very likely less efficient than a simple loop and counter.
It might help if you gave more context about the problem, specifically whether there's more to this question than choosing one syntax over the other. I am having trouble thinking of a problem to which #1 would be a better solution.
I wouldn't write the first one. It's not necessary to use the latest syntax in every setting.
Your instinct is a good one: if it feels and looks clumsy, it probably is.
Go with #2 and sleep at night.

Should I avoid using Java Label Statements?

Today I had a coworker suggest I refactor my code to use a label statement to control flow through 2 nested for loops I had created. I've never used them before because personally I think they decrease the readability of a program. I am willing to change my mind about using them if the argument is solid enough however. What are people's opinions on label statements?
Many algorithms are expressed more easily if you can jump across two loops (or a loop containing a switch statement). Don't feel bad about it. On the other hand, it may indicate an overly complex solution. So stand back and look at the problem.
Some people prefer a "single entry, single exit" approach to all loops. That is to say avoiding break (and continue) and early return for loops altogether. This may result in some duplicate code.
What I would strongly avoid doing is introducing auxilary variables. Hiding control-flow within state adds to confusion.
Splitting labeled loops into two methods may well be difficult. Exceptions are probably too heavyweight. Try a single entry, single exit approach.
Labels are like goto's: Use them sparingly, and only when they make your code faster and more importantly, more understandable,
e.g., If you are in big loops six levels deep and you encounter a condition that makes the rest of the loop pointless to complete, there's no sense in having 6 extra trap doors in your condition statements to exit out the loop early.
Labels (and goto's) aren't evil, it's just that sometimes people use them in bad ways. Most of the time we are actually trying to write our code so it is understandable for you and the next programmer who comes along. Making it uber-fast is a secondary concern (be wary of premature optimization).
When Labels (and goto's) are misused they make the code less readable, which causes grief for you and the next developer. The compiler doesn't care.
There are few occasions when you need labels and they can be confusing because they are rarely used. However if you need to use one then use one.
BTW: this compiles and runs.
class MyFirstJavaProg {
public static void main(String args[]) {
http://www.javacoffeebreak.com/java101/java101.html
System.out.println("Hello World!");
}
}
I'm curious to hear what your alternative to labels is. I think this is pretty much going to boil down to the argument of "return as early as possible" vs. "use a variable to hold the return value, and only return at the end."
Labels are pretty standard when you have nested loops. The only way they really decrease readability is when another developer has never seen them before and doesn't understand what they mean.
I have use a Java labeled loop for an implementation of a Sieve method to find prime numbers (done for one of the project Euler math problems) which made it 10x faster compared to nested loops. Eg if(certain condition) go back to outer loop.
private static void testByFactoring() {
primes: for (int ctr = 0; ctr < m_toFactor.length; ctr++) {
int toTest = m_toFactor[ctr];
for (int ctr2 = 0; ctr2 < m_divisors.length; ctr2++) {
// max (int) Math.sqrt(m_numberToTest) + 1 iterations
if (toTest != m_divisors[ctr2]
&& toTest % m_divisors[ctr2] == 0) {
continue primes;
}
} // end of the divisor loop
} // end of primes loop
} // method
I asked a C++ programmer how bad labeled loops are, he said he would use them sparingly, but they can occasionally come in handy. For example, if you have 3 nested loops and for certain conditions you want to go back to the outermost loop.
So they have their uses, it depends on the problem you were trying to solve.
I've never seen labels used "in the wild" in Java code. If you really want to break across nested loops, see if you can refactor your method so that an early return statement does what you want.
Technically, I guess there's not much difference between an early return and a label. Practically, though, almost every Java developer has seen an early return and knows what it does. I'd guess many developers would at least be surprised by a label, and probably be confused.
I was taught the single entry / single exit orthodoxy in school, but I've since come to appreciate early return statements and breaking out of loops as a way to simplify code and make it clearer.
I'd argue in favour of them in some locations, I found them particularly useful in this example:
nextItem: for(CartItem item : user.getCart()) {
nextCondition : for(PurchaseCondition cond : item.getConditions()) {
if(!cond.check())
continue nextItem;
else
continue nextCondition;
}
purchasedItems.add(item);
}
I think with the new for-each loop, the label can be really clear.
For example:
sentence: for(Sentence sentence: paragraph) {
for(String word: sentence) {
// do something
if(isDone()) {
continue sentence;
}
}
}
I think that looks really clear by having your label the same as your variable in the new for-each. In fact, maybe Java should be evil and add implicit labels for-each variables heh
I never use labels in my code. I prefer to create a guard and initialize it to null or other unusual value. This guard is often a result object. I haven't seen any of my coworkers using labels, nor found any in our repository. It really depends on your style of coding. In my opinion using labels would decrease the readability as it's not a common construct and usually it's not used in Java.
Yes, you should avoid using label unless there's a specific reason to use them (the example of it simplifying implementation of an algorithm is pertinent). In such a case I would advise adding sufficient comments or other documentation to explain the reasoning behind it so that someone doesn't come along later and mangle it out of some notion of "improving the code" or "getting rid of code smell" or some other potentially BS excuse.
I would equate this sort of question with deciding when one should or shouldn't use the ternary if. The chief rationale being that it can impede readability and unless the programmer is very careful to name things in a reasonable way then use of conventions such as labels might make things a lot worse. Suppose the example using 'nextCondition' and 'nextItem' had used 'loop1' and 'loop2' for his label names.
Personally labels are one of those features that don't make a lot of sense to me, outside of Assembly or BASIC and other similarly limited languages. Java has plenty of more conventional/regular loop and control constructs.
I found labels to be sometimes useful in tests, to separate the usual setup, excercise and verify phases and group related statements. For example, using the BDD terminology:
#Test
public void should_Clear_Cached_Element() throws Exception {
given: {
elementStream = defaultStream();
elementStream.readElement();
Assume.assumeNotNull(elementStream.lastRead());
}
when:
elementStream.clearLast();
then:
assertThat(elementStream.lastRead()).isEmpty();
}
Your formatting choices may vary but the core idea is that labels, in this case, provide a noticeable distinction between the logical sections comprising your test, better than comments can. I think the Spock library just builds on this very feature to declare its test phases.
Personally whenever I need to use nested loops with the innermost one having to break out of all the parent loops, I just write everything in a method with a return statement when my condition is met, it's far more readable and logical.
Example Using method:
private static boolean exists(int[][] array, int searchFor) {
for (int[] nums : array) {
for (int num : nums) {
if (num == searchFor) {
return true;
}
}
}
return false;
}
Example Using label (less readable imo):
boolean exists = false;
existenceLoop:
for (int[] nums : array) {
for (int num : nums) {
if (num == searchFor) {
exists = true;
break existenceLoop;
}
}
}
return exists;

Categories

Resources