I have a method that list the elements of an ArrayList, typically it prints the components of a global variable ArrayList, but in one specific instance I need it to print the components of a local variable.
So I have this
public static void listPlayers(ArrayList<Player> characters, boolean beingRolled) {
//print character components (beingRolled specifies which parts of each players to print
}
and I use that when I'm using the local variable ArrayList,
when I want to use the global i call this version
public static void listPlayers(boolean beingRolled) {
listPlayers(players, abitraryBoolean);
}
where players is the global variable
Another thing I was thinking about is anywhere I want to use the global I could pass a null reference for characters and write this method
public static void listPlayers(ArrayList<Player> characters, boolean beingRolled) {
if (characters == null) characters = players;
//print components
}
Which is the more professional/recommended version?
In general, stay away from null; the overloaded method signature is a better approach.
That said, one thing you said is scary -- You have a static method operating on a global variable. In general, that is a bad idea. You should consider refactoring to use Object/class scoped state, rather than global, static scoped state.
I would even go so far as to say that you should only use the overload that takes the ArrayList as a parameter, and pass it the global variable every time. Then, even if you must use a global variable, at least you're using it in one fewer place.
Edit: One of my professors back in college wrote a book on refactoring that is very readable and has a lot of good content (though the typography is a bit odd). It's called Principle-Based Refactoring: Learning Software Design Principles by Applying Refactoring Rules, by Steve Halladay. I highly suggest reading the first half (second half is essentially a reference).
The first version is better because it is easier for someone else to understand. When a function's arguments are all listed, the caller can more easily predict what the function will do. If the function relies on variables which are not visible, say if this is available in a library without the source, the caller will not understand why the function acts in an unexpected way. This would be impossible to understand without good documentation.
Related
I have a method that will process a Collection<Nodes> that is passed in as a parameter. This Collection will be modified, therefore I thought it would be good to first make a copy of it. How do I name the parameter and local variable, e.g. nodes in the example below?
List<Nodes> process(Collection<Nodes> nodes) {
List<Nodes> nodes2 = new ArrayList<>(nodes);
...
}
As another example consider the following where the variable is an int parsed from a String parameter:
public void processUser(final String userId) {
final int userId2 = Integer.parseInt(userId);
...
A good approach to the name variables problem is to use names that suggest the actual meaning of the variable. In your example, you are using names that do not say anything about the method functionality or variables meaning, that's why it is hard to pick a name.
There are many cases like yours in the JDK, e.g. Arrays#copyOf:
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
#SuppressWarnings("unchecked")
T[] copy = ((Object)newType == (Object)Object[].class)
? (T[]) new Object[newLength]
: (T[]) Array.newInstance(newType.getComponentType(), newLength);
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;
}
In this case they call the parameter original and the local variable copy which perfectly expresses that the returned value is a copy of the parameter. Precisely, copying is what this method does and it is named accordingly.
Using the same reasoning for your case (consider refactoring to give more meaningful names to your method and variables) I would name your local copy of nodes something like processedNodes, to express what that variable is and to be consistent with your method's name.
Edit:
The name of the new method you added in your edit does not provide hints about what it does either. I'll assume that it modifies some properties (maybe in a database) of the user whose id is passed via parameter.
If that is the case (or similar), I think that an appropriate approach you
could apply would be that every method should have a single responsibility. According to your method's name it should process the user, for that you need an int userId. The responsibility of parsing an String userId should be out of the scope of this method.
Using the proposed approach has, among others, the following advantages:
Your class won't change if you have to add additional validation to your input.
Your class won't be responsible for handling NumberFormatException which must be the application responsibility.
Your processUser method won't change if you have to handle different types of inputs (e.g. float userId).
It ultimately comes down to what you want to communicate to future programmers. The computer obviously doesn't care; it's other people you're talking to. So the biggest factor is going to be what those people need to know:
What is the logical (abstract, conceptual) meaning of this variable?
What aspects of how this variable is used could be confusing to programmers?
What are the most important things about this variable?
Looking at your first example, it's kind of hard to understand enough about your program to really choose a good name. The method is called process; but methods generally speaking implement computational processes, so this name really doesn't tell me anything at all. What are you processing? What is the process? Who are you processing it for, and why? Knowing what the method does, and the class it's in, will help to inform your variable name.
Let's add some assumptions. Let's say you're building an application that locates Wi-fi access points in a building. The Node in question is a wireless node, with subclasses Repeater, AccessPoint, and Client. Let's also say it's an online-processed dataset, so the collection of nodes given may change at any time in response to a background thread receiving updates in what nodes are currently visible. Your reason for copying the collection at the head of the method is to isolate yourself from those changes for the duration of local processing. Finally, let's assume that your method is sorting the nodes by ping time (explaining why the method takes a generic Collection but returns the more specific List type).
Now that we better understand your system, let's use that understanding to choose some names that communicate the logical intention of your system to future developers:
class NetworkScanner {
List<Node> sortByPingTime(Collection<Node> networkNodes) {
final ArrayList<Node> unsortedSnapshot;
synchronized(networkNodes) {
unsortedSnapshot = new ArrayList<>(networkNodes);
}
return Utils.sort(unsortedSnapshot, (x,y) -> x.ping < y.ping);
}
}
So the method is sortByPingTime to define what it does; the argument is networkNodes to describe what kind of node we're looking at. And the variable is called unsortedSnapshot to express two things about it that aren't visible just by reading the code:
It's a snapshot of something (implying that the original is somehow volatile); and
It has no order that matters to us (suggesting that it might have, by the time we're done with it).
We could put nodes in there, but that's immediately visible from the input argument. We could also call this snapshotToSort but that's visible in the fact that we hand it off to a sort routine immediately below.
This example remains kind of contrived. The method is really too short for the variable name to matter much. In real life I'd probably just call it out, because picking a good name would take longer than anyone will ever waste figuring out how this method works.
Other related notes:
Naming is inherently a bit subjective. My name will never work for everyone, especially when multiple human languages are taken into account.
I find that the best name is often no name at all. If I can get away with making something anonymous, I will--this minimizes the risk of the variable being reused, and reduces symbols in IDE 'find' boxes. Generally this also pushes me to write tighter, more functional code, which I view as a good thing.
Some people like to include the variable's type in its name; I've always found that a bit odd because the type is generally immediately obvious, and the compiler will usually catch me if I get it wrong anyway.
"Keep it Simple" is in full force here, as everywhere. Most of the time your variable name will not help someone avoid future work. My rule of thumb is, name it something dumb, and if I ever end up scratching my head about what something means, choose that occasion to name it something good.
I used to give names, which reflect and emphasize the major things. So a potential reader (including myself after a couple of months) can get immediately, what is done inside the method just by its signature.
The API in discussion receives an input , does some processing and returns the output. These are the three main things here.
If it is not important, what processing is done and what is the type of input, the most generic is this form:
List<Nodes> process(Collection<Nodes> input) {
List<Nodes> output = new ArrayList<>(input);
...
}
and
public void process(final String input) {
final int output = Integer.parseInt(input);
...
If it is important to provide more information about processing and type of an input, names like: processCollection, inputCollection and processUser, inputUserId are more appropriate, but the local variable is still the output - it is clear and self-explained name:
List<Nodes> processCollection(Collection<Nodes> inputCollection) {
List<Nodes> output = new ArrayList<>(inputCollection);
...
}
and
public void processUser(final String inputUserId) {
final int output = Integer.parseInt(inputUserId);
...
It depends on the use case and sometimes it is even more appropriate to elaborate the processing, which is done: asArray or asFilteredArray etc instead of processCollection.
Someone may prefer the source-destination terminology to the input-output - I do not see the major difference between them. If this serves telling the method story with its title, it is good enough.
It depends on what you are going to do with the local variable.
For example in the first example it seems that is likely that variable nodes2 will actually be the value returned in the end. My advice is then to simply call it result or output.
In the second example... is less clear what you may want to achieve... I guess that userIdAsInt should be fine for the local. However if an int is always expected here and you still want to keep the parameter as a String (Perhaps you want to push that validation out of the method) I think it is more appropriate to make the local variable userId and the parameter userIdAsString or userIdString which hints that String, although accepted here, is not the canonic representation of an userId which is an int.
For sure it depends on the actual context. I would not use approaches from other programming languages such as _ which is good for instance for naming bash scripts, IMO my is also not a good choice - it looks like a piece of code copied from tutorial (at least in Java).
The most simple solution is to name method parameter nodesParam or nodesBackup and then you can simply go with nodes as a copy or to be more specific you can call it nodesCopy.
Anyway, your method process has some tasks to do and maybe it is not the best place for making copies of the nodes list. You can make a copy in the place where you invoke the method, then you can simply use nodes as a name of your object:
List<Nodes> process(Collection<Nodes> nodes) {
// do amazing things here
// ...
}
// ...
process(new ArrayList<>(nodes))
// ...
Just my guess, you have got a collection and you want to keep the original version and modify the copy, maybe a real solution for you is to use java.util.stream.Stream.
Simply put, when naming the variable, I consider a few things.
How is the copy created? (Is it converted from one type to another?...)
What am I going to do with the variable?
Is the name short, but/and meaningful?
Considering the same examples you have provided in the question, I will name variables like this:
List<Nodes> process(Collection<Nodes> nodes) {
List<Nodes> nodesCopy = new ArrayList<>(nodes);
...
}
This is probably just a copy of the collection, hence the name nodesCopy. Meaningful and short. If you use nodesList, that can mean it is not just a Collection; but also a List (more specific).
public void processUser(final String userId) {
final int userIdInt = Integer.parseInt(userId);
...
The String userId is parsed and the result is an integer (int)! It is not just a copy. To emphasize this, I would name this as userIdInt.
It is better not to use an underscore _, because it often indicates instance variables. And the my prefix: not much of a meaning there, and it is nooby (local will do better).
When it comes to method parameter naming conventions, if the thing a method parameter represents will not be represented by any other variable, use a method parameter name that makes it very clear what that method parameter is in the context of the method body. For example, primaryTelephoneNumber may be an acceptable method parameter name in a JavaBean setter method.
If there are multiple representations of a thing in a method context (including method parameters and local variables), use names that make it clear to humans what that thing is and how it should be used. For example, providedPrimaryTelephoneNumber, requestedPrimaryTelephoneNumber, dirtyPrimaryTelephoneNumber might be used for the method parameter name and parsedPrimaryTelephoneNumber, cleanPrimaryTelephoneNumber, massagedPrimaryTelephoneNumber might be used for the local variable name in a method that persists a user-provided primary telephone number.
The main objective is to use names that make it clear to humans reading the source code today and tomorrow as to what things are. Avoid names like var1, var2, a, b, etc.; these names add extra effort and complexity in reading and understanding the source code.
Don't get too caught up in using long method parameter names or local variable names; the source code is for human readability and when the class is compiled method parameter names and local variable names are irrelevant to the machine.
I'm trying to write a very simple piece of code and can't figure out an elegant solution to do it:
int count = 0;
jdbcTemplate.query(readQuery, new RowCallbackHandler() {
#Override
public void processRow(ResultSet rs) throws SQLException {
realProcessRow(rs);
count++;
}
});
This obviously doesn't compile. The 2 solutions that I'm aware of both stink:
I don't want to make count a class field because it's really a local variable that I just need for logging purposes.
I don't want to make count an array because it is plain ugly.
This is just silly, there got to be a reasonable way to do it?
A third possibility is to use a final-mutable-int-object, for example:
final AtomicInteger count = new AtomicInteger(0);
....
count.incrementAndGet();
Apache Commons also have a MutableInteger I believe, but I have not used it.
You seem to already be aware of the solutions (they are different though); and you are probably aware of the reasons (it cannot capture local variables by reference because the variable might not exist by the time the closure is run, so it must capture by value (have multiple copies); it is bad to have the same variable refer to different copies in different scopes that each can be changed independently, so they cannot be changed).
If your closure does not need to share state back to the enclosing scope, then a field in the class is the right thing to do. I don't understand what your objection is. If the closure needs to be able to be called multiple times and it needs to increment each time, then it needs to maintain state in the object. A field (instance variable) properly expresses the storing of state in an object. The field can be initialized with the captured value from the outside scope.
If your closure needs to share state back to the enclosing scope (which is not a very common situation), then using a mutable structure (like an array) is the right thing to do, because it avoids the problem of the lifetime of the local variable.
I typically make count a class field but add a comment that it is only a field because it is used by an inner closure, Runnable etc...
E.g. 2 method invocations:
myMethod(getHtmlFileName());
or
String htmlFileName=getHtmlFileName();
myMethod(htmlFileName);
which is better way, exclude less typing in first case ?
If you are going to use a return value of a method in more than one place, storing it in a variable and using that variable in your code could be more practical, readable and easily debuggable rather than calling the method every time:
String htmlFileName = getHtmlFileName();
myMethod(htmlFileName);
....
myMethod(htmlFileName + "...");
The second approach would help you debug the return value of getHtmlFileName(), but other than that, neither approach is better than the other in an absolute sense. It's a matter of preference, and perhaps context, I would say. In this particular case I'd go for the first approach, but if you were combining several methods, I'd go for the second, for sake of readability, e.g.:
String first = firstMethod();
String second = secondMethod(first);
String third = thirdMethod(second);
rather than
thirdMethod(secondMethod(firstMethod()));
EDIT: As others have pointed out, if you're going to use the value in more than one place, then obviously you'd use the second approach and keep a reference to the value for later use.
It will probably depend on the context.
If you are going to use htmlFileName variable elsewhere in you code block you probably what to store it local variable like (especially true for some heavy method calls) :
String htmlFileName=getHtmlFileName();
myMethod(htmlFileName);
if it is one-off call the
myMethod(getHtmlFileName());
is probably more elegant and easy to read.
If you use the getHtmlFileName() returned value later on, and if the returned value is fixed, you will want to use the first form, i.e. assign a local variable and reuse it, and thereby avoid redundant calls / object creations.
Otherwise (e.g. if you only call the getHtmlFileName method once, you will want to use the first form which is more concise, and which avoid a useless local variable assignment, but there is no real harm if you still use the second form (e.g. for debugging).
One of my most common bugs is that I can never remember whether something is a method or a property, so I'm constantly adding or removing parentheses.
So I was wondering if there was good logic behind making the difference between calling on an object's properties and methods explicit.
Obviously, it allows you to have properties and methods that share the same name, but I don't think that comes up much.
The only big benefit I can come up with is readability. Sometimes you might want to know whether something is a method or a property while you're looking at code, but I'm having trouble coming up with specific examples when that would be really helpful. But I am a n00b, so I probably just haven't encountered such a situation yet. I'd appreciate examples of such a situation.
Also, are there other languages where the difference isn't explicit?
Anyways, if you could answer, it will help me be less annoyed every time I make this mistake ^-^.
UPDATE:
Thanks everyone for the awesome answers so far! I only have about a week's worth of js, and 1 day of python, so I had no idea you could reference functions without calling them. That's awesome. I have a little more experience with java, so that's where I was mostly coming from... can anyone come up with an equally compelling argument for that to be the case in java, where you can't reference functions? Aside from it being a very explicit language, with all the benefits that entails :).
All modern languages require this because referencing a function and calling a function are separate actions.
For example,
def func():
print "hello"
return 10
a = func
a()
Clearly, a = func and a = func() have very different meanings.
Ruby--the most likely language you're thinking of in contrast--doesn't require the parentheses; it can do this because it doesn't support taking references to functions.
In languages like Python and JavaScript, functions are first–class objects. This means that you can pass functions around, just like you can pass around any other value. The parentheses after the function name (the () in myfunc()) actually constitute an operator, just like + or *. Instead of meaning "add this number to another number" (in the case of +), () means "execute the preceding function". This is necessary because it is possible to use a function without executing it. For example, you may wish to compare it to another function using ==, or you may wish to pass it into another function, such as in this JavaScript example:
function alertSomething(message) {
alert(message);
}
function myOtherFunction(someFunction, someArg) {
someFunction(someArg);
}
// here we are using the alertSomething function without calling it directly
myOtherFunction(alertSomething, "Hello, araneae!");
In short: it is important to be able to refer to a function without calling it — this is why the distinction is necessary.
At least in JS, its because you can pass functions around.
var func = new Function();
you can then so something like
var f = func
f()
so 'f' and 'func' are references to the function, and f() or func() is the invocation of the function.
which is not the same as
var val = f();
which assigns the result of the invocation to a var.
For Java, you cannot pass functions around, at least like you can in JS, so there is no reason the language needs to require a () to invoke a method. But it is what it is.
I can't speak at all for python.
But the main point is different languages might have reasons why syntax may be necessary, and sometimes syntax is just syntax.
I think you answered it yourself:
One of my most common bugs is that I can never remember whether something is a method or a property, so I'm constantly adding or removing parentheses.
Consider the following:
if (colorOfTheSky == 'blue')
vs:
if (colorOfTheSky() == 'blue')
We can tell just by looking that the first checks for a variable called colorOfTheSky, and we want to know if its value is blue. In the second, we know that colorOfTheSky() calls a function (method) and we want to know if its return value is blue.
If we didn't have this distinction it would be extremely ambiguous in situations like this.
To answer your last question, I don't know of any languages that don't have this distinction.
Also, you probably have a design problem if you can't tell the difference between your methods and your properties; as another answer points out, methods and properties have different roles to play. Furthermore it is good practice for your method names to be actions, e.g. getPageTitle, getUserId, etc., and for your properties to be nouns, e.g., pageTitle, userId. These should be easily decipherable in your code for both you and anyone who comes along later and reads your code.
If you're having troubles, distinguishing between your properties and methods, you're probably not naming them very well.
In general, your methods should have a verb in them: i.e. write, print, echo, open, close, get, set, and property names should be nouns or adjectives: name, color, filled, loaded.
It's very important to use meaningful method and property names, without it, you'll find that you'll have difficulty reading your own code.
In Java, I can think of two reasons why the () is required:
1) Java had a specific design goal to have a "C/C++ like" syntax, to make it easy for C and C++ programmers to learn the language. Both C and C++ require the parentheses.
2) The Java syntax specifically requires the parentheses to disambiguate a reference to an attribute or local from a call to a method. This is because method names and attribute / local names are declared in different namespaces. So the following is legal Java:
public class SomeClass {
private int name;
private int name() { ... }
...
int norm = name; // this one
}
If the () was not required for a method call, the compiler would not be able to tell if the labeled statement ("this one") was assigning the value of the name attribute or the result of calling the name() method.
The difference isn't always explicit in VBA. This is a call to a Sub (i.e. a method with no return value) which takes no parameters (all examples are from Excel):
Worksheets("Sheet1").UsedRange.Columns.AutoFit
whereas this is accessing an attribute then passing it as a parameter:
MsgBox Application.Creator
As in the previous example, parentheses are also optional around parameters if there is no need to deal with the return value:
Application.Goto Worksheets("Sheet2").Range("A1")
but are needed if the return value is used:
iRows = Len("hello world")
Because referencing and calling a method are two different things. Consider X.method being the method of class X and x being an instance of X, so x.method == 'blue' would'nt ever be able to be true because methods are not strings.
You can try this: print a method of an object:
>>> class X(object):
... def a(self):
... print 'a'
...
>>> x=X()
>>> print x.a
<bound method X.a of <__main__.X object at 0x0235A910>>
Typically properties are accessors, and methods perform some sort of action. Going on this assumption, it's cheap to use a property, expensive to use a method.
Foo.Bar, for example, would indicate to me that it would return a value, like a string, without lots of overhead.
Foo.Bar() (or more likely, Foo.GetBar()), on the other hand, implies needing to retrieve the value for "Bar", perhaps from a database.
Properties and methods have different purposes and different implications, so they should be differentiated in code as well.
By the way, in all languages I know of the difference in syntax is explicit, but behind the scenes properties are often treated as simply special method calls.
I'm revisiting data structures and algorithms to refresh my knowledge and from time to time I stumble across this problem:
Often, several data structures do need to swap some elements on the underlying array. So I implement the swap() method in ADT1, ADT2 as a private non-static method. The good thing is, being a private method I don't need to check on the parameters, the bad thing is redundancy. But if I put the swap() method in a helper class as a public static method, I need to check the indices every time for validity, making the swap call very unefficient when many swaps are done.
So what should I do? Neglect the performance degragation, or write small but redundant code?
Better design should always trump small inefficiencies. Only address performance problem if it actually is proven to be one.
Besides, what kind of checking are you doing anyway? Aren't naturally thrown ArrayIndexOutOfBoundsException and/or NullPointerException good enough?
It's worth nothing that while there's public static Collections.swap(List<?>,int,int), java.util.Arrays makes its overloads (for int[], long[], byte[], etc) all private static.
I'm not sure if Josh Bloch ever explicitly addressed why he did that, but one might guess that it has something to do with Item 25 on his book Effective Java 2nd Edition: Prefer lists to arrays. Yes, there will be "performance degradation" in using List, but it's negligible, and the many advantages more than make up for it.
If you don't need to make the checks in the private method, don't make them in the static one. This will result in a RuntimeException for invalid calls, but since all your calls are supposed to be valid, it will be as though you've used a private method.
It's always better for your code to be less efficient than to be duplicated (some constant calls are not considerable). At least that is what is taught at my university.
Code duplication produces bugs. So you prefer your program to work correctly rather than to work a little faster.
If you want to prevent constraints checking: what comes to my mind is that you can either accept naturally thrown exceptions as polygenelubricants suggested or create an abstract super class to all your data structures based on arrays. That abstract class would have protected method swap that will not check parameters. It's not perfect, but I guess that a protected method that does not check parameters is better than a public method that does not do it.