toString(): for debugging or for humans? - java

class Address
{
private enum Component
{
NUMBER,
STREET,
STATE,
COUNTRY
}
private Map<Component, String> componentToValue = ...;
}
I'd like my class to contain two methods:
One to indicate the value of each address component (so I can debug if anything goes wrong).
One to return the address in a form expected by humans: "1600 Amphitheatre Parkway Mountain View, CA 94043".
What is the best-practice for Object.toString()? Is it primary meant for #1 or #2? Is there a best-practice for the naming of these methods?

Would you format an address the same way in a SMS message and in an HTML page? Would you format it the same way in English, French and Japanese?
If no, then you have your answer : the presentation does not belong to the object, but to the presentation layer displaying the object. Unless the object is specifically made up for the presentation layer, for example if it is a HtmlI18nedAddress, use toString for debugging.
Consider Date vs SimpleDateFormat. Date contains the state and SimpleDateFormat returns multiple representations.

I would say the first. Data formatting should not be hard coded into the ToString() function of the object.
I look at it this way: I try to make my ToString() output data that is readable by a matching Parse(string data) function (if that function actually exists or not is not important). So in this case, if you want a specific formatting, write a specific function, and leave the generic data dump routines to ToString().

I normally use the Apache Commons ToStringBuilder http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/builder/ToStringBuilder.html with only the parts that I think are absolutely necessary for debugging.

According to Effective Java Item 12: "Always override toString" the contract for toString() is:
The result should be a concise but informative representation that is easy for a person to read. [...] providing a good toString implementation makes your class much more pleasant to use and makes systems using the class easier to debug.
Thus, it is for debugging.
More notes on toString():
Add JavaDoc (the format hould be explained here)
As soon as the format is fixed, keep in mind that the format will be used for parsing.
I highly recommend investing in the book "Effective Java". It is a very nice read. Just five to ten minutes for an item, but your Java live will change forever!

You could read from some debug property that you configure dynamically.:
#Override
public String toString() {
if(debug) {
return debugAddrString()
}
return normalAddrString();
}

ToString should generally only be used for debug information. Keep in mind that you're overriding a method on Object; is it conceptually accurate to have a method call on an Object reference return a human readable address? In some cases, depending on the project, this may actually make sense, but it sounds a bit odd to me. I would implement a new method.
Another thing to note is that most modern IDEs use the ToString method to print debug information about objects when inspecting them.

One thing you can do is use a Debug flag to change this as you like:
public String toString(boolean debug) {
if (debug) return debugStringVersion;
else return humanVersion;
}
public String toString() {
return toString(Util.DEBUG);
}
Of course this assumes that you have a utility class suet up with a debug flag in it.

Related

Is there a semantic difference between toExternalForm and toString on java.net.URL?

The implementation of one simply delegates to the other, which suggests to me that there is a semantic difference between the two from an interface standpoint -- or at least, someone thought so at some point. Can anyone shed some light there?
Edit: I already know the implementation of toString delegates to toExternalForm. It's the first thing I said. :) I'm asking why this duplication exists - that's what I meant by "semantic" difference.
The javadocs state this for both toString() and toExternalForm(),
Constructs a string representation of this URL. The string is created by calling the toExternalForm method of the stream protocol handler for this object.
In other words, the two methods are specified to return the same value.
Why?
It would be difficult to find the real reason that URL API was designed this way. The decisions were made ~25 years ago. People won't remember, and meeting notes (if they were taken) have probably been lost or disposed of.
However, I imagine the reasoning would have gone something like this:
The Object.toString() method has a very loose specification. It basically just returns something that may be useful for debugging.
The designers probably decided that they wanted a method that has a clear and specific behavior for stringifying a URL object. They called it URL.toExternalForm().
Having designed and implemented URL.toExternalForm() someone probably thought:
"Oh ... now I have a good way to implement URL.toString()".
Finally, they probably decided to specify that the two methods return the same thing.
The decision to specify that the two methods return the same thing was made between Java 1.0 and Java 1.1. (Google for the Java 1.0 and 1.1 documentation and look at the respective javadocs.)
This suggests that step 4 was done "after the fact" of the original implementation. (We would need to look at the original source code and commit history to confirm that, and it is not available.)
The OpenJDK code contains the answer:
There is absolutely no difference between java.net.URL.toString() and java.net.URL.toExternalForm() as toString() just calls toExternalForm():
public final class URL implements java.io.Serializable {
...
public String toString() {
return toExternalForm();
}
...
public String toExternalForm() {
return handler.toExternalForm(this);
}
Source
The question WHY is a different topic. Both methods have not been changed for more than 13 years. Also some Java 1.1 documentation that is still online indicates that both methods were designed to return the same result right at the beginning of Java. Most likely the toExternalForm() is the correct method to get a String representation of an URL and for convenience the toString() method just returns the same result as toString() is way more often used by most Java developers.

Is it ok to add toString() to ease debugging?

I work a lot in intellij and it can be quite convenient to have classes having their own tostring(the generated one in intellij works fine) so you can see something more informative than MyClass#1345 when trying to figure out what something is.
My question is: Is that ok? I am adding code that has no business value and doesn't affect my test cases or the execution of my software(I am not using toString() for anything more than debugging). Still, it is a part of my process. What is correct here?
The toString() method is mainly designed as a debugging purpose method.
Except some exceptional cases, you should favor its use for debug purposes and not to display information to the clients as client needs may happen to be different or be the same as the toString() method today but could be different tomorrow.
From the toString() javadoc, you can read :
Returns a string representation of the object. In general, the
toString method returns a string that "textually represents" this
object. The result should be a concise but informative representation
that is easy for a person to read. It is recommended that all
subclasses override this method.
The parts that matter for your are :
The result should be a concise but informative representation
that is easy for a person to read.
and
It is recommended that all
subclasses override this method.
You said that :
Still, it is a part of my process. What is correct here?
Good thing : the specification recommends it.
Besides the excellent points by davidxxx, the following things apply:
Consistency matters. People working with your code should not be surprised by what is happening within your classes. So either "all/most" classes #override toString() using similar implementations - or "none" does that.
Thus: make sure everybody agrees if/how to implement toString()
Specifically ensure that your toString() implementation is robust
Meaning: you absolutely have to avoid that your implementation throws any exception (for example a NPE because you happen to do someString + fieldX.name() for some fieldX that might be null).
You also have to avoid creating an "expensive" implementation (for example code that does a "deep dive" into some database to return a value from there).
2 cent of personal opinion: I find toString() to be of great value when debugging things; but I also have seen real performance impacts by toString() too expensive. Thing is: you have no idea how often some trace code might be calling toString() on your objects; so you better make sure it returns quickly.
The docs explain the function of this method:
Returns a string representation of the object. In general, the toString method returns a string that "textually represents" this object. The result should be a concise but informative representation that is easy for a person to read. It is recommended that all subclasses override this method.
As you see, they don't specify a perticular use for this method or discourage you from using it for debuging, but they only state what it is expected to do and also recomend implementing this method in subclasses of Object.
Therefore strictly speaking how you use this method is up to you. In the university course i am taking, overwriting the toString method is required for some tasks and in some cases we are asked to use it to demonstrate debuging.
It is perfectly OK and even a good idea. Most classes don't specify the content of toString so it's not wise to use it for logic (the content may change in a future version of the class). But some classes do, for example StringBuilder. And then it is also OK to use the return value for logic.
So for your own classes you may even opt to specify the content and use (and let your users use) the return value for logic.

Is this the right naming convention ?

Assuming we have a method which calls another.
public readXMLFile() {
// reading each line and parsing each line to return node.
Node = parse(line);
}
private parse() {
}
Now is it a good practice to use a more comprehensive function name like "readXMLFileAndParse" ?
Pro's:
It provides a more comprehensive information to caller of what the function is supposed to be doing.
Else client may wonder if it only reads where is the "parse" utility.
In other words I see a clear advantage of a function name to be comprehensive of all the activities nested within it. Is this right thing to do aka is this considered a good practice ?
It's a guideline that every method can only have one job (single responsibility).
However this will cause problems for the naming where a method will return a result of a combination of sub-methods.
Therefore you should name it to describe its primary function: parsing a file. Reading a file is part of that, but it's not vital to the end-user since it's implicated.
Then again, you have to think of what this exactly entails: nobody just parses a file just to parse it. Do you retrieve data? Do you write data?
You should describe your actions on that file, but not as literally as 'readfile' or 'parsefile'.
RetrieveCustomers if you're reading customers would be a lot more descriptive.
public List<Customer> RetrieveCustomers() {
// loop over lines
// call parser
}
private Customer ParseCustomer() { }
If you'd share what exactly it is you're trying to parse, that would help a lot.
I think it depends on the complexity of your class. Since the method is private, no-one, in theory, should care. Named it descriptively enough so you can read your own code 6 months from now, and stop there.
public methods, on the other hand, should be well-named and well-documented. Extra descriptiveness there can't hurt.

Naming conventions for Java methods that return boolean

I like using question mark at the end of method/function names in other languages. Java doesn't let me do this. As a workaround how else can I name boolean returning methods in Java? Using an is, has, should, can in the front of a method sound okay for some cases. Is there a better way to name such methods?
For e.g. createFreshSnapshot?
The convention is to ask a question in the name.
Here are a few examples that can be found in the JDK:
isEmpty()
hasChildren()
That way, the names are read like they would have a question mark on the end.
Is the Collection empty?
Does this Node have children?
And, then, true means yes, and false means no.
Or, you could read it like an assertion:
The Collection is empty.
The node has children
Note:
Sometimes you may want to name a method something like createFreshSnapshot?. Without the question mark, the name implies that the method should be creating a snapshot, instead of checking to see if one is required.
In this case you should rethink what you are actually asking. Something like isSnapshotExpired is a much better name, and conveys what the method will tell you when it is called. Following a pattern like this can also help keep more of your functions pure and without side effects.
If you do a Google Search for isEmpty() in the Java API, you get lots of results.
If you wish your class to be compatible with the Java Beans specification, so that tools utilizing reflection (e.g. JavaBuilders, JGoodies Binding) can recognize boolean getters, either use getXXXX() or isXXXX() as a method name. From the Java Beans spec:
8.3.2 Boolean properties
In addition, for boolean properties, we allow a getter method to match the pattern:
public boolean is<PropertyName>();
This “is<PropertyName>” method may be provided instead of a “get<PropertyName>” method, or it may be provided in addition to a “get<PropertyName>” method. In either case, if the “is<PropertyName>” method is present for a boolean property then we will use the “is<PropertyName>” method to read the property value. An example boolean property might be:
public boolean isMarsupial();
public void setMarsupial(boolean m);
I want to post this link as it may help further for peeps checking this answer and looking for more java style convention
Java Programming Style Guidelines
Item "2.13 is prefix should be used for boolean variables and methods." is specifically relevant and suggests the is prefix.
The style guide goes on to suggest:
There are a few alternatives to the is prefix that fits better in some situations. These are has, can and should prefixes:
boolean hasLicense();
boolean canEvaluate();
boolean shouldAbort = false;
If you follow the Guidelines I believe the appropriate method would be named:
shouldCreateFreshSnapshot()
For methods which may fail, that is you specify boolean as return type, I would use the prefix try:
if (tryCreateFreshSnapshot())
{
// ...
}
For all other cases use prefixes like is.. has.. was.. can.. allows.. ..
Standard is use is or has as a prefix. For example isValid, hasChildren.
is is the one I've come across more than any other. Whatever makes sense in the current situation is the best option though.
I want to point a different view on this general naming convention, e.g.:
see java.util.Set: boolean add​(E e)
where the rationale is:
do some processing then report whether it succeeded or not.
While the return is indeed a boolean the method's name should point the processing to complete instead of the result type (boolean for this example).
Your createFreshSnapshot example seems for me more related to this point of view because seems to mean this: create a fresh-snapshot then report whether the create-operation succeeded. Considering this reasoning the name createFreshSnapshot seems to be the best one for your situation.

Get Methods: One vs Many

getEmployeeNameByBatchId(int batchID)
getEmployeeNameBySSN(Object SSN)
getEmployeeNameByEmailId(String emailID)
getEmployeeNameBySalaryAccount(SalaryAccount salaryAccount)
or
getEmployeeName(int typeOfIdentifier, byte[] identifier) -> In this methods the typeOfIdentifier tells if identifier is batchID/SSN/emailID/salaryAccount
Which one of the above is better way implement a get method?
These methods would be in a Servlet and calls would be made from an API which would be provided to the customers.
Why not overload the getEmployeeName(??) method?
getEmployeeName(int BatchID)
getEmployeeName(object SSN)(bad idea)
getEmployeeName(String Email)
etc.
Seems a good 'many' approach to me.
You could use something like that:
interface Employee{
public String getName();
int getBatchId();
}
interface Filter{
boolean matches(Employee e);
}
public Filter byName(final String name){
return new Filter(){
public boolean matches(Employee e) {
return e.getName().equals(name);
}
};
}
public Filter byBatchId(final int id){
return new Filter(){
public boolean matches(Employee e) {
return e.getBatchId() == id;
}
};
}
public Employee findEmployee(Filter sel){
List<Employee> allEmployees = null;
for (Employee e:allEmployees)
if (sel.matches(e))
return e;
return null;
}
public void usage(){
findEmployee(byName("Gustav"));
findEmployee(byBatchId(5));
}
If you do the filtering by an SQL query you would use the Filter interface to compose a WHERE clause.
The good thing with this approach is that you can combine two filters easily with:
public Filter and(final Filter f1,final Filter f2){
return new Filter(){
public boolean matches(Employee e) {
return f1.matches(e) && f2.matches(e);
}
};
}
and use it like that:
findEmployee(and(byName("Gustav"),byBatchId(5)));
What you get is similar to the Criteria API in Hibernate.
I'd go with the "many" approach. It seems more intuitive to me and less prone to error.
I don't like getXByY() - that might be cool in PHP, but I just don't like it in Java (ymmv).
I'd go with overloading, unless you have properties of the same datatype. In that case, I'd do something similar to your second option, but instead of using ints, I'd use an Enum for type safety and clarity. And instead of byte[], I'd use Object (because of autoboxing, this also works for primitives).
The methods are perfect example for usage of overloading.
getEmployeeName(int batchID)
getEmployeeName(Object SSN)
getEmployeeName(String emailID)
getEmployeeName(SalaryAccount salaryAccount)
If the methods have common processing inside, just write one more getEmplyeeNameImpl(...) and extract there the common code to avoid duplication
First option, no question. Be explicit. It will greatly aid in maintainability and there's really no downside.
#Stephan: it is difficult to overload a case like this (in general) because the parameter types might not be discriminative, e.g.,
getEmployeeNameByBatchId(int batchId)
getEmployeeNameByRoomNumber(int roomNumber)
See also the two methods getEmployeeNameBySSN, getEmployeeNameByEmailId in the original posting.
I will use explicit method names. Everyone that maintains that code and me later will understand what that method is doing without having to write xml comments.
Sometimes it can be more conveniant to use the specification pattern.
Eg: GetEmployee(ISpecification<Employee> specification)
And then start defining your specifications...
NameSpecification : ISpecification<Employee>
{
private string name;
public NameSpecification(string name) { this.name = name; }
public bool IsSatisFiedBy(Employee employee) { return employee.Name == this.name; }
}
NameSpecification spec = new NameSpecification("Tim");
Employee tim = MyService.GetEmployee(spec);
I would use the first option, or overload it in this case, seeing as you have 4 different parameter signatures. However, being specific helps with understanding the code 3 months from now.
Is the logic inside each of those methods largely the same?
If so, the single method with identifier parameter may make more sense (simple and reducing repeated code).
If the logic/procedures vary greatly between types, a method per type may be preferred.
As others suggested the first option seems to be the good one. The second might make sense when you're writing a code, but when someone else comes along later on, it's harder to figure out how to use code. ( I know, you have comments and you can always dig deep into the code, but GetemployeeNameById is more self-explanatory)
Note: Btw, usage of Enums might be something to consider in some cases.
In a trivial case like this, I would go with overloading. That is:
getEmployeeName( int batchID );
getEmployeeName( Object SSN );
etc.
Only in special cases would I specify the argument type in the method name, i.e. if the type of argument is difficult to determine, if there are several types of arguments tha has the same data type (batchId and employeeId, both int), or if the methods for retrieving the employee is radically different for each argument type.
I can't see why I'd ever use this
getEmployeeName(int typeOfIdentifier, byte[] identifier)
as it requires both callee and caller to cast the value based on typeOfIdentifier. Bad design.
If you rewrite the question you can end up asking:
"SELECT name FROM ... "
"SELECT SSN FROM ... "
"SELECT email FROM ... "
vs.
"SELECT * FROM ..."
And I guess the answer to this is easy and everyone knows it.
What happens if you change the Employee class? E.g.: You have to remove the email and add a new filter like department. With the second solution you have a huge risk of not noticing any errors if you just change the order of the int identifier "constants".
With the first solution you will always notice if you are using the method in some long forgotten classes you would otherwise forget to modify to the new identifier.
I personally prefer to have the explicit naming "...ByRoomNumber" because if you end up with many "overloads" you will eventually introduce unwanted errors. Being explicit is imho the best way.
I agree with Stephan: One task, one method name, even if you can do it multiple ways.
Method overloading feature was provided exactly for your case.
getEmployeeName(int BatchID)
getEmployeeName(String Email)
etc.
And avoid your second solution at all cost. It smells like "thy olde void * of C". Likewise, passing a Java "Object" is almost as poor style as a C "void *".
If you have a good design you should be able to determine if you can use the overloading approach or if you're going to run into a problem where if you overload you're going to end up having two methods with the same parameter type.
Overloading seems like the best way initially, but if you end up not being able to add a method in future and messing things up with naming it's going to be a hassle.
Personally I'd for for the approach of a unique name per method, that way you don't run into problems later with trying to overload the same parameter Object methods. Also, if someone extended your class in the future and implemented another void getEmployeeName(String name) it wouldn't override yours.
To summarise, go with a unique method name for each method, overloading can only cause problems in the long run.
The decoupling between the search process and the search criteria jrudolf proposes in his example is excellent. I wonder why isnt it the most voted solution. Do i miss something?
I'd go with Query Objects. They work well for accessing tables directly. If you are confined to stored procedures, they lose some of their power, but you can still make it work.
The first is probably the best in Java, considering it is typesafe (unlike the other). Additionally, for "normal" types, the second solution seems to only provide cumbersome usage for the user. However, since you are using Object as the type for SSN (which has a semantic meaning beyond Object), you probably won't get away with that type of API.
All-in-all, in this particular case I would have used the approach with many getters. If all identifiers have their own class type, I might have gone the second route, but switching internally on the class instead of a provided/application-defined type identifier.
stick all your options in an enum, the have something like the following
GetEmployeeName(Enum identifier)
{
switch (identifier)
case eBatchID:
{
// Do stuff
}
case eSSN:
{
}
case eEmailId:
{
}
case eSalary:
{
}
default:
{
// No match
return 0;
}
}
enum Identifier
{
eBatchID,
eSSN,
eEmailID,
eSalary
}
You are thinking C/C++.
Use objects instead of an identifier byte (or int).
My Bad, the overload approach is better and using the SSN as a primary key is not so good
public ??? getEmployeeName(Object obj){
if (obj instanceof Integer){
...
} else if (obj instanceof String){
...
} else if .... // and so on
} else throw SomeMeaningFullRuntimeException()
return employeeName
}
I think it is better to use Unchecked Exceptions to signaling incorrect input.
Document it so the customer knows what objects to expect. Or create your own wrappers. I prefer the first option.

Categories

Resources