I work a lot in intellij and it can be quite convenient to have classes having their own tostring(the generated one in intellij works fine) so you can see something more informative than MyClass#1345 when trying to figure out what something is.
My question is: Is that ok? I am adding code that has no business value and doesn't affect my test cases or the execution of my software(I am not using toString() for anything more than debugging). Still, it is a part of my process. What is correct here?
The toString() method is mainly designed as a debugging purpose method.
Except some exceptional cases, you should favor its use for debug purposes and not to display information to the clients as client needs may happen to be different or be the same as the toString() method today but could be different tomorrow.
From the toString() javadoc, you can read :
Returns a string representation of the object. In general, the
toString method returns a string that "textually represents" this
object. The result should be a concise but informative representation
that is easy for a person to read. It is recommended that all
subclasses override this method.
The parts that matter for your are :
The result should be a concise but informative representation
that is easy for a person to read.
and
It is recommended that all
subclasses override this method.
You said that :
Still, it is a part of my process. What is correct here?
Good thing : the specification recommends it.
Besides the excellent points by davidxxx, the following things apply:
Consistency matters. People working with your code should not be surprised by what is happening within your classes. So either "all/most" classes #override toString() using similar implementations - or "none" does that.
Thus: make sure everybody agrees if/how to implement toString()
Specifically ensure that your toString() implementation is robust
Meaning: you absolutely have to avoid that your implementation throws any exception (for example a NPE because you happen to do someString + fieldX.name() for some fieldX that might be null).
You also have to avoid creating an "expensive" implementation (for example code that does a "deep dive" into some database to return a value from there).
2 cent of personal opinion: I find toString() to be of great value when debugging things; but I also have seen real performance impacts by toString() too expensive. Thing is: you have no idea how often some trace code might be calling toString() on your objects; so you better make sure it returns quickly.
The docs explain the function of this method:
Returns a string representation of the object. In general, the toString method returns a string that "textually represents" this object. The result should be a concise but informative representation that is easy for a person to read. It is recommended that all subclasses override this method.
As you see, they don't specify a perticular use for this method or discourage you from using it for debuging, but they only state what it is expected to do and also recomend implementing this method in subclasses of Object.
Therefore strictly speaking how you use this method is up to you. In the university course i am taking, overwriting the toString method is required for some tasks and in some cases we are asked to use it to demonstrate debuging.
It is perfectly OK and even a good idea. Most classes don't specify the content of toString so it's not wise to use it for logic (the content may change in a future version of the class). But some classes do, for example StringBuilder. And then it is also OK to use the return value for logic.
So for your own classes you may even opt to specify the content and use (and let your users use) the return value for logic.
Related
The implementation of one simply delegates to the other, which suggests to me that there is a semantic difference between the two from an interface standpoint -- or at least, someone thought so at some point. Can anyone shed some light there?
Edit: I already know the implementation of toString delegates to toExternalForm. It's the first thing I said. :) I'm asking why this duplication exists - that's what I meant by "semantic" difference.
The javadocs state this for both toString() and toExternalForm(),
Constructs a string representation of this URL. The string is created by calling the toExternalForm method of the stream protocol handler for this object.
In other words, the two methods are specified to return the same value.
Why?
It would be difficult to find the real reason that URL API was designed this way. The decisions were made ~25 years ago. People won't remember, and meeting notes (if they were taken) have probably been lost or disposed of.
However, I imagine the reasoning would have gone something like this:
The Object.toString() method has a very loose specification. It basically just returns something that may be useful for debugging.
The designers probably decided that they wanted a method that has a clear and specific behavior for stringifying a URL object. They called it URL.toExternalForm().
Having designed and implemented URL.toExternalForm() someone probably thought:
"Oh ... now I have a good way to implement URL.toString()".
Finally, they probably decided to specify that the two methods return the same thing.
The decision to specify that the two methods return the same thing was made between Java 1.0 and Java 1.1. (Google for the Java 1.0 and 1.1 documentation and look at the respective javadocs.)
This suggests that step 4 was done "after the fact" of the original implementation. (We would need to look at the original source code and commit history to confirm that, and it is not available.)
The OpenJDK code contains the answer:
There is absolutely no difference between java.net.URL.toString() and java.net.URL.toExternalForm() as toString() just calls toExternalForm():
public final class URL implements java.io.Serializable {
...
public String toString() {
return toExternalForm();
}
...
public String toExternalForm() {
return handler.toExternalForm(this);
}
Source
The question WHY is a different topic. Both methods have not been changed for more than 13 years. Also some Java 1.1 documentation that is still online indicates that both methods were designed to return the same result right at the beginning of Java. Most likely the toExternalForm() is the correct method to get a String representation of an URL and for convenience the toString() method just returns the same result as toString() is way more often used by most Java developers.
Can such tests have a good reason to exist?
Some classes use toString for more than just user-readable informative string. Examples are StringBuilder and StringWriter. In such a case it is of course advisable to test the method just like any other business-value method.
Even in the general case it is good practice to smoke-test toString for reliability (no exceptions thrown). The last thing you need is a log statement blowing up your code due to an ill-implemented toString. It has happened to me several times, and the resulting bugs are of the nastiest kind, since you don't even see the toString call in the source code—it's implicitly buried inside a log statement.
The question is not should I test toString(), but do you care about the result of toString()? Is it used for something? If so, then yes, test it.
If a method gets used for something real, then test it.
Obvious answer is „no, it's just a waste of time“. But for many classes, first of all value-wrappers, toString should be overloaded and deliver more information that just org.package.ClassName#2be2befa
So my propostal test for toString is:
#Test
public final void testToString() {
assertFalse(new MyClass().toString().contains("#"));
}
It also increases test converage what is at least not bad.
If the result of the method is important to you, you should test it, otherwise you can just ignore that.
I am going to go against the general advise and say testing a toString method definitely has its place. Applications I have work on log a lot, especially if you turn on debug or trace level logs. If I am relying on the logs to help identify a bug and some fields from my POJO are not present because some developer forgot to regenerate the toString method, this is a huge setback!
The problem is that the toString method is an absolute pain to test as their is no fixed format or a clear way to test it. I would recommend not writing a test yourself, but using a library such as ToStringVerifier
#Test
public void testToString()
{
ToStringVerifier.forClass(User.class).verify();
}
class Address
{
private enum Component
{
NUMBER,
STREET,
STATE,
COUNTRY
}
private Map<Component, String> componentToValue = ...;
}
I'd like my class to contain two methods:
One to indicate the value of each address component (so I can debug if anything goes wrong).
One to return the address in a form expected by humans: "1600 Amphitheatre Parkway Mountain View, CA 94043".
What is the best-practice for Object.toString()? Is it primary meant for #1 or #2? Is there a best-practice for the naming of these methods?
Would you format an address the same way in a SMS message and in an HTML page? Would you format it the same way in English, French and Japanese?
If no, then you have your answer : the presentation does not belong to the object, but to the presentation layer displaying the object. Unless the object is specifically made up for the presentation layer, for example if it is a HtmlI18nedAddress, use toString for debugging.
Consider Date vs SimpleDateFormat. Date contains the state and SimpleDateFormat returns multiple representations.
I would say the first. Data formatting should not be hard coded into the ToString() function of the object.
I look at it this way: I try to make my ToString() output data that is readable by a matching Parse(string data) function (if that function actually exists or not is not important). So in this case, if you want a specific formatting, write a specific function, and leave the generic data dump routines to ToString().
I normally use the Apache Commons ToStringBuilder http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/builder/ToStringBuilder.html with only the parts that I think are absolutely necessary for debugging.
According to Effective Java Item 12: "Always override toString" the contract for toString() is:
The result should be a concise but informative representation that is easy for a person to read. [...] providing a good toString implementation makes your class much more pleasant to use and makes systems using the class easier to debug.
Thus, it is for debugging.
More notes on toString():
Add JavaDoc (the format hould be explained here)
As soon as the format is fixed, keep in mind that the format will be used for parsing.
I highly recommend investing in the book "Effective Java". It is a very nice read. Just five to ten minutes for an item, but your Java live will change forever!
You could read from some debug property that you configure dynamically.:
#Override
public String toString() {
if(debug) {
return debugAddrString()
}
return normalAddrString();
}
ToString should generally only be used for debug information. Keep in mind that you're overriding a method on Object; is it conceptually accurate to have a method call on an Object reference return a human readable address? In some cases, depending on the project, this may actually make sense, but it sounds a bit odd to me. I would implement a new method.
Another thing to note is that most modern IDEs use the ToString method to print debug information about objects when inspecting them.
One thing you can do is use a Debug flag to change this as you like:
public String toString(boolean debug) {
if (debug) return debugStringVersion;
else return humanVersion;
}
public String toString() {
return toString(Util.DEBUG);
}
Of course this assumes that you have a utility class suet up with a debug flag in it.
I overheard two of my colleagues arguing about whether or not to create a new data model class which only contains one string field and a setter and a getter for it. A program will then create a few objects of the class and put them in an array list. The guy who is storing them argue that there should be a new type while the guy who is getting the data said there is not point going through all this trouble while you can simple store string.
Personally I prefer creating a new type so we know what's being stored in the array list, but I don't have strong arguments to persuade the 'getting' data guy. Do you?
Sarah
... a new data model class which only contains one string field and a setter and a getter for it.
If it was just a getter, then it is not possible to say in general whether a String or a custom class is better. It depends on things like:
consistency with the rest of your data model,
anticipating whether you might want to change the representation,
anticipating whether you might want to implement validation when creating an instance, add helper methods, etc,
implications for memory usage or persistence (if they are even relevant).
(Personally, I would be inclined to use a plain String by default, and only use a custom class if for example, I knew that it was likely that a future representation change / refinement would be needed. In most situations, it is not a huge problem to change a String into custom class later ... if the need arises.)
However, the fact that there is proposed to be a setter for the field changes things significantly. Instances of the class will be mutable, where instances of String are not. On the one hand this could possibly be useful; e.g. where you actually need mutability. On the other hand, mutability would make the class somewhat risky for use in certain contexts; e.g. in sets and as keys in maps. And in other contexts you may need to copy the instances. (This would be unnecessary for an immutable wrapper class or a bare String.)
(The simple answer is to get rid of the setter, unless you really need it.)
There is also the issue that the semantics of equals will be different for a String and a custom wrapper. You may therefore need to override equals and hashCode to get a more intuitive semantic in the custom wrapper case. (And that relates back to the issue of a setter, and use of the class in collections.)
Wrap it in a class, if it matches the rest of your data model's design.
It gives you a label for the string so that you can tell what it represents at run time.
It makes it easier to take your entity and add additional fields, and behavior. (Which can be a likely occurrence>)
That said, the key is if it matches the rest of your data model's design... be consistent with what you already have.
Counterpoint to mschaef's answer:
Keep it as a string, if it matches the rest of your data model's design. (See how the opening sounds so important, even if I temper it with a sentence that basically says we don't know the answer?)
If you need a label saying what it is, add a comment. Cost = one line, total. Heck, for that matter, you need a line (or three) to comment your new class, anyway, so what's the class declaration for?
If you need to add additional fields later, you can refactor it then. You can't design for everything, and if you tried, you'd end up with a horrible mess.
As Yegge says, "the worst thing that can happen to a code base is size". Add a class declaration, a getter, a setter, now call those from everywhere that touches it, and you've added size to your code without an actual (i.e., non-hypothetical) purpose.
I disagree with the other answers:
It depends whether there's any real possibility of adding behavior to the type later [Matthew Flaschen]
No, it doesn’t. …
Never hurts to future-proof the design [Alex]
True, but not relevant here …
Personally, I would be inclined to use a plain String by default [Stephen C]
But this isn’t a matter of opinion. It’s a matter of design decisions:
Is the entity you store logically a string, a piece of text? If yes, then store a string (ignoring the setter issue).
If not – then do not store a string. That data may be stored as a string is an implementation detail, it should not be reflected in your code.
For the second point it’s irrelevant whether you might want to add behaviour later on. All that matters is that in a strongly typed language, the data type should describe the logical entity. If you handle things that are not text (but may be represented by text, may contain text …) then use a class that internally stores said text. Do not store the text directly.
This is the whole point of abstraction and strong typing: let the types represent the semantics of your code.
And finally:
As Yegge says, "the worst thing that can happen to a code base is size". [Ken]
Well, this is so ironic. Have you read any of Steve Yegge’s blog posts? I haven’t, they’re just too damn long.
It depends whether there's any real possibility of adding behavior to the type later. Even if the getters and setters are trivial now, a type makes sense if there is a real chance they could do something later. Otherwise, clear variable names should be sufficient.
In the time spent discussing whether to wrap it in a class, it could be wrapped and done with. Never hurts to future-proof the design, especially when it only takes minimal effort.
I see no reason why the String should be wrapped in a class. The basic perception behind the discussion is, the need of time is a String object. If it gets augmented later, get it refactored then. Why add unnecessary code in the name of future proofing.
Wrapping it in a class provides you with more type safety - in your model you can then only use instances of the wrapper class, and you can't easily make a mistake where you put a string that contains something different into the model.
However, it does add overhead, extra complexity and verbosity to your code.
I know that in Java, it is common practice to use "get" as a prefix to an accessor method. I was wondering what the reason for this is. Is it purely to be able to predict what it is returning?
To clarify: In some java classes (eg String) a variable like length can be accessed by calling "length()" rather than "size()". Why are these methods written like this, but others like "getSomeVariable()"?
Thank you for your time.
Edit: Good to see I'm not alone about the confusion & such about the size and length variables
'get' prefix (or 'is' for methods returning booleans) is a part of JavaBean specification which is used throughout the java but mostly in views in web UI.
length() and size() are historical artefacts from pre-javabean times; many a UI developer had lamented the fact that Collection has a size() method instead of getSize()
Because properties are nouns and methods are verbs. It is part of the bean pattern that is well-established and therefore expected by anyone using your class.
It might make sense to say:
String txt="I have " + car.GetFuelLevel() + " liters of petrol.";
or ...
String txt="I have " + car.FuelLevel + " liters of petrol.";
but not ...
String txt="I have " + car.FuelLevel() + " liters of petrol.";
I mean, it doesn't make sense to say "Hey, car. Go FuelLevel for me." But to say "Hey, car. Go GetFuelLevel for me." That's more natural.
Now, why did they break rank with String.length() and others? That's always bothered me, too.
The get prefix is particularly useful if you also have set, add, remove, etc., methods. Of course, it's generally better to have an interface full of gets or full of sets. If almost every method has get then it just becomes noise. So, I'd drop the get for immutables and the set for builders. For "fundamental" types, such as collections and strings, these little words are also noisy, IMO.
The get/set conventions stem from the java Bean specification. So people strongly tend to use that.
And the .size(), .length(), and even .length attribute of arrays are all examples of Java's failures to follow its own conventions. There are many more, it's "fun" to discover them!
They may be failures to the specification, however they improve readability. size and length allow you to read the following line of code:
for (int i=0; i<thing.size(); ++i){
As...
While i is less than the thing's size...
There's no real convention behind this, but it does make it easier to translate into a sentence directly.
The historical reason was that the JavaBean specification stated that accessors to class properties should be done with getPropertyName/setPropertyName. The benefit was that you could then use Introspection APIs to dynamically list the properties of an object, even one that you hadn't previously compiled into your program. An example of where this would be useful is in building a plug-in architecture that needs to load objects and provide the user access to the properties of the object.
You have different names to retrieve size in different classes simply because they were written by different people and there probably wasn't at the time a design guideline for naming class methods in a consistent manner. Once millions of lines of code had been written using these inconsistent names, it was too late to change.