Defining Strings as Null is better or as Empty Strings

Defining Strings as Null is better or as Empty Strings - java

I come from a C#.net background and whenever I had a string I was declaring it as String.Empty
Now I am looking at a Java code from a co-worker and he has declared his strings in a method like this:
String myStr = null;
I don't like it, and even worse, he is assigning values to these strings in an "IF" block where it may or may not even qualify, and then at the end of the method he is calling a myStr.length() on them.
So my question is what is a preferred way in Java? do you think is it better to define them as String.Empty OR leave them as null and do a null check just before calling .length() on them?

In general, using null values is a bad idea, especially if they are returned from a method or passed to another method. Using the Null Object pattern is better, and in a string's case the Null Object is an empty string. By consistently avoiding nulls, the probability of getting a NullPointerException gets smaller and the code becomes more readable (this is discussed at length in Clean Code chapter 7, pages 110-112).
In Java String s = "" does not allocate any memory, because the JVM will have interned the string literal, so there isn't even a performance difference (and even if there were, it would be a premature optimization).

They both have their uses. Empty strings are in-band, i.e. can occur in the input, so they are no use as sentinels. Null is only useful as a sentinel. If you're going to append to it you should initialize it with "" of course. If you're going to immediately reassign it there's no need to provide any value at all, and you should probably combine the declaration with the reassignment. I see far too much of this:
String s = new String();
s = "some string";
All this tells me is that somebody doesn't understand the language.

thought both are applicable for different scenarios, assigning it null will be better(saves memory and less error prone) provided you later on set some string to that else you will be getting a null pointer exception.

An empty String is often a sign that you should either be using a StringBuilder if you're appending (which is a common case). A null reference is 'better' because it uses less memory and it's faster to check if a String is null compared to checking if it's empty. I would advice against using empty Strings instead of declaring them as null, though you should still check for null reference before you use it unless you're 100% sure that it would have been assigned a value at some stage.

Related

In Java, what is the difference between using the toString method vs explicit String casting vs plus quotes (ex. myObj+"")?

What is the difference between plus quotes (+"") and using a "toString()" method or even explicitly casting with something like (String) myObject? Trade-offs?
myObject.toString()
vs.
myObject+""
or even vs.
(String) myObject
More specifically, is there any time using the myObj+"" method can get you into trouble?
Edited for clarity
EDIT 2:
Seems String.valueOf(myObj); is the prefered method for avoiding a null pointer. That said: Is there ever a time when the following is false?
String.valueOf(myObj).equals(myObj+"")

As of Java 7, if you want to avoid a NullPointerException, you can simply use one of these:
Objects.toString( myObject )
Objects.toString( myObject, "defaultValueWhenMyObjectIsNull" )
In all versions of Java, the first of these can also be accomplished with the following, as noted by #NobuGames in the first comment below:
String.valueOf( myObject )
The mechanisms you cite each has a flaw.
myObject.toString() // throws NullPointerException if myObject is null.
myObject+"" // Hack; impairs understandability.
(String) myObject // throws ClassCastException unless myObject is a String or null
EDIT (after question edit)
is there any time using the myObj+"" method can get you into trouble?
Yes, you can confuse other programmers. The intent of the hack is not clear. This can lead to increased cost in time, and increased risk of someone "fixing" it.
However, in terms of just the compiler, you're fine. From the Java Language Specification, section 15.18: String concatentation operator +:
If only one operand expression is of type String, then string conversion (§5.1.11) is performed on the other operand to produce a string at run time.
And from that cited section 5.1.11: String conversion:
If the reference is null, it is converted to the string "null" (four ASCII characters n, u, l, l).
Otherwise, the conversion is performed as if by an invocation of the toString method of the referenced object with no arguments; but if the result of invoking the toString method is null, then the string "null" is used instead.
This second case leads to a difference that you asked about.
Is there ever a time when the following is false? String.valueOf(myObj).equals(myObj+"")
No, but there's a time when that throws a NullPointerException. When myObj is a non-null reference to an object whose toString() method returns null, then String.valueOf(myObj) will be null. Calling the equals method will throw the NullPointerException.
But I suspect you're asking whether there's ever a time the two have different values. Yes, they can have different values. Objects.toString() and String.valueOf() can return null values. The hack will always have a non-null value.
That said, returning null from toString() is somewhat bad form. The JLS acknowledges that it can happen, but the API implies that it should not. Personally, if I were concerned about this case, I would handle it in some way other than the hack.

This code:
myObject+""
Is translated by the compiler to this:
new StringBuilder().append(myObject).append("").toString()
The StringBuilder append method does a null check on the input argument, appending the text "null".
The String class has an overloaded valueOf method, so you can also do:
String.valueOf(myObject)
Which will do a null check, returning the text "null".

Casting to String is going to be highly contextual, so more than one technique may apply here. If you are expecting to directly cast to String, my advice would be to prepare for it. If there is a chance it can be null, then check for it. Or make the API promise not to hand you a null. That is, this is a separation of concerns and a layering of responsibilities.
However, IMO, any sufficiently complicated class ought to have a toString() method. It can be for debugging, or used as a property for computation, but it ought to be human readable. There are few cases where a human-readable version of the object is not warranted, in my experience.
Relying on overloading the + operator feels like a hack, yes.

Is +"" better than .toString()?

Since myObject.toString() fails when myObject is null (throws a NullPointerException), it is safer to do myObject+"", since it's essentially doing String.valueOf(myObject).concat(""), which doesn't fail, but would instead result in the String "null".
However, is this a good practice? My first thought is that it seems like it might take longer to perform, since it's implicitly calling two methods, but it does help guarantee software that doesn't crash.

You certainly can do myObject+"". But as you already know, that requires some extra method invocation. That will depend upon the application you're using it in. Will that extra method invocation be a bottle-neck for the application? I guess that is rarely an issue. Or, to avoid that extra method call, you can directly use String#valueOf() method. But that would depend upon how you want to handle nulls. I would certainly not proceed normally in these circumstances. At least log a message indicating null reference.
Also, if you're already on Java 7, then you can use Objects.toString(Object) method, that handles null for you. Again, that method returns "null" for null references.
So, now it's your call. I've given you some option. You might want to throw exception, log message and proceed with "null" string, or some default string like "".

If you want that behavior, String.valueOf(myObject) gets you the same while being less hacky. But it also means that a null string and the string "null" are treated the same. It's usually better to check for null values explicitly, unless you're just print to a log or such. But in those cases, most methods take an Object reference and handle nulls for you (e.g. System.out.println, most logging frameworks, etc.)

Is there a purpose to having a constant for null?

I recently saw something in some code that made me curious if it was actually having some kind of optimization or performance impact. It was a line like this in a constants file:
public static final Object NULL = null;
Then, throughout the code, rather than explicitly using the null keyword, it would be referred to with Constants.NULL.
I've seen this kind of thing before with something like:
public static final String EMPTY_STRING = "";
... and that seemed to make at least a little bit of sense, if it's an attempt to avoid creating lots of duplicate "" instances. But does it really make any difference with null, since it's not actually any kind of object? Is there something I'm missing?

Neither makes any sense. Using "" in multiple places in your code does not create multiple instances of a String object. Unless you explicitly call new String(""); the JVM just creates a pool of Strings and references the one instance of each particular String.

I don't think that defining and using NULL in this manner does more than add noise. Perhaps whoever wrote that code came from a C++ background, and preferred the more familiar look of NULL over null.
The second example is also questionable, since using "" many times would not result in a separate String object created for every use. To quote the JLS:
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.

There is a purpose to use the Object.NULL.
Suppose that you have a bunch of code that validates if your object is null or not. So as part of a new feature, you need to use default values in some cases of null object.
With the simple if obj != null you can't track down the places where you object is validates against nullity. But if you use obj != Object.NULL then you can use your IDE/editor to help you to find those usages.

Does not make any difference in memory occupied due to the string pool concept in java and also it makes the code look more dirty when using NULL constant instead of null or EMPTY_STRING instead of "".
Both of them are self explanatory so there should be no need to create a naming constant for them.

java.lang.String(String original) is inadequately documented

Specifically, it throws a NullPointerException if original is null. Yet the javadocs do not mention it.
I know, it says this constructor should not be necessary unless an explicit copy is needed, but I am writing copy constructors for some larger objects that contain Strings, and while it probably isn't strictly necessary to make an explicit copy, since everything else is getting an explicit copy in this case, I'm willing to pay a small price in inefficiency.
But shouldn't there be a throws in this javadoc?

From JavaTM 2 Platform Std. Ed. v1.4.2:
Unless otherwise noted, passing a null argument to a constructor or method in this class will cause a NullPointerException to be thrown.

Strings are immutable. You do not want to create a String myString = new String("blah"); since this would result in a new object in heap. It is better to do String myString = "blah"; and link to the strings pool.
String myString = "blah";
String myOtherString = myString;
myOtherString = "";
// at this point myString == "blah"

NullPointerException is an unchecked exception so although if you're very pedantic, you can add a #throws line to the javadoc, it's better not to. It's assumed that if you pass null where it doesn't make sense, you get an NPE. (Unchecked exceptions are more often than not signal a coding error, rather than an exceptional but valid scenario.)
And new String( null ) doesn't make sense, as a constructor couldn't result in an exact "copy" of null.
Copying strings is totally unnecessary too as they're immutable.

The javadoc for this constructor says:
Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable
So you really shouldn't be calling it.
Also, I think it's reasonable to throw an NPE if you pass a null in

RuntimeException is reserved for exceptions that indicate incorrect use of an API (using a null object), which can occur anywhere in a Java program, so it's not particular to the String class (or any specific class)

The issue of the documentation has been addressed by the accepted answer.
I just want to respond to this comment ... which is probably the root of your problem.
But that assumes it doesn't make sense to pass null. Suppose you are copying some Object that has many data member fields, some of which are Strings, and which may or may not be null. In that case I want String(String) to return null, specifically, I want it to copy the null value.
That is not possible. The JLS specifically states that the new operation yields a new object and not a null.
If you want to be able to "create and return an object or null" you have to embed this logic in a factory method of some kind; e.g.
String myNewString(String s) {
return s == null ? s : new String(s);
}
But it is also worth noting that in most circumstances copying a String in Java is unnecessary and wasteful. You should only do this if your application is specifically making use of the object identity of the strings; i.e. that it uses == to compare string objects and depends on strings with the same characters comparing as false. (Applications that require that are pretty rare ...)

Java : "xx".equals(variable) better than variable.equals("xx") , TRUE?

I'm reviewing a manual of best practices and recommendation coding java I think is doubtful.
Recomendation:
String variable;
"xx".equals(variable) // OK
variable.equals("xx") //Not recomended
Because prevents appearance of NullPointerException that are not controlled
Is this true?

This is a very common technique that causes the test to return false if the variable is null instead of throwing a NullPointerException. But I guess I'll be different and say that I wouldn't regard this as a recommendation that you always should follow.
I definitely think it is something that all Java programmers should be aware of as it is a common idiom.
It's also a useful technique to make code more concise (you can handle the null and not null case at the same time).
But:
It makes your code harder to read: "If blue is the sky..."
If you have just checked that your argument is not null on the previous line then it is unnecessary.
If you forgot to test for null and someone does come with a null argument that you weren't expecting it then a NullPointerException is not necessarily the worst possible outcome. Pretending everything is OK and carrying until it eventually fails later is not really a better alternative. Failing fast is good.
Personally I don't think usage of this technique should be required in all cases. I think it should be left to the programmer's judgement on a case-by-case basis. The important thing is to make sure you've handled the null case in an appropriate manner and how you do that depends on the situation. Checking correct handling of null values could be part of the testing / code review guidelines.

It is true. If variable is null in your example,
variable.equals("xx");
will throw a NPE because you can't call a method (equals) on a null object. But
"xx".equals(variable);
will just return false without error.

Actually, I think that the original recommendation is true. If you use variable.equals("xx"), then you will get a NullPointerException if variable is null. Putting the constant string on the left hand side avoids this possibility.
It's up to you whether this defense is worth the pain of what many people consider an unnatural idiom.

This is a common technique used in Java (and C#) programs. The first form avoids the null pointer exception because the .equals() method is called on the constant string "xx", which is never null. A non-null string compared to a null is false.
If you know that variable will never be null (and your program is incorrect in some other way if it is ever null), then using variable.equals("xx") is fine.

It's true that using any propertie of an object that way helps you to avoid the NPE.
But that's why we have Exceptions, to handle those kind of thing.
Maybe if you use "xx".equals(variable) you would never know if the value of variable is null or just isn't equal to "xx". IMO it's best to know that you are getting a null value in your variable, so you can reasign it, rather than just ignore it.

You are correct about the order of the check--if the variable is null, calling .equals on the string constant will prevent an NPE--but I'm not sure I consider this a good idea; Personally I call it "slop".
Slop is when you don't detect an abnormal condition but in fact create habits to personally avoid it's detection. Passing around a null as a string for an extended period of time will eventually lead to errors that may be obscure and hard to find.
Coding for slop is the opposite of "Fail fast fail hard".
Using a null as a string can occasionally make a great "Special" value, but the fact that you are trying to compare it to something indicates that your understanding of the system is incomplete (at best)--the sooner you find this fact out, the better.
On the other hand, making all variables final by default, using Generics and minimizing visibility of all objects/methods are habits that reduce slop.

If you need to check for null, I find this better readable than
if (variable != null && variable.equals("xx")). It's more a matter of personal preference.

As a side note, here is a design pattern where this code recommendation might not make any difference, since the String (i.e. Optional<String>) is never null because of the .isPresent() call from the design pattern:
Optional<String> gender = Optional.of("MALE");
if (gender.isPresent()) {
System.out.println("Value available.");
} else {
System.out.println("Value not available.");
}
gender.ifPresent(g -> System.out.println("Consumer: equals: " + g.equals("whatever")));

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.