I recently saw something in some code that made me curious if it was actually having some kind of optimization or performance impact. It was a line like this in a constants file:
public static final Object NULL = null;
Then, throughout the code, rather than explicitly using the null keyword, it would be referred to with Constants.NULL.
I've seen this kind of thing before with something like:
public static final String EMPTY_STRING = "";
... and that seemed to make at least a little bit of sense, if it's an attempt to avoid creating lots of duplicate "" instances. But does it really make any difference with null, since it's not actually any kind of object? Is there something I'm missing?
Neither makes any sense. Using "" in multiple places in your code does not create multiple instances of a String object. Unless you explicitly call new String(""); the JVM just creates a pool of Strings and references the one instance of each particular String.
I don't think that defining and using NULL in this manner does more than add noise. Perhaps whoever wrote that code came from a C++ background, and preferred the more familiar look of NULL over null.
The second example is also questionable, since using "" many times would not result in a separate String object created for every use. To quote the JLS:
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (ยง15.28) - are "interned" so as to share unique instances, using the method String.intern.
There is a purpose to use the Object.NULL.
Suppose that you have a bunch of code that validates if your object is null or not. So as part of a new feature, you need to use default values in some cases of null object.
With the simple if obj != null you can't track down the places where you object is validates against nullity. But if you use obj != Object.NULL then you can use your IDE/editor to help you to find those usages.
Does not make any difference in memory occupied due to the string pool concept in java and also it makes the code look more dirty when using NULL constant instead of null or EMPTY_STRING instead of "".
Both of them are self explanatory so there should be no need to create a naming constant for them.
Related
So I have a constructor with 5 different variables, where three of which might be null. It accepts user input and the user does not have to enter anything for three of the five attributes.
Therefore, if the user did not enter anything, the object is created using null for all missing values.
obj = new Object(String, null, null, null, String);
Now I am wondering what would be best practice to cope with this.
I can think of three different scenarios:
Deal with it only in the class using the constructor, i.e. always query whether the value is null (e.g. if(getSomeAttribute == null) { //do something }
Deal with it within the object class, i.e. always return some default value for each missing attribute instead of null
Deal with it within the object lcass, i.e. have helper-methods like isAttributeSet(), returning a boolean value indicating whether the attributes are set, that is: not null.
Although I have problems with the last two, as I think I might run into problems with default values, as sometimes it might hard to know if it is a default value; if I'd always check I could just as well check for null instead of inserting a default value first;
Same with the last one, if I have to check the helper-method, I could just as well check for null directly.
What my problem is with this situation, is that sometimes I might not be the one using the getter and setter methods; how should the one using it know there might be null attributes and which that are.
I know, I should document that within my object class, but still I am wondering if there is a "best practice" way to cope with this.
I believe it should be unusual to always check the documentary (or if there is none, the whole class) for something as simple as this.
Maybe I should not even start with null values within my constructor in the first place? But I think I would run into the same kinds of problems, anyway, so that would not really solve my problem
Read Bloch, Effective Java, 2nd ed. Item 2: "Consider a builder when faced with many constructor parameters."
Excellent advice in an excellent book.
Using a builder pattern would help with the constructor, however the main problem is to do with the design of the class - you want to be sure when you call a get/set method that it isn't one of the null/optional members.
You could use polymorphism to have two objects each with an interface that only exposes the getters and setters supported by the concrete implementation. This makes it impossible to call the wrong getters/setters and it is obvious what the intention is.
Specifically, it throws a NullPointerException if original is null. Yet the javadocs do not mention it.
I know, it says this constructor should not be necessary unless an explicit copy is needed, but I am writing copy constructors for some larger objects that contain Strings, and while it probably isn't strictly necessary to make an explicit copy, since everything else is getting an explicit copy in this case, I'm willing to pay a small price in inefficiency.
But shouldn't there be a throws in this javadoc?
From JavaTM 2 Platform Std. Ed. v1.4.2:
Unless otherwise noted, passing a null argument to a constructor or method in this class will cause a NullPointerException to be thrown.
Strings are immutable. You do not want to create a String myString = new String("blah"); since this would result in a new object in heap. It is better to do String myString = "blah"; and link to the strings pool.
String myString = "blah";
String myOtherString = myString;
myOtherString = "";
// at this point myString == "blah"
NullPointerException is an unchecked exception so although if you're very pedantic, you can add a #throws line to the javadoc, it's better not to. It's assumed that if you pass null where it doesn't make sense, you get an NPE. (Unchecked exceptions are more often than not signal a coding error, rather than an exceptional but valid scenario.)
And new String( null ) doesn't make sense, as a constructor couldn't result in an exact "copy" of null.
Copying strings is totally unnecessary too as they're immutable.
The javadoc for this constructor says:
Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable
So you really shouldn't be calling it.
Also, I think it's reasonable to throw an NPE if you pass a null in
RuntimeException is reserved for exceptions that indicate incorrect use of an API (using a null object), which can occur anywhere in a Java program, so it's not particular to the String class (or any specific class)
The issue of the documentation has been addressed by the accepted answer.
I just want to respond to this comment ... which is probably the root of your problem.
But that assumes it doesn't make sense to pass null. Suppose you are copying some Object that has many data member fields, some of which are Strings, and which may or may not be null. In that case I want String(String) to return null, specifically, I want it to copy the null value.
That is not possible. The JLS specifically states that the new operation yields a new object and not a null.
If you want to be able to "create and return an object or null" you have to embed this logic in a factory method of some kind; e.g.
String myNewString(String s) {
return s == null ? s : new String(s);
}
But it is also worth noting that in most circumstances copying a String in Java is unnecessary and wasteful. You should only do this if your application is specifically making use of the object identity of the strings; i.e. that it uses == to compare string objects and depends on strings with the same characters comparing as false. (Applications that require that are pretty rare ...)
I come from a C#.net background and whenever I had a string I was declaring it as String.Empty
Now I am looking at a Java code from a co-worker and he has declared his strings in a method like this:
String myStr = null;
I don't like it, and even worse, he is assigning values to these strings in an "IF" block where it may or may not even qualify, and then at the end of the method he is calling a myStr.length() on them.
So my question is what is a preferred way in Java? do you think is it better to define them as String.Empty OR leave them as null and do a null check just before calling .length() on them?
In general, using null values is a bad idea, especially if they are returned from a method or passed to another method. Using the Null Object pattern is better, and in a string's case the Null Object is an empty string. By consistently avoiding nulls, the probability of getting a NullPointerException gets smaller and the code becomes more readable (this is discussed at length in Clean Code chapter 7, pages 110-112).
In Java String s = "" does not allocate any memory, because the JVM will have interned the string literal, so there isn't even a performance difference (and even if there were, it would be a premature optimization).
They both have their uses. Empty strings are in-band, i.e. can occur in the input, so they are no use as sentinels. Null is only useful as a sentinel. If you're going to append to it you should initialize it with "" of course. If you're going to immediately reassign it there's no need to provide any value at all, and you should probably combine the declaration with the reassignment. I see far too much of this:
String s = new String();
s = "some string";
All this tells me is that somebody doesn't understand the language.
thought both are applicable for different scenarios, assigning it null will be better(saves memory and less error prone) provided you later on set some string to that else you will be getting a null pointer exception.
An empty String is often a sign that you should either be using a StringBuilder if you're appending (which is a common case). A null reference is 'better' because it uses less memory and it's faster to check if a String is null compared to checking if it's empty. I would advice against using empty Strings instead of declaring them as null, though you should still check for null reference before you use it unless you're 100% sure that it would have been assigned a value at some stage.
I've come across a class that includes multiple uses of a string literal, "foo".
What I'd like to know, is what are the benefits and impact (in terms of object creation, memory usage and speed) of using this approach instead of declaring the String as final and replacing all the literals with the final variable?
For example (although obviously not a real word usage):
private static final String FINAL_STRING = "foo";
public void stringPrinter(){
for(int i=0;i<10;i++){
System.out.println(FINAL_STRING);
}
}
Versus:
public void stringPrinter(){
for(int i=0;i<10;i++){
System.out.println("foo");
}
}
Which is preferable and why (assuming the string value will remain constant)?
Would the above (second) example result in 10 String objects being created or would the JVM realise that only a single literal is actually used, and create a single reference. If so, is there any advantage for declaring the String as final (as in the first example)?
If the interpreted code does replace the string literal with a single reference, does that still apply if the same literal occurs in more than one place:
public void stringPrinter(){
for(int i=0;i<5;i++){
System.out.println("foo"); // first occurence
System.out.println("foo"); // second occurence
}
}
They will be exactly the same. The literal is interned (any compile time constant expression that results in that string shares the same instance as all other constants/literals) in both cases and a smart compiler+runtime should have no trouble reducing both to the most optimized example.
The advantage comes more in maintainability. If you want to change the literal, you would need only change one occurrence with a constant but you would need to search and change every instance if they were included inline.
From the JLS
Compile-time constants of type String are always "interned" so as to share unique instances, using the method String.intern.
So, no, there's gonna be only one string object.
As Mark notes, this is strictly the question of maintainability and not performance.
The advantage is not in performance, but in maintainability and reliability.
Let me take a real example I came across just recently. A programmer created a function that took a String parameter that identified the type of a transaction. Then in the program he did string compares against this type. Like:
if (type.equals("stock"))
{ ... do whatever ... }
Then he called this function, passing it the value "Stock".
Do you notice the difference in capitalization? Neither did the original programmer. It proved to be a fairly subtle bug to figure out, because even looking at both listings, the difference in capitalization didn't strike me.
If instead he had declared a final static, say
final static String stock="stock";
Then the first time he tried to pass in "Stock" instead of "stock", he would have gotten a compile-time error.
Better still in this example would have been to make an enum, but let's assume he actually had to write the string to an output file or something so it had to be a string.
Using final statics gives at least x advantages:
(1) If you mis-spell it, you get a compile-time error, rather than a possibly-subtle run-time error.
(2) A static can assign a meaingful name to a value. Which is more comprehensible:
if (employeeType.equals("R")) ...
or
if (employeeType.equals(EmployeeType.RETIRED)) ...
(3) When there are multiple related values, you can put a group of final statics together at the top of the program, thus informing future readers what all the possible values are. I've had plenty of times when I've seen a function compare a value against two or three literals. And that leaves me wondering: Are there other possible values, or is this it? (Better still is often to have an enum, but that's another story.)
All String literals are kept in a String cache (this is across all classes)
Using a constant can make the code clearer, give the the string some context and make the code easier to maintain esp if the same string appears in multiple places.
Those string literals are internalized, so no new String objects are created in the loop. Using the same literal twice could still be a sign for code smell, though; but not in terms of speed or memory usage.
In the cases you are providing, I believe the biggest reason for having it declared as FINAL_STRING somewhere is to ensure it stays in one centralized location. There will only ever be one instance of that string constant, but the first example is far easier to maintain.
Do you use StringUtils.EMPTY instead of ""?
I mean either as a return value or if you set a the value of a String variable. I don't mean for comparison, because there we use StringUtils.isEmpty()
Of course not.
Do you really think "" is not clear enough ?
Constants have essentially 3 use cases:
Document the meaning of a value (with constant name + javadoc)
Synchronize clients on a common value.
Provide a shortcut to a special value to avoid some init costs
None apply here.
I use StringUtils.EMPTY, for hiding the literal and also to express that return StringUtils.EMPTY was fully expected and there should return an empty string, "" can lead to the assumption that "" can be easily changed into something else and that this was maybe only a mistake. I think the EMPTY is more expressive.
No, just use "".
The literal "" is clear as crystal. There is no misunderstanding as to what was meant. I wouldn't know why you would need a class constant for that. I can only assume that this constant is used throughout the package containing StringUtils instead of "". That doesn't mean you should use it, though.
If there's a rock on the sidewalk, you don't have to throw it.
I'm amazed at how many people are happy to blindly assume that "" is indeed an empty string, and doesn't (accidentally?) contain any of Unicode's wonderful invisible and non-spacing characters. For the love of all that is good and decent, use EMPTY whenever you can.
I will add my two cents here because I don't see anybody talking about String interning and Class initialization:
All String literals in Java sources are interned, making any "" and StringUtils.EMPTY the same object
Using StringUtils.EMPTY can initialize StringUtils class, as it accesses its static member EMPTY only if it is not declared final (the JLS is specific on that point). However, org.apache.commons.lang3.StringUtils.EMPTY is final, so it won't initialize the class.
See a related answer on String interning and on Class initialization, referring to the JLS 12.4.1.
I don't really like to use it, as return ""; is shorter than return StringUtils.EMPTY.
However, one false advantage of using it is that if you type return " "; instead of return "";, you may encounter different behavior (regarding if you test correctly an empty String or not).
If your class doesn't use anything else from commons then it'd be a pity to have this dependency just for this magic value.
The designer of the StringUtils makes heavy use of this constant, and it's the right thing to do, but that doesn't mean that you should use it as well.
I find StringUtils.EMPTY useful in some cases for legibility. Particularly with:
Ternary operator eg.
item.getId() != null ? item.getId() : StringUtils.EMPTY;
Returning empty String from a method, to confirm that yes I really wanted to do that.
Also by using a constant, a reference to StringUtils.EMPTY is created. Otherwise if you try to instantiate the String literal "" each time the JVM will have to check if it exists in the String pool already (which it likely will, so no extra instance creation overhead). Surely using StringUtils.EMPTY avoids the need to check the String pool?
No, because I have more to write. And an empty String is plattform independent empty (in Java).
File.separator is better than "/" or "\".
But do as you like. You can't get an typo like return " ";
Honestly, I don't see much use of either. If you want to compare egainst an empty string, just use StringUtils.isNotEmpty(..)
I am recommending to use this constant as one of the building stones of a robust code, to lower the risk of accidently have nonvisible characters sneak in when assigning an empty string to a variable.
If you have people from all around the world in your team and maybe some of them not so experienced, then it might be a good idea to insist on using this constant in the code.
There are lots of different languages around and people are using their own local alphabet settings on their computers. Sometimes they just forget to switch back when coding and after they switch and delete with backspace, then text editor can leave some junk inside of "". Using StringUtils.EMPTY just eliminate that risk.
However, this does not have any significant impact on the performance of the code, nor on the code readability. Also it does not resolve some fundamental problem you might experience, so it is totally up to your good judgement weather you will use this constant or not.
Yes, it makes sense.
It might not be the only way to go but I can see very little in the way of saying this "doesn't make sense".
In my opinion:
It stands out more than "".
It explains that you meant empty, and that blank will likely not do.
It will still require changing everywhere if you don't define your own variable and use it in multiple places.
If you don't allow free string literals in code then this helps.