Is StringUtils.EMPTY recommended?

Is StringUtils.EMPTY recommended? - java

Do you use StringUtils.EMPTY instead of ""?
I mean either as a return value or if you set a the value of a String variable. I don't mean for comparison, because there we use StringUtils.isEmpty()

Of course not.
Do you really think "" is not clear enough ?
Constants have essentially 3 use cases:
Document the meaning of a value (with constant name + javadoc)
Synchronize clients on a common value.
Provide a shortcut to a special value to avoid some init costs
None apply here.

I use StringUtils.EMPTY, for hiding the literal and also to express that return StringUtils.EMPTY was fully expected and there should return an empty string, "" can lead to the assumption that "" can be easily changed into something else and that this was maybe only a mistake. I think the EMPTY is more expressive.

No, just use "".
The literal "" is clear as crystal. There is no misunderstanding as to what was meant. I wouldn't know why you would need a class constant for that. I can only assume that this constant is used throughout the package containing StringUtils instead of "". That doesn't mean you should use it, though.
If there's a rock on the sidewalk, you don't have to throw it.

I'm amazed at how many people are happy to blindly assume that "" is indeed an empty string, and doesn't (accidentally?) contain any of Unicode's wonderful invisible and non-spacing characters. For the love of all that is good and decent, use EMPTY whenever you can.

I will add my two cents here because I don't see anybody talking about String interning and Class initialization:
All String literals in Java sources are interned, making any "" and StringUtils.EMPTY the same object
Using StringUtils.EMPTY can initialize StringUtils class, as it accesses its static member EMPTY only if it is not declared final (the JLS is specific on that point). However, org.apache.commons.lang3.StringUtils.EMPTY is final, so it won't initialize the class.
See a related answer on String interning and on Class initialization, referring to the JLS 12.4.1.

I don't really like to use it, as return ""; is shorter than return StringUtils.EMPTY.
However, one false advantage of using it is that if you type return " "; instead of return "";, you may encounter different behavior (regarding if you test correctly an empty String or not).

If your class doesn't use anything else from commons then it'd be a pity to have this dependency just for this magic value.
The designer of the StringUtils makes heavy use of this constant, and it's the right thing to do, but that doesn't mean that you should use it as well.

I find StringUtils.EMPTY useful in some cases for legibility. Particularly with:
Ternary operator eg.
item.getId() != null ? item.getId() : StringUtils.EMPTY;
Returning empty String from a method, to confirm that yes I really wanted to do that.
Also by using a constant, a reference to StringUtils.EMPTY is created. Otherwise if you try to instantiate the String literal "" each time the JVM will have to check if it exists in the String pool already (which it likely will, so no extra instance creation overhead). Surely using StringUtils.EMPTY avoids the need to check the String pool?

No, because I have more to write. And an empty String is plattform independent empty (in Java).
File.separator is better than "/" or "\".
But do as you like. You can't get an typo like return " ";

Honestly, I don't see much use of either. If you want to compare egainst an empty string, just use StringUtils.isNotEmpty(..)

I am recommending to use this constant as one of the building stones of a robust code, to lower the risk of accidently have nonvisible characters sneak in when assigning an empty string to a variable.
If you have people from all around the world in your team and maybe some of them not so experienced, then it might be a good idea to insist on using this constant in the code.
There are lots of different languages around and people are using their own local alphabet settings on their computers. Sometimes they just forget to switch back when coding and after they switch and delete with backspace, then text editor can leave some junk inside of "". Using StringUtils.EMPTY just eliminate that risk.
However, this does not have any significant impact on the performance of the code, nor on the code readability. Also it does not resolve some fundamental problem you might experience, so it is totally up to your good judgement weather you will use this constant or not.

Yes, it makes sense.
It might not be the only way to go but I can see very little in the way of saying this "doesn't make sense".
In my opinion:
It stands out more than "".
It explains that you meant empty, and that blank will likely not do.
It will still require changing everywhere if you don't define your own variable and use it in multiple places.
If you don't allow free string literals in code then this helps.

Related

blank ("") as constant v/s direct use of blank ("")

I am using blank value("") at many place in my java code.
I want to know does defining blank("") in constant and the using that constant and directly using blank("") is the same thing or does it make any difference?
Thanks.

The String literal "" will be added to the String constants pool. So use it directly like this --> "" as it will be more readable. Don't define a static constant called BLANK_VALUE="" and then use it. In terms of performance, the same instance of the String literal will be re-used, so it doesn't matter (You will have a very small overhead for declaring a static field* but that's ok)
See which code makes more sense :
if("".equals(myString)){ // clear and easy to understand.
//do something
}
OR
if(MyClass.BLANK_VALUE.equals(myString)){ // you will have to go back to BLANK_VALUE to check it's actual value.
//do something
}
PS : Using constants on LHS will prevent NPEs

Why don't you try with isEmpty() method
if(myString !=null && myString.isEmpty() ){}

What is "string bashing" and why is it bad?

My boss keeps using the term "string bashing" (we're a Java shop) and usually makes an example out of me whenever I ask him anything (as if, I'm supposed to know it already). I Googled the term only to find results pertaining to theoretical physics and string theory.
I am guessing it has something to do with using String/StringBuilders incorrectly or not in keeping with best practices, but for the life of me, I can't figure out what it is.

"String bashing" is a slang term for cutting up strings and manipulating them: splitting, joining, inserting, tokenizing, parsing, etc..
It's not inherently bad (despite the connotation of "bashing"), but as you point out, in Java, one needs to be careful not to use String when StringBuilder would be more efficient.

Why don't you ask your boss for an example of string bashing.
Don't forget to ask him for the correct way of refactoring the examples he gives you.

Out of context, "string bashing" doesn't really have any meaning in itself. It's not a buzz word for any good or bad behaviour. It would just mean "bashing strings", as in using string operations.
Whether that is good or bad depends on what you are doing, and the role of the strings would not really be important. There are good and bad ways of handling any kind of data.
Sometimes "bashing strings" is actually the best solution. Consider for example that you want to pick out the first three characters of a string. You could create a regular expression that isolates the characters, but that would certainly be overkill as there is a simple string operation that can do the same, which is a lot faster and easier to maintain.

Effective Java has an item about using strings: "Item 50: Avoid strings where other types are more appropriate". Also on stackoverflow: "Stringly typed".

A guess: It might imply something related to creation of unnecessary temporary objects, and in this particular case Strings. For example, if you're constructing a String token by token then it's usually a good idea to use a StringBuilder. If the String is not built using a builder, each concatenation will cause another temporary object to be created (and later garbage collected).
In modern VMs (I'm thinking HotSpot 1.5 or 1.6) this is rarely a problem unless you're in performance critical code or you're building long strings, e.g. in for loops.
Only a guess; might be better to ask what he or she means? I've never heard the term before.

There are a few results on google which refer to string bashing in this context. They don't appear to refer to the concern about the inefficent temporaries and using StringBuilder.
Instead, it appears to refer to simplistic string parsing. I.e. doing stuff like checking for substrings, slicing the string, etc. In particular, it appears to have the implication of it being a hacky solution to the problem.
It might be seen badly because you should either use real parsing or obtain the data in a non-string format.

Why is there no String.Empty in Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I understand that every time I type the string literal "", the same String object is referenced in the string pool.
But why doesn't the String API include a public static final String Empty = "";, so I could use references to String.Empty?
It would save on compile time, at the very least, since the compiler would know to reference the existing String, and not have to check if it had already been created for reuse, right? And personally I think a proliferation of string literals, especially tiny ones, in many cases is a "code smell".
So was there a Grand Design Reason behind no String.Empty, or did the language creators simply not share my views?

String.EMPTY is 12 characters, and "" is two, and they would both be referencing exactly the same instance in memory at runtime. I'm not entirely sure why String.EMPTY would save on compile time, in fact I think it would be the latter.
Especially considering Strings are immutable, it's not like you can first get an empty String, and perform some operations on it - best to use a StringBuilder (or StringBuffer if you want to be thread-safe) and turn that into a String.
Update
From your comment to the question:
What inspired this is actually
TextBox.setText("");
I believe it would be totally legitimate to provide a constant in your appropriate class:
private static final String EMPTY_STRING = "";
And then reference it as in your code as
TextBox.setText(EMPTY_STRING);
As this way at least you are explicit that you want an empty String, rather than you forgot to fill in the String in your IDE or something similar.

Use org.apache.commons.lang.StringUtils.EMPTY

If you want to compare with empty string without worrying about null values you can do the following.
if ("".equals(text))
Ultimately you should do what what you believe is clearest. Most programmers assume "" means empty string, not a string someone forgot to put anything into.
If you think there is a performance advantage, you should test it. If you don't think its worth testing for yourself, its a good indication it really isn't worth it.
It sounds like to you try to solve a problem which was solved when the language was designed more than 15 years ago.

Don't just say "memory pool of strings is reused in the literal form, case closed". What compilers do under the hood is not the point here. The question is reasonable, specially given the number of up-votes it received.
It's about the symmetry, without it APIs are harder to use for humans. Early Java SDKs notoriously ignored the rule and now it's kind of too late. Here are a few examples on top of my head, feel free to chip in your "favorite" example:
BigDecimal.ZERO, but no AbstractCollection.EMPTY, String.EMPTY
Array.length but List.size()
List.add(), Set.add() but Map.put(), ByteBuffer.put() and let's not forget StringBuilder.append(), Stack.push()

Apache StringUtils addresses this problem too.
Failings of the other options:
isEmpty() - not null safe. If the
string is null, throws an NPE
length() == 0 - again not null safe.
Also does not take into account
whitespace strings.
Comparison to EMPTY constant - May
not be null safe. Whitespace problem
Granted StringUtils is another library to drag around, but it works very well and saves loads of time and hassle checking for nulls or gracefully handling NPEs.

Seems like this is the obvious answer:
String empty = org.apache.commons.lang.StringUtils.EMPTY;
Awesome because "empty initialization" code no longer has a "magic string" and uses a constant.

If you really want a String.EMPTY constant, you can create an utility static final class named "Constants" (for example) in your project. This class will maintain your constants, including the empty String...
In the same idea, you can create ZERO, ONE int constants... that don't exist in the Integer class, but like I commented, it would be a pain to write and to read :
for(int i=Constants.ZERO; ...) {
if(myArray.length > Constants.ONE) {
System.out.println("More than one element");
}
}
Etc.

All those "" literals are the same object. Why make all that extra complexity? It's just longer to type and less clear (the cost to the compiler is minimal). Since Java's strings are immutable objects, there's never any need at all to distinguish between them except possibly as an efficiency thing, but with the empty string literal that's not a big deal.
If you really want an EmptyString constant, make it yourself. But all it will do is encourage even more verbose code; there will never be any benefit to doing so.

To add on to what Noel M stated, you can look at this question, and this answer shows that the constant is reused.
http://forums.java.net/jive/message.jspa?messageID=17122
String constant are always "interned"
so there is not really a need for such
constant.
String s=""; String t=""; boolean b=s==t; // true

I understand that every time I type the String literal "", the same String object is referenced in the String pool.
There's no such guarantee made. And you can't rely on it in your application, it's completely up to jvm to decide.
or did the language creators simply not share my views?
Yep. To me, it seems very low priority thing.

Late answer, but I think it adds something new to this topic.
None of the previous answers has answered the original question. Some have attempted to justify the lack of a constant, while others have showed ways in which we can deal with the lack of the constant. But no one has provided a compelling justification for the benefit of the constant, so its lack is still not properly explained.
A constant would be useful because it would prevent certain code errors from going unnoticed.
Say that you have a large code base with hundreds of references to "". Someone modifies one of these while scrolling through the code and changes it to " ". Such a change would have a high chance of going unnoticed into production, at which point it might cause some issue whose source will be tricky to detect.
OTOH, a library constant named EMPTY, if subject to the same error, would generate a compiler error for something like EM PTY.
Defining your own constant is still better. Someone could still alter its initialization by mistake, but because of its wide use, the impact of such an error would be much harder to go unnoticed than an error in a single use case.
This is one of the general benefits that you get from using constants instead of literal values. People usually recognize that using a constant for a value used in dozens of places allows you to easily update that value in just one place. What is less often acknowledged is that this also prevents that value from being accidentally modified, because such a change would show everywhere. So, yes, "" is shorter than EMPTY, but EMPTY is safer to use than "".
So, coming back to the original question, we can only speculate that the language designers were probably not aware of this benefit of providing constants for literal values that are frequently used. Hopefully, we'll one day see string constants added in Java.

For those claiming "" and String.Empty are interchangeable or that "" is better, you are very wrong.
Each time you do something like myVariable = ""; you are creating an instance of an object.
If Java's String object had an EMPTY public constant, there would only be 1 instance of the object ""
E.g: -
String.EMPTY = ""; //Simply demonstrating. I realize this is invalid syntax
myVar0 = String.EMPTY;
myVar1 = String.EMPTY;
myVar2 = String.EMPTY;
myVar3 = String.EMPTY;
myVar4 = String.EMPTY;
myVar5 = String.EMPTY;
myVar6 = String.EMPTY;
myVar7 = String.EMPTY;
myVar8 = String.EMPTY;
myVar9 = String.EMPTY;
10 (11 including String.EMPTY) Pointers to 1 object
Or: -
myVar0 = "";
myVar1 = "";
myVar2 = "";
myVar3 = "";
myVar4 = "";
myVar5 = "";
myVar6 = "";
myVar7 = "";
myVar8 = "";
myVar9 = "";
10 pointers to 10 objects
This is inefficient and throughout a large application, can be significant.
Perhaps the Java compiler or run-time is efficient enough to automatically point all instances of "" to the same instance, but it might not and takes additional processing to make that determination.

Java : "xx".equals(variable) better than variable.equals("xx") , TRUE?

I'm reviewing a manual of best practices and recommendation coding java I think is doubtful.
Recomendation:
String variable;
"xx".equals(variable) // OK
variable.equals("xx") //Not recomended
Because prevents appearance of NullPointerException that are not controlled
Is this true?

This is a very common technique that causes the test to return false if the variable is null instead of throwing a NullPointerException. But I guess I'll be different and say that I wouldn't regard this as a recommendation that you always should follow.
I definitely think it is something that all Java programmers should be aware of as it is a common idiom.
It's also a useful technique to make code more concise (you can handle the null and not null case at the same time).
But:
It makes your code harder to read: "If blue is the sky..."
If you have just checked that your argument is not null on the previous line then it is unnecessary.
If you forgot to test for null and someone does come with a null argument that you weren't expecting it then a NullPointerException is not necessarily the worst possible outcome. Pretending everything is OK and carrying until it eventually fails later is not really a better alternative. Failing fast is good.
Personally I don't think usage of this technique should be required in all cases. I think it should be left to the programmer's judgement on a case-by-case basis. The important thing is to make sure you've handled the null case in an appropriate manner and how you do that depends on the situation. Checking correct handling of null values could be part of the testing / code review guidelines.

It is true. If variable is null in your example,
variable.equals("xx");
will throw a NPE because you can't call a method (equals) on a null object. But
"xx".equals(variable);
will just return false without error.

Actually, I think that the original recommendation is true. If you use variable.equals("xx"), then you will get a NullPointerException if variable is null. Putting the constant string on the left hand side avoids this possibility.
It's up to you whether this defense is worth the pain of what many people consider an unnatural idiom.

This is a common technique used in Java (and C#) programs. The first form avoids the null pointer exception because the .equals() method is called on the constant string "xx", which is never null. A non-null string compared to a null is false.
If you know that variable will never be null (and your program is incorrect in some other way if it is ever null), then using variable.equals("xx") is fine.

It's true that using any propertie of an object that way helps you to avoid the NPE.
But that's why we have Exceptions, to handle those kind of thing.
Maybe if you use "xx".equals(variable) you would never know if the value of variable is null or just isn't equal to "xx". IMO it's best to know that you are getting a null value in your variable, so you can reasign it, rather than just ignore it.

You are correct about the order of the check--if the variable is null, calling .equals on the string constant will prevent an NPE--but I'm not sure I consider this a good idea; Personally I call it "slop".
Slop is when you don't detect an abnormal condition but in fact create habits to personally avoid it's detection. Passing around a null as a string for an extended period of time will eventually lead to errors that may be obscure and hard to find.
Coding for slop is the opposite of "Fail fast fail hard".
Using a null as a string can occasionally make a great "Special" value, but the fact that you are trying to compare it to something indicates that your understanding of the system is incomplete (at best)--the sooner you find this fact out, the better.
On the other hand, making all variables final by default, using Generics and minimizing visibility of all objects/methods are habits that reduce slop.

If you need to check for null, I find this better readable than
if (variable != null && variable.equals("xx")). It's more a matter of personal preference.

As a side note, here is a design pattern where this code recommendation might not make any difference, since the String (i.e. Optional<String>) is never null because of the .isPresent() call from the design pattern:
Optional<String> gender = Optional.of("MALE");
if (gender.isPresent()) {
System.out.println("Value available.");
} else {
System.out.println("Value not available.");
}
gender.ifPresent(g -> System.out.println("Consumer: equals: " + g.equals("whatever")));

Why doesn't Java warn about a == "something"?

This might sound stupid, but why doesn't the Java compiler warn about the expression in the following if statement:
String a = "something";
if(a == "something"){
System.out.println("a is equal to something");
}else{
System.out.println("a is not equal to something");
}
I realize why the expression is untrue, but AFAIK, a can never be equal to the String literal "something". The compiler should realize this and at least warn me that I'm an idiot who is coding way to late at night.
Clarification
This question is not about comparing two String object variables, it is about comparing a String object variable to a String literal. I realize that the following code is useful and would produce different results than .equals():
String a = iReturnAString();
String b = iReturnADifferentString();
if(a == b){
System.out.println("a is equal to b");
}else{
System.out.println("a is not equal to b");
}
In this case a and b might actually point to the same area in memory, even if it's not because of interning. In the first example though, the only reason it would be true is if Java is doing something behind the scenes which is not useful to me, and which I can't use to my advantage.
Follow up question
Even if a and the string-literal both point to the same area in memory, how is that useful for me in an expression like the one above. If that expression returns true, there isn't really anything useful I could do with that knowledge, is there? If I was comparing two variables, then yes, that info would be useful, but with a variable and a literal it's kinda pointless.

Actually they can indeed be the same reference if Java chooses to intern the string. String interning is the notion of having only one value for a distinct string at runtime.
http://en.wikipedia.org/wiki/String_intern_pool
Java notes about string interning
http://javatechniques.com/blog/string-equality-and-interning/

Compiler warnings tend to be about things that are either blatantly wrong (conditionals that can never be true or false) or unsafe (unchecked casts). The use of == is valid, and in some rare cases intentional.
I believe all of Checkstyle, FindBugs and PMD will warn about this, and optionally a lot of other bad practices we tend to have when half asleep or otherwise incapacitated ;).

Because:
you might actually want to use ==, if working with constants and interned strings
the compiler should make an exception only for String, and no other type. What I mean is - whenever the compiler encounters == it should check if the operands are Strings in order to issue a warning. What if the arguments are Strings, but are referred to as Object or CharSequence ?
The rationale given by checkstyle for issuing an error is that novice programmers often do this. And if you are novice, I'd be hard to configure checkstyle (or pmd), or even to know about them.
Another thing is the actual scenario when strings are compared and there is a literal as one of the operands. First, it would be better to use a constant (static final) instead of a literal. And where would the other operand come from? It is likely that it will come from the same constant / literal, somewhere else in the code. So == would work.

Depending on the context, both identity comparisons and value comparisons can be legitimate.
I can think of very few queries where there is a deterministic automated algorithm to figure out unambiguously that one of them is an error.
Therefore, there's no attempt to do this automatically.
If you think about things like caching, then there are situations where you would want to do this test.

Actually, it may sometimes be true, depending on if Java takes an existing String from its internal String cache, creating the first declaration and then storing it, or taking it for both string declarations.

The compiler doesn't care that you're trying to do identity comparison against a literal. It could also be argued that it's not the compiler's job to be a code nanny. Look for a lint-like tool if you want to catch situations like this.

"I realize why the expression is untrue, but AFAIK, a can never be equal to the String literal "something"."
To clarify, in the example given, the expersion is always TRUE and a can be == and equals() to the String literal and in the example given it is always == and equals().
It is ironic that you appear have given the rare counter example to your own argument.

There are cases where you actually care whether you're dealing with exactly the same object rather than whether two objects are equal. In such cases, you need == rather than equals(). The compiler has no way of knowing whether you really wanted to compare the references for equality or the objects that they point to.
Now, it's far less likely that you're going to want == for strings than it would be for a user-defined type, but that doesn't guarantee that you wouldn't want it, and even if it did, that means that the compiler would have to special case strings are specifically check to make sure that you didn't use == on them.
In addition, because strings are immutable, the JVM is free to make string which would be equal per equals() share the same instance (to save memory), in which case they would also be equal per ==. So, depending on what the JVM does, == could very well return true. And the example that you gave is actually one where there's a decent chance of it because they're both string literals, so it would be fairly easy for the JVM to make them the same string, and it probably would. And, of course, if you want to see whether the JVM is making two strings share the same instance, you would have to use == rather than equals(), so there's a legitimate reason to want to use == on strings right there.
So, the compiler has no way of knowing enough of what you're doing to know that using == instead of equals() should be an error. This can lead to bugs if you're not careful (especially if you're used to a language like C++ which overloads == instead of having a separate equals() function), but the compiler can only do so much for you. There are legitimate reasons for using == instead of equals(), so the compiler isn't going to flag it as an error.

There exist tools that will warn you about these constructs; feel free to use them. However there are valid cases when you want to use == on Strings, and it is much worse language design to warn a user about a perfectly valid construct than to fail to warn them. When you have been using Java a year or so (and I will bet good money that you haven't reached that stage yet) you will find avoiding constructs like this is second nature.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.