Java String Comparison: style choice or optimization? - java

I've been taking a look at some GWT code written by various people and there are different ways of comparing strings. I'm curious if this is just a style choice, or if one is more optimized than another:
"".equals(myString);
myString.equals("");
myString.isEmpty();
Is there a difference?

"".equals(myString);
will not throw a NullPointerException if myString is null. That is why a lot of developers use this form.
myString.isEmpty();
is the best way if myString is never null, because it explains what is going on. The compiler may optimize this or myString.equals(""), so it is more of a style choice. isEmpty() shows your intent better than equals(""), so it is generally preferred.

Beware that isEmpty() was added in Java 6, and, unfortunately, there are still people who complain pretty loudly if you don't support Java 1.4.

apache StringUtils provides some convenience methods for, well, String manipulation.
http://commons.apache.org/lang/api/org/apache/commons/lang/StringUtils.html#isBlank(java.lang.CharSequence)
check out that method and associated ones.

myString.isEmpty() is probably best if you are working on a recent version of Java (1.6). It is likely to perform better than myString.equals("") as it only needs to examine one string.
"".equals(myString) has the property of not throwing a null pointer exception if myString is null. However for that reason alone I'd avoid it as it is usually better to fail fast if you hit an unexpected condition. Otherwise some little bug in the future will be very difficult to track down.....
myString.equals("") is the most natural / idiomatic approach for people wanting to keep compatibility with older Java versions, or who just want to be very explicit about what they are comparing to.

Both of the options using "" may require the creation of a temporary String object but the .isEmpty() function shouldn't.
If they bothered to put the .isEmpty() function in I say it is probably best to use it!

Related

Why does ImmutableCollection.contains(null) fail?

Question ahead:
why does in Java the call coll.contains(null) fail for ImmutableCollections?
I know, that immutable collections cannot contain null-elements, and I do not want to discuss whether that's good or bad.
But when I write a Function, that takes a (general, not explicit immutable) Collection, it fails upon checking for nulls. Why does the implementation not return false (which is actually the 'correct' answer)?
And how can I properly check for nulls in a Collection in general?
Edit:
with some discussions (thanks to the commenters!) I realized, that I mixed up two things: ImmutableCollection from the guava library, and the List returned by java.util.List.of, being some class from ImmutableCollections. However, both classes throw an NPE on .contains(null).
My problem was with the List.of result, but technically the same would happen with guaves implementation. [edit: It does not]
I am distressed by this discussion!
Collections that do this have been a pet peeve of mine since before I wrote the first collections that eventually became Guava. If you find any Guava collection that throws NPE just because you asked it a perfectly innocent question like .contains(null), please file a bug! We hate that crap.
EDIT: I was so distressed that I had to go back to look at my 2007 changelist that first created ImmutableSet and saw literally this:
#Override public boolean contains(#Nullable Object target) {
if (target == null) {
return false;
}
ahhhhh.
why does in Java the call coll.contains(null) fail for ImmutableCollections?
Because the design team (the ones who have created guava) decided that, for their collections, null is unwanted, and therefore any interaction between their collections and a null check, even in this case, should just throw to highlight to the programmer, at the earliest possible opportunity, that there is a mismatch. Even where the established behaviour (as per the existing implementations in the core runtime itself, such as ArrayList and friends, as well as the javadoc), rather explicitly go the other way and say that a non-sequitur check (is this pear part of this list of apples?) strongly suggests that the right move is to just return false and not throw.
In other words, guava messed up. But now that they have done so, going back is potentially backwards compatibility breaking. It really isn't very - you are replacing an exception thrown with a false return value; presumably code could be out there that relies on the NPE (catching it and doing something different from what the code would do had contains(null) returned false instead of throwing) - but that's a rare case, and guava breaks backwards compatibility all the time.
And how can I properly check for nulls in a Collection in general?
By calling .contains(null), just as you are. The fact that guava doesn't do it right doesn't change the answer. You might as well ask 'how do I add elements to a list', and counter the answer of "well, you call list.add(item) to do that" with: Well, I have this implementation of the List interface that plays Rick Astley over the speaker instead of adding to the list, so, I reject your answer.
That's.. how java and interfaces work: You can have implementations of them, and the only guardianship that they do what the interface dictates they must, is that the author understands there is a contract that needs to be followed.
Now, normally a library so badly written they break contract for no good reason*, isn't popular. But guava IS popular. Very popular. That gets at a simple truth: No library is perfect. Guava's API design is generally quite good (in my opinion, vastly superior to e.g. Apache commons libraries), and the team actively spends a lot of time debating proper API design, in the sense that the code that one would write using guava is nice (as defined by: Easy to understand, has few surprises, easy to maintain, easy to test, and probably easy to mutate to deal with changing requirements - the only useful definition for nebulous terms like 'nice' or 'elegant' code - it's code that does those things, anything else is pointless aesthetic drivel). In other words, they are actively trying, and they usually get it right.
Just, not in this case. Work around it: return item != null && coll.contains(item); will get the job done.
There is one major argument in favour of guava's choice: They 'contract break' is an implicit break - one would expect that .contains(null) works, and always returns false, but it's not explicitly stated in the javadoc that one must do this. Contrast to e.g. IdentityHashMap, which uses identity equivalence (a==b) and not value equality (a.equals(b)) in its .containsKey etc implementations, which explicitly goes against the javadoc contract as stated in the j.u.Map interface. IHM has an excellent reason for it, and highlights the discrepancy, plus explains the reason, in the javadoc. Guava isn't nearly as clear about their bizarre null behaviour, but, here's a crucial thing about null in java:
Its meaning is nebulous. Sometimes it means 'empty', which is bad design: You should never write if (x == null || x.isEmpty()) - that implies some API is badly coded. If null is semantically equivalent to some value (such as "" or List.of()), then you should just return "" or List.of(), and not null. However, in such a design, list.contains(null) == false) would make sense.
But sometimes null means not found, irrelevant, not applicable, or unknown (for example, if map.get(k) returns null, that's what it means: Not found. Not 'I found an empty value for you'). This matches with what NULL means in e.g. SQL. In all those cases, .contains(null) should be returning neither true nor false. If I hand you a bag of marbles and ask you if there is a marble in there that is grue, and you have no idea what grue means, you shouldn't answer either yes or no to my query: Either answer is a meaningless guess. You should tell me that the question cannot be answered. Which is best represented in java by throwing, which is precisely what guava does. This also matches with what NULL does in SQL. In SQL, v IN (x) returns one of 3 values, not 2 values: It can resolve to true, false, or null. v IN (NULL) would resolve to NULL and not false. It is answering a question that can't be answered with the NULL value, which is to be read as: Don't know.
In other words, guava made a call on what null implies which evidently does not match with your definitions, as you expect .contains(null) to return false. I think your viewpoint is more idiomatic, but the point is, guava's viewpoint is different but also consistent, and the javadoc merely insinuates, but does not explicitly demand, that .contains(null) returns false.
That's not useful whatsoever in fixing your code, but hopefully it gives you a mental model, and answers your question of "why does it work like this?".

What is the best way to check string not null and not empty in Java?

I consider bewteen !StringUtils.isEmpty(string) and string != null && string.length()!=0 Which is better for performance??
I would prefer string != null && !string.isEmpty(). The intent is clear and why rely on an external library for such a simple thing.
There's no relevant performance difference. When the code gets used a lot, the utility method gets inlined and then it's exactly the same. When it doesn't get used, then nobody cares.
Performance is sometimes important, but such trivialities have no impact. Learn Java benchmarking (JMH) before you even start thinking about optimizations.
In case you already use Apache Commons, then use their utility methods. If you don't than don't bring the library in just because of such a triviality. OTOH there are many other useful things in the library (FWIW I started with some trivial methods from Guava and now I use a substantial portion of it).

Would it be a good idea if compiler resolved nulls when Optional<Object> is expected as argument?

That would be so obviously useful that I am starting to think I am missing a rationale to avoid it, since I am sure Oracle would have made it that way. It would be the most valuable feature on Optional for me.
public class TestOptionals{
public static void main(String[] args) {
test(null);
}
public static void test(Optional<Object> optional){
System.out.println(optional.orElse(new DefaultObject()));
}
}
(This throws a NullPointerException)
Without that feature I see too verbose using Optional for the argument.
I prefer a simple Object optional signature and
checking it by if (null = optional) that creating the object Optional for comparing later. It is not valuable if that doesn't help you checking the null
There was a HUGE discussion of Optional on all the various Java mailing lists, comprising hundreds of messages. Do a web search for
site:mail.openjdk.java.net optional
and you'll get links to lots of them. Of course, I can't even hope to summarize all the issues that were raised. There was a lot of controversy, and there was quite a breadth of opinion about how much "optionality" should be added to the platform. Some people thought that a library solution shouldn't be added at all; some people thought that a library solution was useless without language support; some people thought that a library solution was OK, but there was an enormous amount of quibbling about what should be in it; and so forth. See this message from Brian Goetz on the lambda-dev mailing list for a bit of perspective.
One pragmatic decision made by the lambda team was that any optional-like feature couldn't involve any language changes. The language and compiler team already had its hands full with lambda and default methods. These of course were the main priorities. Practically speaking, the choices were either to add Optional as a library class or not at all.
Certainly people were aware of other languages' type systems that support option types. This would be a big change to Java's type system. The fact is that for most of the past 20 years, reference types have been nullable, and there's been a single, untyped null value. Changing this is a massive undertaking. It might not even be possible to do this in a compatible way. I'm not an expert in this area, but most such discussions have tended to go off into the weeds pretty quickly.
A smaller change that might be more tractable (also mentioned by Marko Topolnik) is to consider the relationship between reference types and Optional as one of boxing, and then bring in the support for autoboxing/autounboxing that's already in the language.
Already this is somewhat problematic. When auto(un)boxing was added in Java 5, it made a large number of cases much nicer, but it added a lot of rough edges to the language. For example, with auto-unboxing, one can now use < and > to compare the values of boxed Integer objects. Unfortunately, using == still compares references instead of values! Boxing also made overload resolution more complicated; it's one of the most complicated areas of the language today.
Now let's consider auto(un)boxing between reference types and Optional types. This would let you do:
Optional<String> os1 = "foo";
Optional<String> os2 = null;
In this code, os1 would end up as a boxed string value, and os2 would end up as an empty Optional. So far, so good. Now the reverse:
String s1 = os1;
String s2 = os2;
Now s1 would get the unboxed string "foo", and s2 would be unboxed to null, I guess. But the point of Optional was to make such unboxing explicit, so that programmers would be confronted with a decision about what to do with an empty Optional instead of having it just turn into null.
Hmmm, so maybe let's just do autoboxing of Optional but not autounboxing. Let's return to the OP's use case:
public static void main(String[] args) {
test(null);
}
public static void test(Optional<Object> optional) {
System.out.println(optional.orElse(new DefaultObject()));
}
If you really want to use Optional, you can manually box it one line:
public static void test(Object arg) {
Optional<Object> optional = Optional.ofNullable(arg);
System.out.println(optional.orElse(new DefaultObject()));
}
Obviously it might be nicer if you didn't have to write this, but it would take an enormous amount of language/compiler work, and compatibility risk, to save this line of code. Is it really worth it?
What seems to be going on is that this would allow the caller to pass null in order to have some specific meaning to the callee, such as "use the default object" instead. In small examples this seems fine, but in general, loading semantics onto null increasingly seems like a bad idea. So this is an additional reason not to add specific language support for boxing of null. The Optional.ofNullable() method mainly is there to bridge the gap between code that uses null and code that uses Optional.
If you are committed to using the Optional class, then see the other answers.
On the other hand, I interpreted your question as, "Would it be a good idea to avoid the syntactic overhead of using Optional while still obtaining a guarantee of no null pointer exceptions in your code?" The answer to this question is a resounding yes. Luckily, Java has a feature, type annotations, that enables this. It does not require use of the the Optional class.
You can obtain the same compile-time guarantees, without adding Optional to your code and while retaining backward compatibility with existing code.
Annotate references that might be null with the #Nullable type
annotation.
Run a compiler plugin such as the Checker Framework's Nullness Checker.
If the plugin issues no errors, then you know that your code always checks for null where it needs to, and therefore your code never issues a null pointer exception exception at run time.
The plugin handles the special cases mentioned by #immibis and more, so your code is much less verbose than code using Optional. The plugin is compatible with normal Java code and does not require use of Java 8 as Optional does. It is in daily use at companies such as Google.
Note that this approach requires you to supply a command-line argument to the compiler to tell it to run the plugin. Also note that this approach does not integrate with Optional; it is an alternate approach.

What patterns are there for dereferencing a nullable object?

Are there any patterns which are similar to Groovy's safe navigation operator? This is where a null value does make sense and allowing a NullPointerException to be thrown would be wrong.
Other than checking for null, what are some other options? The String class provides a static method to dereference possibly null values but there doesn't seem to be much use of this pattern apart from the Objects class. Google's Guava project does have an Optional class but that requires the caller supplying a default which seems too much compared to the safe navigation operator and when you would just want it to be null most of the time. Something that looks like this:
com.google.common.base.Optional.of(fooBar).
or(org.mockito.Mockito.mock(FooBar.class));
The null object pattern or the option monad in Scala are not what I'm looking for or proposals of adding such operators into future versions of Java, I'm looking for patterns that can be applied in Java now.
Nope. Just a good ol' fashioned if-else. You can use the ternary operator if that feels more pattern-like. You could fashion something patternerific with anonymous classes and Callables and such, but it would be very, very ugly in Java.
The "safe" navigation operator is in my opinion a really bad idea since it supresses exceptions and allows developers to get away with error-ridden code.
There are basically just three cases to consider:
If you have a null where it is not supposed to be then you have an error and should fail fast and loud. A NullPointerException is a perfectly good way to do this, although some would argue that you should have tests which detected the stray null earlier. Note that Fail-Fast is a very important principle in building robust systems, so you should learn to love your NullPointerExceptions. They are there to guide you to building better software. You should fix the NPE by fixing your code, not by ignoring / suppressing it!
If a null is allowed but requires specific handling (e.g. a default value) then you should be explicit about this in your code. It's perfectly good practice to write an if/else null check for this, or else you can use a helper function that itself handles the null case.
If a null is allowed but requires no specific handling then you don't need to do anything. Just pass it round as usual.

Best practice with respect to NPE and multiple expressions on single line

I'm wondering if it is an accepted practice or not to avoid multiple calls on the same line with respect to possible NPEs, and if so in what circumstances. For example:
anObj.doThatWith(myObj.getThis());
vs
Object o = myObj.getThis();
anObj.doThatWith(o);
The latter is more verbose, but if there is an NPE, you immediately know what is null. However, it also requires creating a name for the variable and more import statements.
So my questions around this are:
Is this problem something worth
designing around? Is it better to go
for the first or second possibility?
Is the creation of a variable name something that would have an effect performance-wise?
Is there a proposal to change the exception
message to be able to determine what
object is null in future versions of
Java ?
Is this problem something worth designing around? Is it better to go for the first or second possibility?
IMO, no. Go for the version of the code that is most readable.
If you get an NPE that you cannot diagnose then modify the code as required. Alternatively, run it using the debugger and use breakpoints and single stepping to find out where the null pointer is coming from.
Is the creation of a variable name something that would have an effect performance-wise?
Adding an extra variable may increase the stack frame size, or may extend the time that some objects remain reachable. But both effects are unlikely to be significant.
Is there a proposal to change the exception message to be able to determine what object is null in future versions of Java ?
Not that I am aware of. Implementing such a feature would probably have significant performance downsides.
The Law of Demeter explicitly says not to do this at all.
If you are sure that getThis() cannot return a null value, the first variant is ok. You can use contract annotations in your code to check such conditions. For instance Parasoft JTest uses an annotation like #post $result != null and flags all methods without the annotation that use the return value without checking.
If the method can return null your code should always use the second variant, and check the return value. Only you can decide what to do if the return value is null, it might be ok, or you might want to log an error:
Object o = getThis();
if (null == o) {
log.error("mymethod: Could not retrieve this");
} else {
o.doThat();
}
Personally I dislike the one-liner code "design pattern", so I side by all those who say to keep your code readable. Although I saw much worse lines of code in existing projects similar to this:
someMap.put(
someObject.getSomeThing().getSomeOtherThing().getKey(),
someObject.getSomeThing().getSomeOtherThing())
I think that no one would argue that this is not the way to write maintainable code.
As for using annotations - unfortunately not all developers use the same IDE and Eclipse users would not benefit from the #Nullable and #NotNull annotations. And without the IDE integration these do not have much benefit (apart from some extra documentation). However I do recommend the assert ability. While it only helps during run-time, it does help to find most NPE causes and has no performance effect, and makes the assumptions your code makes clearer.
If it were me I would change the code to your latter version but I would also add logging (maybe print) statements with a framework like log4j so if something did go wrong I could check the log files to see what was null.

Categories

Resources