Why doesn't Java warn about a == "something"? - java

This might sound stupid, but why doesn't the Java compiler warn about the expression in the following if statement:
String a = "something";
if(a == "something"){
System.out.println("a is equal to something");
}else{
System.out.println("a is not equal to something");
}
I realize why the expression is untrue, but AFAIK, a can never be equal to the String literal "something". The compiler should realize this and at least warn me that I'm an idiot who is coding way to late at night.
Clarification
This question is not about comparing two String object variables, it is about comparing a String object variable to a String literal. I realize that the following code is useful and would produce different results than .equals():
String a = iReturnAString();
String b = iReturnADifferentString();
if(a == b){
System.out.println("a is equal to b");
}else{
System.out.println("a is not equal to b");
}
In this case a and b might actually point to the same area in memory, even if it's not because of interning. In the first example though, the only reason it would be true is if Java is doing something behind the scenes which is not useful to me, and which I can't use to my advantage.
Follow up question
Even if a and the string-literal both point to the same area in memory, how is that useful for me in an expression like the one above. If that expression returns true, there isn't really anything useful I could do with that knowledge, is there? If I was comparing two variables, then yes, that info would be useful, but with a variable and a literal it's kinda pointless.

Actually they can indeed be the same reference if Java chooses to intern the string. String interning is the notion of having only one value for a distinct string at runtime.
http://en.wikipedia.org/wiki/String_intern_pool
Java notes about string interning
http://javatechniques.com/blog/string-equality-and-interning/

Compiler warnings tend to be about things that are either blatantly wrong (conditionals that can never be true or false) or unsafe (unchecked casts). The use of == is valid, and in some rare cases intentional.
I believe all of Checkstyle, FindBugs and PMD will warn about this, and optionally a lot of other bad practices we tend to have when half asleep or otherwise incapacitated ;).

Because:
you might actually want to use ==, if working with constants and interned strings
the compiler should make an exception only for String, and no other type. What I mean is - whenever the compiler encounters == it should check if the operands are Strings in order to issue a warning. What if the arguments are Strings, but are referred to as Object or CharSequence ?
The rationale given by checkstyle for issuing an error is that novice programmers often do this. And if you are novice, I'd be hard to configure checkstyle (or pmd), or even to know about them.
Another thing is the actual scenario when strings are compared and there is a literal as one of the operands. First, it would be better to use a constant (static final) instead of a literal. And where would the other operand come from? It is likely that it will come from the same constant / literal, somewhere else in the code. So == would work.

Depending on the context, both identity comparisons and value comparisons can be legitimate.
I can think of very few queries where there is a deterministic automated algorithm to figure out unambiguously that one of them is an error.
Therefore, there's no attempt to do this automatically.
If you think about things like caching, then there are situations where you would want to do this test.

Actually, it may sometimes be true, depending on if Java takes an existing String from its internal String cache, creating the first declaration and then storing it, or taking it for both string declarations.

The compiler doesn't care that you're trying to do identity comparison against a literal. It could also be argued that it's not the compiler's job to be a code nanny. Look for a lint-like tool if you want to catch situations like this.

"I realize why the expression is untrue, but AFAIK, a can never be equal to the String literal "something"."
To clarify, in the example given, the expersion is always TRUE and a can be == and equals() to the String literal and in the example given it is always == and equals().
It is ironic that you appear have given the rare counter example to your own argument.

There are cases where you actually care whether you're dealing with exactly the same object rather than whether two objects are equal. In such cases, you need == rather than equals(). The compiler has no way of knowing whether you really wanted to compare the references for equality or the objects that they point to.
Now, it's far less likely that you're going to want == for strings than it would be for a user-defined type, but that doesn't guarantee that you wouldn't want it, and even if it did, that means that the compiler would have to special case strings are specifically check to make sure that you didn't use == on them.
In addition, because strings are immutable, the JVM is free to make string which would be equal per equals() share the same instance (to save memory), in which case they would also be equal per ==. So, depending on what the JVM does, == could very well return true. And the example that you gave is actually one where there's a decent chance of it because they're both string literals, so it would be fairly easy for the JVM to make them the same string, and it probably would. And, of course, if you want to see whether the JVM is making two strings share the same instance, you would have to use == rather than equals(), so there's a legitimate reason to want to use == on strings right there.
So, the compiler has no way of knowing enough of what you're doing to know that using == instead of equals() should be an error. This can lead to bugs if you're not careful (especially if you're used to a language like C++ which overloads == instead of having a separate equals() function), but the compiler can only do so much for you. There are legitimate reasons for using == instead of equals(), so the compiler isn't going to flag it as an error.

There exist tools that will warn you about these constructs; feel free to use them. However there are valid cases when you want to use == on Strings, and it is much worse language design to warn a user about a perfectly valid construct than to fail to warn them. When you have been using Java a year or so (and I will bet good money that you haven't reached that stage yet) you will find avoiding constructs like this is second nature.

Related

Purpose of CompareObjectsWithEquals PMD rule

I've just fallen foul of the CompareObjectsWithEquals rule in PMD, because I've compared two object references using '==' instead of equals(), but I'm struggling to see why this is a problem and can't find any justification for this restriction.
I appreciate that Object.equals() compares references and therefore has the same effect, but I'm not using a raw Object, so I can't guarantee that method won't be overridden at some point somewhere in the hierarchy.
I want to do a reference comparison, and I want to be sure that this always will be a reference comparison. Why would PMD try to force me to call equals()?
Is it just me, or is this a really stupid rule??
Edited:
Just to be clear - I am not asking what the difference is between == and equals() (as per What is the difference between == vs equals() in Java?) - I understand this perfectly. I am asking why PMD would force me to always use equals() when the caller may legitimately want to ensure that a reference comparison is performed.
In your case, you know what you are doing, and need to compare reference so for sure the rule does not apply. And you have to use ==.
But most of the time, that's a mistake from new Java developers who try to compare value of objects using == instead of .equals().
Addition to #YMomb:
PMD and those kind of static analysis tools always leave the final decision to user. You have full rights to ignore any rule if you believe that your design is correct.

Does equality test order affect performance in Java?

I commonly find myself writing code like this:
private List<Foo> fooList = new ArrayList<Foo>();
public Foo findFoo(FooAttr attr) {
for(Foo foo : fooList) {
if (foo.getAttr().equals(attr)) {
return foo;
}
}
}
However, assuming I properly guard against null input, I could also express the loop like this:
for(Foo foo : fooList) {
if (attr.equals(foo.getAttr()) {
return foo;
}
}
I'm wondering if one of the above forms has a performance advantage over the other. I'm well aware of the dangers of premature optimization, but in this case, I think the code is equally legible either way, so I'm looking for a reason to prefer one form over another, so I can build my coding habits to favor that form. I think given a large enough list, even a small performance advantage could amount to a significant amount of time.
In particular, I'm wondering if the second form might be more performant because the equals() method is called repeatedly on the same object, instead of different objects? Maybe branch prediction is a factor?
I would offer 2 pieces of advice here:
Measure it
If nothing else points you in any given direction, prefer the form which makes most sense and sounds most natural when you say it out loud (or in your head!)
I think that considering branch prediction is worrying about efficiency at too low of a level. However, I find the second example of your code more readable because you put the consistent object first. Similarly, if you were comparing this to some other object that, I would put the this first.
Of course, equals is defined by the programmer so it could be asymmetric. You should make equals an equivalence relation so this shouldn't be the case. Even if you have an equivalence relation, the order could matter. Suppose that attr is a superclass of the various foo.getAttr and the first test of your equals method checks if the other object is an instance of the same class. Then attr.equals(foo.getAttr()) will pass the first check but foo.getAttr().equals(attr) will fail the first check.
However, worrying about efficiency at this level seldom has benefits.
This depends on the implementation of the equals methods. In this situation I assume that both objects are instances of the same class. So that would mean that the methods are equal. This makes no performance difference.
If both objects are of the same type, then they should perform the same. If not, then you can't really know in advance what's going to happen, but usually it will be stopped quite quickly (with an instanceof or something else).
For myself, I usually start the method with a non-null check on the given parameter and I then use the attr.equals(foo.getAttr()) since I don't have to check for null in the loop. Just a question of preference I guess.
The only thing which does affect performance is code which does nothing.
In some cases you have code which is much the same or the difference is so small it just doesn't matter. This is the case here.
Where its is useful to swap the .equals() around is when you have a known value which cannot be null (This doesn't appear to be the cases here) of the type you are using is known.
e.g.
Object o = (Integer) 123;
String s = "Hello";
o.equals(s); // the type of equals is unknown and a virtual table look might be required
s.equals(o); // the type of equals is known and the class is final.
The difference is so small I wouldn't worry about it.
DEVENTER (n) A decision that's very hard to make because so little depends on it, such as which way to walk around a park
-- The Deeper Meaning of Liff by Douglas Adams and John Lloyd.
The performance should be the same, but in terms of safety, it's usually best to have the left operand be something that you are sure is not null, and have your equals method deal with null values.
Take for instance:
String s1 = null;
s1.equals("abc");
"abc".equals(s1);
The two calls to equals are not equivalent as one would issue a NullPointerException (the first one), and the other would return false.
The latter form is generally preferred for comparing with string constants for exactly this reason.

Not working String comparison in Java

This code is not working:
String name = "Bob";
String name1 = "Anne";
if name = name1;
System.out.println ("Hello");
I am a beginner in Java, please help we with this code. I am trying to compare two strings.
You want:
if (name.equals(name1))
Note that you don't want
if (name == name1)
which would be syntactically correct, but would compare the two string references for equality, rather than comparing whether the objects involved represent the same sequence of characters
Further, note that even the top version will simply perform an ordinal comparison of the UTF-16 code units in the string. If the two strings logically represent the same characters but are in different forms, you may not get the result you expect. If you want to do a culture-sensitive comparison, you should look at Collator.
Additionally, I'd recommend that if you're really new to Java, you start off with console apps and/or unit tests to explore the language instead of JSP - it'll give you a much smoother cycle while you're learning the basics, and a simpler environment to work in.
Even more additionally, the code given at the top will throw a NullPointerException if name is a null reference. You should consider what you want to happen at that point - if the string being null would represent a bug anyway, then the exception is probably appropriate; otherwise, you might want to change the code. One useful method in Guava (which is chock full of good stuff) is Objects.equal:
if (Objects.equal(name, name1))
The method return true if both arguments are null, false if exactly one argument is null, and the result of calling equals() if they're both non-null. Very handy.
Your Mistakes :
You are using "=" operator to compare strings. It is not a conditional operator, it is an Assignment operator. The conditional operator in Java is "==" which is used to compare two values if they are equal or not. You even cannot use this one for Strings.
You are writing like this :
if name = name1;
System.out.println ("Hello");
You have put a semi-colon at the end of if statement. So it will do nothing (if your condition is supposed to be right, which is not in this case) however the condition is true or not.
You are missing parantheses around the condition given in if statement.
Synatx of if statement is : if(condition)
So it is must to write "()" around your condition.
Now, for comparing Strings, String class gives us methods like :
stringOne.equals(stringTwo)
It checks for exactly the same string.
or
stringOne.equalsIgnoreCase(stringTwo)
It will ignore Caps-Small letter case.
You must compare the two variables like this
if (name.equals(name1))
This should work and not the way you did it!!!
if(name.equals(name1))
System.out.println("Hello");
== works only when you compare primitives like int or long.
If you want to compare String you have to use either equals() or compareTo(). Single = is an assignment not comparison by doing name=name1 you essentially assign string name1 to variable name.
Your posted code isn't really Java. In addition, you don't compare '==' with the assignment operator '='. Finally, to do proper comparison of 'Object's or anything descended from Objects you need to use the .equals(...) method.
Comparing with == means "is the same object", not "is an object with the same value". The difference seems to be small; but, if you opt to compare objects by their value, it is not small at all. Two Objects can be created with identical "values", and only .equals(...) allows you to consider those two Objects to be the same.

String intern in equals method

Is it a good practise to use String#intern() in equals method of the class. Suppose we have a class:
public class A {
private String field;
private int number;
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final A other = (A) obj;
if ((this.field == null) ? (other.field != null) : !this.field.equals(other.field)) {
return false;
}
if (this.number != other.number) {
return false;
}
return true;
}
}
Will it be faster to use field.intern() != other.field.intern() instead of !this.field.equals(other.field).
No! Using String.intern() implicitly like this is not a good idea:
It will not be faster. As a matter of fact it will be slower due to the use of a hash table in the background. A get() operation in a hash table contains a final equality check, which is what you want to avoid in the first place. Used like this, intern() will be called each and every time you call equals() for your class.
String.intern() has a lot of memory/GC implications that you should not implicitly force on users of this class.
If you want to avoid full blown equality checks when possible, consider the following avenues:
If you know that the set of strings is limited and you have repeated equality checks, you can use intern() for the field at object creation, so that any subsequent equality checks will come down to an identity comparison.
Use an explicit HashMap or WeakHashMap instead of intern() to avoid storing strings in the GC permanent generation - this was an issue in older JVMs, not sure if it is still a valid concern.
Keep in mind that if the set of strings is unbounded, you will have memory issues.
That said, all this sounds like premature optimization to me. String.equals() is pretty fast in the general case, since it compares the string lengths before comparing the strings themselves. Have you profiled your code?
Good practice : Nope. You're doing something tricky, and that makes for brittle, less readable code. Unless this equals() method needs to be crazy performant (and your performance tests validate that it is in fact faster), it's not worth it.
Faster : Could be. But don't forget that you can have unintended side effects from using the intern() method: http://www.onkarjoshi.com/blog/213/6-things-to-remember-about-saving-memory-with-the-string-intern-method/
Any benefit gained by performing an identity comparison on the interned Strings is likely to be outweighed by the associated cost of interning the Strings.
In the above case you could consider interning the String when you instantiate the class, providing the field is constant (in which case you should also mark it as final). You could also check for null on instantiation to avoid having to check on each call to equals (assuming you disallow null Strings).
However, in general these types of micro-optimisation offer little gain in performance.
Let's go through this one step at a time...
The idea here is that if you use String#intern, you'll be given a canonical representation of that String. A pool of Strings is kept internally and each entry is guaranteed to be unique for that pool with regard to equals. If you call intern() on a String, then either a previously pooled identical String is going to be returned, or the String you called intern on is going to be pooled and returned.
So if we have two Strings s1 and s2 and we assume neither is null, then the following two lines of code are considered idempotent:
s1.equals(s2);
s1.intern() == s2.intern();
Let's investigate two assumptions we've made now:
s1.intern() and s2.intern() really will return the same object if s1.equals(s2) evaluates to true.
Using the == operator on two interned references to the same String will be more efficient than using the equals method.
The first assumption is probably the most dangerous of all. The JavaDoc for the intern method tells us that using this method will return a canonical representation for an internally kept pool of Strings. But it doesn't tell us anything about that pool. Once an entry has been added to the pool, can it ever be removed again? Will the pool keep growing indefinitely or will entries occassionally be culled to make it act as a limited-size cache? You'd have to check the actual specifications of the Java Language and Virtual Machine to get any certainty, if they offer it at all. Having to check specs for a limited optimization is usually a big warning sign. Checking the source code for Sun's JDK 7, I see that intern is specified as a native method. So not only is the implementation likely to be vendor-specific, it might vary across platforms as well for VMs from the same vendor. All bets are off regarding stuff that's not in the spec.
On to our second assumption. Let's consider for a moment what it would take to intern a String... First of all, we'll need to check if the String is already in the pool. We'll assume they've tried to get an O(1) complexity going there to keep this fast by using some hashing scheme. But that's assuming we've got a hash of the String. Since this is a native method, I'm not certain what would be used... Some hash of the native representation or simply what hashCode() returns. I know from the source code of Sun's JDK that a String instance caches its hash code. It'll only be calculated the first time the method is called, and after that the calculated value will be returned. So at the very least, a hash must be calculated at least once if we're to use that. Getting a reliable hash of a String will probably involve arithmetic on each and every character, which can be expensive for lenghty values. Even once we have the hash and thus a set of Strings that are candidates for being matches in the interned pool, we'd still have to verify if one of these really is an exact match which would involve... an equality check. Meaning going through each and every character of the Strings and seeing if they match if trivial cases like inequal length can't be applied first. Worse still, we might have to do this for more than one other String like we'd do with a regular equals, since multiple Strings in the pool might have the same hash or end up in the same hash bucket.
So, that stuff we need to do to find out if a String was already interned or not sounds suspiciously like what equals would need to do. Basically, we've gained nothing and might even have made our equals implementation more expensive. At least, if we're going to call intern each and every time. So maybe we should intern the String right away and simply always use that interned reference. Let's check how class A would look if that were the case. I'm assuming the String field is initialized on construction:
public class A {
private final String field;
public A(final String s) {
field = s.intern();
}
}
That's looking a little more sensible. Any Strings that are passed to the constructor and are equal will end up being the same reference. Now we can safely use == between the field field of A instances for equality checks, right?
Well, it'd be useless. Why? If you check the source for equals in class String, you'll find that any implementation made by someone with half a brain will first do a == check to catch the trivial case where the instance and the argument are the same reference first. That could save a potentially heavy char-by-char comparison. I know the JDK 7 source I'm using for reference does this. So you're still better off using equals because it does that reference check anyway.
The second reason this'd be a bad idea is that first point way up above... We simply don't know if the instances are going to be kept in the pool indefinitely. Check this scenario, which may or may not occur depending on JVM implementation:
String s1 = ... //Somehow gets passed a non-interned "test" value
A a1 = new A(s1);
//Lots of time passes... winter comes and goes and spring returns the land to a lush green...
String s2 = ... //Somehow gets passed a non-interned "test" value
A a2 = new A(s2);
a1.equals(a2); //Totally returns the wrong result
What happened? Well, if it turns out the interned String pool will sometimes be culled of certain entries, then that first construction of an A could have s1 interned, only to see it being removed from the pool, to have it later replaced by that s2 instance. Since s1 and s2 are conceivably different instances, the == check fails. Can this happen? I've got no idea. I certainly won't go check the specs and native code to find out. Will the programmer that's going through your code with a debugger to find out why the hell "test" is not considered the same as "test"?
It's no problem if we're using equals. It'll catch the same instance case early for optimal results, which will benefit us when we've interned our Strings, but we won't have to worry about cases where the instances still end up being different because then equals is gonna do the classic compare work. It just goes to show that it's best not to second-guess the actual runtime implementation or compiler, because these things were made by people who know the specs like the back of their hands and really worry about performance.
So String interning manually can be of benefit when you make sure that...
you're not interning each and every time, but just intern a String once like when intializing a field and then keep using that interned instance;
you still use equals to make sure implementation details won't ruin your day and your code doesn't actually rely on that interning, instead relying on the implementation of the method to catch the trivial cases.
After keeping this in mind, surely it's worth using intern()? Well, we still don't know how expensive intern() is. It's a native method so it might be really fast. But we're not sure unless we check the code for our target platform and JVM implementation. We've also had to make sure we understand exactly what interning does and what assumptions we've made about it. Are you sure the next person reading your code will have the same level of understanding? They might be bewildered about this weird method they've never seen before that dabbles in JVM internals and might spend an hour reading the same gibberish I'm typing right now, instead of getting work done.
That's the problem right there... Before, it was simple. You used equals and were done. Now, you've added another little thing that can nestle itself in your mind and cause you to wake up screaming one night because you've just realized that oh my God you've forgot to take out one of the == uses and that piece of code is used in a routine controlling the killer bots' apprisal of citizen disobedience and you've heard its JVM isn't too solid!
Donald Knuth was famously attributed the quote...
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil"
Knuth was clever enough to add in that 97% detail. Sometimes, thoroughly micro-optimizing a small portion of code can make a big difference. Say, if that piece of code takes up 30% of the program's runtime execution. The problem with micro-optimizations is that they tend to work on assumptions. When you start using intern() and believe that from then on it'll be safe to make reference equality checks, you've made a hell of a lot of assumptions. And even if you go down to implementation level to check if they're right, are you sure they will be in the next JRE version?
I myself have used intern() manually. Did it in some piece of code where the same handful of Strings are gonna end up in hundreds if not thousands of object instances as fields. Those fields are gonna be used as keys in HashMaps and are frequently used while doing some validation over those instances. I figured interning was worth it for two purposes: reducing memory overhead by making all those equal Strings one single instance and speeding up the map lookups, since they're using hashCode() and equals. But I've made damn sure that you can take all those intern() calls out of the code and everything will still work fine. The interning is just some icing on the cake in this case, a little extra that may or may not make a bit of difference along the road. But it's not an essential part of my code's correctness.
Long post, eh? Why'd I go through the trouble of typing all of this up? To show you that if you make micro-optimizations, you'd better know damn well what you're doing and willing to document it so thoroughly that you might as well not have bothered.
This is hard to say given that you have not specified hardware. Timing test are difficult to get right and are not universal. Have you done a timing test yourself?
My feeling is that the intern pattern would not be faster as each string would need to be matched to a possible string in a dictionary of all interned strings.

I don't understand this ("string" == "string") example

I found this java code on a java tutorial page:
if ("progress" == evt.getPropertyName())
http://download.oracle.com/javase/tutorial/uiswing/examples/components/index.html
How could this work? I thought we HAVE TO use the equals() method for this situation (string.equals("bla"))? Could we use equals() here too? Would it be better? Any ideas?
Edit: So IF equals() would be better, then I really don't understand why a serious oracle tutorial page didn't use it? Also, I don't understand why it's working because I thought a string is an object. If I say object == object, then that's a big problem.
Yes, equals() would definitely be better and correct. In Java, a pool of string constants is maintained and reused intelligently for performance. So this can work, but it is only guaranteed if evt.getPropertyName() is assured to return constants.
Also, the more correct version would be "progress".equals(evt.getPropertyName()), in case evt.getPropertyName() is null. Note that the implementation of String.equals starts with using == as a first test before doing char-by-char comparison, so performance will not be much affected versus the original code.
Which demo are we looking at?
This explains equals() vs ==
http://www.java-samples.com/showtutorial.php?tutorialid=221
It is important to understand that the equals( ) method and the == operator perform two different operations. As just explained, the equals( ) method compares the characters inside a String object. The == operator compares two object references to see whether they refer to the same instance. The following program shows how two different String objects can contain the same characters, but references to these objects will not compare as equal:
So in your particular example, it is comparing the reference to see if they are the same reference, not to see if the string chars match I believe.
The correct version of this code should be:
if ("progress".equals(evt.getPropertyName()))
This could work because of the way that the JVM handles string constants. Each string constant is intern()ed. So if evt.getPropertyName() is returning a reference to a string constant than using == will work. But it is bad form and in general it will not work.
This only would work if evt.getPropertyName() returns a constant string of value "progress".
With constant string, I mean evaluated at compile-time.
In most cases, when comparing Strings, using equals is best. However, if you know you'll be comparing the exact same String objects (not just two strings that have the same content), or if you're dealing entirely with constant Strings and you really care about performance, using == will be somewhat faster than using equals. You should normally use equals since you normally don't care about performance sufficiently to think about all the other prerequisites for using ==.
In this case, the author of the progress demo should probably have used equals - that code isn't especially performance-critical. However, in this particular case, the code will be dealing entirely with constant strings, so whilst it's probably not the best choice, especially for a demo, it is a valid choice.

Categories

Resources