Behavior of String literals is confusing - java

The behavior of String literals is very confusing in the code below.
I can understand line 1, line 2, and line 3 are true, but why is line 4 false?
When I print the hashcode of both they are the same.
class Hello
{
public static void main(String[] args)
{
String hello = "Hello", lo = "lo";
System.out.print((Other1.hello == hello) + " "); //line 1
System.out.print((Other1.hello == "Hello") + " "); //line 2
System.out.print((hello == ("Hel"+"lo")) + " "); //line 3
System.out.print((hello == ("Hel"+lo)) + " "); //line 4
System.out.println(hello == ("Hel"+lo).intern()); //line 5
System.out.println(("Hel"+lo).hashCode()); //hashcode is 69609650 (machine depedent)
System.out.println("Hello".hashCode()); //hashcode is same WHY ??.
}
}
class Other1 { static String hello = "Hello"; }
I know that == checks for reference equality and check in the pool for literals. I know equals() is the right way. I want to understand the concept.
I already checked this question, but it doesn't explain clearly.
I would appreciate a complete explanation.

Every compile-time constant expression that is of type String will be put into the String pool.
Essentially that means: if the compiler can (easily) "calculate" the value of the String without running the program, then it will be put into the pool (the rules are slightly more complicated than that and have a few corner cases, see the link above for all the details).
That's true for all the Strings in lines 1-3.
"Hel"+lo is not a compile-time constant expression, because lo is a non-constant variable.
The hash codes are the same, because the hashCode of a String depends only on its content. That's required by the contract of equals() and hashCode().

Strings computed by concatenation at runtime are newly created and therefore distinct
here is a link to read: http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5

String object can be created in the following ways:
String str = new String("abcd"); // Using the new operator
// str is assigned with "abcd" value at compile time.
String str="abcd"; // Using string literal
// str is assigned with "abcd" value at compile time.
String str="ab" + "cd"; // Using string constant expression.
// str is assigned with "abcd" value at compile time.
String str1 = "cd";
String str = "ab"+str1; // Using string expression.
// str is assigned with "abcd" value at run time only.
and Hashcode will be calculated only at runtime based on the contents of the String objects.

It's because the comipler in this instance is not smart enough to work out that it can burn in the same string literal.
Hashcode needs to always return the same value for strings that are equivelent (calling .equals on it returns true) so will return the same result.

Its because following code
("Hel"+lo)) + " "
is translated internally to
new StringBuilder("Helo").append(new String(lo)).append(new String(" ")).toString()
So you can see that entirely a new String instance is created with help of different String instances. That is why you get false as they point to different memory locations in heap

The hashCode doesn't have anything to do with an objects reference (The == check is a reference comparator). Its possible to have 2 objects where the hashCode returns the same value, the equals operator returns true, but == returns false. This is when they are 2 different objects, but with the same value.
I believe the reason line 4 is returning false is that it is a value computed at runtime, and thus is a different string instance, with a different reference.

String literals are saved in a special memory, if they are exactly the same, they are pointed to the same map of memory. If you don't create a literal String, a new object will be created so it won't point to that memory so the reference won't be the same.
The intern() method tells the virtual machine to put it into that shared, string literals map of memory so next time you do that literal, it'll search there and point it.

As you already know ... this is just because of reference ...when string comes from the pool it will have same refrence ...but when u do manuplations a new string with new refrence is generated ...
You can check this link for pooling concept

The difference between line number 3 and 4 are as follows.
•Strings computed by constant expressions are computed at compile time and then treated as if they were literals.
•Strings computed by concatenation at run time are newly created and therefore distinct.
The above reference is taken from java spec. Please let me know if you need more clarification.
http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5

System.identityHashCode() would be returned by the default method hashCode(), this is typically implemented by converting the internal address of the object into an integer.

Finally I know the answer !
Read Java SE 8 specification section 15.21.3 Reference Equality Operators == and != (http://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.21.3)
While == may be used to compare references of type String, such an
equality test determines whether or not the two operands refer to the
same String object.
The result is false if the operands are distinct
String objects, even if they contain the same sequence of characters(§3.10.5). The contents of two strings s and t can be tested for
equality by the method invocation s.equals(t).
So the following code :
class Test {
public static void main(String[] args) {
String hello = "Hello";
String lo = "lo";
System.out.println((hello == ("Hel"+lo))); // line 3
}
}
The expression ("Hel"+lo) in line 3, return the new Strings that computed by concatenation at run time.
*Strings computed by concatenation at run time are newly created and therefore distinct.
(http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#d5e1634)
So the result of this code:
class Test {
public static void main(String[] args) {
String hello = "Hello";
String lo = "lo";
System.out.println((hello == ("Hel"+lo))); // line 3
}
}
would result:
false
Because,
the "Hello" object in this expression:
String hello = "Hello";
and ("Hel"+lo) object in this expression:
System.out.print((hello == ("Hel"+lo)) + " ");
is different, although :
*they both contain the same sequence character, which is "Hello".
*they both have the same hashCode.
*hello.equals(("Hel"+lo)) will return true.

Related

Why "abc" in String x = "abc".toUpperCase() is not included in an intern pool?

Whenever I do a string conversion (like to uppercase - using toUpperCase()), does a new string object always get created (not using interning)?
Also, I tried to test something myself with the code below:
String x = "abc".toUpperCase();
String y = "abc";
String z = "ABC";
System.out.println(x == y); // returns false
System.out.println(x == z); // returns false
I thought "abc" would put an object into an intern pool while toUpperCase() seems to be just using new keyword without String.intern().
However, it did not go as I expected.
x == y gave me a false value. Why "abc" is not pushed into an intern pool?
I thought calling a string using a literal format -e.g. "abc" - would do like new String("abc").intern() automatically?
Just to clarify, I assume here that the main point of your question was comparing x == z - two uppercase strings - and not x == y which would always be false because one is uppercase and the other is lowercase and this has nothing to do with interning.
JVMS 8 5.1 says:
A string literal is a reference to an instance of class String, and is
derived from a CONSTANT_String_info structure (§4.4.3) in the binary
representation of a class or interface. The CONSTANT_String_info
structure gives the sequence of Unicode code points constituting the
string literal.
The Java programming language requires that identical string literals
(that is, literals that contain the same sequence of code points) must
refer to the same instance of class String (JLS §3.10.5). In addition,
if the method String.intern is called on any string, the result is a
reference to the same class instance that would be returned if that
string appeared as a literal. Thus, the following expression must have
the value true:
("a" + "b" + "c").intern() == "abc"
To derive a string literal, the Java Virtual Machine examines the
sequence of code points given by the CONSTANT_String_info structure.
If the method String.intern has previously been called on an instance of class String containing a sequence of Unicode code points
identical to that given by the CONSTANT_String_info structure, then
the result of string literal derivation is a reference to that same
instance of class String.
Otherwise, a new instance of class String is created containing the sequence of Unicode code points given by the CONSTANT_String_info
structure; a reference to that class instance is the result of string
literal derivation. Finally, the intern method of the new String
instance is invoked.
The important parts here is that only string literals are put in a so called string pool and ones on which .intern() is explicitly called. Same goes to "taking" instances from the string pool. So if a string is created in any other way than specified as a string literal or is recieved from intern() method, it won't be an instance from string pool nor it will be added there.
A following example shows a little bit better how it works:
String a = "abc".toUpperCase().intern();
String b = "abc".toUpperCase();
String c = "ABC";
String d = "ABC";
System.out.println(a == b); // returns false
System.out.println(a == c); // returns true
System.out.println(b == c); // returns false
System.out.println(c == d); // returns true

When does the "==" operator successfully compare strings? [duplicate]

This question already has answers here:
How do I compare strings in Java?
(23 answers)
String literals vs String Object in Java
(6 answers)
Closed 3 years ago.
I understand that the equality operator compares references to the strings. So, it will check to see if the strings refer to the same object and not if they are equal character by character.
As a first step in learning about search algorithms, I set up the following program where I have an array of names and then I check if a certain name appears in the array.
First Approach :
I declare and initialize the array of the names. And I ask the user to input a name to check if it appears in the array.
Here's the code I used -
import java.util.Scanner;
public class Strawman{
public static void main(String[] args){
System.out.println("Enter the name to search for:");
Scanner scanner = new Scanner(System.in);
String key = scanner.nextLine();
String[] names = {"alice", "bob", "carlos", "carol", "craig", "dave", "erin", "eve", "frank", "mallory", "oscar", "peggy", "trent", "walter", "wendy"};
for (int i = 0; i < names.length; i++){
if (key == names[i]) {
System.out.println("Index " + i + " has the name " + key);
}
}
}
}
One of the runs of this program is shown in the following screenshot -
As expected, because I'm using the == operator to compare strings, this fails to find the name "oscar" in the array, even though it appeared in the initial array. This output is as expected based on my understanding of how equality operators compares references of the strings.
But, I don't understand why the program seems to work if instead of asking for user input, I declare the name to search for as a string.
Second Approach:
The name "oscar" to search for has been declared as a string instead of asking for user input -
public class Strawman2{
public static void main(String[] args){
String[] names = {"alice", "bob", "carol", "craig", "carlos", "dave", "eve", "fred", "greg", "gregory", "oscar", "peter"};
String key = "oscar";
for (int i = 0; i < names.length; i++){
if (names[i] == key){
System.out.println("Index " + i + " has name " + key);
}
}
}
}
Now, if I run the program, the name "oscar" is found in the array -
Can someone explain the difference in the two cases?
It's because in the second approach
String key = "oscar";
reuses an instance from the string constant pool populated by
String[] names = {"alice", "bob", "carol", "craig", "carlos", "dave", "eve", "fred", "greg", "gregory", "oscar", "peter"};
Change the way you initiate key variable into:
String key = new String("oscar");
it will behave the same way as first approach as you bypass the String Constant Pool and your key variable will now refer to another object in memory.
For more information about the String Constant Pool:
String Constant Pool
It's because the compiler reuses string instances from string literals that are known at compile time. Hence they pass the object equality check. Reuse is possible because Strings are immutable objects.
Strings that are not know at compile time, and/or explicitly created as new String objects, are not subject to this optimisation and will always result in new objects.
There are only two situations where == is guaranteed to work (as you want) for string testing:
You created a String object explicitly and know sure that you are using the same reference for it in two different places.
You know for sure that both strings you are comparing have been interned. Noting that string literals are always1 interned.
Technically, an interned string is one that is a result of calling String::intern some time during its lifetime. (See JLS 3.10.5 and the javadoc.) Informally an interned string is one that "is in the String pool", though the term "the String pool" is not specified anywhere2.
Anything else and == is liable going to give the wrong answer.
And ... those two cases rarely arise in real-world programs.
1 - Not strictly 100% true: consider literals that are subexpressions in constant expressions. However, that does not affect the behavior of the == operator.
2 - The closest I have found is "a pool of strings, initially empty, is maintained privately by the class String" in the javadocs. But the current javadocs, JLS and JVM spec don't use the phrase "the String pool" or "the String constant pool" or any of the other variations any place that I can find.

Query regarding String Pool

Suppose I have strings:
String a = "hello";
String b = "h";
String c = "ello";
String d = b+c;
When I check for a==d it returns false.
Please correct me if am wrong, the bytecode would contain string d value as hello right? I want to know why is it that during execution of the program,string d is not picked up from string pool as hello is already available in string pool and hence returning false as above when checked for equality.
Please correct me if am wrong, the bytecode would contain string d value as hello right?
You are wrong. (You can see that you are wrong by inspecting the bytecode for yourself.)
The value of d is evaluated at runtime, by concatenating b and c.
It is only if you declare both b and c as final that this becomes true: then they are both compile time constant expressions (*).
This means that the value assigned to d is a compile-time constant expression, so it is evaluated at compile time to be "hello". But only one "hello" is inserted into the constant pool, because no more are necessary.
Hence, a == d would be true.
(*) final-ness is a necessary but not sufficient condition to be a compile-time constant; the other relevant fact is that they are assigned a compile-time constant value, namely a string literal value.
During the execution StringBuilder will be created which itself will create a String object out of char array.
Sample:
StringBuilder dBuilder = new StringBuilder();
dBuilder.append(b);
dBuilder.append(c);
String d = dBuilder.toString(); // here new String(value, 0, count); will be called,
where value is the char array and count is the size of the resulting string.
The == operator checks two strings are point exactly the same object. here a,b,c Reference variable have different hash Code. actually == comparing the memory address. If they are both equal it will return true and false
If you say to Java put the d value to the pool
String d = (b+c).intern();
it will return true; the example in here(https://ideone.com/Y100H4)

Class Hello in a test

I have come across this question in a test:
class Hello {
public static void main(String[] args){
String hello = "Hello", lo = "lo";
System.out.println(hello == ("Hel" + "lo"));
System.out.println(hello == ("Hel" + lo));
System.out.println(hello == ("Hel" + lo).intern());
}
}
The output is:
true
false
true
Why is the second output false?
It prints 'false' because the concatenation of the String constant "Hel" and the String object 'lo' results in a seaparate, anonymous string object, with its own unique object reference. Thus, the "hello" String object and the concatenated string are different objects based on object reference (== operator, not by String value with String.equals()).
== compares the references of two sides.
Here, for hello == ("Hel"+lo), the references of two sides are not the same. So, it returns false.
For comparing values, use equals() method.
I think it Comparision Literal Problem.
It Works.
System.out.print((hello.equals("Hel"+lo)) + " ");
System.out.print((hello == ("Hel"+"lo")) + " ");
I think it is because in the second output ("Hel" + lo) is no more in the string. The equality "==" operator compares object memory location and not characters of String.By default Java puts all string literal into string pool, but you can also put any string into pool by calling intern() method of java.lang.String class, like string created using new() operator.

Checking a number in String Variable [duplicate]

This question already has answers here:
How do I compare strings in Java?
(23 answers)
Closed 9 years ago.
I am getting a value as 9999912499 from the database.
I have separated it in two parts 99999 and 12499 using substring.
Now I want to check whether if the 1st string is equal to 99999 then i do some processing otherwise something other processing.
But controls never gets in to the if loop
Following is a snapshot:
String strPscId = Long.toString(pscID);
String convPscID = strPscId.substring(5, strPscId.length());
String checkNine = strPscId.substring(0,5);
BigDecimal jpaIdObj = jeuParam.getJpaIdObj();
Long mod_id = modele.getModId();
log.info("outstrPscId == " +strPscId);
log.info("outconvPscID == " +convPscID);
log.info("outcheckNine == " +checkNine);
log.info("outjpaIdObj == " +jpaIdObj);
log.info("outmod_id == " +mod_id);
if(checkNine == "99999") { <method-call> }
else { <another - method - call> }
For some reason, the people that make java decided that == shouldn't be used to compare Strings, so you have to use
checkNine.equals("99999");
Look at the following code:
String str1 = "abc";
String str2 = str1;
In the first line, a new string is created and stored in your computer's memory. str1 itself is not that string, but a reference to that string. In the second line, str2 is set to equal str1. str2 is, like str1, only a reference to a place in memory. However, rather than creating an entirely new string, str2 is a reference to the same place in memory that str1 is a reference to. == checks if the references are the same, but .equals() checks if the each character in a string is the same as the corresponding character in the other string.
boolean bool1 = (str1 == str2);
boolean bool2 = str1.equals(str2);
If this code were added to the code above that, both bool1 and bool2 would be true.
String str1 = "abc";
String str2 = new String(str1);
boolean bool1 = (str1 == str2);
boolean bool2 = str1.equals(str2);
In this case bool2 is still true, but bool1 is false. This is because str2 isn't set to equal str1, so it isn't a reference to the same place in memory that str1 is a reference to. Instead, new String(str1) creates an entirely new string that has the value of str1. str1 and str2 are references to two different places in memory. They contain the same value, but are fundamentally different in that they are stored in two different places, and therefore are two different things.
If I replaced new String(str1) with "abc" or str1, bool1 would be true, because without the key word new, the JVM only creates a new string to store in memory if absolutely necessary. new forces the JVM to create an entirely new string, whether or not any place in memory already has the same value as the new string being created.
.equals() is slow but generally more useful than ==, which is far faster but often does not always give the desired result. There are many times when == can be used with the same result as .equals(), but it can be difficult to tell when those times are. Unless you a knowledgeable programmer making something where speed is important, I would suggest that you always use .equals().
You need use equals method, rather than == to compare strings.
Change from
if(checkNine == "99999")
to
if(checkNine.equals("99999"))
The == operator is used to compare the content of two variables. This works as expected when using primitive types (or even wrapper classes because of auto-boxing). However, when we are using == with a reference to an object (e.g., checkNine), the content is the reference to the object but not the value of the object. This is where equals() method is used.
if("99999".equals(checkNine)){
<method-call>
}
else {
<another - method - call>
}
if(checkNine.equals( "99999")) {
<method-call>
}
else {
<another - method - call>
}
if (strPscId.startsWith("99999"))
{
bla bla
}
else
{
sth else than bla bla
}

Categories

Resources