This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
When should we use intern method of String?
what is string interning?
Please explain the inner workings of the following code:
System.out.println(new String("ABC").intern()==new String("ABC").intern());
In the above code it prints "true". But according java rules, in the case of the new operator, it always creates a new object. object.intern() method also creates an object in the string pool. So my question is, in the above code how many objects are created.
According to me, 3 new objects will created. One goes to String pool, and two anonymous objects will be created by the new operator. But i am not sure.
If i am wrong please explain.
Assuming no cleverness in the optimizer, two objects are created. (A smart enough optimizer could optimize this to just an unconditional true, in which case no objects are created.)
tl;dr version: You were almost right with your answer of 3, except that the string that goes into the String pool is not generated as part of this statement; it's already created.
First, let's get the "ABC" literal out of the way. It's represented in the runtime as a String object, but that lives in pergen and was created once in the whole life of the JVM. If this is the first class that uses that string literal, it was created at class load time (see JLS 12.5, which states that the String was created when the class was loaded, unless it previously existed).
So, the first new String("ABC") creates one String, which simply copies the reference (but does not create a new object) to the chars array and hash from the String that represents the "ABC" literal (which, again, is not created as part of this line). The .intern() method then looks to see whether an equal String is already in permgen. It is (it's just the String that represents the literal to begin with), so that's what that function returns. So, new String("ABC").intern() == "ABC". See JLS 3.10.5, and in particular:
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.
The same thing exactly happens with the second occurrence of new String("ABC").intern(). And, since both intern() method return the same object as the "ABC" literal, they represent the same value.
Breaking it down a bit:
String a = new String("ABC"); // a != "ABC"
String aInterned = a.intern(); // aInterned == "ABC"
String b = new String("ABC"); // b != "ABC"
String bInterned = b.intern(); // bInterned == "ABC"
System.out.println(new String("ABC").intern()==new String("ABC").intern());
// ... is equivalent to...
System.out.println(aInterned == bInterned); // ...which is equivalent to...
System.out.println("ABC" == "ABC"); // ...which is always true.
When you call intern() method, jvm will check if the given string is there, in string pool or not. If it is there, it will return a reference to that, otherwise it will create a new string in pool and return reference to that.
In your case : System.out.println(new String("ABC").intern()==new String("ABC").intern());
The first new String("ABC").intern() will create a string "ABC" in pool.When you call new String("ABC").intern() second time, jvm will return the reference to previously created string.That is the reason you are getting true when comparing both(btn are pointing to same reference).
I believe you are right, as new operation create a new object so there are 2 anonymous objects and intern() creates a new string in the string pool only if it is not already and returns it's reference
Related
I have a scenario like this -
String s = "abc", t="abc"; //LINE 1
System.out.println(s==t); // definitely it would return true; //LINE 2
s=s+"d"; t=t+"d"; //LINE 3
System.out.println(s==t); // output would be false; but why??
s=s.intern(); t=t.intern();
System.out.println(s==t); // it would return true;
I wanted to know why the second print statement returned false. Please provide me any reference link which explains the same.
While creating t at line 1; intern was called and it pointed to "abc" why not intern was called at line 3?
java strings are immutable.
that means that when you do something like s=s+"d" youre actually creating a whole new string, and assigning it to s.
on top of that, the compiler does constant detection and allocation, so that when you write s="abc", t="abc" the compiler re-uses the same reference and your code is effectively s=t="abc"
so you start with the exact same string instance (thanks to compiler optimization) and turn it into 2 identical yet different strings, at which point s==t is false (s.equals(t) would have been true, as it compares the contents and not the address in memory).
next up is intern(). what intern() does is looks up an identical string in the string cache and returns it. if it doesnt find an identical entry it places the argument provided into the cache and returns the argument. so s=s.intern() places s into the string cache and returns it (so s is unchanged) but the following call t=t.intern() actually returns s, so that s==t again.
Strings are "special" Java objects.
The JVM tries to reuse the same references (that's why String s = "abc", t="abc"; causes s and t to point to the same instance), however, when working on instances (like t=t+"d") a new instance gets created, thus, the references are not the same
In order to compare strings you have to use the .equals() method.
intern() causes to create a canonical representation out of the string pool inside the String class (
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#intern%28%29)
String s = "abc", t="abc";
s == t is true because Java automatically interns String literals. In this case the String literal "abc" has been interned and both s and t point to that same instance. Hence s == t is true.
s = s + "d"; t = t + "d";
Strings in Java are immutable. Hence what you are assigning to s and t are two new Strings that have been constructed. Therefore they do not point to the same instance. This is why s == t returns false.
s = s.intern(); t = t.intern();
Here you have forcibly interned the string in s.intern(). Since both s and t contain the same string values, the JVM sees that t is the same and makes it point to the same interned-instance as s. Hence s == t is true.
As a general note, establishing the equality of strings should be done via .equals() and not ==; == only compares references for reference-types and not values.
Java Language Specification explicitly covers this particular situation. Here is a quote from chapter 3.10.5. "String Literals":
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.
As you can see, only constant expressions are interned. So, first four lines of your code are equivalent to:
String s = "abc".intern(), t="abc".intern();
System.out.println(s==t);
s=s+"d".intern(); t=t+"d".intern();
System.out.println(s==t);
Expressions s+"d" and t+"d" aren't constant and, thus, aren't interned.
JLS even provides an example with useful notes. Here is the relevant part:
package testPackage;
class Test {
public static void main(String[] args) {
String hello = "Hello", lo = "lo";
System.out.print((hello == ("Hel"+lo)));
}
}
Output: false
Note: Strings computed by concatenation at run time are newly created and therefore distinct.
Because when you concatenate Strings you generate a new object reference except when they are literal Strings.
Note that the intern of both Strings point to the same literal String object reference.
For the below statement in a program, how many objects will be created in heap memory and in the string constant pool?
I need clarity in object creation. Many sources I've read are not elaborating. I am confused when the object gets destroyed.
String a="MAM"+"BCD"+"EFG"+"GFE";
How many objects will be created?
I am looking for good material about the life cycle of objects, methods and classes and how the JVM handles them when they are dynamically changed and modified.
"MAM"+"BCD"+"EFG"+"GFE" is a compile-time constant expression and it compiles into "MAMBCDEFGGFE" string literal. JVM will create an instance of String from this literal when loading the class containing the above code and will put this String into the string pool. Thus String a = "MAM"+"BCD"+"EFG"+"GFE"; does not create any object, see JLS 15.18.1. String Concatenation Operator +
The String object is newly created (§12.5) unless the expression is a compile-time constant expression (§15.28).
It simply assigns a reference to String object in pool to local var a.
Only one object is created.
string s1 = "java";
string s2 = "ja" + "va";
s.o.p(s1==s2);
The statement yields true.
String s1="java";
string s2 = "ja";
String s3 = s2 +"va";
s.o.p(s1==s3);
The statement yields false.
So minimum one apparent should be permanent, then '+' operator generates new string object (in non constant pool using new()).
So, the question you asked does not have one also permanent. This means it creates only one object.
Exactly one object is created and placed in the constant pool, unless it already exists, in which case the existing object is used. The compiler concatenates string constants together, as specified in JLS 3.10.5 and 15.28.
A long string literal can always be broken up into shorter pieces and written as a (possibly parenthesized) expression using the string concatenation operator +
http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5
Most answers seem to focus that a) the complete expression is one compile time constant and b) that the line itself does not construct a new object but only a reference to one object.
However noone so far has mentioned, that String itself contains a reference to an internal char[] (which is also in the constant pool).
Summary: There are two objects in the constant pool (String and char[]). The line does neither create nor destroy any object.
And regarding:
I am confused when the object gets destroyed.
No object is destroyed, since stuff in the constant pool will only be destroyed if the class itself would be unloaded. At most you can say, that the reference a will go out of scope eventually.
Only one object will be created since String a will compile into "MAMBCDEFGGFE".
Answers stating a single heap object in your example are correct. However, consider this code:
public class Tester
{
public String a="MAM";
public String b ="BCD";
public String c = "EFG";
public String d ="GFE";
public Tester()
{
String abcd = a + b + c + d;
}
}
In this example, there are 7 strings being created. a,b,c and d are not compiled into a single constant - they are members. 1 string is then created for each + operator - semantically speaking, + is a concatenation but logically it is creating a new string in memory. The first 2 operator strings are discarded immediately and are now eligible for garbage collection but the memory churn still occurs.
Technically there in an 8th object. The instance of Tester.
Edit: This has been proved to be nonsense in the comments
This question already has answers here:
Strings are objects in Java, so why don't we use 'new' to create them?
(15 answers)
Closed 9 years ago.
I was reading some advices about best practices in java and I got the following idea which made me curious
Also whenever you want to instantiate a String object, never use its constructor but always instantiate it directly.
For example:
//slow instantiation
String slow = new String("Yet another string object");
//fast instantiation
String fast = "Yet another string object";
Why is this? doesn't the 'fast' call the default string constructor?
When you use new you get a new string object, but if you use string literal then see here:
In computer science, string interning is a method of storing only one
copy of each distinct string value, which must be immutable. Interning
strings makes some string processing tasks more time- or
space-efficient at the cost of requiring more time when the string is
created or interned. The distinct values are stored in a string intern
pool. The single copy of each string is called its 'intern' and is
typically looked up by a method of the string class, for example
String.intern() in Java. All compile-time constant strings in Java are
automatically interned using this method.
If you do:
String a = "foo";
String b = "foo";
Then a==b is true !
A String will be created only if it hasn't been interned.
An object will be created in the first time, and it'll be stored in a place called the String constant pool.
But using the new which will create a different object for each string, will output false.
String a = new String("foo");
String b = new String("foo");
Now a==b is false.
So when using literal it is easier to read, plus easier for the compiler to make optimizations. So.. use it when you can.
The JVM maintains a pool of String literals for optimizations. When you create a String using the constructor,
String s1 = new String("foo");
A new String object is created, and the literal "foo" is added to the pool. After this, anytime you use "foo" in your code, the "foo" refers to the item in the pool and a new object is not created. Since String is immutable, this does not create any problems.
So when you create a String using the "shortcut":
String s2 = "foo"
the JVM looks into the pool, if "foo" already exists there, it will make s2 refer to the item in the pool.
This is a major difference with possible performance impacts: The constructor always creates an object, and adds the literal to the pool if it is not already present there. The shortcut refers to the item in the pool and creates a new object only if the liuteral is not in the pool.
In Java, apparently, String s = "foo" is preferred over String s = new String("foo").
Why? Isn't a new string object created in both cases? Why would the first case preclude calling a constructor?
Why?
Because the second approach results in two string objects (the original due to the string literal, plus an explicit copy).
The first case is a string literal, simply a shorthand the language offers you to create a string. The String class constructor still gets called, just not explicitly, which means less typing and less code clutter.
The second case takes the String object already created by the literal and passes it to a constructor, which copies the content to create a new, separate String object. The literal will still be around because literals are interned.
There is rarely a point to using the String constructor (pretty much only when you've created a substring of a very large string and want to release the memory used by the rest of the string, because substrings by default use the same underlying char array as the original string, just with a different offset and length.
I don't think it's preferable. I assume the only "benefit" you get is that if you wrongfully use the "==" operator rather than the equals method, have two different instances of a string will fail faster which will prompt you to fix your code. (the == operator may "succeed" and fail unpredictably)
Unless of course your code requires you to construct two different instances for whatever reason
Why? Isn't a new string object created in both cases?
No, the initial form being a string literal will be interned such that only one instance is created:
String s = "foo";
String s2 = "foo";
s == s2 => true
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Java Strings: “String s = new String(”silly“);”
What is the purpose of the expression “new String(…)” in Java?
There are two ways to create a String object:
1) using literal as in String s ="hello" (creates one object)
2) using new as in String s = new String("hello") (creates two objects)
I was wondering why do ever I need to go for 2) approach?
If you create a string with new, then you get a different String reference. This can avoid creepy behaviour:
String s = "hello";
String t = "hello";
String u = new String("hello");
System.out.println(s==t);
System.out.println(t==u);
prints true, false. I can't really think of a real bit of software where I'd use this. But in a sense it is 'safer' to create new references, so that == doesn't surprise us.
The basic difference between them is memory allocation.
First option i.e
String s1 = "hello";
When you use this s1 is called as a string literal and memory for s1 is allocated at compile time.
But in 2nd case
String s2 = new String("hello");
In this case s2 is called as an object of String representing hello
When you tries to create two string literal using the first case, only one memory is referenced by those two literals. I mean String literals are working with a concept of string pool. when you create a 2nd string literal with same content, instead of allocating a new space compiler will return the same reference. Hence you will get true when you compare those two literals using == operator.
But in the 2nd case each time JVM will create a new object for each. and you have to compare their contents using equals() method but not with == operator.
If you want to create a new string object using 2nd case and also you don't want a new object, then you can use intern() method to get the same object.
String s = "hello";
String s1 = new String("hello").intern();
System.out.println(s == s1);
In this case instead of creating a new object, JVM will return the same reference s. So the output will be true
The only mentally sane occasion where new String("foo") should be used are unit-tests. You can make sure that the code does not use == for string comparisons but the proper .equals() method.
The second approach is just a possibility. Actually is never used (by most of developers). The first one is a less and more convenient version of the latter, no reasons to use the second way.
PS. The second just creates a different link to the literal. Technically they will re-use the same char array. The only difference is the reference will be different (i.e. == will give false, but NEVER use == for string comparison).
This can be understood as a constructor per copy. They are very used in C++. The net effect is having a duplicate of the object passed as a parameter, in this case, a String.